Is this a new feature, an enhancement, or a change to existing functionality?
New Feature
How would you describe the priority of this feature request
Critical (currently preventing usage)
Please provide a clear description of problem this feature solves
NICo needs improvements as far as observability on its deployment. I am creating a single "feat:observability" instead of multiple tickets, I'd consider this a JIRA Epic.
Need a reference architecture for logging
- Where does NICo currently log to ?
- Which components log ?
- What's the log format / schema ?
- How do I tune log levels ?
- How do I send them to a file or stdout/err ?
- An example of a centralized logging implementation with ELK or equivalent
Need a reference architecture for metrics
- Grafana dashboards that I can import into my installation
- Schema for the metrics that are available and can be used for debugging
- Important KPI / metrics for me to watch out for (e.g: time on a given state of the machine, nodes stuck)
- Alert thresholds I should set
Need a reference architecture for traces
- What parts of the code base have tracing enabled ?
- How do I configure traces ?
- How do I send them somewhere ?
Feature Description
As an operator I'd like to have enough observability to operate NICo
Describe your ideal solution
No response
Describe any alternatives you have considered
No response
Additional context
No response
Code of Conduct
Is this a new feature, an enhancement, or a change to existing functionality?
New Feature
How would you describe the priority of this feature request
Critical (currently preventing usage)
Please provide a clear description of problem this feature solves
NICo needs improvements as far as observability on its deployment. I am creating a single "feat:observability" instead of multiple tickets, I'd consider this a JIRA Epic.
Need a reference architecture for logging
Need a reference architecture for metrics
Need a reference architecture for traces
Feature Description
As an operator I'd like to have enough observability to operate NICo
Describe your ideal solution
No response
Describe any alternatives you have considered
No response
Additional context
No response
Code of Conduct