Skip to content

feat: Observability #1989

Description

@nvaprado

Is this a new feature, an enhancement, or a change to existing functionality?

New Feature

How would you describe the priority of this feature request

Critical (currently preventing usage)

Please provide a clear description of problem this feature solves

NICo needs improvements as far as observability on its deployment. I am creating a single "feat:observability" instead of multiple tickets, I'd consider this a JIRA Epic.

Need a reference architecture for logging

  • Where does NICo currently log to ?
  • Which components log ?
  • What's the log format / schema ?
  • How do I tune log levels ?
  • How do I send them to a file or stdout/err ?
  • An example of a centralized logging implementation with ELK or equivalent

Need a reference architecture for metrics

  • Grafana dashboards that I can import into my installation
  • Schema for the metrics that are available and can be used for debugging
  • Important KPI / metrics for me to watch out for (e.g: time on a given state of the machine, nodes stuck)
  • Alert thresholds I should set

Need a reference architecture for traces

  • What parts of the code base have tracing enabled ?
  • How do I configure traces ?
  • How do I send them somewhere ?

Feature Description

As an operator I'd like to have enough observability to operate NICo

Describe your ideal solution

No response

Describe any alternatives you have considered

No response

Additional context

No response

Code of Conduct

  • I agree to follow NCX Infra Controller's Code of Conduct
  • I have searched the open feature requests and have found no duplicates for this feature request

Metadata

Metadata

Labels

featureFeature (deprecated - use issue type, but it's needed for reporting now)interest/dsxroadmapRoadmap item with program-level tracking
No fields configured for Documentation.

Projects

Status
In Progress

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions