Skip to content

Logs export / Observability #7858

@cjen1-msft

Description

@cjen1-msft

Currently logging is expensive in production and hence we need to be careful and 'minimal' in the logs which we do generate.
The idea here is that we want to simplify the logging story using more 'standard-practice' techniques rather than just logging on stdout, and hence also free up logging budget for observability.
The eventual aim is to enable in production tracing of: 'occasional' state machines (node startup, snapshotting, reconfiguration), per-request profiling, active trace validation of ccfraft.tla; however that is all quite far away still.

For now I propose the following rough logging structure to make it easy to enable telemetry in production and simplify and improve the current state, primarily by adding log export to fluentd.

flowchart LR
  subgraph Node["CCF node"]
    PROD["Producer threads<br/>LOG_*_FMT macros"]
    TRACE["Producer threads<br/>TRACE(<tag>, data)"]
    CONS["Console sink<br/>(stdout / journald)"]
    FLU["Fluentd sink<br/>per-thread queues<br/>+ host reader thread"]
    PROD --> CONS
    PROD --> FLU
    TRACE --> CONS
    TRACE --> FLU
  end
  FLUENTD["fluentd<br/>in_forward :24224"]
  MDSD["mdsd"]
  GENEVA["Geneva Logs"]
  FLU -->|msgpack Forward<br/>over TCP| FLUENTD
  FLUENTD --> MDSD
  MDSD --> GENEVA
Loading

Configuration UX

This is what the config could roughly look like:

"logging": {
  "sinks": [
    { "type": "console", "format": "Text", "level": "Info" },
    {
      "type": "fluentd_forward",
      "level": "Debug",
      "fluentd_tag": "ccf.node.1",
      "address": "127.0.0.1:24224",
      "thread_queue_capacity_bytes": "128KB",
      "reconnect_backoff": "1s",
      "connect_timeout": "2s",
    }
  ]
}

The default should be to preserve compatibility with what we have currently.

Note: log level should still be controlled by the CLI but that should be the most verbose level that we allow and per-sink levels are clamped silently to that.
That allows us to have just FAIL to STDOUT but INFO or DEBUG to fluentd.

Constraints

  • Performance impact should be minimal
    • Backpressure from fluentd cannot block the cchost process
      • Likely only option is to drop packets, but log to stderr that it happened (w/ debounce and aggregate count)
  • Log level captured at CLI
  • How to log before fluentd connected? Just buffer?

Rough implementation plan

  • Micro-msgpack library for emitting the trace messages we need
  • Export to fluentd
  • Performance overhead test with trace logging of requests, for landmark style performance metrics

Non-goals

  • Logging semantics
    • Message ordering vs timestamp
      • Just sort the flow by timestamp
    • Clock sync in a distributed deployment
      • We're considering adding vector clocks or lamport clocks to the raft messages to allow a total order over these that respects causality.
      • Other cases MUST NOT rely on cross instance timestamps
    • Dynamic export sinks
  • Private data in loglines controlled on a per-log-level basis, CLI arg takes care of this
  • Migration of existing json logging to msgpack
    • I think we want json as a midpoint which is machine and human readable
  • TLS between cchost and fluentd
    • There is support for TLS for fluentd in_forward but as its all on the same vnet (or localhost) this is unnecessary for v0

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels
    No fields configured for Feature.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions