Skip to content

feat(parquet): one-off event annotations on recordings#914

Merged
thinkingfish merged 3 commits into
mainfrom
claude/add-oneoff-events-4TZFu
May 12, 2026
Merged

feat(parquet): one-off event annotations on recordings#914
thinkingfish merged 3 commits into
mainfrom
claude/add-oneoff-events-4TZFu

Conversation

@thinkingfish
Copy link
Copy Markdown
Member

Summary

Attach one-off events (restarts, config changes, anomalies, deploys, ...) to parquet recordings so the viewer can mark key moments on existing charts. Schema + annotate pipeline only — no viewer/UI work yet.

Schema (dashboard::events::Event)

Events live as a JSON blob in the parquet footer under the new events key:

{
  "events": [
    {
      "timestamp":   1778599380000000000,        // required, ns since epoch
      "description": "vllm restart for new config", // required
      "kind":        "restart",                  // free-form (restart, config_change, deploy, incident, anomaly, marker, note, ...)
      "details":     "swapped tp 2 → 4",         // optional long text
      "source":      "vllm",                     // optional scope
      "node":        "gpu01",
      "instance":    "0",
      "labels":      {"reason": "OOM"},          // optional free-form tags
      "duration_ns": 30000000000,                // optional — absent = point, present = band
      "id":          "evt-2026-05-12-001"        // optional, used for dedupe on combine
    }
  ]
}

Key design choices (per brainstorm):

  • Self-describing scope — each event carries its own source/node/instance instead of inheriting from file metadata; combined files therefore keep events at the top level rather than under per_source_metadata.
  • No auto-fill from file metadata — combined files mix sources and nodes; inferring scope would be silently error-prone.
  • kind is free-form — documented conventions only, no enum. Too early to lock down.
  • timestamp + optional duration_ns — unifies point events and intervals in one struct.

rezolus parquet annotate additions

# Add events from a JSON / JSON-array / JSONL file
rezolus parquet annotate file.parquet --add-events events.json

# Read events from stdin (mirrors --systeminfo -)
cat events.json | rezolus parquet annotate file.parquet --add-events -

# Inline one-shot event, repeatable
rezolus parquet annotate file.parquet \
  --event 'time=2026-05-12T15:23Z,kind=restart,description="vllm restart",node=gpu01' \
  --event 'time=2026-05-12T16:00:00Z,kind=marker,description="benchmark start",label.run=ci-42'

# Wipe existing events; combine with --add-events / --event for "replace"
rezolus parquet annotate file.parquet --clear-events
rezolus parquet annotate file.parquet --clear-events --add-events new.json

Order within a single invocation: clear → add-events files → inline --event flags. Operations are append by default; events are sorted by timestamp on write and deduped by stable id. CLI --event accepts label.<name>=<value> for free-form labels.

Inputs accept RFC3339 timestamps (including the seconds-omitted short form 2026-05-12T15:23Z) and humantime durations (30s, 1m30s); canonical storage is u64 nanoseconds.

rezolus parquet combine

Concatenates events from every input, sorts by timestamp, and dedupes by id. No per_source_metadata indirection because each event already carries its scope.

Test plan

  • cargo test -p dashboard (Event struct, normalize/dedupe)
  • cargo test --bin rezolus parquet_tools — 73 tests pass (events: 27 incl. e2e against real parquet, plus existing combine/filter/annotate tests still green)
  • cargo clippy --bin rezolus — no new warnings in touched files
  • CLI smoke tests: file input, inline --event, RFC3339 short form, stdin via -, --clear-events, combine merging events from two annotated files
  • Viewer rendering — explicitly deferred to a follow-up PR

Generated by Claude Code

claude added 2 commits May 12, 2026 04:17
Adds support for attaching one-off events (restarts, config changes,
anomalies, deploys, ...) to parquet recordings as a JSON blob in the
file footer. Events are self-describing — each carries optional
source/node/instance scope rather than inheriting from file-level
metadata — so combined files keep them at the top level instead of
nesting under per_source_metadata.

`rezolus parquet annotate` gains:
- --add-events FILE  (also '-' for stdin; accepts JSON, JSON array, or JSONL)
- --event key=value,...  (inline shorthand, repeatable)
- --clear-events  (composes with --add-events for replace)

Inputs accept RFC3339 timestamps (including the seconds-omitted short
form) and humantime durations; canonical storage is u64 nanoseconds.
`parquet combine` concatenates events from all inputs, sorts by
timestamp, and dedupes by stable `id`.

No viewer/UI work yet; this is the schema + annotate pipeline only.
@thinkingfish thinkingfish marked this pull request as ready for review May 12, 2026 04:44
Adds an `events` section to docs/parquet_metadata.md covering the
payload shape, CLI surface (--add-events / --event / --clear-events),
and combine merge behavior. Extends the post-recording mutator table
and combine "what changed" table to include events. Drops a handful
of inline comments that just restated the adjacent code.
@thinkingfish thinkingfish merged commit 72acada into main May 12, 2026
27 checks passed
@thinkingfish thinkingfish deleted the claude/add-oneoff-events-4TZFu branch May 13, 2026 19:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants