feat(parquet): one-off event annotations on recordings#914
Merged
Conversation
Adds support for attaching one-off events (restarts, config changes, anomalies, deploys, ...) to parquet recordings as a JSON blob in the file footer. Events are self-describing — each carries optional source/node/instance scope rather than inheriting from file-level metadata — so combined files keep them at the top level instead of nesting under per_source_metadata. `rezolus parquet annotate` gains: - --add-events FILE (also '-' for stdin; accepts JSON, JSON array, or JSONL) - --event key=value,... (inline shorthand, repeatable) - --clear-events (composes with --add-events for replace) Inputs accept RFC3339 timestamps (including the seconds-omitted short form) and humantime durations; canonical storage is u64 nanoseconds. `parquet combine` concatenates events from all inputs, sorts by timestamp, and dedupes by stable `id`. No viewer/UI work yet; this is the schema + annotate pipeline only.
Adds an `events` section to docs/parquet_metadata.md covering the payload shape, CLI surface (--add-events / --event / --clear-events), and combine merge behavior. Extends the post-recording mutator table and combine "what changed" table to include events. Drops a handful of inline comments that just restated the adjacent code.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Attach one-off events (restarts, config changes, anomalies, deploys, ...) to parquet recordings so the viewer can mark key moments on existing charts. Schema + annotate pipeline only — no viewer/UI work yet.
Schema (
dashboard::events::Event)Events live as a JSON blob in the parquet footer under the new
eventskey:{ "events": [ { "timestamp": 1778599380000000000, // required, ns since epoch "description": "vllm restart for new config", // required "kind": "restart", // free-form (restart, config_change, deploy, incident, anomaly, marker, note, ...) "details": "swapped tp 2 → 4", // optional long text "source": "vllm", // optional scope "node": "gpu01", "instance": "0", "labels": {"reason": "OOM"}, // optional free-form tags "duration_ns": 30000000000, // optional — absent = point, present = band "id": "evt-2026-05-12-001" // optional, used for dedupe on combine } ] }Key design choices (per brainstorm):
source/node/instanceinstead of inheriting from file metadata; combined files therefore keep events at the top level rather than underper_source_metadata.kindis free-form — documented conventions only, no enum. Too early to lock down.timestamp+ optionalduration_ns— unifies point events and intervals in one struct.rezolus parquet annotateadditionsOrder within a single invocation: clear → add-events files → inline
--eventflags. Operations are append by default; events are sorted by timestamp on write and deduped by stableid. CLI--eventacceptslabel.<name>=<value>for free-form labels.Inputs accept RFC3339 timestamps (including the seconds-omitted short form
2026-05-12T15:23Z) and humantime durations (30s,1m30s); canonical storage isu64nanoseconds.rezolus parquet combineConcatenates events from every input, sorts by timestamp, and dedupes by
id. Noper_source_metadataindirection because each event already carries its scope.Test plan
cargo test -p dashboard(Event struct, normalize/dedupe)cargo test --bin rezolus parquet_tools— 73 tests pass (events: 27 incl. e2e against real parquet, plus existing combine/filter/annotate tests still green)cargo clippy --bin rezolus— no new warnings in touched files--event, RFC3339 short form, stdin via-,--clear-events, combine merging events from two annotated filesGenerated by Claude Code