Skip to content

feat(telemetry): migrate stream_with_chunking orchestration span to plugin hooks #1290

Description

@ajbozarth

Background

PR #1289 closed #1048 by moving session and action span emission off direct trace_application(...) calls in mellea/stdlib/session.py and mellea/stdlib/functional.py. After that PR, mellea/stdlib/streaming.py is the only remaining file under mellea/stdlib/ that imports from mellea.telemetry.tracing.

streaming.py was added by PR #1095 (feat(stdlib): add streaming event types, events() iterator, and OTEL bridge, merged 2026-05-19) — twelve days after the Phase 2 sub-issues were filed (2026-05-07), so #1048's original plan didn't account for it. PR #1289 explicitly scoped streaming out and called out a follow-up issue (this one) to track the migration.

Current state on main

mellea/stdlib/streaming.py imports four tracing helpers and uses them inside stream_with_chunking:

  • from ..telemetry.tracing import set_span_error, set_span_status_error, trace_application (line 33)
  • with trace_application("stream_with_chunking") as span: (line 469) — opens an orchestration span around the whole streaming run
  • span.add_event("quick_check", {...}) — per-chunk validation result
  • set_span_status_error(span, ...) — when a quick-check fails (twice, lines 550 and 575)
  • set_span_error(span, exc) — on outer exception (line 665)

These four helpers (trace_application, set_span_attribute, set_span_error, set_span_status_error) are kept alive in mellea.telemetry.tracing solely because streaming.py still calls them. Once this issue lands they can be retired.

Scope

Migrate the stream_with_chunking orchestration span and its events onto plugin hooks, mirroring how PR #1289 moved action spans onto component_* hooks and how PR #1181 moved backend spans onto generation_* hooks. The metrics calls currently inline in streaming.py move onto the same hooks at the same time — once the hook surface exists, subscribing the existing metrics plugins is trivial and there's no reason to leave streaming as the lone stdlib module still importing from mellea.telemetry.

Likely shape — hook names below are illustrative; the final hook surface (names, granularity, payload fields) should be designed and agreed before implementation begins:

  • New hook types in mellea/plugins/types.py, e.g.:
    • STREAMING_ORCHESTRATION_PRE — fired at the start of stream_with_chunking (opens the orchestration span)
    • STREAMING_CHUNK_VALIDATED — fired after each per-chunk quick-check validation (records the quick_check event currently added inline; also drives record_requirement_check / record_requirement_failure metrics)
    • STREAMING_ORCHESTRATION_POST — fired on successful completion (closes the span; drives record_sampling_outcome)
    • STREAMING_ORCHESTRATION_ERROR — fired on exception (marks the span ERROR and closes it; drives record_error)
  • New payload dataclasses in mellea/plugins/hooks/streaming.py carrying the data currently stamped inline (chunk index, requirement count, pass/fail, exception, etc.). Include a correlation field (e.g. streaming_id: str) so pre/post hooks pair cleanly across _run_async_in_thread Tasks — same pattern as generation_id / component_id.
  • New StreamingTracingPlugin in mellea/telemetry/tracing_plugins.py, subscribed to the new hooks and emitting the orchestration span via typed helpers in mellea/telemetry/tracing.py.
  • The existing RequirementMetricsPlugin / ErrorMetricsPlugin / SamplingMetricsPlugin (in mellea/telemetry/metrics_plugins.py) gain subscriptions to the relevant new streaming hooks, replacing the direct record_* calls in streaming.py.
  • New typed helpers in tracing.py: start_streaming_orchestration_span, add_streaming_chunk_event, finish_streaming_orchestration_span_success, finish_streaming_orchestration_span_error — internal, not exported on the public mellea.telemetry surface.

Acceptance criteria

  • mellea/stdlib/streaming.py has no imports from mellea.telemetry.tracing or mellea.telemetry.metrics.
  • mellea/stdlib/ as a whole has no imports from mellea.telemetry — closes the architectural goal that feat: session and act spans via plugin hooks #1048 originally aimed at.
  • trace_application, set_span_attribute, set_span_error, set_span_status_error are removed from mellea.telemetry.tracing (no longer needed).
  • Existing streaming spans, events, and metrics render with the same names and attributes as on main today (no regressions in observability).
  • Streaming orchestration span nests correctly under the action span when called via a session — same session > action > stream_with_chunking parent chain that refactor(telemetry): migrate application spans to plugin hooks #1289 delivered for non-streaming generations.
  • Tests cover: pre/post hook pairing, ERROR status on exception, chunk-validation events, metrics emission via the existing metrics plugins, parent-child nesting.

Phase & dependencies

Phase 2 (coverage). Builds on Phase 1 (#1181) and the action/session migration (PR #1289).

Parent epic: #444

Metadata

Metadata

Assignees

Labels

area/stdlibCore abstractions: Context, MOT, SamplingStrategy, formatters, serializationarea/telemetryOTel spans, metrics, tracing, semconvenhancementNew feature or requestrefactor

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions