Skip to content

feat!: Dual-Context Model (AsyncContext + SyncContext) - v1.0.0#25

Open
Yaming-Hub wants to merge 15 commits intomainfrom
feat/dual-context-redesign
Open

feat!: Dual-Context Model (AsyncContext + SyncContext) - v1.0.0#25
Yaming-Hub wants to merge 15 commits intomainfrom
feat/dual-context-redesign

Conversation

@Yaming-Hub
Copy link
Copy Markdown
Collaborator

Summary

Introduces a dual-context storage model that separates async (task-local) and sync (thread-local) context into independent modules, eliminating the scope chain leak bug caused by depth mismatches between DcontextLayer and application code sharing the same store.

This is a major version bump (0.7.1 -> 1.0.0) as it introduces significant new architecture.

Problem

In production, span_stack in Kusto telemetry grows monotonically over process lifetime. The root cause is a depth mismatch between DcontextLayer (using force_thread_local) and application code (scope_async, set_context) sharing the same thread-local store. When a future yields inside an instrumented span, the span exit/re-enter cycle creates orphaned scope entries.

Solution

New Modules

  • dcontext::async_ctx - all operations target tokio task-local store exclusively
  • dcontext::sync_ctx - all operations target thread-local store exclusively

New Tracing Layers

  • AsyncDcontextLayer - pushes scope on first span enter, persists across yields, pops on close. Structurally prevents the monotonic growth bug.
  • SyncDcontextLayer - standard push-on-enter/pop-on-exit for sync code.

Bridging

One-way async->sync bridging via async_ctx::snapshot() + sync_ctx::restore().

Why the Leak Is Structurally Impossible

  1. AsyncDcontextLayer writes to task-local -> scoped to the task
  2. async_ctx::scope() writes to the SAME task-local -> no store conflict
  3. AsyncDcontextLayer pushes only on first enter, pops only on close -> no depth mismatch with application scopes
  4. Thread-local is never touched by async code

Backward Compatibility

All existing APIs remain available and unchanged.

Changes

  • dcontext/src/async_ctx.rs - new async context module
  • dcontext/src/sync_ctx.rs - new sync context module
  • dcontext-tracing/src/async_layer.rs - AsyncDcontextLayer
  • dcontext-tracing/src/sync_layer.rs - SyncDcontextLayer
  • dcontext-tracing/tests/dual_context_tests.rs - 13 integration tests
  • Updated design doc, usage guide, README
  • Version bump: all crates -> 1.0.0

Test Results

All 139 tests pass (88 dcontext + 38 dcontext-tracing + 13 new dual-context integration tests).

Introduces a dual-context storage model that separates async (task-local)
and sync (thread-local) context into independent modules, eliminating the
scope chain leak bug caused by depth mismatches between DcontextLayer and
application code sharing the same store.

## New Modules

- `dcontext::async_ctx` — all operations target tokio task-local store
  exclusively. Includes push_scope, pop_scope, scope_chain, set_context,
  get_context, snapshot, with_context, and scope (async scoped context).

- `dcontext::sync_ctx` — all operations target thread-local store
  exclusively. Includes push_scope, pop_scope, scope_chain, set_context,
  get_context, restore (from snapshot), and clear.

## New Tracing Layers

- `AsyncDcontextLayer` — pushes scope on first span enter, persists across
  yields, pops on span close. Structurally prevents the monotonic growth bug.

- `SyncDcontextLayer` — standard push-on-enter/pop-on-exit for sync code.

## Bridging

One-way async->sync bridging via async_ctx::snapshot() + sync_ctx::restore().

## Backward Compatibility

All existing APIs (DcontextLayer, scope_async, force_thread_local, fork,
with_fork, etc.) remain available and unchanged.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@Yaming-Hub Yaming-Hub force-pushed the feat/dual-context-redesign branch from 4ed402e to 32bc488 Compare May 10, 2026 23:52
Yaming-Hub and others added 14 commits May 10, 2026 16:56
Add 5 new samples demonstrating the v0.8 dual-context model:

- dual_async_ctx: async_ctx module usage (task-local context)
- dual_sync_ctx: sync_ctx module usage (thread-local context)
- dual_bridging: async→sync snapshot bridging patterns
- dual_cross_process: serialize context and restore remotely
- dual_tracing_layers: AsyncDcontextLayer + SyncDcontextLayer

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Store per-span tracking state in span extensions (Send+Sync) rather than
thread_local!, fixing correctness in multi-threaded tokio runtimes where
tasks can migrate between threads across polls.

Also ensures the layer is completely transparent (no-op) when no task-local
context is available (code not running in tokio thread pool).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
async_ctx::try_push_scope returns None when no task-local context is
available, making the no-context case explicit at the call site.
AsyncDcontextLayer uses this to cleanly skip when outside tokio.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Store scope_name in AsyncScopeState. On close, verify the top of the
task-local scope chain matches what we pushed before popping. Skips
the pop if the scope was externally manipulated, preventing mismatched
pops. Also adds async_ctx::peek_scope() public helper.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replace name-based peek check with depth-based check. Depth is
monotonically increasing and never reused within a store, making it
a natural unique identifier. on_close now verifies current_depth()
matches before popping, preventing mismatched pops.

Removed uuid dependency — no external crate needed.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ontextLayer

Deleted layer.rs. All features (field extraction, span info, span
recording) are now in both AsyncDcontextLayer and SyncDcontextLayer.
DcontextLayer is retained as a type alias for SyncDcontextLayer for
backward compatibility. Both layers share the same 4-level integration:
auto-scoping, field extraction, span info, and span recording.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
When constructed with .async_aware(), the sync layer checks on first
span enter whether an async task-local context is available. The result
is stored in span extensions (SyncSkipMarker) so that on_exit and
on_close consistently skip without re-probing. This allows mounting
both AsyncDcontextLayer and SyncDcontextLayer together — async tasks
use the async layer, sync code uses the sync layer, no duplication.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…yer logic

- Split storage.rs into async_storage.rs (task-local) and sync_storage.rs (thread-local)
- Keep storage.rs as backward-compatible re-export shim
- Extract shared layer logic into layer_common.rs (field extraction, span recording, span info)
- Refactor AsyncDcontextLayer and SyncDcontextLayer to use shared helpers
- Eliminates ~200 lines of duplicated code between layers

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…age/sync_storage directly

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Move apply_field_extraction, set_span_info, record_context_to_span
  from layer_common.rs into async_layer.rs and sync_layer.rs
- Each layer now directly calls its own context module (async_ctx/sync_ctx)
- Make ContextValue trait and value module public for cross-crate use
- Update ExtractFns closures to return Option<Arc<dyn ContextValue>>
- Update collect_log_fields/WithContextFields to try async then sync
- Fix get_context to return Option<T> (matching existing API)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…atch in top-level API

- Remove FORCE_THREAD_LOCAL_DEPTH and task-local dispatch from sync_storage
- sync_storage now always operates on thread-local only
- Top-level API (get_context, set_context, scope, etc.) dispatches:
  tries task-local first, falls back to thread-local (backward compat)
- ScopeGuard now tracks is_async flag to pop the correct store
- Deprecate force_thread_local, scope_async, named_scope_async
- Update wire.rs and snapshot.rs to dispatch appropriately
- All 144 tests passing

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Create store.rs with shared ContextStore struct
- Merge async_storage.rs (TASK_CONTEXT) into async_ctx.rs
- Merge sync_storage.rs (thread-local, scope mgmt, value access,
  deprecated compat, fork support) into sync_ctx.rs
- Delete async_storage.rs and sync_storage.rs
- Update all internal references

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Push beyond the limit creates dead scopes that only bump the depth
counter. Pop decrements until back at the limit, then does real pops.
Adds 3 unit tests verifying dead scope behavior, value survival,
and full push/pop roundtrip.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant