fix: parity & bug-fix wave on the 0.6.6/0.6.7 features — EvalSession TS port, explicit-kwarg precedence, env-key path, telemetry chain guard#172
Merged
Conversation
…TS) + bug-3 regression tests in both SDKs
…LLOWED already includes w/W
…ne marker (TS parity, sentinel defaults)
…no-competitor-references)
…() factory (TS parity)
…ume, preemptive generation, smart-turn
… (parity with Python)
…rom the TS stream handler
…ckfills the call-path config (both SDKs)
|
Preview deployment for your docs. Learn more about Mintlify Previews.
💡 Tip: Enable Workflows to automatically generate PRs for you. |
This was referenced Jun 11, 2026
Merged
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
StreamHandler, with chainable assertions, scripted LLM provider, and an unstubbedgetpatter evalCLI.voice=/model=precedence over engine markers (Python),OPENAI_API_KEYenv handling on the provider-string path (both SDKs — TS had a latent dead-call), a telemetry close/chain race guard (TS),barge_in_mode/barge_in_confirm_msexposed on the Pythonagent()factory, and missing regression tests for the 0.6.7 telemetry delivery fix.Implementation
src/evals/):EvalSessionconstructs a real pipeline-modeStreamHandlerand injects turns through the live-call path; fakes only at the STT/TTS/audio-sender/LLM boundary.expect()assertions,LLMJudge,EvalCase.agent/llmProviderrouting, JSON/YAML suite loader, CLIeval runwith exit codes 0/1/2. Field names, defaults, and report rows byte-compatible with Python. Reviewed post-port:argumentsfield parity, bounded await of aborted dispatch before timeout throw, crypto-based call ids, judge response guard.voice/modelmoved to sentinelNonedefaults so explicit values — even equal to the documented default — win overengine=marker values (mirrors TS??resolution). Telemetry model resolution updated consistently.provider="openai_realtime"with no configured key now backfillsOPENAI_API_KEYfrom the environment into the local config so the call path actually uses it. Python previously rejected despite its error message promising env support; TS accepted at validation but dialed with an empty key.close()has begun (parity with Python'snot _closed); added the missing regression tests in both SDKs for the 0.6.7 "event recorded during an in-flight flush" fix, using a gated real HTTP collector.barge_in_mode/barge_in_confirm_msasagent()keywords with validation (were dataclass-only).local_recording/localRecordingdocumented across both SDK doc trees (8 pages); runnable examples added for pause-resume barge-in, preemptive generation, and smart-turn semantic EOU (both languages) + examples index.Breaking change?
No. Two edge-case behavior corrections, both documented under
## Unreleased→ Fixed:model="gpt-realtime-mini"+ engine with a different model now runs the explicit model in Python (previously the engine's — TS already did this);provider="openai_realtime"with only an env key is now accepted in Python (previously rejected) and actually works in TS (previously dead call).Test plan
pytest tests/— 2648 passed, 8 skipped, 2 xfailednpm test(2096 passed, 9 skipped) +npm run lint+npm run buildOPENAI_API_KEYin the env (the env-set run is what caught the env-key divergence)Docs updates
docs/python-sdk/{features,reference,local-mode,events}.mdx,docs/typescript-sdk/{features,reference,local-mode,events}.mdx(local recording)docs/examples/:pause-resume-barge-in.{py,ts},preemptive-generation.{py,ts},smart-turn-detection.{py,ts},README.mdindex