feat(ui): surface gen_ai spans in AI Traces#1331
Conversation
Adds a sentryPythonAIHandler that detects OpenTelemetry gen_ai.* spans emitted by sentry-python AI integrations (PydanticAI, OpenAI, Anthropic) and maps their standard fields (model, provider, tokens, messages, response text, tool calls) into the AI Traces view. The handler claims gen_ai.* spans that lack vercel.ai.* fields, and extractAllAIRootSpans now attributes each root span to a single handler so spans are never surfaced twice. Fixes #1330
|
The latest updates on your projects. Learn more about Vercel for GitHub. 1 Skipped Deployment
|
Codecov Results 📊✅ Patch coverage is 89.80%. Project has 862 uncovered lines. Files with missing lines (1)
Coverage diff@@ Coverage Diff @@
## main #PR +/-##
==========================================
- Coverage 59.68% 57.68% -2%
==========================================
Files 53 57 +4
Lines 1672 2048 +376
Branches 1204 1476 +272
==========================================
+ Hits 994 1186 +192
- Misses 678 862 +184
- Partials 128 154 +26Generated by Codecov Action |
Codecov Results 📊✅ Patch coverage is 89.61%. Project has 863 uncovered lines. Files with missing lines (1)
Coverage diff@@ Coverage Diff @@
## main #PR +/-##
==========================================
- Coverage 59.68% 57.77% -1.91%
==========================================
Files 53 57 +4
Lines 1672 2055 +383
Branches 1204 1492 +288
==========================================
+ Hits 994 1192 +198
- Misses 678 863 +185
- Partials 128 155 +27Generated by Codecov Action |
|
Holding off on auto-merge — the diff is 496 lines (gate is 150), so this one warrants a human look. @sergical, over to you for review. |
extractRootSpans recursed into children of any non-python span, including Vercel gen_ai.* parents, so nested gen_ai.chat/tool children lacking vercel.ai.* fields were collected as separate roots and the same workflow appeared twice in AI Traces. Stop descending at any gen_ai.* span since it's already the root for its tree.
extractResponseData captured response text/finish reason only while empty, so an agent root with multiple gen_ai.chat children showed the first reply. Spans iterate in tree pre-order, so take the last span's values to reflect the latest model turn.
extractTokenUsage stopped at the first usage, so a multi-turn agent root under-reported tokens. Sum input/output tokens across all spans and source trace.metadata token counts from the same aggregate so the prompt, token total, and response stay consistent.
|
why are we making a pydantic distinction if its just |
Rename the sentry-python handler to a generic gen_ai handler keyed off the standard sentry-conventions gen_ai.* attributes. The provider is just the gen_ai.system attribute, not a reason to fork per-SDK handlers. Vercel keeps its own handler only because of its proprietary vercel.ai.* fields.
|
good call — there was no real reason for a provider distinction. the handler was already keyed off the standard renamed it to a generic the only handler I kept separate is Vercel AI SDK, since it emits proprietary |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit cca6119. Configure here.
| const responseText = span.data[GEN_AI_RESPONSE_TEXT_FIELD]; | ||
| if (responseText) { | ||
| trace.response.text = Array.isArray(responseText) ? responseText.map(String).join("") : String(responseText); | ||
| } |
There was a problem hiding this comment.
Finish reason can stay stale
Low Severity
When merging response data across spans, extractResponseData always updates gen_ai.response.text from the latest span but only changes finishReason if that span includes gen_ai.response.finish_reasons. A later turn’s text can appear with an earlier turn’s finish reason.
Reviewed by Cursor Bugbot for commit cca6119. Configure here.
| if (responseText) { | ||
| trace.response.text = Array.isArray(responseText) ? responseText.map(String).join("") : String(responseText); | ||
| } |
There was a problem hiding this comment.
Bug: The code incorrectly handles an empty responseText array. Because [] is truthy, it gets processed into an empty string, which can overwrite valid response text from previous spans.
Severity: MEDIUM
Suggested Fix
Modify the condition to ensure responseText is not just truthy, but also contains data. For arrays, check responseText.length > 0. Alternatively, only assign to trace.response.text if the processed value is a non-empty string. For example: if (responseText && (!Array.isArray(responseText) || responseText.length > 0)) { ... }.
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's
not valid.
Location:
packages/spotlight/src/ui/telemetry/components/insights/aiTraces/sdks/genAI.ts#L289-L291
Potential issue: In multi-turn AI agent traces, the code iterates through spans to
extract the final response text. The check `if (responseText)` is truthy for an empty
array `[]`. The code then processes this empty array into an empty string
(`[].map(String).join("")` results in `""`). This empty string then overwrites
`trace.response.text`, potentially erasing a valid response from a previous span. This
can happen if an intermediate span, such as a tool call, has an empty
`gen_ai.response.text` array. If this is the last span in the trace, the response text
will be permanently lost.
|
can we confirm PydanticAI uses gen_ai traces and doesn't need a separate handler like the vercel ai sdk v1 and v2 since those are the ones that need |
|
confirmed — PydanticAI emits only standard
so your read is right: |
|
and this isn't only sentry-python specific, this will work across all sentry sdks if tehy want to use spotlight, right? |
|
right — nothing in the handler is Python-specific. it matches purely on the span shape: op prefixed |


The AI Traces view only recognized Vercel AI SDK spans, so standard
gen_ai.*spans emitted by Sentry SDKs (sentry-python's PydanticAI/OpenAI/Anthropic integrations, and any other SDK following sentry-conventions) never showed up there even though the span data was present.Adds a generic
genAIHandlerthat detects sentry-conventionsgen_ai.*spans lackingvercel.ai.*fields and maps their standard fields — model, provider, token usage, request messages, response text, finish reason, and tool calls — into the AI Traces view. Detection is purely span-shape based (op prefixedgen_ai., agen_ai.*data key, novercel.ai.*), so it's SDK- and provider-agnostic — the provider is just thegen_ai.systemattribute.extractAllAIRootSpansnow attributes each root span to a single handler so spans are never surfaced twice. Vercel AI SDK keeps its own handler only because it emits proprietaryvercel.ai.*fields (v1/v2) on top ofgen_ai.*.Testing
pnpm testforgenAI.spec.ts(13 tests, all pass).Closes #1330