Skip to content

Add call transcript analysis via Grok (xAI)#430

Open
gianpaj wants to merge 4 commits into
mainfrom
claude/call-transcript-summarization-bymrs9
Open

Add call transcript analysis via Grok (xAI)#430
gianpaj wants to merge 4 commits into
mainfrom
claude/call-transcript-summarization-bymrs9

Conversation

@gianpaj

@gianpaj gianpaj commented Jun 21, 2026

Copy link
Copy Markdown
Owner

Summary

Adds automatic analysis of call transcripts using xAI's Grok model to extract structured insights (language, topic, engagement, sentiment, etc.). Includes a webhook-triggered API route for real-time analysis when calls complete, a daily cron script for recent calls, and a backfill script for historical sessions. All three paths write identical rows to a new call_session_analysis table.

Changes

  • Database schema: Added call_session_analysis (per-call analysis) and call_session_analytics (aggregate insights) tables with indexes and RLS policies. Includes a Postgres trigger that enqueues analysis via webhook when calls complete.
  • API route (/api/call-sessions/analyze): Accepts webhook POST from Supabase, validates the session, calls analyzeTranscript(), and inserts the result. Includes idempotency guards and error handling.
  • Core analysis library (lib/ai/analyze-call.ts): Exports analyzeTranscript(), extractMessages(), buildConversationSummary(), and toAnalysisRow(). Uses Zod schema for structured LLM output. Handles variable transcript shapes (bare arrays, { messages }, { user_transcriptions }).
  • Scripts:
    • scripts/analyze-call-sessions.mjs: Analyzes recent calls (default 24h) in batches, writes CSV + insights JSON, saves to DB. Supports --dry-run, --hours, --limit, --debug, --smoke-test.
    • scripts/backfill-call-analysis.mjs: Analyzes all completed sessions without an analysis row. Supports --limit, --min-duration, --models filtering.
  • Tests (apps/web/tests/analyze-call.test.ts): Unit tests for message extraction, conversation summary formatting, and row mapping.
  • Config: Added XAI_API_KEY and optional XAI_SUMMARY_MODEL env vars; updated .env.example and package.json dependencies.

How to test

  1. Webhook flow: Complete a call ≥2 min long; verify a row appears in call_session_analysis within seconds.
  2. Recent cron: Run pnpm analyze-call-sessions --dry-run --hours=1 to see CSV + insights without writing.
  3. Backfill: Run pnpm backfill-call-analysis --dry-run --limit=10 to test batch processing.
  4. Unit tests: pnpm test analyze-call.test.ts covers message extraction, truncation, and row mapping.

Scope

  • Backend
  • Database
  • Infra / config

Checklist

  • I self-reviewed this PR
  • I ran pnpm run fixall
  • I ran pnpm run type-check
  • I added or updated tests where needed
  • I documented env vars, migrations, or rollout notes if needed

Notes for reviewers

  • The webhook is fire-and-forget; failures to enqueue don't roll back the call completion. The backfill script reprocesses any sessions that still lack an analysis row.
  • Transcript shape varies (bare array, { messages }, { user_transcriptions }); the extraction logic normalizes all variants and sorts chronologically.
  • When user transcription is missing, the analysis notes this in notable_patterns so it's not conflated with actual disengagement.
  • The same prompt, schema, and extraction logic are shared across the webhook, recent-cron, and backfill paths to ensure consistency.

https://claude.ai/code/session_01NcdZdGkUc2UWCsg2gx32rg

claude added 2 commits June 20, 2026 17:48
When a call session transitions to 'completed' with a non-empty
transcript, a Supabase Database Webhook (pg_net trigger) POSTs the
session id to a new Next.js route handler, which calls Grok (xAI) via
the AI SDK to generate a structured analysis (title, summary, key
points, sentiment, action items) and stores it in a new
call_sessions.analysis JSONB column.

- migration: add analysis/analysis_status/analysis_generated_at columns,
  partial index, pg_net webhook trigger gated on the completed transition
- lib/ai/summarize-call.ts: defensive transcript normalizer + generateObject
- app/api/call-sessions/analyze: authenticated, idempotent handler
- types, .env.example, @ai-sdk/xai dependency, unit tests

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01NcdZdGkUc2UWCsg2gx32rg
…ipts

Pivot from the simple single-column summary to the richer, table-based
analysis shape established on feat/scripts-call-analysis.

- migration: replace the analysis JSONB columns with call_session_analysis
  (one rich row per call: language, topic, engagement, sentiment, key
  requests, etc.) and call_session_analytics (aggregate per run); keep the
  pg_net webhook trigger on the completed transition (transcript-shape safe)
- lib/ai/analyze-call.ts: shared transcript extraction + Grok generateObject
  with the rich schema, mapped onto the table columns
- route /api/call-sessions/analyze: refactored to insert call_session_analysis,
  idempotent (skip if a row exists / too short / not completed)
- scripts/analyze-call-sessions.mjs (recent/daily-cron) and
  scripts/backfill-call-analysis.mjs (historical backfill, batch pattern),
  wired into scripts + root package.json
- types, tests updated; drop the old summarize-call lib/test

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01NcdZdGkUc2UWCsg2gx32rg
@bolt-new-by-stackblitz

Copy link
Copy Markdown

Review PR in StackBlitz Codeflow Run & review this pull request in StackBlitz Codeflow.

@vercel

vercel Bot commented Jun 21, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
sexyvoice Ready Ready Preview, Comment Jun 21, 2026 11:50am
sexyvoice-docs Ready Ready Preview, Comment Jun 21, 2026 11:50am

Request Review

@chatgpt-codex-connector

Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

@claude

claude Bot commented Jun 21, 2026

Copy link
Copy Markdown
Contributor

Claude finished @gianpaj's task in 2m 55s —— View job


PR Review: Call transcript analysis via Grok (xAI)

Nice, well-documented PR — the three-path design (webhook / recent-cron / backfill), idempotency guards, fire-and-forget webhook, and set search_path = '' in the security definer function are all good. I read through the core lib, the API route, the migration, both scripts, and the tests. Findings below (details posted as inline comments).

🔴 Correctness / consistency

  1. Errored sessions are permanently blocked from reprocessing (scripts/analyze-call-sessions.mjs). buildAnalysisRecord + saveAllSessionAnalyses insert a row even when analysis failed, and filterAnalyzedSessions treats any existing row as "analyzed". So a one-time failure means the recent-cron and backfill scripts will never retry it — directly contradicting the webhook's deliberate "no row on failure" design (route.ts:95-97) and the PR's claim that backfill reprocesses sessions lacking an analysis row.

  2. Parse failures are recorded as fake successes (analyze-call-sessions.mjs:476-484). A JSON parse error is stuffed into analysis with no top-level error, so it's counted as a valid analysis in aggregateInsights and saved as an all-null row with error: null.

  3. Duplicate rows possible (route.ts:70-84). The read-then-insert idempotency check isn't atomic and there's no unique constraint on session_id. Concurrent webhook retries, or a webhook firing during a backfill run, can both insert. Recommend a unique index + upsert/conflict handling.

  4. "Shared prompt/schema/extraction" isn't actually shared. The PR notes say all three paths share prompt, schema, and extraction, but lib/ai/analyze-call.ts uses generateObject + a Zod schema, while scripts/analyze-call-sessions.mjs re-implements the prompt as free-text generateText + manual parseLlmJson, and duplicates extractMessages/buildConversationSummary. Two sources of truth that will drift (and have different reliability characteristics). The tests only cover the lib version, not the script's parsing path.

🟠 Security / privacy

  1. Unset secret → open endpoint (route.ts:24). If CALL_SUMMARY_SECRET is undefined, the check compares against the literal "Bearer undefined". Add an explicit guard (return 500 when unset); consider crypto.timingSafeEqual.

  2. Telemetry records full transcripts (analyze-call.ts:226-230). recordInputs/recordOutputs: true writes the (intimate) conversation into trace spans. Worth reconsidering for this content type.

🟡 Minor

  1. Code default model grok-4 vs .env.example's grok-4-fast-non-reasoning — align these (analyze-call.ts:13).
  2. buildConversationSummary: a single line longer than maxLength yields only the truncation marker with no content.
  3. Messages lacking a timestamp sort to epoch 0 (front of the list), which can interleave oddly with timestamped turns — acceptable, but worth noting.

✅ Looks good

  • Migration structure, RLS-by-default (service-role only), webhook trigger condition, and pg_net usage.
  • Idempotency/no-op 200 design in the webhook for not-yet-completed / too-short calls.
  • Tests are clear and cover the extraction/summary/row-mapping helpers in the lib.

None of these are blockers for the happy path, but #1#3 undermine the retry/backfill guarantees the PR describes and are worth addressing before merge. Happy to help with any of the fixes.

@socket-security

socket-security Bot commented Jun 21, 2026

Copy link
Copy Markdown

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff Package Supply Chain
Security
Vulnerability Quality Maintenance License
Addednpm/​@​ai-sdk/​xai@​3.0.96761008798100

View full report

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces automated call transcript analysis using xAI's Grok model, adding a Next.js API route, database migrations for analysis tables, a webhook trigger, and CLI scripts for running recent analyses and backfills. The review feedback highlights several critical areas for improvement: a potential security bypass in the API route if the shared secret is undefined, database query inefficiencies in the scripts due to in-memory filtering instead of using LEFT JOINs, a missing UNIQUE constraint on session_id in the analysis table, inadequate error handling in the API route, and inefficient/unstable date sorting inside .sort() comparators.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment on lines +23 to +26
const authHeader = request.headers.get('authorization');
if (authHeader !== `Bearer ${process.env.CALL_SUMMARY_SECRET}`) {
return new NextResponse('Unauthorized', { status: 401 });
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

If process.env.CALL_SUMMARY_SECRET is not set or is empty (e.g., in a development or misconfigured environment), process.env.CALL_SUMMARY_SECRET will be undefined. This allows any request with an Authorization: Bearer undefined header to bypass the authentication check. Ensure that the secret is defined and not empty before performing the comparison.

Suggested change
const authHeader = request.headers.get('authorization');
if (authHeader !== `Bearer ${process.env.CALL_SUMMARY_SECRET}`) {
return new NextResponse('Unauthorized', { status: 401 });
}
const authHeader = request.headers.get('authorization');
const secret = process.env.CALL_SUMMARY_SECRET;
if (!secret || authHeader !== `Bearer ${secret}`) {
return new NextResponse('Unauthorized', { status: 401 });
}

Comment on lines +173 to +197
async function getRecentCallSessions(supabase, hoursAgo, limit = null) {
const cutoffTime = new Date(
Date.now() - hoursAgo * 60 * 60 * 1000,
).toISOString();

let query = supabase
.from('call_sessions')
.select(SESSION_COLUMNS)
.gte('started_at', cutoffTime)
.eq('status', 'completed')
.not('transcript', 'is', null)
.gte('duration_seconds', MIN_ANALYSIS_CALL_DURATION_SECONDS)
.order('started_at', { ascending: false });

if (limit) {
query = query.limit(limit);
}

const { data, error } = await query;
if (error) {
throw new Error(`Error fetching call sessions: ${error.message}`);
}

return filterAnalyzedSessions(supabase, data || []);
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Fetching all recent completed sessions and then filtering them in memory by making chunked database queries to call_session_analysis is highly inefficient. As the database grows, this will cause significant performance degradation. Instead, perform a LEFT JOIN with an IS NULL check directly in the Supabase query to fetch only the unanalyzed sessions in a single database round-trip.

async function getRecentCallSessions(supabase, hoursAgo, limit = null) {
  const cutoffTime = new Date(
    Date.now() - hoursAgo * 60 * 60 * 1000,
  ).toISOString();

  let query = supabase
    .from('call_sessions')
    .select(`${SESSION_COLUMNS}, call_session_analysis!left(id)`)
    .is('call_session_analysis', null)
    .gte('started_at', cutoffTime)
    .eq('status', 'completed')
    .not('transcript', 'is', null)
    .gte('duration_seconds', MIN_ANALYSIS_CALL_DURATION_SECONDS)
    .order('started_at', { ascending: false });

  if (limit) {
    query = query.limit(limit);
  }

  const { data, error } = await query;
  if (error) {
    throw new Error(`Error fetching call sessions: ${error.message}`);
  }

  return data || [];
}

Comment on lines +203 to +239
export async function getAllCompletedCallSessions(supabase, options = {}) {
const { minDuration = MIN_ANALYSIS_CALL_DURATION_SECONDS, models = [] } =
options;
const rows = [];
let from = 0;

while (true) {
let query = supabase
.from('call_sessions')
.select(SESSION_COLUMNS)
.eq('status', 'completed')
.not('transcript', 'is', null)
.gte('duration_seconds', minDuration)
.order('started_at', { ascending: false })
.range(from, from + DB_FETCH_PAGE_SIZE - 1);

if (models.length > 0) {
query = query.in('model', models);
}

const { data, error } = await query;
if (error) {
throw new Error(`Error fetching call sessions: ${error.message}`);
}
if (!data || data.length === 0) {
break;
}

rows.push(...data);
if (data.length < DB_FETCH_PAGE_SIZE) {
break;
}
from += DB_FETCH_PAGE_SIZE;
}

return filterAnalyzedSessions(supabase, rows);
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Fetching all completed sessions in pages of 1000 and then filtering them in memory by querying call_session_analysis in chunks of 200 will result in a massive number of database queries and high memory usage on large datasets. Optimize this by performing a LEFT JOIN with an IS NULL check directly in the database query to fetch only the unanalyzed sessions.

export async function getAllCompletedCallSessions(supabase, options = {}) {
  const { minDuration = MIN_ANALYSIS_CALL_DURATION_SECONDS, models = [] } =
    options;
  const rows = [];
  let from = 0;

  while (true) {
    let query = supabase
      .from('call_sessions')
      .select(`${SESSION_COLUMNS}, call_session_analysis!left(id)`)
      .is('call_session_analysis', null)
      .eq('status', 'completed')
      .not('transcript', 'is', null)
      .gte('duration_seconds', minDuration)
      .order('started_at', { ascending: false })
      .range(from, from + DB_FETCH_PAGE_SIZE - 1);

    if (models.length > 0) {
      query = query.in('model', models);
    }

    const { data, error } = await query;
    if (error) {
      throw new Error(`Error fetching call sessions: ${error.message}`);
    }
    if (!data || data.length === 0) {
      break;
    }

    rows.push(...data);
    if (data.length < DB_FETCH_PAGE_SIZE) {
      break;
    }
    from += DB_FETCH_PAGE_SIZE;
  }

  return rows;
}

Comment on lines +19 to +21
create table if not exists public.call_session_analysis (
id uuid primary key default gen_random_uuid(),
session_id uuid not null references public.call_sessions(id) on delete cascade,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The call_session_analysis table is designed to have a 1-to-1 relationship with call_sessions ("one rich row per analysed call"). However, there is no UNIQUE constraint on session_id. Without a unique constraint, concurrent webhook invocations or manual script runs could insert duplicate analysis rows for the same session. Add a unique constraint to enforce data integrity.

create table if not exists public.call_session_analysis (
  id uuid primary key default gen_random_uuid(),
  session_id uuid not null unique references public.call_sessions(id) on delete cascade,

Comment on lines +53 to +58
if (selectError || !session) {
return NextResponse.json(
{ error: 'Call session not found' },
{ status: 404 },
);
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If a database error occurs during the query (e.g., connection timeout or permission issue), selectError will be populated. Treating this as a 404 Not Found is misleading and makes debugging difficult. It is better to log the error and return a 500 Internal Server Error when selectError is present.

  if (selectError) {
    console.error('Error fetching call session:', selectError);
    return NextResponse.json(
      { error: 'Failed to fetch call session' },
      { status: 500 },
    );
  }

  if (!session) {
    return NextResponse.json(
      { error: 'Call session not found' },
      { status: 404 },
    );
  }

Comment on lines +70 to +77
const { count: existing } = await supabase
.from('call_session_analysis')
.select('id', { count: 'exact', head: true })
.eq('session_id', id);

if ((existing ?? 0) > 0) {
return NextResponse.json({ skipped: true });
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If the query to check for an existing analysis fails, countError is ignored and existing will be null. This causes the code to proceed to call the LLM and attempt an insert, which is wasteful and could fail. Properly handle the query error by logging it and returning a 500 status.

  const { count: existing, error: countError } = await supabase
    .from('call_session_analysis')
    .select('id', { count: 'exact', head: true })
    .eq('session_id', id);

  if (countError) {
    console.error('Error checking existing analysis:', countError);
    return NextResponse.json(
      { error: 'Failed to check existing analysis' },
      { status: 500 },
    );
  }

  if ((existing ?? 0) > 0) {
    return NextResponse.json({ skipped: true });
  }

Comment thread apps/web/lib/ai/analyze-call.ts Outdated
Comment on lines +162 to +166
return [...normalizedAssistant, ...normalizedUser].sort((a, b) => {
const timeA = a.timestamp ? new Date(a.timestamp).getTime() : 0;
const timeB = b.timestamp ? new Date(b.timestamp).getTime() : 0;
return timeA - timeB;
});

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Parsing date strings using new Date() inside a .sort() comparator is inefficient because it instantiates new objects repeatedly. Furthermore, if any timestamp is invalid or malformed, new Date().getTime() returns NaN, which leads to unstable and incorrect sorting behavior in JavaScript. Use Date.parse and handle NaN values safely.

  return [...normalizedAssistant, ...normalizedUser].sort((a, b) => {
    const timeA = a.timestamp ? Date.parse(a.timestamp) : 0;
    const timeB = b.timestamp ? Date.parse(b.timestamp) : 0;
    const scoreA = Number.isNaN(timeA) ? 0 : timeA;
    const scoreB = Number.isNaN(timeB) ? 0 : timeB;
    return scoreA - scoreB;
  });

Comment thread scripts/analyze-call-sessions.mjs Outdated
Comment on lines +343 to +347
return [...normalizedAssistant, ...normalizedUser].sort((a, b) => {
const timeA = a.timestamp ? new Date(a.timestamp).getTime() : 0;
const timeB = b.timestamp ? new Date(b.timestamp).getTime() : 0;
return timeA - timeB;
});

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Parsing date strings using new Date() inside a .sort() comparator is inefficient because it instantiates new objects repeatedly. Furthermore, if any timestamp is invalid or malformed, new Date().getTime() returns NaN, which leads to unstable and incorrect sorting behavior in JavaScript. Use Date.parse and handle NaN values safely.

  return [...normalizedAssistant, ...normalizedUser].sort((a, b) => {
    const timeA = a.timestamp ? Date.parse(a.timestamp) : 0;
    const timeB = b.timestamp ? Date.parse(b.timestamp) : 0;
    const scoreA = Number.isNaN(timeA) ? 0 : timeA;
    const scoreB = Number.isNaN(timeB) ? 0 : timeB;
    return scoreA - scoreB;
  });

// CRON_SECRET pattern used by /api/daily-stats, but a distinct secret so the
// webhook isn't conflated with the cron.
const authHeader = request.headers.get('authorization');
if (authHeader !== `Bearer ${process.env.CALL_SUMMARY_SECRET}`) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Guard against an unset secret. If CALL_SUMMARY_SECRET is undefined, this compares the incoming header against the literal string "Bearer undefined", which an attacker can trivially send — turning a misconfiguration into an open, unauthenticated endpoint that triggers paid Grok calls and DB writes.

const secret = process.env.CALL_SUMMARY_SECRET;
if (!secret) {
  Sentry.captureMessage('CALL_SUMMARY_SECRET not configured');
  return new NextResponse('Server misconfigured', { status: 500 });
}
if (authHeader !== `Bearer ${secret}`) {
  return new NextResponse('Unauthorized', { status: 401 });
}

A constant-time comparison (crypto.timingSafeEqual) would also be a nice hardening here.

Comment on lines +70 to +84
const { count: existing } = await supabase
.from('call_session_analysis')
.select('id', { count: 'exact', head: true })
.eq('session_id', id);

if ((existing ?? 0) > 0) {
return NextResponse.json({ skipped: true });
}

try {
const analysis = await analyzeTranscript(session);

const { error: insertError } = await supabase
.from('call_session_analysis')
.insert(toAnalysisRow(session, analysis));

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Race condition / duplicate rows. This read-then-insert isn't atomic, and there's no unique constraint on call_session_analysis.session_id. Two webhook deliveries (Supabase retries, or a webhook firing while the backfill script runs) can both pass the "row exists" check and both insert — producing duplicate analysis rows for one session.

Recommend adding a unique index on session_id in the migration and relying on upsert/onConflict (or catching the unique-violation) instead of a separate count query. See the migration comment for the index suggestion.

Comment thread scripts/analyze-call-sessions.mjs Outdated
Comment on lines +264 to +271
export async function saveAllSessionAnalyses(supabase, results) {
let successCount = 0;
let errorCount = 0;

for (const result of results) {
const { error } = await supabase
.from('call_session_analysis')
.insert(buildAnalysisRecord(result));

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Errored sessions get a row that permanently blocks reprocessing. buildAnalysisRecord always builds an insert row even when result.error is set (no analysis), and saveAllSessionAnalyses inserts it. But filterAnalyzedSessions excludes any session that has a row in call_session_analysis. So a session that errored once can never be picked up again by the recent-cron or backfill scripts.

This directly contradicts:

  • the webhook path, which deliberately does not write a row on failure so retries/backfill can reprocess (see route.ts lines 95-97), and
  • the PR description's claim that backfill "reprocesses any sessions that still lack an analysis row."

Consider skipping the insert when result.error is set, or have filterAnalyzedSessions only treat rows with error is null as "analyzed".

Comment on lines +476 to +484
let analysis;
try {
analysis = parseLlmJson(responseText);
} catch (parseError) {
analysis = {
raw_response: responseText,
parse_error: parseError.message,
};
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A JSON parse failure produces analysis = { raw_response, parse_error } with no top-level result.error. Downstream this means:

  • aggregateInsights counts it as a valid analysis (r.analysis && !r.error), inflating success metrics.
  • buildAnalysisRecord writes a row with all analysis columns null and error: null — i.e. a "successful" empty row that then blocks reprocessing (see filter issue above).

Set result.error (e.g. error: \parse failed: ${parseError.message}`) instead of stashing it inside analysis`, so it's treated as a failure consistently.

Comment on lines +53 to +54
create index if not exists idx_call_session_analysis_session_id
on public.call_session_analysis (session_id);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider a unique index here to make idempotency enforceable at the DB level and close the webhook race (see route.ts):

create unique index if not exists idx_call_session_analysis_session_id
  on public.call_session_analysis (session_id);

This only works cleanly if you stop writing error rows for the same session (see the script comments) — otherwise an error row and a later success row would collide. If you want to keep both, a partial unique index where error is null is the alternative.

Comment on lines +226 to +230
experimental_telemetry: {
isEnabled: true,
recordInputs: true,
recordOutputs: true,
},

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

recordInputs: true / recordOutputs: true will write the full transcript into telemetry spans (and anything that ingests them). Given the explicitly intimate/adult nature of these conversations, this is a privacy exposure worth a second look — consider recordInputs: false (or redaction) unless the traces are tightly access-controlled.

Comment thread apps/web/lib/ai/analyze-call.ts Outdated
export const MIN_ANALYSIS_CALL_DURATION_SECONDS = 120;

// Confirm the exact current Grok model id against xAI docs; override via env.
const DEFAULT_MODEL = 'grok-4';

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: the code default is grok-4, but .env.example sets XAI_SUMMARY_MODEL=grok-4-fast-non-reasoning. Worth aligning so the documented default matches the code default (and confirm the chosen model supports the structured-output / generateObject path reliably).

@argos-ci

argos-ci Bot commented Jun 21, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Argos notifications ↗︎

Build Status Details Updated (UTC)
default (Inspect) ✅ No changes detected - Jun 21, 2026, 11:51 AM

@pullfrog pullfrog Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Solid, well-structured feature, but a few things warrant a look before merge: verbatim call transcripts are shipped to Sentry, the idempotency guard has no DB-level uniqueness backing it, and the default Grok model id is unverified (and diverges from .env.example).

Reviewed changes — initial review of the full PR adding Grok-based call transcript analysis across a webhook route, two scripts, and a new schema.

  • New webhook route (app/api/call-sessions/analyze/route.ts) — bearer-secret auth, session lookup, idempotency guards, then analyzeTranscript + insert.
  • Core analysis lib (lib/ai/analyze-call.ts) — extractMessages, buildConversationSummary, analyzeTranscript (generateObject + Zod), toAnalysisRow.
  • Scriptsanalyze-call-sessions.mjs (recent/cron) and backfill-call-analysis.mjs, sharing extraction logic but using generateText + manual JSON parse.
  • Migrationcall_session_analysis + call_session_analytics tables, indexes, RLS (enabled, no policies), and a security definer trigger firing a pg_net POST on status → completed.
  • Config@ai-sdk/xai dependency, new env vars, package scripts, unit tests.

ℹ️ Webhook and script paths can produce non-identical rows despite the "identical rows" claim

The PR notes that the webhook, recent-cron, and backfill paths share prompt/schema/extraction "to ensure consistency", but only the extraction helpers are actually shared. The route uses generateObject with a Zod schema and a system prompt; the scripts use generateText with a JSON-in-prompt instruction, no system message, and a manual parser. The resulting rows can diverge in practice.

  • Enum enforcement — the Zod schema rejects/coerces off-schema topic_category values; the script stores whatever string the model returns.
  • Null/falsy coercionbuildAnalysisRecord uses || null / || [] (so duration_seconds: 0, empty-string where_conversation_died, and null key_user_requests are coerced), while toAnalysisRow uses ?? null and passes schema output through unchanged.
Technical details
# Webhook vs script row divergence

## Affected sites
- `apps/web/lib/ai/analyze-call.ts``analyzeTranscript` uses `generateObject` + Zod + `system` prompt; `toAnalysisRow` uses `?? null`.
- `scripts/analyze-call-sessions.mjs``analyzeCallSessionsWithLLM` uses `generateText` + `buildPrompt` (no system message) + `parseLlmJson`; `buildAnalysisRecord` uses `|| null` / `|| []`.

## Required outcome
- Either make the two paths genuinely equivalent (shared system prompt + shared null-handling), or soften the "identical rows" claim in the PR/code comments so downstream consumers don't assume column-level parity between webhook-written and script-written rows.

## Open questions for the human
- Is exact row parity a real requirement (e.g. analytics joins assume it), or is "same schema, best-effort same semantics" acceptable? That decides whether this needs a fix or just a doc correction.

ℹ️ Nitpicks

  • extractMessages / extractMessages sort comparator (analyze-call.ts:163, analyze-call-sessions.mjs:344): a non-null but unparseable timestamp yields NaN from getTime(), which the ts ? … : 0 ternary doesn't guard — NaN comparisons give an undefined sort order. new Date(ts).getTime() || 0 would coerce NaN to 0.

Pullfrog  | Fix all ➔Fix 👍s ➔View workflow run | Using Claude Opus𝕏

Comment thread apps/web/lib/ai/analyze-call.ts
Comment thread apps/web/lib/ai/analyze-call.ts Outdated
Comment thread apps/web/supabase/migrations/20260620000000_add_call_session_analysis.sql Outdated
Comment thread scripts/analyze-call-sessions.mjs Outdated
Comment thread apps/web/app/api/call-sessions/analyze/route.ts Outdated
…privacy)

- route: guard unset CALL_SUMMARY_SECRET (500), split selectError (500) from
  not-found (404) via maybeSingle, handle countError, upsert onConflict
  session_id to close the read-then-insert race
- migration: UNIQUE(session_id) on call_session_analysis (DB-level idempotency);
  drop now-redundant plain index
- analyze-call lib: disable AI-SDK telemetry for the analysis call so verbatim
  transcripts are never shipped to Sentry; safe epoch sort (Date.parse + NaN)
- scripts: PostgREST !left/is-null anti-join instead of in-memory filtering;
  don't persist rows for failed/parse-failed analyses (keeps them retryable);
  treat parse failures as errors not fake successes; upsert on save; safe sort
- align default Grok model to grok-4 across code and .env.example

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01NcdZdGkUc2UWCsg2gx32rg

@pullfrog pullfrog Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ No new issues found.

Reviewed changes — incremental re-review of commit 62fdcac, which addresses all five inline findings from the prior pullfrog review.

  • Auth guard for unset secretroute.ts now returns 500 when CALL_SUMMARY_SECRET is unset, before the bearer comparison, closing the Bearer undefined bypass.
  • DB-level idempotency — migration adds unique on call_session_analysis.session_id (redundant plain index dropped), and the route plus both script save paths switched to upsert(..., { onConflict: 'session_id', ignoreDuplicates: true }).
  • Sentry transcript leak fixedanalyzeTranscript sets experimental_telemetry to isEnabled/recordInputs/recordOutputs: false, so verbatim transcripts no longer reach the Sentry AI-SDK span.
  • Parse-failure handling — the recent-cron script now records result.error and skips the insert on unparseable LLM output, and saveAllSessionAnalyses skips failed analyses (new skippedCount) so sessions stay retryable.
  • Anti-join replaces in-memory filteringfilterAnalyzedSessions removed in favor of a PostgREST call_session_analysis!left(id) + .is(..., null) anti-join across both fetch paths.
  • Stable timestamp sort — shared toEpoch() helper coerces missing/unparseable timestamps to 0, fixing the NaN sort instability; .env.example realigned to the grok-4 default.

The anti-join migration is the most substantive change; it's sound because the backfill fetch loop completes before any inserts run, so .range() pagination over the server-filtered set stays stable.

The one prior point not carried into commits is the informational "webhook vs script can produce non-identical rows" observation (the route uses generateObject+Zod+system prompt; scripts use generateText+manual parse with || coercion). That remains a documentation/scope decision for you — no action required to merge.

Pullfrog  | View workflow run | Using Claude Opus𝕏

Per maintainer decision, use the current xAI flagship (grok-4.3) as the
default analysis model across the lib, scripts, and .env.example. It supports
structured outputs (generateObject); bare `grok-4` was not a listed current id.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01NcdZdGkUc2UWCsg2gx32rg

@pullfrog pullfrog Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ No new issues found.

Reviewed changes — incremental re-review of commit 94eea55, which bumps the default Grok model from grok-4 to grok-4.3.

  • Default model bumpDEFAULT_MODEL updated to grok-4.3 in both apps/web/lib/ai/analyze-call.ts and scripts/analyze-call-sessions.mjs, with .env.example and the JSDoc comments realigned so all four sites agree.
  • Comment clarified — the prior "confirm the exact current Grok model id against xAI docs" note is replaced with "xAI flagship; supports structured outputs (generateObject)".

This closes the last open thread from the initial review (the model id was previously flagged as unverified). grok-4.3 is a current xAI model (released 2026-04-30) that supports structured outputs via the native xAI API, so the webhook's generateObject path is sound. The change is applied consistently across every site, so no new divergence is introduced.

Pullfrog  | View workflow run | Using Claude Opus𝕏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants