feat(build): word-level transcript → per-beat compose prompt (#204) by kiyeonjeon21 · Pull Request #237 · vericontext/vibeframe

kiyeonjeon21 · 2026-06-22T11:08:11Z

Closes the vibe build half of #204 — word-synced caption / kinetic-type animations.

What

The LLM composer now receives Whisper word-level narration timings so it can sync word reveals to speech. The deterministic vibe scene add emit path already did this; vibe build did not.

How

New _shared/transcribe-narration.ts — transcribeNarrationWords() (provider-agnostic Whisper word-level transcription of generated narration), readBeatTranscript(), beatTranscriptRelPath(). The inline transcribe block in vibe scene add is refactored to reuse it (single source of truth).
Asset stage dispatchTranscript() — after narration, transcribe to assets/transcript-<beat>.json when narration exists and an OpenAI key is configured. Cached (skipped when the file exists and narration wasn't freshly regenerated); best-effort (no key / no words → skip, never fails the build). Opt out with --skip-transcript.
Prompt injection (formatTranscriptSection) — compact, deterministic timing section with a token-budget guard: word-level [start, "word"] table at/below 120 words; phrase-level approximate anchors above. Folded into the user prompt → the compose cache key invalidates when timings change. Carries the existing visual-sync-only guard (no <audio>/SFX).
Host-agent path (compose-prompts) — exposes transcriptPath and feeds the same timings into its prompt.

Behavior

Default on when narration + OpenAI key exist; --skip-transcript opts out. No key → graceful skip (narration still plays, no word-sync). Transcription is low/negligible cost (~$0.002/beat).

Tests

formatTranscriptSection: no / short / oversized transcript + clamping/rounding.
buildUserPrompt: section omitted without transcript, word-level with, cache-key changes when transcript changes.
asset stage: writes transcript-<beat>.json when words exist; --skip-transcript writes nothing.
readBeatTranscript / path helper / non-fatal failure path.
vibe build --describe schema snapshot updated for --skip-transcript.

Notes / follow-ups

Dry-run does not itemise the (~$0.002/beat) transcription cost as a separate line — intentional, below plan rounding noise.
LLM prompt-tuning for how aggressively it uses the timings is a separate quality pass.

Closes the `vibe build` half of word-sync animations (#204): the LLM composer now receives Whisper word-level narration timings so it can sync caption / kinetic-type reveals to speech. The deterministic `scene add` emit path already supported this; build did not. - Add `_shared/transcribe-narration.ts`: `transcribeNarrationWords()` (provider-agnostic Whisper word-level transcription of generated narration) + `readBeatTranscript()` + `beatTranscriptRelPath()`. Refactor the inline transcribe block in `vibe scene add` to reuse it. - Asset stage gains `dispatchTranscript()`: after narration, transcribe to `assets/transcript-<beat>.json` when narration exists and an OpenAI key is configured. Cached (skipped when the file exists and narration was not freshly regenerated); best-effort (missing key / no words → skip, never fails the build). Gated by `--skip-transcript`. - `buildUserPrompt` injects a compact, deterministic timing section with a token-budget guard (`formatTranscriptSection`): word-level table at/below 120 words, phrase-level "approximate" anchors above. Folded into the user prompt, so the compose cache key invalidates when timings change. Carries the visual-sync-only guard (no `<audio>`/SFX). - Host-agent path (`compose-prompts`) exposes `transcriptPath` and feeds the same timings into its prompt. - Tests: no/short/oversized transcript, cache invalidation, asset-stage generation + `--skip-transcript`, transcript read/validation.

vercel · 2026-06-22T11:08:17Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
vibeframe	Ready	Preview, Comment	Jun 22, 2026 11:08am

kiyeonjeon21 added 2 commits June 22, 2026 20:00

chore: bump version to 0.113.11

31057bf

vercel Bot deployed to Preview June 22, 2026 11:08 View deployment

kiyeonjeon21 merged commit 6972357 into main Jun 22, 2026
6 checks passed

kiyeonjeon21 deleted the feat/word-sync-transcript branch June 22, 2026 11:12

kiyeonjeon21 mentioned this pull request Jun 22, 2026

compose-scenes-with-skills: pass word-level transcript into per-beat compose prompt (word-sync animations) #204

Closed

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(build): word-level transcript → per-beat compose prompt (#204)#237

feat(build): word-level transcript → per-beat compose prompt (#204)#237
kiyeonjeon21 merged 2 commits into
mainfrom
feat/word-sync-transcript

kiyeonjeon21 commented Jun 22, 2026

Uh oh!

vercel Bot commented Jun 22, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

kiyeonjeon21 commented Jun 22, 2026

What

How

Behavior

Tests

Notes / follow-ups

Uh oh!

vercel Bot commented Jun 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vercel Bot commented Jun 22, 2026 •

edited

Loading