Skip to content

docs: real steam tts with timestamp#80

Merged
Kilerd merged 1 commit into
mainfrom
docs/streaming-tts-with-timestamp
May 12, 2026
Merged

docs: real steam tts with timestamp#80
Kilerd merged 1 commit into
mainfrom
docs/streaming-tts-with-timestamp

Conversation

@Kilerd
Copy link
Copy Markdown
Contributor

@Kilerd Kilerd commented May 12, 2026

Summary by CodeRabbit

  • Documentation
    • Updated text-to-speech streaming with timestamps endpoint documentation with refined timestamp handling and alignment snapshot semantics
    • Refreshed OpenAPI specification reflecting updated streaming response payload structure
    • Updated Python and Node.js code examples for streaming integration
    • Enhanced developer guide with clarified guidance for consuming timestamped streams

Review Change Stack

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 12, 2026

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 27ad4db1-6ac9-4be5-bbea-35cacec13b1e

📥 Commits

Reviewing files that changed from the base of the PR and between 7c055a2 and 5bb0708.

📒 Files selected for processing (3)
  • api-reference/endpoint/openapi-v1/text-to-speech-stream-with-timestamps.mdx
  • api-reference/openapi.json
  • developer-guide/core-features/text-to-speech.mdx

📝 Walkthrough

Walkthrough

This PR updates the OpenAPI schema and endpoint documentation for the /v1/tts/stream/with-timestamp endpoint to introduce chunk-relative audio offsets and snapshot-based alignment storage. The change establishes that alignment updates are keyed by chunk_seq, avoiding redundant sequential alignment tracking.

Changes

TTS Timestamped Stream Endpoint Update

Layer / File(s) Summary
OpenAPI Request and Event Schema
api-reference/openapi.json
Added TTSStreamWithTimestampRequest schema; updated endpoint request body to use it instead of generic TTSRequest; extended TTSTimestampStreamEvent to include chunk_seq and chunk_audio_offset_sec as required fields with updated descriptions.
OpenAPI Examples and Response Description
api-reference/openapi.json
Updated SSE example payloads to show chunk_audio_offset_sec and chunk_seq in event structures; clarified 200-response description to emphasize latest cumulative alignment snapshots keyed by chunk_seq with replacement semantics.
Endpoint Documentation: Payload, Alignment, and Stream Parsing
api-reference/endpoint/openapi-v1/text-to-speech-stream-with-timestamps.mdx
Updated stream payload field table to include chunk_audio_offset_sec; documented alignment shape with local-second semantics and instruction to add offset for global timeline; revised "Parsing the Stream" to replace alignment snapshots by chunk_seq instead of collecting alignments sequentially.
Code Examples: Python and Node.js Snapshot Storage
api-reference/endpoint/openapi-v1/text-to-speech-stream-with-timestamps.mdx
Updated Python example to store alignments in alignment_by_chunk dictionary keyed by chunk_seq with {content, offset, alignment} tuples; updated Node.js example to use alignmentByChunk Map with matching snapshot storage pattern.
Timeline Construction and Format Guidance
api-reference/endpoint/openapi-v1/text-to-speech-stream-with-timestamps.mdx
Rewrote global timeline construction to group by chunk_seq and offset each segment using snapshot's chunk_audio_offset_sec, replacing prior approach that accumulated audio_duration; updated MP3 format warning to focus on audio window and timestamp snapshot misalignment.
Developer Guide Update
developer-guide/core-features/text-to-speech.mdx
Updated "Streaming with Timestamps" section to describe cumulative alignment snapshots keyed by chunk_seq and clarify client behavior for audio concatenation and snapshot replacement.

Estimated Code Review Effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly Related PRs

  • fishaudio/docs#79: Establishes the initial timestamped TTS stream endpoint documentation and schemas; this PR builds on it by refining the SSE event structure with chunk_seq and chunk_audio_offset_sec fields and updating alignment snapshot semantics.

Poem

🐰 With chunks and offsets, we leap ahead,
Each snapshot keyed, the waypoints spread,
No more sequences, just the latest way—
Alignment's story, clear as the day! 🎵

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch docs/streaming-tts-with-timestamp

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@Kilerd Kilerd merged commit fc585de into main May 12, 2026
2 of 3 checks passed
@Kilerd Kilerd deleted the docs/streaming-tts-with-timestamp branch May 12, 2026 11:43
@mintlify
Copy link
Copy Markdown

mintlify Bot commented May 12, 2026

Preview deployment for your docs. Learn more about Mintlify Previews.

Project Status Preview Updated (UTC)
hanabiaiinc 🔴 Failed May 12, 2026, 12:55 PM

💡 Tip: Enable Workflows to automatically generate PRs for you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant