Skip to content

✨ Support for voxtral realtime stt#1277

Open
cameledev wants to merge 11 commits into
mainfrom
add-voxtral-realtime-stt
Open

✨ Support for voxtral realtime stt#1277
cameledev wants to merge 11 commits into
mainfrom
add-voxtral-realtime-stt

Conversation

@cameledev

@cameledev cameledev commented Apr 16, 2026

Copy link
Copy Markdown
Collaborator

Purpose

Add support for voxtral realtime stt (live transcription).

Why we can't use mistralai plugin

The official livekit-plugins-mistralai plugin is hardwired to Mistral's hosted cloud API both the URL path and the wire protocol. We self-host Voxtral on vLLM, which speaks a different protocol on a different path, so the plugin can't be pointed at it. Four distinct issues, in the order they occured:

1. Hardcoded WebSocket path

The Mistral SDK's RealtimeTranscription._build_url hardcodes the path /v1/audio/transcriptions/realtime. server_url only controls scheme + host there is no kwarg to change the suffix, and the LiveKit plugin calls connect() without overriding it either. vLLM serves Voxtral realtime at /v1/realtime. That route doesn't exist on vLLM, so every WebSocket connnection attemps returned HTTP 500. A monkeypatch on _build_url (/v1/audio/transcriptions/realtime/v1/realtime) fixed this issue.

2. Incompatible wire protocol (the real showstopper)

Even with the path fixed, the two sides speak different languages:

Mistral plugin/SDK vLLM Voxtral
Protocol Mistral proprietary OpenAI-Realtime derived
Server msgs RealtimeTranscriptionSessionCreated, TranscriptionStreamTextDelta, TranscriptionStreamDone session.created, transcription.delta, transcription.done
Client msgs send_audio / flush_audio / end_audio input_audio_buffer.append / .commit

The Mistral SDK cannot parse vLLM's messages. Monkeypatching this would mean reimplementing the SDK's entire receive path.

3. The handshake failures (how the mismatch actually surfaced)

rt.connect() does two sequential handshakes, and the plugin broke at both:

  • WS upgrade handshake (transport). Before the path fix, the upgrade itself was rejected: websockets.exceptions.InvalidStatus: server rejected WebSocket connection: HTTP 500 thrown out of ConnectionPool.prewarm.
  • Realtime session handshake (application). After the _build_url
    monkeypatch, the upgrade succeeded ([accepted] / connection open in vLLM logs), but _recv_handshake then blocked on await websocket.recv() waiting for Mistral's session.created event. vLLM never sends that event (its OpenAI-style handshake has a different shape), so the 10s LiveKit connect timeout fired → CancelledErrorTimeoutErrorAPIConnectionError: Connection error., retried, then session closed as unrecoverable.

4. Dependency resolution failure (packaging)

Independently, livekit-plugins-mistralai==1.5.4 pulls inmistralai[realtime]>=2.0.0. Under uv's universal resolution across
requires-python = ">=3.12" (which now includes 3.14), that extra is
unresolvable → requirements are unsatisfiable. (It "worked before" only
because pip resolved against the single Docker Python 3.13.)

@cameledev cameledev changed the title Support for voxtral realtime stt ✨ Support for voxtral realtime stt Apr 16, 2026
Comment thread src/agents/multi-user-transcriber.py Outdated
Comment thread src/agents/multi-user-transcriber.py Outdated
Comment thread src/agents/multi-user-transcriber.py Fixed
@cameledev cameledev force-pushed the add-voxtral-realtime-stt branch from 5d4ec36 to ec7eb87 Compare May 12, 2026 16:33
@sonarqubecloud

Copy link
Copy Markdown

@cameledev cameledev force-pushed the add-voxtral-realtime-stt branch from ec7eb87 to de37643 Compare May 19, 2026 16:35
Comment thread src/agents/voxtral_vllm_stt.py Fixed
Comment thread src/agents/voxtral_vllm_stt.py Fixed
Comment thread src/agents/voxtral_vllm_stt.py Fixed
Comment on lines +80 to +88
const existingIndex = prevSegments.findIndex(
(s: TranscriptionSegmentWithParticipant) => s.id === segment.id
)
if (existingIndex === -1) {
return [...prevSegments, { participant, ...segment }]
}
const next = prevSegments.slice()
next[existingIndex] = { ...next[existingIndex], ...segment }
return next

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

allow update of previously received segments

Comment thread src/agents/voxtral_vllm_stt.py Fixed
Comment thread src/agents/voxtral_vllm_stt.py Fixed
@cameledev cameledev marked this pull request as ready for review May 28, 2026 18:13
@cameledev cameledev requested a review from lebaudantoine May 28, 2026 18:13

@cameledev cameledev left a comment

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.

Comment thread src/backend/core/api/__init__.py Outdated
"default_country": settings.ROOM_TELEPHONY_DEFAULT_COUNTRY,
},
"subtitle": {"enabled": settings.ROOM_SUBTITLE_ENABLED},
"subtitle": {"enabled": True}, # settings.ROOM_SUBTITLE_ENABLED},

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

REVERT BEFORE MERGE

Comment on lines 589 to 590

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

REVERT BEFORE MERGE (2/2)

send_t.cancel()
try:
await send_t
except (asyncio.CancelledError, websockets.WebSocketException):
Comment thread src/backend/core/api/viewsets.py Outdated
authentication_classes=[LiveKitTokenAuthentication],
)
@FeatureFlag.require("subtitle")
) # @FeatureFlag.require("subtitle")

@lebaudantoine lebaudantoine May 29, 2026

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

revert before merge

Comment thread compose.yml
build:
context: ./src/agents
target: development
command: ["python", "multi_user_transcriber.py", "dev"]

@lebaudantoine lebaudantoine May 29, 2026

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not necessary, but I understand if you want to keep more control over the compose service

@sonarqubecloud

Copy link
Copy Markdown

Quality Gate Failed Quality Gate failed

Failed conditions
C Reliability Rating on New Code (required ≥ A)

See analysis details on SonarQube Cloud

Catch issues before they fail your Quality Gate with our IDE extension SonarQube for IDE

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants