Summary
A recording's Recording row stays status=active indefinitely (never transitions to stopped/saved) when the egress is terminated out-of-band — i.e. by anything other than the stop-recording flow. The clearest trigger: changing the room access while recording, which stops the LiveKit egress directly. After this, the room is effectively locked — POST /api/v1.0/rooms/{id}/start-recording/ returns 409 Conflict ("already recording") and never recovers without a manual DB edit.
Steps to reproduce
- Start a recording in a room (observed with
mode=transcript, but the lifecycle is mode-independent).
- Change the room's access while it is recording (this stops the egress).
- LiveKit fires
egress_ended → meet receives it (POST /api/v1.0/rooms/webhooks-livekit/ → 200), but the Recording row stays active.
- Try to record in that room again → 409 Conflict, permanently.
Contrast — the normal paths work
| End method |
Recording row |
Stop button / leave room (→ stop-recording) |
stopped ✅ |
| Change room access (egress stops out-of-band) |
stays active ❌ → 409 lock |
Expected
Finalize the recording (active → stopped/saved) on the egress_ended webhook regardless of how the egress ended, not only via the stop-recording request — or add a reconciliation/janitor for active recordings whose egress has already ended. As-is, a single access change permanently 409-locks the room and leaks active rows.
Notes
- meet does receive the egress webhook at the moment it ends (200 response), so the signal is available — it just isn't used to finalize the recording.
- Minor, likely related: when the egress ends this way, stopping the metadata-collector agent dispatch logs
TwirpError(code=unavailable, "no response from servers", 503) because the agent already self-exited on participant disconnect — a no-op stop on a finished job. Worth guarding so it doesn't surface as an error.
Summary
A recording's
Recordingrow staysstatus=activeindefinitely (never transitions tostopped/saved) when the egress is terminated out-of-band — i.e. by anything other than thestop-recordingflow. The clearest trigger: changing the room access while recording, which stops the LiveKit egress directly. After this, the room is effectively locked —POST /api/v1.0/rooms/{id}/start-recording/returns 409 Conflict ("already recording") and never recovers without a manual DB edit.Steps to reproduce
mode=transcript, but the lifecycle is mode-independent).egress_ended→ meet receives it (POST /api/v1.0/rooms/webhooks-livekit/→ 200), but theRecordingrow staysactive.Contrast — the normal paths work
stop-recording)stopped✅active❌ → 409 lockExpected
Finalize the recording (
active→stopped/saved) on theegress_endedwebhook regardless of how the egress ended, not only via thestop-recordingrequest — or add a reconciliation/janitor foractiverecordings whose egress has already ended. As-is, a single access change permanently 409-locks the room and leaksactiverows.Notes
TwirpError(code=unavailable, "no response from servers", 503)because the agent already self-exited on participant disconnect — a no-op stop on a finished job. Worth guarding so it doesn't surface as an error.