Problem
Room deletion in Resonate is not atomic. When deletion fails mid-process, the system is left in an inconsistent state across Appwrite and LiveKit.
This results in:
- Orphaned participant documents referencing deleted rooms
- LiveKit rooms without corresponding database records
- Silent cleanup failures with no recovery mechanism
Affected Code Paths
- delete-room
- livekit-webhook (room_finished)
- database-cleaner
Root Cause
- Async forEach used for deletions (not awaited)
- No verification after deletion steps
- No rollback or compensation logic
- No reconciliation for partial failures
Proposed Fix
- Replace async forEach with Promise.all() or for...of
- Verify completion after each deletion phase
- Add retry logic for transient failures
- Log failed deletions with sufficient detail
- Implement reconciliation job to clean orphaned rooms and participant
- Ensure LiveKit and Appwrite states remain consistent
Acceptance Criteria
- Room deletion is atomic (all-or-nothing)
- Orphaned participants are automatically cleaned
- Deletion failures are logged and traceable
- No inconsistent room state across services
- Safe handling of concurrent deletions
Impact
- High-priority data integrity issue affecting:
- System reliability
- Database cleanliness and storage costs
- User experience and operational overhead
Problem
Room deletion in Resonate is not atomic. When deletion fails mid-process, the system is left in an inconsistent state across Appwrite and LiveKit.
This results in:
Affected Code Paths
Root Cause
Proposed Fix
Acceptance Criteria
Impact