feat: add meeting transcription feature for macOS#789
Conversation
Add meeting transcription capability using system audio capture via ScreenCaptureKit. This allows recording meetings with mic and/or desktop audio, transcribing in real-time every 30 seconds using existing STT providers (OpenAI/Groq). Features: - System audio capture using macos-system-audio-recorder npm package - Support for mic-only, system-only, or both audio sources - Real-time transcription with time-stamped segments - Meeting history with search, rename, and delete - Full transcript view with segment details Note: This feature is macOS-only (requires macOS 12.3+ for ScreenCaptureKit) Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
🤖 Augment PR SummarySummary: This PR adds a macOS-only “Meetings” feature that records meeting audio (mic, system audio, or both) and transcribes it in near real-time using the app’s existing STT providers. Changes:
Technical Notes: Transcription requests are guarded against overlap, use request timeouts, and audio buffers are bounded to prevent unbounded growth; meeting data is persisted as JSON under the app’s data directory. 🤖 Was this summary useful? React with 👍 or 👎 |
| // Mutations | ||
| const startMutation = useMutation({ | ||
| mutationFn: (audioSource: MeetingAudioSource) => | ||
| tipcClient.startMeetingRecording({ audioSource }), |
| } | ||
|
|
||
| private startTranscriptionLoop(): void { | ||
| this.transcriptionTimer = setInterval(async () => { |
There was a problem hiding this comment.
- Make transcription loop non-reentrant to prevent overlapping runs - Add microphone capture in renderer for mic/both audio sources - Wait for in-progress transcription before stopping recording Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
|
augment review |
| )} | ||
| </Button> | ||
|
|
||
| {process.platform !== "darwin" && ( |
There was a problem hiding this comment.
| const audioData = Buffer.concat(chunks) | ||
|
|
||
| // Create WAV header | ||
| const wavHeader = this.createWavHeader(audioData.length, SAMPLE_RATE, CHANNELS, BYTES_PER_SAMPLE * 8) |
There was a problem hiding this comment.
The WAV header is always written as 48kHz/mono/16-bit, but macos-system-audio-recorder can output different sample rates / channel counts / bit depths via getAudioDetails(). If these don’t match the actual PCM stream, the WAV will be invalid and transcription quality can degrade or fail.
🤖 Was this useful? React with 👍 or 👎
| const groqBaseUrl = config.groqBaseUrl || "https://api.groq.com/openai/v1" | ||
| const openaiBaseUrl = config.openaiBaseUrl || "https://api.openai.com/v1" | ||
|
|
||
| const response = await fetch( |
There was a problem hiding this comment.
fetch() here has no timeout/abort; if the STT request hangs, isTranscribing can remain true and stopRecording() will wait indefinitely in its while (this.isTranscribing) loop. Consider adding an AbortController with a reasonable timeout for transcription requests.
🤖 Was this useful? React with 👍 or 👎
- Replace process.platform with process.env.IS_MAC in renderer for platform detection - Use actual audio details from macos-system-audio-recorder getAudioDetails() instead of hardcoded values - Add AbortController with 60s timeout for transcription API requests to prevent hanging Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
|
augment review |
|
|
||
| logApp(`[MeetingRecorder] Started recording meeting ${meetingId}`) | ||
| return meeting | ||
| } catch (error) { |
There was a problem hiding this comment.
| startTime: this.systemAudioBuffer.startTime, | ||
| audioDetails: { ...this.systemAudioBuffer.audioDetails }, | ||
| }) | ||
| this.systemAudioBuffer.data = [] |
There was a problem hiding this comment.
This clears the buffered audio before attempting transcription; if transcribeAudio() fails, that chunk is dropped and the meeting transcript can become permanently incomplete. Consider retaining the buffer until transcription succeeds (or otherwise enabling retry) to avoid silent data loss.
🤖 Was this useful? React with 👍 or 👎
| } | ||
|
|
||
| // Send to main process | ||
| tipcClient.addMeetingMicrophoneData({ audioData: int16Data.buffer }) |
There was a problem hiding this comment.
tipcClient.addMeetingMicrophoneData(...) returns a Promise but it’s not awaited/handled; if IPC rejects, this can surface as an unhandled rejection in the renderer during recording. Consider attaching a .catch (or otherwise handling errors) to keep failures contained.
🤖 Was this useful? React with 👍 or 👎
- Add cleanupRecordingResources() to properly clean up system recorder on error - Retain audio buffer until transcription succeeds to avoid data loss - Add .catch() to tipcClient.addMeetingMicrophoneData to handle promise rejections Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
|
augment review |
| } | ||
|
|
||
| source.connect(processor) | ||
| processor.connect(audioContext.destination) |
There was a problem hiding this comment.
processor.connect(audioContext.destination) will typically route the microphone signal to the user’s speakers, which can be distracting and can create an echo/feedback loop (especially if “system audio” capture is enabled). Consider ensuring the mic capture path is “silent” while still keeping the processing callback running.
🤖 Was this useful? React with 👍 or 👎
| } | ||
| } catch (error) { | ||
| // Keep the buffer data for retry on next interval | ||
| logApp(`[MeetingRecorder] Transcription error for ${source} (will retry):`, error) |
There was a problem hiding this comment.
Since the buffer is retained on failures, a prolonged outage (or missing API key) can cause the WAV payload to grow without bound and eventually exceed upstream STT size limits, making retries fail indefinitely. Consider bounding/splitting buffered audio when retrying to avoid permanent transcript loss/memory growth.
🤖 Was this useful? React with 👍 or 👎
…audio echo - Add MAX_BUFFER_SIZE_BYTES (25MB) to prevent unbounded audio buffer growth - Discard oldest audio chunks when buffer limit is reached - Route microphone audio through silent gain node to prevent echo/feedback - Add getBufferSize helper for calculating buffer usage Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
|
augment review |
Summary
Add meeting transcription capability using system audio capture via ScreenCaptureKit. This allows recording meetings with mic and/or desktop audio, transcribing in real-time every 30 seconds using existing STT providers (OpenAI/Groq).
Features
macos-system-audio-recordernpm package (Swift/ScreenCaptureKit)Technical Details
New Files
apps/desktop/src/main/meeting-recorder.ts- Main process service for audio capture and transcriptionapps/desktop/src/renderer/src/pages/meetings.tsx- React UI for meeting recordingModified Files
apps/desktop/src/shared/types.ts- Added Meeting-related typesapps/desktop/src/main/tipc.ts- Added IPC handlers for meeting operationsapps/desktop/src/renderer/src/router.tsx- Added /meetings routeapps/desktop/src/renderer/src/components/app-layout.tsx- Added Meetings nav linkDependencies
macos-system-audio-recorder@0.0.1- Swift binary for ScreenCaptureKit audio capturePlatform Support
Testing
Screenshots
The Meetings page is accessible from the sidebar under Settings and provides: