feat: add meeting transcription feature for macOS by aj47 · Pull Request #789 · aj47/SpeakMCP

aj47 · 2025-12-26T23:18:18Z

Summary

Add meeting transcription capability using system audio capture via ScreenCaptureKit. This allows recording meetings with mic and/or desktop audio, transcribing in real-time every 30 seconds using existing STT providers (OpenAI/Groq).

Features

System audio capture using macos-system-audio-recorder npm package (Swift/ScreenCaptureKit)
Flexible audio sources: Mic-only, system-only, or both
Real-time transcription: Audio buffered and transcribed every 30 seconds
Meeting history: View, search, rename, and delete past meetings
Transcript viewer: Full transcript with time-stamped segments showing audio source

Technical Details

New Files

apps/desktop/src/main/meeting-recorder.ts - Main process service for audio capture and transcription
apps/desktop/src/renderer/src/pages/meetings.tsx - React UI for meeting recording

Modified Files

apps/desktop/src/shared/types.ts - Added Meeting-related types
apps/desktop/src/main/tipc.ts - Added IPC handlers for meeting operations
apps/desktop/src/renderer/src/router.tsx - Added /meetings route
apps/desktop/src/renderer/src/components/app-layout.tsx - Added Meetings nav link

Dependencies

Added macos-system-audio-recorder@0.0.1 - Swift binary for ScreenCaptureKit audio capture

Platform Support

⚠️ macOS only - Requires macOS 12.3+ for ScreenCaptureKit. Users on other platforms will see a warning message.

Testing

TypeScript compiles without errors
All existing tests pass (38/38)

Screenshots

The Meetings page is accessible from the sidebar under Settings and provides:

Audio source selector (Both/Mic/System)
Start/Stop recording with live timer
Meeting history grouped by date
Detail dialog with full transcript and segments

Add meeting transcription capability using system audio capture via ScreenCaptureKit. This allows recording meetings with mic and/or desktop audio, transcribing in real-time every 30 seconds using existing STT providers (OpenAI/Groq). Features: - System audio capture using macos-system-audio-recorder npm package - Support for mic-only, system-only, or both audio sources - Real-time transcription with time-stamped segments - Meeting history with search, rename, and delete - Full transcript view with segment details Note: This feature is macOS-only (requires macOS 12.3+ for ScreenCaptureKit) Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

augmentcode · 2025-12-26T23:26:42Z

🤖 Augment PR Summary

Summary: This PR adds a macOS-only “Meetings” feature that records meeting audio (mic, system audio, or both) and transcribes it in near real-time using the app’s existing STT providers.

Changes:

Introduces a new main-process MeetingRecorderService that captures system audio via ScreenCaptureKit (through macos-system-audio-recorder), buffers audio, and transcribes every 30 seconds.
Adds IPC procedures in apps/desktop/src/main/tipc.ts for starting/stopping recording, streaming microphone PCM chunks from the renderer, and CRUD operations on meeting history.
Adds a new Meetings page (/meetings) with recording controls, meeting history list, rename/delete actions, and a transcript viewer dialog.
Extends shared types (Meeting, MeetingTranscriptSegment, state/config types) to support the new feature.
Registers the new route and adds a navigation entry in the desktop sidebar.
Adds dependency macos-system-audio-recorder@0.0.1 for system audio capture on macOS 12.3+.

Technical Notes: Transcription requests are guarded against overlap, use request timeouts, and audio buffers are bounded to prevent unbounded growth; meeting data is persisted as JSON under the app’s data directory.

_{🤖 Was this summary useful? React with 👍 or 👎}

augmentcode

Review completed. 2 suggestions posted.

Comment augment review to trigger a new review at any time.

augmentcode · 2025-12-26T23:26:43Z

+  // Mutations
+  const startMutation = useMutation({
+    mutationFn: (audioSource: MeetingAudioSource) =>
+      tipcClient.startMeetingRecording({ audioSource }),


The UI offers microphone/both sources, but this page never captures mic audio nor calls tipcClient.addMeetingMicrophoneData, so those modes likely won’t include microphone input in the transcript.

_{🤖 Was this useful? React with 👍 or 👎}

augmentcode · 2025-12-26T23:26:43Z

+  }
+
+  private startTranscriptionLoop(): void {
+    this.transcriptionTimer = setInterval(async () => {


setInterval(async () => ...) can overlap if transcribeBufferedAudio() takes longer than 30s, which can lead to concurrent writes/segment ordering issues (and potentially race with stopRecording()). Consider ensuring transcription runs are non-reentrant.

_{🤖 Was this useful? React with 👍 or 👎}

- Make transcription loop non-reentrant to prevent overlapping runs - Add microphone capture in renderer for mic/both audio sources - Wait for in-progress transcription before stopping recording Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

aj47 · 2025-12-26T23:33:05Z

augment review

augmentcode

Review completed. 3 suggestions posted.

Comment augment review to trigger a new review at any time.

augmentcode · 2025-12-26T23:38:27Z

+              )}
+            </Button>
+
+            {process.platform !== "darwin" && (


process.platform may be undefined in the renderer (this app generally uses build-time process.env.IS_MAC), which would crash this page when rendered. Consider switching this check to process.env.IS_MAC (or another renderer-safe platform flag).

_{🤖 Was this useful? React with 👍 or 👎}

augmentcode · 2025-12-26T23:38:27Z

+    const audioData = Buffer.concat(chunks)
+
+    // Create WAV header
+    const wavHeader = this.createWavHeader(audioData.length, SAMPLE_RATE, CHANNELS, BYTES_PER_SAMPLE * 8)


The WAV header is always written as 48kHz/mono/16-bit, but macos-system-audio-recorder can output different sample rates / channel counts / bit depths via getAudioDetails(). If these don’t match the actual PCM stream, the WAV will be invalid and transcription quality can degrade or fail.

_{🤖 Was this useful? React with 👍 or 👎}

augmentcode · 2025-12-26T23:38:27Z

+    const groqBaseUrl = config.groqBaseUrl || "https://api.groq.com/openai/v1"
+    const openaiBaseUrl = config.openaiBaseUrl || "https://api.openai.com/v1"
+
+    const response = await fetch(


fetch() here has no timeout/abort; if the STT request hangs, isTranscribing can remain true and stopRecording() will wait indefinitely in its while (this.isTranscribing) loop. Consider adding an AbortController with a reasonable timeout for transcription requests.

_{🤖 Was this useful? React with 👍 or 👎}

- Replace process.platform with process.env.IS_MAC in renderer for platform detection - Use actual audio details from macos-system-audio-recorder getAudioDetails() instead of hardcoded values - Add AbortController with 60s timeout for transcription API requests to prevent hanging Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

aj47 · 2025-12-26T23:44:32Z

augment review

augmentcode

Review completed. 3 suggestions posted.

Comment augment review to trigger a new review at any time.

augmentcode · 2025-12-26T23:50:44Z

+
+      logApp(`[MeetingRecorder] Started recording meeting ${meetingId}`)
+      return meeting
+    } catch (error) {


If an error happens after startSystemAudioRecording() succeeds, this catch resets flags but doesn’t stop this.systemRecorder, so system capture could continue running in the background. Consider adding cleanup for the recorder/buffers on this failure path.

_{🤖 Was this useful? React with 👍 or 👎}

augmentcode · 2025-12-26T23:50:44Z

+        startTime: this.systemAudioBuffer.startTime,
+        audioDetails: { ...this.systemAudioBuffer.audioDetails },
+      })
+      this.systemAudioBuffer.data = []


This clears the buffered audio before attempting transcription; if transcribeAudio() fails, that chunk is dropped and the meeting transcript can become permanently incomplete. Consider retaining the buffer until transcription succeeds (or otherwise enabling retry) to avoid silent data loss.

_{🤖 Was this useful? React with 👍 or 👎}

augmentcode · 2025-12-26T23:50:44Z

+          }
+
+          // Send to main process
+          tipcClient.addMeetingMicrophoneData({ audioData: int16Data.buffer })


tipcClient.addMeetingMicrophoneData(...) returns a Promise but it’s not awaited/handled; if IPC rejects, this can surface as an unhandled rejection in the renderer during recording. Consider attaching a .catch (or otherwise handling errors) to keep failures contained.

_{🤖 Was this useful? React with 👍 or 👎}

- Add cleanupRecordingResources() to properly clean up system recorder on error - Retain audio buffer until transcription succeeds to avoid data loss - Add .catch() to tipcClient.addMeetingMicrophoneData to handle promise rejections Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

aj47 · 2025-12-26T23:55:39Z

augment review

augmentcode

Review completed. 2 suggestions posted.

Comment augment review to trigger a new review at any time.

augmentcode · 2025-12-26T23:59:43Z

+        }
+
+        source.connect(processor)
+        processor.connect(audioContext.destination)


processor.connect(audioContext.destination) will typically route the microphone signal to the user’s speakers, which can be distracting and can create an echo/feedback loop (especially if “system audio” capture is enabled). Consider ensuring the mic capture path is “silent” while still keeping the processing callback running.

_{🤖 Was this useful? React with 👍 or 👎}

augmentcode · 2025-12-26T23:59:43Z

+      }
+    } catch (error) {
+      // Keep the buffer data for retry on next interval
+      logApp(`[MeetingRecorder] Transcription error for ${source} (will retry):`, error)


Since the buffer is retained on failures, a prolonged outage (or missing API key) can cause the WAV payload to grow without bound and eventually exceed upstream STT size limits, making retries fail indefinitely. Consider bounding/splitting buffered audio when retrying to avoid permanent transcript loss/memory growth.

_{🤖 Was this useful? React with 👍 or 👎}

…audio echo - Add MAX_BUFFER_SIZE_BYTES (25MB) to prevent unbounded audio buffer growth - Discard oldest audio chunks when buffer limit is reached - Route microphone audio through silent gain node to prevent echo/feedback - Add getBufferSize helper for calculating buffer usage Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>

aj47 · 2025-12-27T00:05:11Z

augment review

augmentcode

Review completed. No suggestions at this time.

Comment augment review to trigger a new review at any time.

augmentcode Bot reviewed Dec 26, 2025

View reviewed changes

augmentcode Bot reviewed Dec 27, 2025

View reviewed changes

Conversation

aj47 commented Dec 26, 2025

Summary

Features

Technical Details

New Files

Modified Files

Dependencies

Platform Support

Testing

Screenshots

Uh oh!

augmentcode Bot commented Dec 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

augmentcode Bot left a comment

Choose a reason for hiding this comment

Uh oh!

augmentcode Bot Dec 26, 2025

Choose a reason for hiding this comment

Uh oh!

augmentcode Bot Dec 26, 2025

Choose a reason for hiding this comment

Uh oh!

aj47 commented Dec 26, 2025

Uh oh!

augmentcode Bot left a comment

Choose a reason for hiding this comment

Uh oh!

augmentcode Bot Dec 26, 2025

Choose a reason for hiding this comment

Uh oh!

augmentcode Bot Dec 26, 2025

Choose a reason for hiding this comment

Uh oh!

augmentcode Bot Dec 26, 2025

Choose a reason for hiding this comment

Uh oh!

aj47 commented Dec 26, 2025

Uh oh!

augmentcode Bot left a comment

Choose a reason for hiding this comment

Uh oh!

augmentcode Bot Dec 26, 2025

Choose a reason for hiding this comment

Uh oh!

augmentcode Bot Dec 26, 2025

Choose a reason for hiding this comment

Uh oh!

augmentcode Bot Dec 26, 2025

Choose a reason for hiding this comment

Uh oh!

aj47 commented Dec 26, 2025

Uh oh!

augmentcode Bot left a comment

Choose a reason for hiding this comment

Uh oh!

augmentcode Bot Dec 26, 2025

Choose a reason for hiding this comment

Uh oh!

augmentcode Bot Dec 26, 2025

Choose a reason for hiding this comment

Uh oh!

aj47 commented Dec 27, 2025

Uh oh!

augmentcode Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

augmentcode Bot commented Dec 26, 2025 •

edited

Loading