A private, local-first meeting recorder, transcriber & voice toolkit for Windows.
Record your meetings, transcribe and diarize them on your own GPU, get AI meeting notes, dictate with your voice, and have any text read aloud — all from a lightweight tray app.
Most meeting tools upload your audio to the cloud. Cadence keeps recording, transcription and speaker separation entirely on your machine — nothing leaves your computer unless you turn on an optional cloud feature (AI notes / read-aloud). It captures your microphone and the meeting audio as separate tracks, so "you vs. them" is perfectly split before any AI runs.
- 🎙️ Dual-track recording — your mic and the meeting/system audio recorded separately (FLAC), plus a mixed track for playback. No virtual cable needed (WASAPI loopback).
- 📝 Local transcription — NVIDIA Parakeet TDT v3 on your GPU. Fast, multilingual, free.
- 👥 Speaker diarization — pyannote splits multiple remote speakers (optional, needs a free Hugging Face token).
- 🧠 AI meeting notes — concise summaries, action items and open questions via DeepSeek / OpenRouter / Mistral (editable prompt).
- ⌨️ Voice dictation — 3 global-hotkey modes:
- Dictation → raw speech-to-text inserted at the cursor.
- Dictation + DeepSeek → cleaned-up, polished text.
- Dictation + DeepSeek + Translate → polished and translated into a language of your choice.
- 🔊 Read aloud — select any text, press a hotkey, and Cadence speaks it (Microsoft Edge neural voices).
- 🌍 14 interface languages — English, Русский, 中文, Español, Français, Deutsch, Português, Italiano, 日本語, 한국어, العربية (RTL), हिन्दी, Türkçe, Polski.
- 🔒 Private by default — recording, transcription and diarization run locally. Cloud is opt-in and clearly marked.
- 🖥️ Lives in the tray — autostart with Windows, mac-style UI, one-click installer or portable build.
| Feature | Where it runs |
|---|---|
| Recording, transcription (Parakeet), diarization (pyannote) | 100% local on your machine |
| Meeting notes, dictation polish/translate (DeepSeek) | Cloud API — only if you add a key |
| Read-aloud (Edge voices) | Microsoft online service — only when you use it |
- Download the latest
Cadence-Setup-x.y.z.exe(installer) orCadence-x.y.z-portable.exefrom Releases. - Run it. The app appears in the system tray.
- On the first transcription, Cadence automatically sets up its local engine — it installs the
uvpackage manager (if missing), creates a Python environment, and downloads the speech model (cached afterwards). No console, no manual steps. Transcription uses the ONNX Parakeet engine in one of two modes (Settings → Recording): CPU — the lightweight default (~1 GB, no GPU); or GPU — onnxruntime-gpu for ~7× faster transcription of long meetings (downloads the CUDA libraries, ~1.8 GB, the first time you switch). Speaker diarization uses a separate NeMo/PyTorch engine (set up only if you use it).
The build is unsigned, so Windows SmartScreen may warn on first launch → More info → Run anyway.
- Windows 10 / 11 (x64).
- NVIDIA GPU recommended for fast transcription. The default CPU mode runs fine for short clips & dictation; GPU mode (an NVIDIA CUDA card) is ~7× faster on long meetings (84 min ≈ 1 min vs ~8 min).
- Internet on first run (to download the engine + models) and for the optional cloud features.
- For speaker diarization: a free Hugging Face read token, and accept the licenses of
pyannote/speaker-diarization-3.1andpyannote/segmentation-3.0. - For AI notes / dictation polish & translate: an API key (DeepSeek by default; OpenRouter / Mistral also supported).
npm install
npm run dev # development with HMR
npm run typecheck # main + renderer type checks
npm run build:win # NSIS installer + portable .exe in dist/Electron + React + TypeScript + Tailwind (electron-vite, electron-builder) · NVIDIA Parakeet TDT v3
(NeMo) · pyannote.audio · FastAPI worker · edge-tts · uiohook-napi (global hotkeys) ·
ffmpeg-static.
- Parakeet TDT v3 (ASR) and pyannote (diarization) are downloaded from Hugging Face on first use. pyannote models are gated — you must accept their licenses on huggingface.co and use a read token.
- Read-aloud uses Microsoft Edge online neural voices via the community
edge-ttsclient (intended for personal use; subject to Microsoft's terms). - Meeting notes / dictation polish & translate call the LLM provider you configure (DeepSeek by default) with your own API key.
Cadence itself is MIT-licensed; the models and services above keep their own licenses/terms.
Issues and pull requests are welcome — see CONTRIBUTING.md.
Cadence is free and open-source. If it saves you time, you can buy me a coffee ☕ — it directly supports development.
MIT © Jurijs Ivanenko


