Skip to content

bykcyc/Cadence

Repository files navigation

🎙️ Cadence

A private, local-first meeting recorder, transcriber & voice toolkit for Windows.

Record your meetings, transcribe and diarize them on your own GPU, get AI meeting notes, dictate with your voice, and have any text read aloud — all from a lightweight tray app.

Latest release Downloads Platform License: MIT Made with Electron

⬇️ Download the latest release


Why Cadence?

Most meeting tools upload your audio to the cloud. Cadence keeps recording, transcription and speaker separation entirely on your machine — nothing leaves your computer unless you turn on an optional cloud feature (AI notes / read-aloud). It captures your microphone and the meeting audio as separate tracks, so "you vs. them" is perfectly split before any AI runs.

📸 Screenshots

Recordings list — local transcription, speaker separation and AI-notes status per meeting

Meeting detail — dual-track audio player and the diarized transcript by speaker   Settings — recording devices and the three dictation hotkey modes

✨ Features

  • 🎙️ Dual-track recording — your mic and the meeting/system audio recorded separately (FLAC), plus a mixed track for playback. No virtual cable needed (WASAPI loopback).
  • 📝 Local transcription — NVIDIA Parakeet TDT v3 on your GPU. Fast, multilingual, free.
  • 👥 Speaker diarizationpyannote splits multiple remote speakers (optional, needs a free Hugging Face token).
  • 🧠 AI meeting notes — concise summaries, action items and open questions via DeepSeek / OpenRouter / Mistral (editable prompt).
  • ⌨️ Voice dictation — 3 global-hotkey modes:
    • Dictation → raw speech-to-text inserted at the cursor.
    • Dictation + DeepSeek → cleaned-up, polished text.
    • Dictation + DeepSeek + Translate → polished and translated into a language of your choice.
  • 🔊 Read aloud — select any text, press a hotkey, and Cadence speaks it (Microsoft Edge neural voices).
  • 🌍 14 interface languages — English, Русский, 中文, Español, Français, Deutsch, Português, Italiano, 日本語, 한국어, العربية (RTL), हिन्दी, Türkçe, Polski.
  • 🔒 Private by default — recording, transcription and diarization run locally. Cloud is opt-in and clearly marked.
  • 🖥️ Lives in the tray — autostart with Windows, mac-style UI, one-click installer or portable build.

🔐 Privacy

Feature Where it runs
Recording, transcription (Parakeet), diarization (pyannote) 100% local on your machine
Meeting notes, dictation polish/translate (DeepSeek) Cloud API — only if you add a key
Read-aloud (Edge voices) Microsoft online service — only when you use it

🚀 Getting started

  1. Download the latest Cadence-Setup-x.y.z.exe (installer) or Cadence-x.y.z-portable.exe from Releases.
  2. Run it. The app appears in the system tray.
  3. On the first transcription, Cadence automatically sets up its local engine — it installs the uv package manager (if missing), creates a Python environment, and downloads the speech model (cached afterwards). No console, no manual steps. Transcription uses the ONNX Parakeet engine in one of two modes (Settings → Recording): CPU — the lightweight default (~1 GB, no GPU); or GPU — onnxruntime-gpu for ~7× faster transcription of long meetings (downloads the CUDA libraries, ~1.8 GB, the first time you switch). Speaker diarization uses a separate NeMo/PyTorch engine (set up only if you use it).

The build is unsigned, so Windows SmartScreen may warn on first launch → More info → Run anyway.

💻 Requirements

  • Windows 10 / 11 (x64).
  • NVIDIA GPU recommended for fast transcription. The default CPU mode runs fine for short clips & dictation; GPU mode (an NVIDIA CUDA card) is ~7× faster on long meetings (84 min ≈ 1 min vs ~8 min).
  • Internet on first run (to download the engine + models) and for the optional cloud features.
  • For speaker diarization: a free Hugging Face read token, and accept the licenses of pyannote/speaker-diarization-3.1 and pyannote/segmentation-3.0.
  • For AI notes / dictation polish & translate: an API key (DeepSeek by default; OpenRouter / Mistral also supported).

🛠️ Build from source

npm install
npm run dev          # development with HMR
npm run typecheck    # main + renderer type checks
npm run build:win    # NSIS installer + portable .exe in dist/

🧱 Tech stack

Electron + React + TypeScript + Tailwind (electron-vite, electron-builder) · NVIDIA Parakeet TDT v3 (NeMo) · pyannote.audio · FastAPI worker · edge-tts · uiohook-napi (global hotkeys) · ffmpeg-static.

⚖️ Models & third-party services

  • Parakeet TDT v3 (ASR) and pyannote (diarization) are downloaded from Hugging Face on first use. pyannote models are gated — you must accept their licenses on huggingface.co and use a read token.
  • Read-aloud uses Microsoft Edge online neural voices via the community edge-tts client (intended for personal use; subject to Microsoft's terms).
  • Meeting notes / dictation polish & translate call the LLM provider you configure (DeepSeek by default) with your own API key.

Cadence itself is MIT-licensed; the models and services above keep their own licenses/terms.

🤝 Contributing

Issues and pull requests are welcome — see CONTRIBUTING.md.

❤️ Support

Cadence is free and open-source. If it saves you time, you can buy me a coffee ☕ — it directly supports development.

📄 License

MIT © Jurijs Ivanenko