LogNotes is a lightweight, local speech-to-text application that transcribes your recorded notes and pastes the result wherever your cursor is placed. It's primarily designed to "log" short notes. I use it quite often when instructing coding agents (e.g, when providing feedback, describing bugs, or outlining requirements).
The app uses Whisper for transcription. The current version is still very much a work in progress, but I'll definitely be working on further improvements.
LogNotes is built as an Electron front end (the UI) over a Python back end that runs the ML pipeline and OS integration. See ARCHITECTURE.md for how it all fits together.
I built LogNotes because I wanted a free, fully local alternative to Whispr Flow. I wanted something I could run on-device, without needing a subscription. The goal was to create the same core experience, even if it was a bit slower, and process voice recordings locally.
When looking into open source solutions, I came across Handy by @cjpais. I used this app's code as a reference for several optimizations to make LogNotes faster and extensible. I would definitely recommend checking out Handy as well.
- Python 3.10+ (the back-end ML pipeline)
- Node.js 18+ (the Electron front end)
cd LogNotes
python -m venv venv
# Windows
venv\Scripts\activate
# macOS/Linux
source venv/bin/activate
pip install -r requirements.txtcd electron
npm installBuild a Windows installer with build\build-electron.ps1 (produces dist-electron\LogNotes Setup *.exe).
- Transcription Speed: Typically bigger speech-to-text models will take a bit longer to transcribe the text, but they have better accuracy. I have optimized the speed as much as I can. It should be reasonably fast even on CPUs.
- First Transcription: The very first transcription may be slow since the speech detection tool has to start up. You may notice the same difference in speed after restarting the app.
- Hotkeys: If you're using LogNotes primarily on a specific app (e.g., Cursor, Obsidian, Jira), make sure your hotkey doesn't conflict with any existing shortcuts in those apps.
- Windows App Builds: If you make changes to the code and rebuild the Windows app it will take several minutes (just fyi).
- Antivirus Scans: The first time you run the Windows app your antivirus software may need to scan it before you can use it. Once the scan is over just reopen the app and it should work as expected.
- Local Processing - All transcription happens on your machine. Silence is automatically filtered out.
- Flexible Recording Modes - Choose between Hold mode (press and hold) or Toggle mode (click to start/stop).
- Whisper Transcription - Choose between Whisper base / small. CUDA is auto-detected and used when available.
- Session Activity Tab - Every transcription this session is retained in RAM so you can retry it with a different model, copy the text, or delete it.
- Always-Visible Recording Overlay - Small borderless status indicator pinned to a screen corner; drag to reposition, right-click to cycle corners.
- Checkpoint Pasting - Sentences are pasted as soon as Whisper finishes each one, so partial text is preserved if processing fails mid-stream
- No Audio Storage - Recordings are held in memory only during the app session and never written to disk. The Activity tab keeps recent clips in RAM so you can retry a transcription with a different model. Everything is cleared on app close
- Config Validation - All configuration values are validated against whitelists on load.
- Atomic Config Permissions - The config file is created with
0o600permissions in a singleos.open()call, with no readable window between creation andchmod. - Bounded Activity Memory - The session audio cap is enforced before adding each new entry, preventing a single long recording from temporarily spiking RAM past the limit.
- Pinned Model Versions - External model downloads use pinned versions.
cd electron && npm startThis launches the Electron UI, which spawns the Python back end automatically. Activate the venv first so the back end's Python dependencies are available.
Hold Mode (default):
- Open any text editor or input field where you want to paste text
- Press and hold the hotkey (default:
Ctrl+Shift+D) - Speak your text
- Release the hotkey
- Wait for processing - the transcribed text will be pasted at your cursor
Toggle Mode:
- Open any text editor or input field where you want to paste text
- Press the hotkey once to start recording
- Speak your text
- Press the hotkey again to stop recording
- Wait for processing - the transcribed text will be pasted at your cursor
LogNotes/
├── sidecar.py # Python back end: WebSocket/RPC server + bridge
├── requirements.txt # Python dependencies
├── ARCHITECTURE.md # Full architecture reference
├── electron/ # Electron front end
│ ├── main.js # Process spawn/supervision, tray, lifecycle
│ ├── preload.js # Hardened contextBridge surface
│ ├── package.json # electron-builder config
│ └── renderer/ # index.html/renderer.js (tabs), overlay.html/.js
├── src/
│ ├── controller.py # LogNotesController — the orchestrator (Tk-free)
│ ├── ui_bridge.py # UIBridge protocol the controller talks through
│ ├── config.py # Schema, validation, 0o600 save, ConfigStore
│ ├── activity.py # In-memory ActivityStore (session-scoped)
│ ├── paths.py # User data / cache dir + bundled-asset resolution
│ ├── audio/ # recorder.py (sounddevice)
│ ├── transcription/ # registry.py, whisper.py, device.py
│ ├── input/ # hotkey.py (pynput), paster.py (paste/clipboard)
│ └── ui/ # Legacy Tk UI (reference only; entry main.py)
├── build/ # PyInstaller specs + build-electron.ps1
├── tests/ # unittest suite + sidecar protocol smoke test
└── documentation/
├── configuration.md # Config file, schema, settings, validation
└── troubleshooting.md # Common issues and fixes
| Component | Library |
|---|---|
| Front end | Electron |
| Back-end IPC | WebSocket (loopback) |
| Transcription | faster-whisper |
| Audio Recording | sounddevice |
| Global Hotkeys | pynput |
| Text Pasting | pynput + pyperclip |
- Architecture — The full picture: the Electron + Python-back-end split, the transcription pipeline, the IPC protocol, module layout, packaging, and the security model.
- Configuration — Covers the config file location, full settings schema, valid values for each option, and the validation rules applied on load.
- Troubleshooting — Step-by-step fixes for common issues including hotkeys not firing, audio problems, transcription quality, and packaged build failures.
This project was built entirely with Claude Code and OpenAI Codex. While the code has been reviewed, AI-generated code can contain bugs or issues that are easy to miss. If you spot anything significant, please open an issue. I'd genuinely appreciate it.
MIT