GPU-powered local dictation tool by LAB37.
Hold a key, speak, release. Text appears wherever your cursor is. No cloud, no API keys, no subscription. Runs entirely on your machine.
- Hold-to-dictate - Push-to-talk trigger (Caps Lock on Windows, Right Option on Mac)
- Auto-stop - Stops recording automatically when you stop speaking (Silero VAD)
- Pre-buffer - Captures audio before you press the key so first words aren't cut off
- Multi-language - Auto-detects Swedish, English, and 90+ other languages
- Fast - ~0.2s on RTX 4090 (CUDA), ~0.5s on Apple Silicon (MLX Metal)
- Clean output - Strips filler words, detects Whisper hallucinations
- Visual overlay - Floating status bar with live waveform during recording
- System tray - Full settings menu from the system tray icon (Windows) or menu bar (Mac)
- Audio feedback - Subtle terminal-style blips on start/done
- Clipboard paste - Works in any app (restores clipboard after)
- Cross-platform - Windows and macOS
- macOS 13+ (Ventura or later)
- Apple Silicon (M1/M2/M3/M4)
git clone https://github.com/delarc0/bark.git
cd bark
chmod +x setup-mac.sh
./setup-mac.shThe setup script handles everything automatically: Homebrew, Python, dependencies, and building Bark.app.
Double-click Bark.app (or drag it to your Dock). On first launch:
- Grant Microphone permission when prompted
- Grant Accessibility permission (System Settings > Privacy & Security > Accessibility > toggle Bark on)
- The Whisper model downloads automatically (~1.5 GB, cached after first time)
If the overlay doesn't appear after granting permissions, quit and relaunch.
- A small green overlay appears at the bottom center of your screen
- Hold Right Option (the Option key on the right side) to record
- Speak -- the overlay shows a waveform animation
- Release to transcribe, or just stop talking (auto-stop after 1.5s of silence)
- Transcribed text is pasted at your cursor position
- Right-click the overlay or the menu bar icon to access settings or quit
cd bark
git pull
./setup-mac.shThe setup script detects your existing virtual environment and reuses it -- only new/changed packages are installed. Bark.app is rebuilt automatically.
Coming from v1.0 or an old version? If you have problems after updating, do a clean install:
cd bark git pull rm -rf .venv Bark.app ./setup-mac.shYour settings (
bark_config.json) and history (bark_history.txt) are preserved.
| Problem | Fix |
|---|---|
| Nothing happens when I press Right Option | Grant Accessibility permission: System Settings > Privacy & Security > Accessibility. Toggle Bark on. Restart Bark. |
| "Bark.app is damaged" | Run xattr -cr Bark.app in Terminal, then try again. |
| Overlay doesn't appear | Check dictation.log for errors. Run ./setup-mac.sh to rebuild. |
| Transcription is slow first time | Normal -- model loads into Metal memory once. Subsequent runs are ~0.5s. |
Manual setup (advanced)
If you prefer to set things up manually:
brew install python ffmpeg
/opt/homebrew/bin/python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements-mac.txt
chmod +x create-app.sh
./create-app.shImportant: use /opt/homebrew/bin/python3, not the system /usr/bin/python3 (too old, wrong architecture).
- Windows 10/11
- NVIDIA GPU with CUDA support (recommended) or CPU-only mode
Download and run Bark-Setup.exe from the latest release. No Python or developer tools needed -- everything is bundled.
After install, search "Bark" in the Start Menu or use the Desktop shortcut.
Note: Windows SmartScreen may show an "Unknown Publisher" warning. Click "More info" then "Run anyway". This is because the installer is not code-signed (yet).
Install from source (developers)
git clone https://github.com/delarc0/bark.git
cd bark
installer\setup-win.batThe setup script creates a virtual environment, detects your GPU, installs PyTorch (CUDA or CPU), and launches Bark when done.
Requirements for source install: Python 3.11+, NVIDIA drivers with CUDA 12.x (for GPU mode).
- The Whisper model downloads automatically (~1.5 GB, one-time download)
- A small green overlay appears near the bottom of your screen
- Bark appears in your system tray (notification area) with a menu for all settings
- Hold Caps Lock to record (Caps Lock is suppressed, won't toggle)
- Speak -- the overlay shows a waveform animation
- Release to transcribe, or just stop talking (auto-stop after 1.5s of silence)
- Transcribed text appears at your cursor position
- Right-click the overlay or the system tray icon to access settings (language, trigger key, dark mode, etc.)
Download the latest installer from releases and run it again. Your settings and history are preserved.
Edit config.py to customize:
| Setting | Default | Description |
|---|---|---|
MODEL_SIZE |
Auto-detected | Whisper model (platform-specific) |
LANGUAGE |
None (auto) |
Force a specific language, e.g. "en" or "sv" |
AUTO_STOP |
True |
Auto-stop recording after silence |
SILENCE_TIMEOUT |
1.5 |
Seconds of silence before auto-stop |
PRE_BUFFER |
0.5 |
Seconds of audio kept before recording starts |
BEEP_VOLUME |
0.3 |
Sound effect volume (0.0 - 1.0) |
PASTE_DELAY |
0.15 |
Clipboard paste delay in seconds |
dictation.py Main entry point, orchestrates everything
audio.py Always-on mic stream, pre-buffer, Silero VAD
transcriber.py Whisper transcription (faster-whisper on Windows, mlx-whisper on Mac)
keyboard_hook.py Trigger key hook (pynput), clipboard paste
overlay.py Tkinter floating overlay with waveform visualization
feedback.py Audio feedback via sounddevice
config.py Platform detection + configurable settings
- faster-whisper - CTranslate2 Whisper (Windows/CUDA)
- mlx-whisper - MLX Whisper (Mac/Metal)
- Silero VAD - Voice activity detection
- pynput - Keyboard hooks
Built by LAB37