Free, Open-Source Voice-to-Text for Windows using OpenAI Whisper
Press a hotkey, speak, and your words get typed into any text field. Simple as that.
Tired of slow, inaccurate, or subscription-based dictation software? freevoice is a free, offline voice typing app for Windows that actually works.
The problem: Most speech-to-text tools are either cloud-based (privacy concerns, requires internet, often paid), or the built-in Windows voice typing is limited and inconsistent.
The solution: freevoice runs entirely on your machine using OpenAI's Whisper - the same AI that powers professional transcription services. No subscriptions, no internet required, no data leaving your computer.
Whether you're writing emails, coding, taking notes, or just prefer speaking over typing, freevoice gives you fast, accurate voice-to-text in any application. It works everywhere - browsers, Word, Slack, VS Code, even games with chat.
- Push-to-talk - Hold the hotkey to record, release to transcribe
- Hands-free mode - Lock recording to speak without holding keys
- System tray app - Runs quietly in the background
- Auto-type - Transcribed text is automatically typed into the focused field
- Smart Dictionary - Automatically corrects brand names and technical terms
- Text Cleanup - Removes filler words (um, uh, etc.)
- Multiple Whisper models - Choose speed vs accuracy
- Customizable shortcuts - Set your own hotkeys
- Notification sounds - Audio feedback for recording start/stop
- Offline - Everything runs locally, no internet needed after model download
- GPU acceleration - Supports CUDA for faster transcription (optional)
-
Install Python 3.10+ from python.org
- Check "Add Python to PATH" during installation
-
Download this repository (Code → Download ZIP) and extract it
-
Install dependencies - Open terminal in the folder and run:
pip install -r requirements.txt
-
Run freevoice - Double-click
freevoice.bat
That's it! The app will download the Whisper model on first run, then you're ready to go.
Note: This is plain Python code - feel free to look around, customize it, or learn from it. No hidden magic here.
- Start the app - It appears in your system tray (bottom-right, near the clock)
- Click on any text field where you want to type
- Hold your hotkey and speak - Release to transcribe
| Action | Keys | Description |
|---|---|---|
| Push-to-talk | Hold Ctrl+Alt |
Hold to record, release to transcribe |
| Lock recording | Ctrl+Alt + Space |
Hands-free mode - keeps recording after release |
| Stop locked recording | Ctrl+Alt |
Stops recording and transcribes |
| Cancel | Esc |
Cancels recording without transcribing |
All shortcuts can be customized in Settings.
The app icon shows a colored dot in the corner to indicate status:
| Dot | Status |
|---|---|
| (none) | Idle, ready to record |
| 🔴 Red | Recording (push-to-talk) |
| 🟣 Purple | Recording locked (hands-free) |
| 🟠 Orange | Processing/transcribing |
Right-click the tray icon → Settings (or double-click the icon)
- Model - Choose Whisper model (tiny → large-v3)
- Language - Auto-detect or specify language
- Shortcuts - Customize all hotkeys
- Text Cleanup - Remove filler words automatically
- Dictionary - Add custom terms and brand names
- Sounds - Toggle notification sounds
- Startup - Launch automatically with Windows
| Model | Download | Speed | Accuracy | Best for |
|---|---|---|---|---|
tiny |
~75 MB | Fastest | Basic | Quick notes |
base |
~150 MB | Fast | Good | Recommended |
small |
~500 MB | Medium | Better | General use |
medium |
~1.5 GB | Slow | Great | Important content |
large-v2 |
~3 GB | Slowest | Excellent | Maximum accuracy |
large-v3 |
~3 GB | Slowest | Best | Maximum accuracy |
Models are downloaded automatically when first selected.
freevoice automatically recognizes similar-sounding words and replaces them with your preferred spelling. Just add your terms - no need to list every variation!
Edit in Settings → Dictionary, or directly in dictionary.json:
{
"terms": [
"ChatGPT",
"JavaScript",
"TypeScript",
"PostgreSQL",
"Kubernetes",
"YourBrandName"
]
}How it works:
- Say "chat gpt" → types "ChatGPT"
- Say "javascript" → types "JavaScript"
- Say "post gres" → types "PostgreSQL"
Uses phonetic matching - just add the correct spelling and it figures out the rest!
- Speak clearly with natural pauses between sentences
- GPU acceleration - If you have an NVIDIA GPU with CUDA, transcription will be much faster
- Set your language - Specifying a language is faster than auto-detect
- Clipboard preserved - Your clipboard contents are restored after typing
- Check your microphone is set as default in Windows Sound Settings
- Make sure no other app is exclusively using the microphone
- Use a smaller model (Settings → Model → tiny or base)
- If you have an NVIDIA GPU, ensure CUDA drivers are installed
- Make sure a text field is focused before releasing the hotkey
- Some applications may block simulated keyboard input
- Check tray menu → "Copy Last" to get the text to your clipboard
- Some apps may capture global hotkeys - try different shortcuts in Settings
- Run as administrator if shortcuts don't work in elevated apps
freevoice/
├── assets/ # Icons and sounds
├── scripts/ # Development tools
├── src/ # Source code modules
├── config.json # User settings
├── dictionary.json # Custom terms
├── main.py # Entry point
├── freevoice.bat # Launch script
└── requirements.txt # Dependencies
For development with auto-reload (restarts the app when you change files):
python scripts/dev.pyPress Ctrl+C to stop. Requires watchdog (installed automatically on first run).
- Windows 10/11
- Python 3.10+
- Microphone
- ~200 MB disk space (plus model size)
- OpenAI Whisper - Speech recognition model
- faster-whisper - Optimized Whisper implementation
Created by @estrimaitis
- GitHub: github.com/estrimaitis/freevoice
MIT License - Use it however you want!


