Local-first realtime voice assistant with:
- Python backend for local speech orchestration, memory, and model-host adapters
- React browser client for mic capture, live transcript, and OpenAI WebRTC mode
- Support for both LM Studio and Ollama in local generation mode
- Optional OpenAI Realtime boost mode
For a fuller explanation of the project, see docs/manual.md.
Use the Windows setup script to create backend/.env and backend/data/runtime_config.json before first launch:
powershell -NoProfile -ExecutionPolicy Bypass -File .\scripts\setup-voice.ps1Optional flags for scripted runs:
powershell -NoProfile -ExecutionPolicy Bypass -File .\scripts\setup-voice.ps1 -UseDefaults -SkipModelPullThe setup script currently provisions Ollama as the local model host, writes the app config files, and can optionally capture an OpenAI API key for boost mode.
Recommended runtime: Python 3.11 or 3.12 for the smoothest local STT stack, though the base backend package itself targets Python 3.11+.
cd backend
python -m venv .venv
.venv\Scripts\Activate.ps1
pip install -e .[stt]
copy .env.example .env
uvicorn app.main:app --reload --host 127.0.0.1 --port 8000Notes:
faster-whisperis used for local STT.pyttsx3is the default offline TTS backend on Windows.- LM Studio should expose its local server on
http://127.0.0.1:1234. - Ollama should run on
http://127.0.0.1:11434.
cd client
npm install
npm run devThe client runs on http://127.0.0.1:5173 by default and proxies API/WebSocket traffic to the backend.
To expose the web app to other devices on your LAN or tailnet, run the backend on 127.0.0.1:8001 as usual and start the client dev server on all interfaces:
cd client
npm run dev -- --host 0.0.0.0 --port 5173The browser client now uses same-origin /api and /ws paths by default, so remote devices will talk to the backend through the Vite proxy instead of trying to reach their own 127.0.0.1.
For Tailscale, the clean path is to expose the client with:
tailscale serve --bg 5173Then open the https://<your-device>.<tailnet>.ts.net URL from another Tailscale device.
Notes:
- Vite now accepts
.ts.nethostnames for Tailscale access. - Plain LAN HTTP can expose the page, but browser microphone access may still require a secure context.
localhostworks, and Tailscale HTTPS works reliably for voice input. - A one-command launcher is available:
powershell -NoProfile -ExecutionPolicy Bypass -File .\scripts\start-voice-network.ps1- Stop everything with:
powershell -NoProfile -ExecutionPolicy Bypass -File .\scripts\stop-voice-network.ps1Set VOICE_APP_OPENAI_API_KEY in backend/.env to enable the OpenAI Realtime session endpoint.