Voice Testing

Local-first realtime voice assistant with:

Python backend for local speech orchestration, memory, and model-host adapters
React browser client for mic capture, live transcript, and OpenAI WebRTC mode
Support for both LM Studio and Ollama in local generation mode
Optional OpenAI Realtime boost mode

For a fuller explanation of the project, see docs/manual.md.

Interactive setup

Use the Windows setup script to create backend/.env and backend/data/runtime_config.json before first launch:

powershell -NoProfile -ExecutionPolicy Bypass -File .\scripts\setup-voice.ps1

Optional flags for scripted runs:

powershell -NoProfile -ExecutionPolicy Bypass -File .\scripts\setup-voice.ps1 -UseDefaults -SkipModelPull

The setup script currently provisions Ollama as the local model host, writes the app config files, and can optionally capture an OpenAI API key for boost mode.

Backend

Recommended runtime: Python 3.11 or 3.12 for the smoothest local STT stack, though the base backend package itself targets Python 3.11+.

cd backend
python -m venv .venv
.venv\Scripts\Activate.ps1
pip install -e .[stt]
copy .env.example .env
uvicorn app.main:app --reload --host 127.0.0.1 --port 8000

Notes:

faster-whisper is used for local STT.
pyttsx3 is the default offline TTS backend on Windows.
LM Studio should expose its local server on http://127.0.0.1:1234.
Ollama should run on http://127.0.0.1:11434.

Client

cd client
npm install
npm run dev

The client runs on http://127.0.0.1:5173 by default and proxies API/WebSocket traffic to the backend.

Remote access

To expose the web app to other devices on your LAN or tailnet, run the backend on 127.0.0.1:8001 as usual and start the client dev server on all interfaces:

cd client
npm run dev -- --host 0.0.0.0 --port 5173

The browser client now uses same-origin /api and /ws paths by default, so remote devices will talk to the backend through the Vite proxy instead of trying to reach their own 127.0.0.1.

For Tailscale, the clean path is to expose the client with:

tailscale serve --bg 5173

Then open the https://<your-device>.<tailnet>.ts.net URL from another Tailscale device.

Notes:

Vite now accepts .ts.net hostnames for Tailscale access.
Plain LAN HTTP can expose the page, but browser microphone access may still require a secure context. localhost works, and Tailscale HTTPS works reliably for voice input.
A one-command launcher is available:

powershell -NoProfile -ExecutionPolicy Bypass -File .\scripts\start-voice-network.ps1

Stop everything with:

powershell -NoProfile -ExecutionPolicy Bypass -File .\scripts\stop-voice-network.ps1

OpenAI boost mode

Set VOICE_APP_OPENAI_API_KEY in backend/.env to enable the OpenAI Realtime session endpoint.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
backend		backend
client		client
docs		docs
scripts		scripts
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Voice Testing

Interactive setup

Backend

Client

Remote access

OpenAI boost mode

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Voice Testing

Interactive setup

Backend

Client

Remote access

OpenAI boost mode

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages