A FastAPI-based TTS (Text-to-Speech) server that provides OpenAI-compatible API endpoints using KittenTTS. This server can be easily integrated with Open WebUI and other applications that support OpenAI's TTS API format.
note: you will need to have KittenTTS separately installed on your system
- π OpenAI-compatible TTS API endpoints
- π£οΈ Multiple voice options with voice mapping
- β‘ Fast and efficient speech synthesis using KittenTTS
- β‘ GPU Accelleration for Apple Silicon and Cuda
- ποΈ Configurable speech speed (0.25x to 4.0x)
- π Health check and model status endpoints
- π§ Easy integration with Open WebUI
- Python 3.8 or higher
- pip package manager
-
Clone the repository
git clone https://github.com/drivenfast/kitten-tts-server cd kittentts-server -
Create and activate a virtual environment
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install dependencies
pip install -r requirements.txt
-
Install KittenTTS
# Follow KittenTTS installation instructions from their repository # This typically involves installing from source or using their provided wheels
-
Start the server
python server.py
Or use the startup script:
chmod +x start_server.sh ./start_server.sh
The server will be available at http://localhost:8001
POST /v1/audio/speechRequest Body:
{
"model": "tts-1-hd",
"input": "Hello, this is a test of the KittenTTS server!",
"voice": "alloy",
"response_format": "mp3",
"speed": 2.0
}Example using curl:
curl -X POST "http://localhost:8001/v1/audio/speech" \
-H "Content-Type: application/json" \
-d '{
"model": "tts-1-hd",
"input": "Hello world!",
"voice": "alloy",
"speed": 1.0
}' \
--output speech.wavGET /v1/modelsGET /v1/audio/voicesGET /healthThe server maps OpenAI-compatible voice names to KittenTTS voices:
| OpenAI Voice | KittenTTS Voice | Description |
|---|---|---|
| alloy | expr-voice-5-m | Male voice |
| echo | expr-voice-2-m | Male voice |
| fable | expr-voice-3-f | Female voice |
| onyx | expr-voice-4-m | Male voice |
| nova | expr-voice-5-f | Female voice |
| shimmer | expr-voice-2-f | Female voice |
-
Start the KittenTTS server:
python server.py
-
Configure Open WebUI:
- Go to Settings β Audio
- Set TTS Engine to "OpenAI"
- Set API Base URL to:
http://localhost:8001/v1 - Leave API Key empty (not required)
- Input one of the voices mapped to OpenAI Voice (e.g. shimmer) in the TTS Voice Field
- Leave TTS model field as tts-1-hd
-
Test the integration:
- Try using TTS in Open WebUI chat
- The server logs will show generation requests
KITTENTTS_HOST: Server host (default: "0.0.0.0")KITTENTTS_PORT: Server port (default: 8001)KITTENTTS_LOG_LEVEL: Logging level (default: "info")KITTENTTS_USE_GPU: Enable GPU acceleration (default: "true")KITTENTTS_GPU_PROVIDER: GPU provider preference (default: "auto")KITTENTTS_ONNX_THREADS: ONNX Runtime threads (default: 0 = auto)
The server automatically detects and uses GPU acceleration when available:
Apple Silicon (M1/M2/M3/M4):
- Uses CoreML execution provider for GPU/Neural Engine acceleration
- Automatically enabled on macOS with Apple Silicon
NVIDIA CUDA:
- Uses CUDA execution provider when CUDA is available
- Requires CUDA runtime and ONNX Runtime GPU package
Intel/AMD Systems:
- Falls back to CPU execution with optimized threading
- Can use OpenVINO if available
Configuration Options:
# Enable/disable GPU acceleration
export KITTENTTS_USE_GPU=true
# Force specific provider (auto, coreml, cuda, cpu)
export KITTENTTS_GPU_PROVIDER=auto
# Set number of CPU threads (0 = auto-detect)
export KITTENTTS_ONNX_THREADS=4Check GPU Status:
curl http://localhost:8001/gpu/statusYou can modify the voice mapping and other settings by editing the config.py file.
# Build the Docker image
docker build -t kittentts-server .
# Run the container
docker run -p 8001:8001 kittentts-serverdocker-compose up -dkittentts-server/
βββ server.py # Main FastAPI server
βββ config.py # Configuration settings
βββ requirements.txt # Python dependencies
βββ start_server.sh # Startup script
βββ Dockerfile # Docker configuration
βββ docker-compose.yml # Docker Compose configuration
βββ tests/ # Test files
β βββ test_api.py
β βββ test_integration.py
βββ docs/ # Additional documentation
βββ api.md
βββ deployment.md
# Install development dependencies
pip install -r requirements-dev.txt
# Run tests
pytest tests/
# Run with coverage
pytest tests/ --cov=. --cov-report=html# Format code with black
black .
# Sort imports
isort .
# Lint with flake8
flake8 .-
KittenTTS not found:
- Ensure KittenTTS is properly installed in your environment
- Check that all dependencies are installed
-
Audio format issues:
- The server currently supports WAV and MP3 formats
- MP3 support may require additional audio codecs
-
Port already in use:
- Change the port in
config.pyor set theKITTENTTS_PORTenvironment variable
- Change the port in
Server logs are output to the console. For production deployments, consider using a proper logging configuration.
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add some amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
- KittenTTS for the excellent TTS engine
- FastAPI for the web framework
- Open WebUI for TTS integration support
Note: This server requires KittenTTS to be installed separately. Please refer to the KittenTTS documentation for installation instructions specific to your system.