Generate MP4 videos from MP3 voice-overs — with real transcription and burned-in subtitles.
┌──────────────────┐ ┌────────────────────┐
│ React + Vite │──────▶│ FastAPI Backend │
│ (port 5173) │ proxy │ (port 8000) │
└──────────────────┘ └────────┬───────────┘
│
┌────────▼───────────┐
│ Background Worker │
│ │
│ 1. Whisper (STT) │
│ 2. FFmpeg (video) │
│ 3. Burn subtitles │
└────────┬───────────┘
│
┌────────▼───────────┐
│ SQLite + Local FS │
└────────────────────┘
- Drag & drop audio upload (MP3, WAV, M4A, OGG, FLAC, AAC)
- Real-time job tracking with progress bar and pipeline logs
- Local Whisper transcription — no API keys needed
- Word-level SRT subtitles burned into video
- Audio-reactive waveform video background
- Fallback mode — static subtitles if transcription fails
- SQLite persistence — jobs survive server restarts
Download from https://www.python.org/downloads/
Download from https://nodejs.org/
Windows (recommended — using winget):
winget install Gyan.FFmpegWindows (alternative — using Chocolatey):
choco install ffmpegWindows (manual):
- Download from https://www.gyan.dev/ffmpeg/builds/
- Extract to
C:\ffmpeg - Add
C:\ffmpeg\binto your system PATH
Verify installation:
ffmpeg -versioncd backend
python -m venv venv
.\venv\Scripts\Activate.ps1
pip install -r requirements.txtcd frontend
npm installTerminal 1 — Backend:
cd backend
.\venv\Scripts\Activate.ps1
uvicorn main:app --reload --port 8000Terminal 2 — Frontend:
cd frontend
npm run devNavigate to http://localhost:5173
- Upload an MP3 file (max 50 MB, 2 minutes)
- Click Generate Video
- Watch the pipeline progress in real time
- Download the final MP4 when complete
Edit backend/config.py:
| Setting | Default | Description |
|---|---|---|
WHISPER_MODEL |
base |
Whisper model size (tiny/base/small) |
VIDEO_WIDTH |
1280 |
Output video width |
VIDEO_HEIGHT |
720 |
Output video height |
MAX_AUDIO_DURATION |
120 |
Max audio length in seconds |
FONT_SIZE |
28 |
Subtitle font size |
You can also set the Whisper model via environment variable:
set WHISPER_MODEL=small| Method | Path | Description |
|---|---|---|
| POST | /api/jobs/upload |
Upload audio, create job |
| GET | /api/jobs/{id} |
Get job status |
| GET | /api/jobs/{id}/download |
Download generated MP4 |
| GET | /api/jobs/ |
List all jobs |
| GET | /api/health |
Health check |
- Frontend: React 18 + Vite
- Backend: Python FastAPI + Uvicorn
- Transcription: OpenAI Whisper (local)
- Video: FFmpeg (H.264 + AAC)
- Database: SQLite (WAL mode)
- Worker: Background thread (daemon)