yt_scribe.py is a command-line tool that downloads audio from YouTube videos, transcribes the audio using OpenAI's Whisper model, and exports both the transcription and relevant metadata as files.
- Audio Download: Extracts audio from YouTube videos.
- Automatic Transcription: Transcribes audio using Whisper with GPU/CPU support.
- Language Auto-Detection: Detects language automatically if not specified.
- Metadata Export: Saves video metadata (title, channel, publish date, and detected language) to a JSON file.
- Customizable Output: Configurable model size, language, and output directory.
- Python 3.8+
- Libraries:
torchwhisperyt-dlpargparsejson
- Download the script.
- Install dependencies:
pip install torch whisper yt-dlp
python yt_scribe.py -u "<YouTube_URL>" -o <output_directory>| Argument | Description | Default |
|---|---|---|
-u, --urls |
Comma-separated YouTube URLs or file path | Required |
-o, --output_dir |
Output directory | Current directory |
-m, --model_size |
Whisper model size (tiny, base, small, medium, large) |
base |
-l, --language |
Language code (e.g., en, es) or auto-detection |
Auto-detect |
python yt_scribe.py -u "https://www.youtube.com/watch?v=dQw4w9WgXcQ" -o transcriptions/ -m base -l enpython yt_scribe.py -u youtube_urls.txt -o transcriptions/- Transcription File:
<video_title>_transcription.txt - Metadata File:
<video_title>_metadata.json
This project is licensed under the MIT License. See the LICENSE file for details.
For any questions, reach out at my mail.