Voice Transcriber

A VS Code extension that records your voice and transcribes it using OpenAI Whisper or a local Whisper-compatible API. Can optionally clean up the text with an LLM.

Features

Record audio directly in VS Code with real-time visualization
Upload audio or video files — audio track is extracted automatically (MP4, MKV, MOV, AVI, WebM)
Transcribe long recordings (1hr+) — split into 10-min chunks behind the scenes
Transcribe via OpenAI Whisper or your own local server
Clean up filler words and fix punctuation with LLM (optional)
Keep your last 10 transcriptions
Auto-copy results to clipboard
Recover recordings if VS Code crashes

ffmpeg Installation (required for native recording and long files)

The extension uses ffmpeg for native recording, splitting long files, and extracting audio from video uploads. Without ffmpeg, recording falls back to the browser (works, but lower quality and no long-file support).

macOS:

brew install ffmpeg

Linux:

sudo apt install ffmpeg

Windows:

winget install ffmpeg
# or
choco install ffmpeg

Usage

Click the microphone icon in the top-right of your editor
Set up your provider (OpenAI or local)
Hit "Start Recording" and speak
Hit "Stop" — text is automatically copied to clipboard

Configuration

OpenAI

Get an API key from platform.openai.com/api-keys, select "OpenAI" as provider, paste your key, and save.

Local server

Any Whisper-compatible API works:

faster-whisper-server
whisper.cpp server
Anything with a /v1/audio/transcriptions endpoint

Just enter the URL, e.g. http://localhost:8000/v1/audio/transcriptions.

LLM text cleanup

When using OpenAI, you can enable "Clean up text with LLM" to remove filler words, fix punctuation, and add paragraph breaks.

Models available: gpt-4o-mini (default, cheapest), gpt-4o, gpt-4-turbo, gpt-3.5-turbo.

Languages

Auto-detect or pick manually: English, Russian, Ukrainian, Spanish, French, German, Italian, Portuguese, Polish, Japanese, Korean, Chinese, and more.

Troubleshooting

Microphone access denied

macOS: System Settings → Privacy & Security → Microphone → enable VS Code → restart VS Code

Windows: Settings → Privacy → Microphone → allow app access

Linux: Check PulseAudio/PipeWire settings with pavucontrol, make sure no other app is blocking the mic

How to check logs

Command Palette (Ctrl+Shift+P / Cmd+Shift+P) → "Developer: Open Webview Developer Tools" → pick Voice Transcriber → Console tab

Transcription fails

Check your API key
For local API — make sure the server is running and URL is correct
Check your internet connection

Large files and video uploads

Recordings over 24 MB are transcoded to 128 kbps MP3 and split into 10-minute chunks, each transcribed separately and concatenated. Video uploads (MP4, MKV, MOV, AVI, WebM) have their audio track extracted automatically. Both features require ffmpeg.

Privacy

API keys are stored in VS Code's secure storage (system keychain)
Audio goes directly to OpenAI or your local API
Nothing is saved to disk

For Developers

Setup

npm install
npm run compile

Press F5 to launch the Extension Development Host.

Commands

npm run compile   # build once
npm run watch     # rebuild on changes

Publishing to VS Code Marketplace

Prerequisites

Microsoft account — account.microsoft.com
Azure DevOps org — dev.azure.com
Publisher ID — marketplace.visualstudio.com/manage

Get a Personal Access Token (PAT)

Go to dev.azure.com → profile → Personal access tokens → New Token
Organization: All accessible organizations
Scopes: Custom defined → Marketplace → Manage
Copy the token (shown only once)

Update package.json

{
  "publisher": "your-publisher-id",
  "icon": "resources/icon.png"
}

Icon must be a 128×128 PNG.

Publish

npm install -g @vscode/vsce
vsce login your-publisher-id
vsce publish

Update version

vsce publish patch  # 0.1.0 → 0.1.1
vsce publish minor  # 0.1.0 → 0.2.0
vsce publish major  # 0.1.0 → 1.0.0

Other useful commands

vsce package                      # create .vsix without publishing
vsce show publisher.extension     # show extension info
vsce unpublish publisher.ext      # remove from marketplace

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
resources		resources
src		src
webview		webview
.gitignore		.gitignore
.vscodeignore		.vscodeignore
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Voice Transcriber

Features

ffmpeg Installation (required for native recording and long files)

Usage

Configuration

OpenAI

Local server

LLM text cleanup

Languages

Troubleshooting

Microphone access denied

How to check logs

Transcription fails

Large files and video uploads

Privacy

For Developers

Setup

Commands

Publishing to VS Code Marketplace

Prerequisites

Get a Personal Access Token (PAT)

Update package.json

Publish

Update version

Other useful commands

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Voice Transcriber

Features

ffmpeg Installation (required for native recording and long files)

Usage

Configuration

OpenAI

Local server

LLM text cleanup

Languages

Troubleshooting

Microphone access denied

How to check logs

Transcription fails

Large files and video uploads

Privacy

For Developers

Setup

Commands

Publishing to VS Code Marketplace

Prerequisites

Get a Personal Access Token (PAT)

Update package.json

Publish

Update version

Other useful commands

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages