Download Google Gemini's "Listen" TTS audio responses as files.
No waiting through playback. No menu hunting. One click, file in Downloads.
Gemini's Listen feature is buried behind a 3-dot menu, and there's no built-in way to keep a copy of the audio without sitting through real-time playback. This Chrome extension fixes both:
- Adds a Download option right inside the 3-dot menu, next to Listen
- Pulls the audio directly out of Gemini's blob URL — typically saves in 2-3 seconds regardless of how long the clip would take to play
- Saves silently — no audio plays during capture
- Names files
<conversation-title>-<message-index>.mp3so they're sortable
Drop a GIF or screenshot here once you record one.
┌──────────────────────────────┐
│ [3-dot menu opens] │
│ ───────────────────── │
│ 🔊 Listen │
│ ⬇️ Download ← injected │
│ 📋 Copy │
│ ⋮ │
└──────────────────────────────┘
The extension is not (yet) on the Chrome Web Store. To install from source:
- Clone this repo
git clone https://github.com/zachisparanoid/GeminiTTS-Download.git cd GeminiTTS-Download - Open
chrome://extensionsin Chrome - Toggle Developer mode on (top-right)
- Click Load unpacked
- Select the
extension/directory inside this repo - Visit
https://gemini.google.com/, hover over an assistant message, open the 3-dot menu — Download sits right next to Listen
For more detail, see extension/README.md.
┌─────────────────────────────────────────────────────────────┐
│ gemini.google.com │
│ │
│ ┌─────────────────┐ postMessage ┌─────────────────┐ │
│ │ content script │ ◄──────────────► │ page-world script│ │
│ │ (isolated world)│ │ (main world) │ │
│ └────────┬────────┘ └────────┬─────────┘ │
│ │ │ │
│ - injects "Download" - patches URL.createObjectURL
│ - tracks 3-dot clicks - patches HTMLMediaElement.play
│ - builds Blob locally - mutes/pauses audio
│ - triggers <a download> - posts captured bytes back
└─────────────────────────────────────────────────────────────┘
- The content script watches Gemini's DOM and injects a Download item into the per-message 3-dot menu.
- The page-world script monkey-patches
URL.createObjectURL. The moment Gemini constructs a Blob for a TTS response, we read its bytes viablob.arrayBuffer()— no waiting for playback to finish. - While capture is in flight,
HTMLMediaElement.prototype.playis short-circuited so no audio actually plays. - The bytes are passed back to the content script (via
window.postMessagewith a transferable ArrayBuffer), wrapped in a Blob, and triggered as a download via a hidden<a download>anchor click.
For implementation details, the source files are heavily commented — start with extension/content-script.js and extension/injected.js.
- Vanilla JavaScript, no bundler, no transpiler, no dependencies
- Manifest V3, content scripts only (no service worker — capture and save both happen in-page)
- Filename module unit tests via Node's built-in
node:test
cd extension
npm testhost_permissions: https://gemini.google.com/*— that's it.
No downloads, no tabs, no storage, no notifications. The download itself is triggered via a hidden <a download> element click, so no privileged API is required. No telemetry, no remote calls, no data leaving your machine.
- Gemini's DOM selectors are best-effort. If Google ships a redesign, the menu-injection selectors may need tweaks. They all live in one
SELECTORSobject at the top ofextension/content-script.js. - The capture path assumes Gemini delivers TTS via a
blob:URL on an<audio>element. If that ever changes, the existingfetchinterception is a fallback path. - A 30-second timeout is enforced — if no audio response arrives within that window, the download is abandoned with a toast error.
Issues and PRs welcome. The codebase is small (≈700 lines of JS total) and source files are commented to explain architectural choices.
If a Gemini DOM change breaks the extension, the fix is usually one regex in the SELECTORS object at the top of extension/content-script.js — please open a PR.
MIT — © 2026 Zachary Winchester
