Roadmap

Not a promise list — a running record of what's in, what's queued, and what's explicitly out.

V1 (shipped)

Kokoro TTS with voice/speed/language config
faster-whisper STT with listen, listen --copy, listen --paste
Claude Code Stop hook with per-session pinning
readback CLI: say, stop, listen, voices, models, pin, unpin, list, config, install-hook, uninstall-hook, update-hook, doctor
Markdown / code-block / URL stripping before TTS
Config at ~/.config/readback/config
install.sh that doesn't touch existing installations elsewhere on disk

Queued for V1.1

Cursor hook support. Cursor uses a similar hook system to Claude Code. Should be a 50-line addition.
Codex CLI hook support. Same — another LLM CLI with a hook mechanism.
More voice presets. readback preset reading / preset glance / preset focus — named combinations of voice + speed + volume.
readback say --voice X --speed Y — per-invocation overrides without touching config.
Per-session config overrides. Pin session X with voice A, session Y with voice B.
A test suite. Right now there isn't one. A few bats-core tests for the CLI and pytest tests for the Python engines would cover most regressions.

Considered for V2

Global push-to-talk hotkey for listen — press a key anywhere in macOS, speak, release, get the transcript pasted into the focused app. This is what Superwhisper spent years on. I'd need Hammerspoon or Karabiner integration. Unclear if it fits the "small and auditable" scope.
Streaming STT. Text appears as you speak instead of at the end of recording. faster-whisper supports this. Needs a UI decision — print to stderr as partial? Update a TUI?
Smarter markdown cleaning. Current regex-based stripper misses some edge cases. A proper markdown parser would handle nested structures better.
Audio output routing. Pick which output device to use (for people with multiple speakers / headphones).
Whisper model auto-fallback. Try small, fall back to base, fall back to tiny if loading fails.

Explicitly out of scope (probably forever)

GUI application. This is a CLI. Adding a GUI would change the maintenance burden, the audit surface, and the audience.
Proprietary cloud voices. ElevenLabs, Google TTS, Azure — all great, but they take audio off your machine. Not what this is for.
Full voice command system. "Open this file, scroll down, run the tests." That's Talon. Use Talon.
Windows support. The whole stack is macOS-shaped right now (pbcopy, osascript, Homebrew). Porting would double the maintenance surface.
Linux support. Would be valuable, but I don't daily-drive Linux. Accepting PRs from people who do.
Publishing as a Python package on PyPI. Would require more packaging ceremony. If the project grows, revisit.
A Homebrew formula. Same — nice-to-have but not worth the overhead for a single-user-count project.

How to suggest something

Open an issue on the repo or fork and send a PR. The bar for new features is: does it stay small, does it preserve the "read it in an afternoon" audit budget, and does it help the accessibility core use case?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Roadmap

V1 (shipped)

Queued for V1.1

Considered for V2

Explicitly out of scope (probably forever)

How to suggest something

FilesExpand file tree

roadmap.md

Latest commit

History

roadmap.md

File metadata and controls

Roadmap

V1 (shipped)

Queued for V1.1

Considered for V2

Explicitly out of scope (probably forever)

How to suggest something