Deferred improvements. Not urgent, but worth capturing.
- Warn on very long recordings. Parakeet is optimized for 15-60s
utterances and degrades beyond ~5 min. Fire a
notify-sendwarning when a recording crosses ~3 min, and hard-stop + warn at ~10 min (model accuracy drops, inference memory balloons). - Recording-state indicator (optional). The system mic icon
(Plasma tray, GNOME top bar, macOS Control Center) already lights up
during capture, but a first-party cue would be nicer. On macOS:
switch the menu-bar
NSStatusItem.button.imageto a distinct variant in the CGEventTap's keyDown handler, revert on keyUp (template vs. non-template affects how macOS tints it in light/dark/tinted menu bars). On Linux: a small OSD overlay (Wispr-Flow-style) via a Qt/GTK app or a KDE widget. - Transcription history / undo. Keep the last N transcriptions so the user can paste an earlier one without re-dictating.
- Detect & handle more distros in install.sh. Currently: Fedora,
Debian/Ubuntu, Arch. Add openSUSE (
zypper), NixOS (flake), Alpine (apk) if requests come in. - Per-compositor input injection. We default to ydotool (works
everywhere via uinput). On wlroots/GNOME,
wtypeavoids needing a privileged daemon. Detect the compositor and prefer the lighter tool where available. - X11 fallback. We assume Wayland for
wl-copyandydotool. An X11 path would usexclip/xselandxdotool. Not a priority.
v0.1.0 ships prebuilt .deb and .rpm (both arm64 and amd64) as
GitHub release assets via nfpm + GitHub Actions, plus a one-liner
installer (scripts/install-release.sh). The remaining wins are
about discoverability and auto-updates:
- COPR repo (Fedora).
sudo dnf copr enable jguice/utter && sudo dnf install utteradds auto-updates viadnf. The spec file can mostly reuse the nfpm config. - Self-hosted apt repo on GitHub Pages (via
aptlyorreprepro) forsudo apt install utter+ auto-updates on Debian/Ubuntu. - Launchpad PPA (Ubuntu-only alternative to the apt repo).
- AUR package (Arch).
- Flatpak. Probably painful because of /dev/uinput and /dev/input access requirements — Flatpak sandboxes heavily restrict both.
- Bundle model as a GitHub release asset. ~650 MB fits under the 2 GB-per-asset limit. Acts as a mirror if the HF repo ever disappears.
- In-app update check. Utter daemon could compare its running
version to the latest GitHub release and
notify-sendwhen a new one is available. Pairs well with the COPR/apt repo once those land.
- Replace the
wl-pastepoll inemit_textwith the Wayland protocol viawl-clipboard-rs. Current flow spawnswl-copy --primaryas a subprocess, then pollswl-paste --primaryevery 10 ms until the primary selection matches our text (up to a 300 ms budget) before firing the Shift+Insert chord. That's observable and bounded — but it's still polling.wl-clipboard-rstalks the Wayland protocol directly:set_selection(primary, ...)returns only after the compositor has accepted the offer, which is the actual event we want to wait on. Trade-offs: adds a dep, requires the daemon to maintain a Wayland connection for the lifetime of the selection (wl-copy does this via a fork-to-background daemon today), and error handling gets more involved. Do this once we have another compositor-specific bug that pushes us toward direct-protocol access anyway. - Silence the
onnxruntime cpuid_info warningon every CLI invocation. On Apple Silicon / Asahi, ONNX Runtime printsUnknown CPU vendor. cpuinfo_vendor value: 0to stderr at shared-library load time — i.e. beforemain()runs, so it fires on everyutter --version,utter start,utter stop, etc. even though those subcommands don't need the model at all. Cleanest fix: split the binary so only thedaemonsubcommand linkstranscribe-rs/ort, and the short-lived subcommands are a thin shim that talks to the daemon over the socket. An intermediate fix: switchortto itsload-dynamicfeature (dlopen on first use) — the warning then only appears when the daemon actually loads the model. Needs a PR against transcribe-rs (or a feature flag) since it pins ort's features itself. - Hotplug support in the watcher. Currently enumerates
/dev/input/event*once at startup. If you plug in a new keyboard later, the watcher won't pick it up without a restart. Subscribe to udev events instead. - Graceful daemon shutdown. On SIGTERM, finish any in-flight transcription before exiting. Right now a restart mid-transcription drops the audio.
- Local-LLM cleanup (tier 2). Tier 1 (regex) ships today — a
token-scan drops fillers (uh/um/er/ah/erm/hmm), collapses 3+
same-word repetitions, and folds short-token stutters into the full
word that follows ("wh wh wh what" → "what"). Handles the obvious
cases with zero deps. Gated behind
UTTER_CLEANUP=1(default on). Tier 2: match OpenWhispr's quality by bundling a small instruct model (Qwen 2.5 3B or Llama 3.2 3B, q5_k_m) viallama-cpp-rsor a llama.cpp sidecar; apply OpenWhispr's cleanup prompt. Gate behindUTTER_LLM_CLEANUP=1+ model path. Adds ~2-4 GB + ~500 ms latency on a 3B model. Opt-in for users who want full LLM polish. - Initial prompt support. transcribe-rs doesn't expose Parakeet's initial-prompt feature yet; would help with domain vocabulary (code identifiers, proper nouns).
- Voice activity detection. Trim leading/trailing silence before feeding the model — small accuracy + speed win.
- Streaming mode. Parakeet TDT supports streaming; transcribe-rs doesn't wire it up yet. Would unlock real-time incremental output for long dictations.
- Re-introduce paste-method selection IF a real app breaks
Shift+Insert. The current default (Shift+Insert, primary
selection) covers every terminal + GTK/Qt input we've tested. The
ctrl-v/ctrl-shift-vbranches were removed because they were speculative — no app in our test set needed them. If a user reports a specific app where Shift+Insert fails, add a per-app or config-file override with that app documented as the motivation. - Per-app paste method. Different apps want different keystrokes (Claude Code wants Shift+Insert, Konsole wants Ctrl+Shift+V, most GUIs want Ctrl+V). Look up the focused window class via KWin D-Bus / wlroots toplevel-management and dispatch accordingly. Only worth building if the previous item turns up real-world breakage.