Local voice-to-text for Apple Silicon macOS — plus opt-in cloud captions when you want them.
Private by default. Memory-friendly. Steady Traditional Chinese output.
Canonical repository: github.com/felixfu824/HushType
HushType is a free, open-source, offline speech-to-text app for macOS and iOS. It uses Qwen3-ASR (macOS / iOS mirrors) running locally on Apple Silicon (MLX) to transcribe voice input in English, Chinese, and Japanese — including mixed-language sentences. Delivers steady Traditional Chinese (繁體中文) output via OpenCC. A privacy-first alternative to cloud dictation services and Whisper-based tools (higher RAM consumption, tendency to output Simplified Chinese), built to be lightweight enough to coexist with all the applications you're already running.
Private and local-first. Voice never leaves your machine. The speech model runs on your Mac's GPU — no cloud, no account, no telemetry. The only network call is the one-time model download (~675 MB); after that, fully offline. iPhone audio travels to YOUR Mac over Tailscale or local WiFi. Pre-download the model on another machine and HushType is fully air-gappable.
Memory-friendly — coexists with your agents. Built for how Apple Silicon owners actually use their Macs in 2026: two or three coding agents (Claude Code, Cursor, Codex), a browser, normal workload. Qwen3-ASR-0.6B 4-bit is ~675 MB on disk — less than half of Whisper Large v3 Turbo — yet competitive with models 3× its size. Small enough for an entry-level 8 / 16 GB M-series Mac. One click unloads it and frees ~2 GB; auto-reloads on the next Right ⌥ hold (~3s cold start).
Traditional Chinese that actually works. Whisper's single zh code defaults to Simplified with no reliable way to force Traditional; most open-source models mix Simplified or Mainland phrasing (软件, not 軟體). HushType chains Qwen3-ASR with OpenCC s2twp for Taiwan-specific output — 軟體, 滑鼠, 品質. Native EN / ZH code-switching in one pass (mix in a single sentence, get one clean transcription). Optional deterministic ITN converts Chinese numerals to Arabic digits in context (一零一大樓 → 101 大樓), on by default.
Live captions in two flavors — local for transcription, cloud for translation. New in v0.5. Live Caption runs your audio through the same on-device Qwen3-ASR pipeline as dictation and writes captions onto a floating panel — fully local, free, works on a plane. Live Translated Caption is the opt-in deviation from local-only: it streams audio directly from your Mac to OpenAI's latest gpt-realtime-translate model so you can subtitle a Japanese YouTube clip, a Korean podcast, or a Spanish-language meeting in English (or 13 other target languages) in real time.
When you turn it on, the key is yours, the audio is yours, the bill is yours. Your API key sits in plaintext at ~/Library/Application Support/HushType/openai.json (same security profile as .env). Audio streams Mac → OpenAI directly over WSS; HushType operates no servers, sees no audio, no key, no spend. Cost guardrails (auto-stop, daily cap) and a per-session cost chip are built in. Cloud mode never auto-starts — first use triggers a one-time disclosure, and the engine resets to local on every app launch.
| Feature | Default | Requirement |
|---|---|---|
| Hold Right ⌥ to dictate (macOS) | ON | macOS 15+ |
| Tap Right ⌥ to translate selected text | OFF | macOS 14+ |
| Live Caption (local, free) — floating panel from mic or system audio | OFF | macOS 15+ |
| Live Translated Caption (cloud, ~$2/hr) — real-time foreign-language subtitles via OpenAI | OFF (opt-in) | Your own OpenAI API key |
| Right ⌘ + / — toggle whichever caption mode you used last | — | macOS 15+ |
| EN / ZH / JA + native code-switching | ON | — |
簡體 → 繁體 post-processing (OpenCC s2twp) |
ON | — |
| 阿拉伯數字 conversion (deterministic ITN) | ON | — |
| AI Cleanup — filler removal, self-correction resolution | OFF (opt-in beta) | macOS 26 + Apple Intelligence |
| Customized dictionary (proper nouns / jargon) | File-driven | — |
| Floating "Listening / Transcribing" pill | ON | — |
| Unload speech-to-text model | One-click | — |
| iOS app + custom keyboard (Mac as server) | Optional | iOS 17+, Python on Mac |
Talking to AI agents. Giving Claude or ChatGPT a detailed prompt takes 5 minutes to type, 30 seconds to say. Hold Right ⌥, speak your entire prompt (mixing languages as needed), release — text appears in the chat input. Local transcription means your prompts never leave your machine even if you're driving cloud-hosted agents.
Voice notes on the go. On the subway, Mac at home. Tap "Start Listening" on iPhone, switch to Notes, tap the mic button on the HushType keyboard. Audio travels over Tailscale to your Mac, transcribes in ~1 second, text appears.
Reading in another language. Select any text in Safari, Mail, Notes — anywhere — and tap Right ⌥. A translucent card pops up with the translation via Apple's on-device Translation Framework. Auto-dismisses after 10s, pauses on hover. No API key, no cloud.
Watching foreign-language content. Korean drama, Japanese news, Spanish football commentary. Open the source in any app, click Live Translated Caption → From System Audio… in the menu bar, pick the app — translated English (or whichever target you set) streams onto a floating caption panel anchored at the bottom of your screen. Right ⌘ + / toggles it on and off. The original-language line shows above the translation as a confidence check; cost chip in the header tracks the session bill against your own OpenAI key.
macOS (standalone — zero network required):
Hold Right Option (≥0.3s) → speak → release → text at cursor
Tap Right Option (<0.3s) with text selected → translation card
Pipeline: mic → Qwen3-ASR (MLX, on-device) → OpenCC s2twp → ITN → paste
iOS (via your Mac as server):
Open HushType → Start Listening → switch to any app → HushType keyboard → tap mic
Pipeline: iPhone mic → WiFi/Tailscale → Mac server → Qwen3-ASR → OpenCC → result back → text inserted
┌──────────────────────────────────┐
│ Mac (Apple Silicon) │
┌──────────────┐ WiFi/Tailscale │ │
│ iPhone │ ──── HTTP POST ──►│ ios_server.py (port 8000) │
│ HushType KB │◄── JSON result ───│ ↓ │
└──────────────┘ │ mlx-audio (port 8199) │
│ → Qwen3-ASR 0.6B (MLX/Metal) │
│ → OpenCC s2twp │
│ │
│ HushType.app (menu bar) │
│ → Right Option hotkey │
│ → Local transcription │
└──────────────────────────────────┘
- Download
HushType.dmgfrom the latest release - Open the DMG and drag HushType to Applications
- Right-click HushType.app → Open (required on first launch — the app is ad-hoc signed, not notarized)
- Grant Accessibility and Microphone permissions when prompted
- Wait for the model to download (~675 MB, one-time, progress shown in menu bar)
The DMG is self-contained — OpenCC and all dependencies are bundled. No Homebrew, no terminal commands.
iOS server support: The DMG also includes the iOS server toggle in the menu bar. It requires Python 3 and additional packages to be installed separately — see the iOS setup guide below. If dependencies are missing, the app will show an error with the exact
pip3 installcommand needed.
See Prerequisites and macOS Setup Guide below.
Updating means replacing the .app bundle. Preferences, the ASR model, and user data live outside the bundle and are preserved.
DMG: quit HushType, drag the new HushType.app onto the Applications shortcut in the new DMG (click Replace), relaunch from Spotlight.
From source: git pull && make install.
Permission re-grant: because HushType is ad-hoc signed, macOS may require Accessibility to be enabled again after an update. The setup window will show the current permission state. Click Open System Settings, enable HushType in Accessibility, then click Restart HushType so macOS applies the grant. If you see duplicate HushType entries, cannot find HushType, or the switch does not work, use Reset Old HushType Entry in the setup window and add/enable HushType again.
Full uninstall: Trash /Applications/HushType.app, then optionally defaults delete com.felix.hushtype and rm -rf ~/.cache/huggingface/hub/models--*Qwen3-ASR* to remove preferences and the model cache.
Note: If you installed via DMG (Option A), skip this section — everything is bundled. These are only needed for building from source or setting up the iOS server.
Hardware and OS:
| Requirement | Purpose |
|---|---|
| Mac with Apple Silicon (M1+) | MLX inference requires Metal GPU |
| macOS 15.0+ | Minimum OS for speech-swift |
| iPhone with iOS 17+ | iOS client (optional) |
Software dependencies (build from source):
| Dependency | Purpose | Install | Required for |
|---|---|---|---|
| Homebrew | Package manager | See brew.sh | Build from source |
| opencc | Simplified → Traditional Chinese | brew install opencc |
Build from source (bundled in DMG) |
| speech-swift | Qwen3-ASR on Apple Silicon (MLX) | Automatic via SPM | Build from source |
| Python 3.13+ | iOS server runtime | brew install python |
iOS only |
| mlx-audio | STT server for iOS | pip3 install "mlx-audio[stt,server]" |
iOS only |
| httpx | Async HTTP for proxy server | pip3 install httpx |
iOS only |
| webrtcvad-wheels, setuptools | mlx-audio runtime deps | pip3 install webrtcvad-wheels setuptools |
iOS only |
| xcodegen | iOS Xcode project generation | brew install xcodegen |
iOS only |
| Xcode 16+ | Building the iOS app | Mac App Store | iOS only |
| Tailscale | Encrypted iPhone-to-Mac from anywhere | See tailscale.com | Optional |
git clone https://github.com/felixfu824/HushType.git
cd HushType
# Install dependencies
brew install opencc
# Build and install to /Applications
make install- Launch HushType from Spotlight (Cmd+Space → HushType)
- On first launch, the Set Up HushType window shows the required permissions: Accessibility and Microphone.
- Click Open System Settings in the Accessibility card. Find HushType in the Accessibility list and toggle it on. If HushType is missing, use the small helper panel to drag HushType into the list.
- Click Allow Microphone and approve the macOS microphone prompt.
- Return to HushType and click Restart HushType — the app relaunches itself with the new Accessibility permission active. (macOS caches the permission check per-process, so a restart is mandatory after granting — HushType handles it for you.)
- Wait for the model to download (~675 MB, one-time, progress shown in menu bar)
- Hold Right Option (≥0.3s) — record. A "Listening" pill with a live audio meter shows at the bottom of the screen.
- Release — pill switches to "Transcribing"; transcribed text pastes at your cursor and stays on the clipboard.
- Tap Right Option (<0.3s) — with text selected, translates via Apple Translation Framework into a floating card. See Text Translation.
Menu bar:
- Language — Auto / English / Chinese / Japanese
- Show Floating Indicator — toggle the listening pill (default on)
- Number Conversion — Chinese numeral → Arabic digit pass (default on)
- Text Translation — enable tap-to-translate (macOS 14+)
- AI Cleanup — Apple Foundation Models post-processing (macOS 26+, off by default)
- Unload Speech-to-Text Model — frees ~2 GB RAM; reload from the same menu (~3s cold start)
- Edit Customized Dictionary —
~/Library/Application Support/HushType/dictionary.txt, plain text,source -> targetper line, hot-reloads
That's it. No server, no network, no configuration.
Two products sharing the same floating caption panel. Mutually exclusive at runtime — starting one auto-stops the other.
Live Caption (free, local, on-device):
- Status-bar menu → click Live Caption to toggle (uses last-known source — defaults to mic on first use), or pick From Microphone / From System Audio… explicitly.
- System Audio first time → pick the app whose audio you want to caption from the picker.
- Captions stream onto a floating panel pinned near the bottom of your screen. The panel is draggable and resizable; its frame is remembered across stops.
Live Translated Caption (~$2/hr against your own OpenAI account):
- Get an API key at https://platform.openai.com/api-keys.
- Status-bar menu → Live Translated Caption → Translated Caption Settings… → click Open file in TextEdit and paste your key into
openai.jsonas theapi_keyfield. - Pick a target language in the same settings window (default: English; 13 others including 繁體中文 / 简体中文 / 日本語 / 한국어 / Español / Français / Deutsch).
- Click Live Translated Caption → From Microphone (or From System Audio…) in the menu. First time you do this, a one-time disclosure modal explains the cost and privacy profile — accept once and it stays accepted.
- A cost chip in the caption panel header (e.g.
12:34 · $0.42) shows session duration and spend. Auto-stop minutes and daily-cap warnings are configurable in the same settings window.
Hotkey (both products): Right ⌘ + / toggles whichever product you last started. First-use default is local (Live Caption). The menu items are the authoritative way to pick a specific product + source.
Mid-session switching: Clicking the other product's menu item while one is running auto-stops the current session and starts the new one. Clicking the same product's other source switches in place without rebuilding the panel.
On-device translation via Apple Translation Framework. Select any text → tap Right Option (<0.3s) → translucent card appears with the translation, also auto-copied to clipboard. Card auto-dismisses after 10s; hover to pause, click or Escape to dismiss now.
Direction: Chinese → English; everything else → Traditional Chinese. Override via menu bar or defaults write hushtype.translateTargetLanguage.
Enable: Menu bar → Text Translation. The toggle runs a sanity-check; if Translation Framework isn't available, the toggle stays off with a clear error.
HushType ships with AI Cleanup off by default. When enabled, each transcription is passed through Apple's on-device Foundation Models framework, which (1) strips leading filler words (um, uh, 嗯, 那個), (2) collapses immediate duplicates while preserving emphatic repetitions, and (3) resolves explicit self-corrections (I'll send it Wednesday no actually Friday → I'll send it Friday).
Why off by default: AI Cleanup rewrites your transcription content. The deterministic ITN layer (Chinese numeral → Arabic digit) is on by default because it's reversible and bounded; AI Cleanup is opt-in because semantic rewriting is a stronger commitment.
Requirements: macOS 26 (Tahoe) + Apple Intelligence enabled + Apple Silicon.
How to enable: Menu bar → AI Cleanup. HushType runs a quick round-trip test against the on-device model; if Apple Intelligence isn't available, you get a clear error and the toggle stays off. On success, future transcriptions are cleaned automatically. If the on-device model errors mid-transcription, HushType silently falls back to the uncleaned text — you never see a broken result.
Known limitations (beta): Occasional over-pruning of Chinese adverbs (我一直都在 may become 我一直在); trailing particles may leak through after self-correction resolution; English numerals inside Chinese context get converted (我買了 five 本書 → 我買了 5 本書, accepted behavior); Japanese is tested minimally.
The iOS app uses your Mac as the transcription server. Your iPhone sends audio to your Mac over WiFi or Tailscale, and receives the transcribed text back.
# Python packages for the transcription server
pip3 install "mlx-audio[stt,server]" webrtcvad-wheels setuptools httpx
# OpenCC for Traditional Chinese + xcodegen for iOS project
brew install opencc xcodegen# If using Tailscale (works from anywhere):
tailscale ip -4
# Example output: 100.x.x.x
# If using LAN only (same WiFi):
ipconfig getifaddr en0
# Example output: 192.168.50.50Write down this IP — you'll enter it on your iPhone later.
Option A — From HushType menu bar (recommended): Click the HushType icon in menu bar → "Start iOS Server"
Option B — From terminal:
cd HushType
python3 scripts/ios_server.py
# Server starts on 0.0.0.0:8000
# First transcription request will download the model (~675 MB)Verify the server is running:
curl http://localhost:8000/
# Should return: {"status":"ok","service":"HushType iOS Server","opencc":true}cd iOS
xcodegen generate
open HushType.xcodeprojIn Xcode:
- Click the HushType project in the navigator (top left)
- Select the HushType target → Signing & Capabilities → set Team to your Apple ID
- Select the HushTypeKeyboard target → same thing, set Team
- If Xcode shows "Update to recommended settings" → click Perform Changes
- Connect iPhone via USB cable
- Select your iPhone as the run destination (top bar)
- Click Run (Cmd+R)
First-time build takes ~1 minute. Subsequent builds are faster.
These steps happen on the iPhone itself:
5a. Enable Developer Mode (one-time):
- Settings → Privacy & Security → Developer Mode → toggle On
- iPhone will restart. After restart, confirm "Turn On" when prompted.
5b. Trust the developer (one-time):
- Settings → General → VPN & Device Management
- Tap your Apple ID under "Developer App"
- Tap Trust
5c. Add the HushType keyboard (one-time):
- Settings → General → Keyboard → Keyboards → Add New Keyboard
- Scroll down to "Third-Party Keyboards" → tap HushType
- Tap HushType in the keyboard list → toggle Allow Full Access → confirm
Important: Full Access must be enabled. Without it, the keyboard cannot communicate with the main app or access the network. If the mic button doesn't respond, this is the most common cause.
- Open the HushType app on iPhone
- Enter your Mac's IP address:
http://<your-ip>:8000(the IP from Step 2) - Tap Test Connection → should show green "Connected"
- Tap Start Listening — the orange microphone indicator appears at the top of the screen
- The app shows a 5-minute countdown timer
- Switch to any app (Messages, Notes, Safari, etc.)
- Long-press the globe key on your keyboard → select HushType
- Tap the mic button → speak → tap stop
- Wait 1-2 seconds → transcribed text appears at your cursor
- Use space, backspace, and return buttons for basic editing
When the 5-minute session expires, return to the HushType app and tap "Start Listening" again.
You only need to repeat Steps 3 + 6-7 each day:
- Make sure the iOS server is running on Mac (menu bar → "Start iOS Server")
- Open HushType on iPhone → Start Listening
- Switch to your app → use the keyboard
The USB cable is only needed for installing/updating the app from Xcode. Normal usage is wireless.
Note: With free Apple ID provisioning, the app expires every 7 days. When it stops launching, reconnect USB → Xcode → Cmd+R to reinstall. Your settings are preserved. A paid Apple Developer account ($99/year) extends this to 1 year.
# View all settings
defaults read com.felix.hushtype
# Language: nil=auto, "english", "chinese", "japanese"
defaults write com.felix.hushtype hushtype.language -string "chinese"
# Model: default "aufklarer/Qwen3-ASR-0.6B-MLX-4bit" on macOS;
# alternative "mlx-community/Qwen3-ASR-1.7B-8bit" for better quality.
defaults write com.felix.hushtype hushtype.modelId -string "mlx-community/Qwen3-ASR-1.7B-8bit"
# Traditional Chinese conversion (default: true)
defaults write com.felix.hushtype hushtype.chineseConversionEnabled -bool false
# Number conversion / ITN — Chinese numeral → Arabic digit (default: true)
defaults write com.felix.hushtype hushtype.numberConversionEnabled -bool false
# Floating "Listening / Transcribing" indicator (default: true)
defaults write com.felix.hushtype hushtype.floatingOverlayEnabled -bool false
# AI Cleanup via Apple Foundation Models (default: false, requires macOS 26+)
# Prefer toggling from the menu bar — the menu validates FoundationModels
# availability and shows a clear error if Apple Intelligence isn't enabled.
defaults write com.felix.hushtype hushtype.aiCleanupEnabled -bool true
# Text Translation via Apple Translation Framework (default: false, requires macOS 14+)
defaults write com.felix.hushtype hushtype.textTranslationEnabled -bool true
# Translation target language (default: nil = auto — Chinese→English, other→繁體中文)
# Set to a specific language code to override (e.g., "en", "zh-Hant-TW", "ja")
defaults write com.felix.hushtype hushtype.translateTargetLanguage -string "en"- Server URL: configured in the app UI (persisted in App Group)
- Session duration: 5 minutes (hardcoded in BackgroundAudioManager.swift)
- Model:
mlx-community/Qwen3-ASR-0.6B-4bit(hardcoded in RemoteTranscriber.swift)
Edit Sources/HushType/HotkeyManager.swift:
private static let rightOptionKeyCode: Int64 = 61Common keycodes: Right Option (61), Right Command (54), Left Option (58), Left Control (59), Fn/Globe (63).
- No audio is stored. Voice data exists only in RAM during the recording → transcription pipeline, then discarded. Nothing is written to disk — not on macOS, not on the iOS server.
- No network after setup. The only internet access is the one-time model download (~675 MB) on first launch. After that, the app and the model run fully offline with zero outbound connections.
- No telemetry. No analytics, no usage tracking, no phone-home. The macOS app contains zero network code beyond the initial model fetch (handled by the HuggingFace Hub SDK inside speech-swift) and an optional GitHub releases check for update notifications.
- Cloud Live Translated Caption uses YOUR key directly to OpenAI. Opt-in, off by default, never auto-resumes across launches. Your API key is stored in plaintext at
~/Library/Application Support/HushType/openai.json(same security profile as.env) —chmod 600 ~/Library/Application\ Support/HushType/openai.jsonif you want a tighter mode bit. Audio streams Mac → OpenAI via WSS; HushType operates no servers, intermediates no traffic, and never sees your audio, your key, or your spend. The engine resets to local on every app launch — you re-opt-in each time you want the cloud path. - iOS audio stays on your network. iPhone audio travels directly to your Mac over local WiFi or Tailscale (WireGuard-encrypted). No third-party server is involved.
- Fully air-gappable. Pre-download the model folder on another machine (
~/.cache/huggingface/hub/models--aufklarer--Qwen3-ASR-0.6B-MLX-4bit/for the macOS app,~/.cache/huggingface/hub/models--mlx-community--Qwen3-ASR-0.6B-4bit/for the iOS server) and copy it over — the app will never need internet.
HushType/
├── Package.swift SPM config (macOS target)
├── Makefile build / install / clean / dmg
├── Sources/HushType/ macOS menu bar app
│ ├── main.swift NSApplication bootstrap
│ ├── AppDelegate.swift Orchestrator + state machine
│ ├── StatusBarController.swift Menu bar icon + menus + iOS server toggle
│ ├── IOSServerManager.swift Manages ios_server.py subprocess
│ ├── OnboardingManager.swift First-launch / repair permission orchestration
│ ├── OnboardingSetupWindowController.swift Setup window for Accessibility + Microphone
│ ├── PermissionSettingsGuidePanel.swift Floating helper for macOS permission lists
│ ├── DraggableAppTileView.swift Draggable HushType.app tile for System Settings
│ ├── SystemAudioPermissionFlow.swift Screen & System Audio permission flow
│ ├── SystemAudioPermissionWindowController.swift System-audio permission setup panel
│ ├── HotkeyManager.swift CGEvent tap for Right Option
│ ├── AudioCaptureService.swift AVAudioEngine mic capture (16kHz mono, RMS publisher)
│ ├── TranscriptionEngine.swift Protocol + Qwen3ASR wrapper (MLX)
│ ├── ChineseConverter.swift OpenCC s2twp (Simplified → Traditional)
│ ├── NumberNormalizer.swift Deterministic Chinese-numeral → Arabic-digit ITN
│ ├── DictionaryReplacer.swift Customized dictionary (final post-processing step)
│ ├── TextInserter.swift Clipboard + Cmd+V paste (result persists on clipboard)
│ ├── InputSourceManager.swift CJK input method detection
│ ├── FloatingOverlayWindow.swift Borderless NSPanel for the listening pill
│ ├── FloatingOverlayView.swift SwiftUI pill: RMS bars + transcribing spinner
│ ├── AICleaner.swift Non-gated façade over FoundationModels cleanup
│ ├── FoundationModelsCleaner.swift macOS 26+ gated Apple FM wrapper
│ ├── CleanupPrompt.swift Phase 4 AI Cleanup prompt (filler + self-correction)
│ ├── TranslationManager.swift Apple Translation Framework integration
│ ├── TranslationCardWindow.swift Floating translation card NSPanel
│ ├── TranslationCardView.swift SwiftUI translation card view
│ ├── LiveCaptionManager.swift Local/cloud live caption orchestration
│ ├── LiveCaptionWindow.swift Floating live caption panel
│ ├── LiveCaptionView.swift SwiftUI live caption panel view
│ ├── LiveCaptionWorker.swift Streaming local ASR worker
│ ├── LocalQwen3Backend.swift Local Live Caption backend
│ ├── OpenAITranslateBackend.swift Cloud translated-caption backend
│ ├── OpenAIKeyStore.swift User OpenAI API key file handling
│ ├── CloudOnboardingAlert.swift One-time cloud disclosure
│ ├── SystemAudioSource.swift ScreenCaptureKit system-audio source
│ ├── SystemAudioPicker.swift App/source picker for system audio
│ ├── MemoryUtils.swift Process memory reading utilities
│ └── AppConfig.swift UserDefaults wrapper
├── scripts/
│ ├── ios_server.py FastAPI proxy: mlx-audio + OpenCC
│ └── build_mlx_metallib.sh MLX Metal shader compilation
├── Resources/
│ ├── Info.plist LSUIElement, mic usage description
│ ├── HushType.png App icon (1024x1024)
│ └── HushType.icns macOS app icon
└── iOS/ iPhone app + keyboard extension
├── project.yml xcodegen project spec
├── Shared/ Shared between app + keyboard extension
│ ├── AppGroupConstants.swift App Group keys + file-based IPC
│ ├── IPCConstants.swift Darwin notification names
│ └── WAVEncoder.swift Float32 → 16-bit PCM WAV
├── VoxKey/ Main iOS app (directory name kept from v1)
│ ├── VoxKeyApp.swift SwiftUI entry point (@main HushTypeApp)
│ ├── Views/ContentView.swift Server config, listening session, countdown
│ ├── Services/
│ │ ├── AudioRecorder.swift AVAudioEngine with listening + recording modes
│ │ ├── BackgroundAudioManager.swift Session timer, IPC polling, background
│ │ └── RemoteTranscriber.swift HTTP multipart POST to Mac server
│ ├── Assets.xcassets/ App icon asset catalog
│ └── Resources/silence.wav Background audio fallback
└── VoxKeyKeyboard/ Custom keyboard extension
└── KeyboardViewController.swift Mic, space, backspace, return, globe
To run HushType on your own devices, change these:
| What | Where | Example |
|---|---|---|
| Bundle ID | iOS/project.yml (both targets) + iOS/Shared/AppGroupConstants.swift |
com.yourname.hushtype / group.com.yourname.hushtype |
| Server URL default | iOS/VoxKey/Views/ContentView.swift |
Your Tailscale or LAN IP |
| Hotkey | Sources/HushType/HotkeyManager.swift |
Any modifier keycode |
| Model | iOS/VoxKey/Services/RemoteTranscriber.swift + scripts/ios_server.py |
mlx-community/Qwen3-ASR-1.7B-8bit for better quality |
| Session timeout | iOS/VoxKey/Services/BackgroundAudioManager.swift |
sessionDuration property |
| OpenCC config | Sources/HushType/ChineseConverter.swift + scripts/ios_server.py |
Change s2twp to s2t for standard Traditional |
macOS: "MLX error: Failed to load the default metallib"
Run: bash scripts/build_mlx_metallib.sh release
macOS: Hotkey not working Check Accessibility permission in System Settings → Privacy & Security → Accessibility. HushType must be in the list and toggled on. If HushType is missing or the switch does not work, relaunch HushType and use Reset Old HushType Entry from the setup window, then add/enable HushType again. If you just granted Accessibility and the hotkey still does not work, click Restart HushType in the setup window or quit and relaunch manually — macOS caches the permission check per-process.
iOS: "App Transport Security" error
The Info.plist must have NSAllowsArbitraryLoads = true with NO NSExceptionDomains — they conflict and cause iOS to ignore the global allow.
iOS: Mic button does nothing (no recording starts) Most common cause: Full Access is not enabled. Go to Settings > General > Keyboard > Keyboards > HushType > toggle Allow Full Access. Without this, the keyboard extension cannot communicate with the main app.
iOS: Keyboard stuck on "Transcribing..." The main app isn't receiving commands. Ensure:
- HushType app is open and showing "Listening" with the orange mic dot
- The Mac server is running (
curl http://<mac-ip>:8000/) - App Group container works (check Xcode console for "App Group container: /path...")
iOS: "Open HushType app first" The main app isn't running or the listening session expired (5-min timeout). Open HushType app and tap "Start Listening" again.
iOS: App stops working after 7 days Free provisioning signing expires. Reconnect iPhone via USB, open Xcode, Cmd+R to reinstall. Settings are preserved.
Server: Port already in use
lsof -ti :8000 :8199 | xargs kill- iOS requires Mac to be on and server running (no cloud fallback)
- Free provisioning: iOS app expires every 7 days (re-sign via Xcode)
- Session timeout is fixed at 5 minutes (no UI to change yet)
- Mac must be reachable from iPhone (same WiFi or Tailscale)
- DMG is ad-hoc signed (not notarized) — macOS Gatekeeper will warn on first launch. Right-click → Open to bypass.
