Sona

Sona is a powerful, offline transcript editor built with Tauri, React, and Sherpa-onnx. It provides fast, accurate, and private speech-to-text capabilities directly on your local machine using a high-performance Rust backend.

✨ Features

🔒 Offline & Private: All speech processing happens locally on your device. No data leaves your machine.
🎙️ Real-time Transcription: Record and transcribe audio in real-time with low latency.
📁 Batch Processing: Import multiple audio or video files for bulk transcription in the background.
🗂️ Workspace Organization: Use Workspace, Projects, and Inbox to organize saved recordings and imports.
📝 Interactive Editor: A rich text editor synchronized with audio playback for corrections, speaker labels, and version snapshots.
👥 Speaker Profiles & Review: Build local speaker profiles, correct speaker badges segment by segment, and review suggested or anonymous speaker groups before export.
✨ LLM Assistant: Polish, translate, and summarize transcripts using OpenAI, Anthropic, Gemini, or Ollama.
🗣️ Live Caption & Voice Typing: Reuse the same offline live transcription stack for floating captions or dictation into other applications.
📤 Smart Export: Export in multiple formats (TXT, SRT, VTT, JSON) with bilingual support.
🛟 Recovery, Backup & Diagnostics: Resume interrupted work, export lightweight backups of config, workspace, and text history, and inspect model/runtime health from the app.
🔔 Notifications & Automation: Use the header notification center for updates, recovery, and automation results, and configure watched-folder automation rules in Settings.
🤖 Advanced AI Models: Powered by state-of-the-art models like SenseVoice, Whisper, and Paraformer.

🚀 Getting Started

Download from GitHub Releases

The easiest way to install Sona is to download the pre-built binaries for your platform from the GitHub Releases page.

User Guide

For end-user setup and daily workflows, read the User Guide. It covers first-run setup, Live Record, Batch Import, Workspace / Projects / Inbox, transcript editing, speaker review, version snapshots, LLM features, Voice Typing, export, Dashboard / backup / recovery entry points, and troubleshooting.

CLI

Sona supports offline batch transcription commands directly through the main desktop executable. Packaged installs do not add it to your shell PATH, so invoke the app binary itself with CLI subcommands.

Installed package locations:

Windows: run Sona.exe transcribe ... from the installation directory
macOS: run /Applications/Sona.app/Contents/MacOS/Sona transcribe ...
Linux: run the packaged Sona binary with CLI subcommands from the install location
AppImage: run the mounted AppImage executable with CLI subcommands

Source builds can still run the CLI directly with Cargo:

cargo run --manifest-path src-tauri/Cargo.toml -- transcribe ./sample.mp4 --config ./sona-cli.toml --output ./sample.srt

Current CLI scope is intentionally narrow:

Single-file offline transcription
Export to json, txt, srt, or vtt
Exposed through the main desktop executable, but not registered on PATH

For the full CLI guide and a minimal TOML example, read docs/cli.md.

Build from Source

Prerequisites

Node.js: v20 or later (for frontend build).
Rust: Stable release (required for the Tauri backend).
Package Manager: pnpm via Corepack (recommended).

Linux Requirements

If you are running on Linux (Ubuntu/Debian), ensure you have the necessary system dependencies:

sudo apt-get update
sudo apt-get install libwebkit2gtk-4.1-dev \
    build-essential \
    curl \
    wget \
    file \
    libssl-dev \
    libgtk-3-dev \
    libayatana-appindicator3-dev \
    librsvg2-dev \
    libasound2-dev

Installation

Clone the repository

git clone https://github.com/AirSodaz/sona.git
cd sona

Install dependencies
```
corepack enable
pnpm install
```
Run the application
```
pnpm run tauri dev
```
Run frontend tests
```
pnpm test
```

📦 Model Management

Sona allows you to choose the AI model that best fits your needs, both for offline transcription and online assistance.

Offline Transcription

Navigate to Settings > Model Settings.
Choose from a curated list of high-performance models:
- SenseVoice: Best for multilingual support and emotion recognition.
- Whisper (Tiny): Lightweight version of OpenAI's Whisper model.
- Paraformer: Optimized for streaming.
Click Download. The model will be automatically stored locally.

LLM Assistant (Polish, Translate, Summary)

Navigate to Settings > LLM Service.
Select your provider (OpenAI, Anthropic, Gemini, or Ollama).
Enter your API Key and Base URL (if applicable).
Select the models that power polish, translation, and summary generation.

🏗️ Building

To build the application for production:

pnpm run tauri build

The executable will be generated in src-tauri/target/release/bundle.

Name		Name	Last commit message	Last commit date
Latest commit History 989 Commits
.github		.github
.husky		.husky
.vscode		.vscode
docs		docs
public		public
scripts		scripts
src-tauri		src-tauri
src		src
.editorconfig		.editorconfig
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
GEMINI.md		GEMINI.md
README.md		README.md
README.zh-CN.md		README.zh-CN.md
commitlint.config.cjs		commitlint.config.cjs
eslint.config.js		eslint.config.js
index.html		index.html
package.json		package.json
playwright.config.ts		playwright.config.ts
pnpm-lock.yaml		pnpm-lock.yaml
pnpm-workspace.yaml		pnpm-workspace.yaml
tsconfig.json		tsconfig.json
tsconfig.node.json		tsconfig.node.json
vite.config.ts		vite.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sona

✨ Features

🚀 Getting Started

Download from GitHub Releases

User Guide

CLI

Build from Source

Prerequisites

Linux Requirements

Installation

📦 Model Management

Offline Transcription

LLM Assistant (Polish, Translate, Summary)

🏗️ Building

About

Uh oh!

Releases 25

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Sona

✨ Features

🚀 Getting Started

Download from GitHub Releases

User Guide

CLI

Build from Source

Prerequisites

Linux Requirements

Installation

📦 Model Management

Offline Transcription

LLM Assistant (Polish, Translate, Summary)

🏗️ Building

About

Topics

Resources

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 25

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages