A Rust port of birdnet-go — a self-hosted, realtime soundscape analyzer that listens to a microphone 24/7, identifies bird species with the BirdNET neural network, filters false positives, stores detections, saves audio clips, and shows them live in a web dashboard.
soundcard (cpal) -> downmix mono -> resample 48 kHz (rubato)
-> 3 s windows w/ overlap -> BirdNET ONNX inference (birdnet-onnx crate, on ort)
-> top-k + activation -> threshold + false-positive + dynamic-threshold filters
-> actions: store (SeaORM/SQLite) + WAV clip (hound) + SSE broadcast
-> Leptos dashboard
A single Leptos fullstack crate built with cargo-leptos:
- The server (
ssrfeature) runs the realtime daemon and anaxumserver. - The browser app (
hydratefeature -> WASM) is the same crate, compiled forwasm32-unknown-unknown; backend deps aressr-only and never reach the WASM bundle. - Queries use
#[server]functions (e.g.list_detections); the realtime push feed (live detections + audio meter) and binary media (clip WAV, spectrogram PNG) are plainaxumroutes mounted on the same server, since those don't fit request/response server functions.
- Inference via the
birdnet-onnxcrate (the birdnet-go author's own library, built onort2.0-rc). It auto-detects the model type and applies the correct activation (BirdNET -> sigmoid, Perch -> softmax, BSG -> pre-sigmoid) - Leptos fullstack (SSR + hydration + server functions), one crate.
- 64-bit targets (
x86_64,aarch64); 32-bit ARM is not supported. - SeaORM + SQLite datastore (MySQL/Postgres possible later).
BirdNET v2.4 ships officially as TensorFlow Lite, but birdnet-rs runs ONNX. You have two options: grab the pre-converted model (fast, no toolchain — this is what the packaged builds use), or download + convert it yourself from the official weights.
The packaged builds (.deb, .tar.gz, .zip, .msi) download the
pre-converted ONNX model, labels and taxonomy from this repo's GitHub releases
automatically on first run — no Python, TensorFlow or git needed. To fetch
it manually (e.g. for a cargo-built checkout):
scripts/fetch-model.sh # -> models/{*.onnx, *_Labels_en_us.txt, eBird_taxonomy_*.json}Set BIRDNET_MODEL_BASE_URL to pin a specific release tag instead of latest.
The model is BirdNET GLOBAL 6K v2.4, CC BY-NC-SA 4.0 (Cornell Lab) — see
Licensing and the MODEL_LICENSE shipped beside the model.
scripts/download_model.sh --convert # -> models/BirdNET_GLOBAL_6K_V2.4.onnx + labelsdownload_model.sh pulls the FP32 model and the index-aligned label file from
birdnet-go's own embedded data (so you don't have to run birdnet-go first, and
the model/labels are guaranteed to match). Useful flags:
--locale de— common names in another language (run with--helpfor the list)--out DIR— download directory (defaultmodels)--force— re-download even if files exist--convert— also run the ONNX conversion afterwards
scripts/download_model.sh # -> models/*.tflite + *_Labels_en_us.txt
scripts/convert_model.sh # -> models/BirdNET_GLOBAL_6K_V2.4.onnxconvert_model.sh needs git and a Python interpreter in the 3.10–3.13
range (TensorFlow has no wheels for 3.14+; the script auto-detects python3.12/
python3.11/… or takes --python <path>). On first run it checks out the
official BirdNET ONNX converter
into .cache/ and builds an isolated virtualenv (this downloads TensorFlow, so
the first conversion takes a few minutes; later runs reuse it). Pass --fp16
for a half-precision model (smaller, good on a Raspberry Pi 5).
Why not a plain
tf2onnxone-liner? BirdNET bakes its mel-spectrogram front-end (an STFT) into the model graph, whichtf2onnxlowers to anRFFT2Dop it can't convert. The official converter handles this by keepingRFFT2Das a custom op and rewriting it to aMatMulwith a precomputed DFT matrix during optimization.convert_model.shjust drives that tool with the dependency versions it pins.
The resulting ONNX model keeps the embedded mel front-end, so it takes raw
48 kHz PCM ([1, 144000]) and outputs one logit per species ([1, 6522]).
Models are not committed to the repo (size + licensing); they're published as
release assets and downloaded on demand instead — see Licensing.
Two separate licenses apply:
- birdnet-rs source code — MIT.
- BirdNET model (the
.onnxweights, species labels, and eBird taxonomy) — CC BY-NC-SA 4.0, © K. Lisa Yang Center for Conservation Bioacoustics, Cornell Lab of Ornithology (source). birdnet-rs re-hosts a format-converted (TFLite → ONNX) copy as a release asset and downloads it on first run; only the format changed. Using the model — including via birdnet-rs — is therefore non-commercial, requires attribution, and any redistribution of the model must stay under CC BY-NC-SA 4.0. The full notice ships asMODEL_LICENSEnext to the model files.
Needs the wasm target and cargo-leptos:
rustup target add wasm32-unknown-unknown
cargo install cargo-leptos
# Build server + WASM + bundle the site, then run.
cargo leptos build --release
# First run writes a default config.yaml you can edit (model/audio settings).
LEPTOS_SITE_ROOT=target/site ./target/release/birdnet-rsOr, for development with hot reload of both server and UI:
cargo leptos watchThen open http://localhost:8080 (the address is the Leptos site-addr).
- Live audio monitor — a VU meter (RMS + peak) and a scrolling spectrogram
of the microphone input, fed by
audioSSE events. - Input device selector — pick the capture device from a dropdown;
the choice is applied live (no restart needed) and persisted to the OS user-config
directory (
~/.config/birdnet-rs/preferences.jsonon Linux,%APPDATA%\…on Windows,~/Library/Application Support/…on macOS) so it's restored next launch. - Sample-rate selector — choose the capture rate (44.1 kHz … 256 kHz, filtered to what the device reports); also persisted. The BirdNET v2.4 model runs at 48 kHz, so higher rates are captured and downsampled for inference.
- Closest match — the current best-guess species + match % shown in
realtime as audio is analyzed (
liveSSE event) - Detections — recent detections (loaded via the
list_detectionsserver function, live-prepended viadetectionSSE), with search by name/code, per-row review (✓ correct / ✗ false-positive), a rendered clip spectrogram thumbnail, and an audio player.
Detections are stored in birdnet.db; clips are written under clips/<date>/.
Audio/model/storage settings live in config.yaml (path overridable with
BIRDNET_CONFIG); the web address is the Leptos site-addr
(Cargo.toml [package.metadata.leptos], or LEPTOS_SITE_ADDR, default
0.0.0.0:8080).
birdnet:
model_path: models/BirdNET_GLOBAL_6K_V2.4.onnx
labels_path: models/BirdNET_GLOBAL_6K_V2.4_Labels_en_us.txt
threshold: 0.3 # minimum confidence to report
overlap: 0.0 # seconds of overlap between 3 s windows (1.5 = 50%)
locale: en_us
latitude: 0.0
longitude: 0.0
taxonomy_path: models/eBird_taxonomy_codes_2021E.json # species codes
range_filter: # location/date plausibility filter (BirdNET meta model)
enabled: false # also needs the range model + non-zero lat/lon
model_path: models/BirdNET_GLOBAL_6K_V2.4_RangeModel.onnx
threshold: 0.01
rerank: false # true -> multiply confidence by location score
realtime:
audio:
source: default # device name, or "default"
export:
enabled: true
path: clips
retention: # automatic clip cleanup
enabled: false
max_age_days: 30
output:
sqlite:
path: birdnet.dbTo enable the range filter + species codes, fetch and convert the extras:
scripts/download_model.sh --range --convert # adds taxonomy + range ONNX modelThen set birdnet.latitude/longitude and birdnet.range_filter.enabled: true.
Env overrides: BIRDNET_CONFIG, BIRDNET_THRESHOLD, BIRDNET_MODEL_PATH,
BIRDNET_LABELS_PATH, BIRDNET_DB_PATH.
| Route | Description |
|---|---|
POST /api/list_detections |
#[server] function — recent detections (JSON) |
POST /api/search_detections |
#[server] function — search by name / species code |
POST /api/review_detection |
#[server] function — mark correct / false-positive |
POST /api/list_audio_devices / set_audio_device |
#[server] functions — input device list / switch |
GET /stream |
live SSE: detection + audio + live (best-guess) events |
GET /media/clip/{id} |
the WAV clip |
GET /media/spectrogram/{id} |
clip rendered as a spectrogram PNG |
GET / , /pkg/* |
SSR dashboard + WASM/JS/CSS bundle |
cargo test # server-side unit + route tests (default `ssr` feature)
cargo leptos build # full fullstack build (server + WASM + bundle)
# End-to-end inference test (needs the converted model on disk):
BIRDNET_TEST_MODEL=models/BirdNET_GLOBAL_6K_V2.4.onnx \
BIRDNET_TEST_LABELS=models/BirdNET_GLOBAL_6K_V2.4_Labels_en_us.txt \
cargo test -- --ignoredImplemented: soundcard capture, resampling, windowing, ONNX inference, confidence/dynamic/false-positive filtering, location/date range filter, eBird taxonomy (species codes), SQLite storage, clip export, clip retention, clip + live spectrograms, a live audio monitor (VU meter + scrolling spectrogram), a Leptos (WASM) dashboard with live SSE updates, detection search, review (mark correct / false-positive), MQTT publishing (+ Home Assistant discovery), BirdWeather upload, and a species-image provider (Wikipedia, cached) shown in the detection list.
Planned: RTSP & multiple sources, Perch/Bat models, notifications, weather, auth/OIDC, MySQL/Postgres.
Enable in config.yaml (all off by default except the image provider):
mqtt:
enabled: false
broker: mqtt://localhost:1883 # mqtts:// for TLS
topic: birdnet-rs/detections
username: ""
password: ""
retain: false
qos: 1
home_assistant: { enabled: false, discovery_prefix: homeassistant, device_name: BirdNET-RS }
birdweather:
enabled: false
id: "" # BirdWeather station token
threshold: 0.7
location_accuracy: 500 # GPS fuzz radius (m); also set birdnet.latitude/longitude
imageprovider:
enabled: true
cache_dir: images
ttl_days: 30BirdWeather encodes each soundscape to loudness-normalized FLAC via ffmpeg
(EBU R128, −23 LUFS) — ffmpeg must be on PATH. MQTT publishes a JSON
message per detection (+ an online/offline status and an HA discovery sensor).
The image provider resolves species -> a free Wikipedia thumbnail, caches it
under cache_dir, and the dashboard shows it (served via /media/image/{id}).