Releases: MapleEve/VoScript
Releases · MapleEve/VoScript
VoScript v0.8.4
VoScript v0.8.4 completes the 0.8 Rust-kernel foundation while keeping the public HTTP contract compatible with existing clients.
Highlights
- Adds the optional Rust kernel foundation behind
RUST_KERNEL_MODE, with Python remaining the default runtime path. - Adds selected Rust-backed validation paths for voiceprint scoring, result post-processing, artifact manifests, and status helper contracts.
- Adds optional artifact manifest metadata to completed results without exposing runtime paths or private operator data.
- Keeps speaker labels stable and disambiguates duplicate display names instead of merging speakers in the result contract.
- Updates public configuration, security, and changelog docs for the v0.8.4 release surface.
- Strengthens release gates with public release scanning, CI security scan, FOSSA policy checks, Rust wheel tests, and Docker packaging smoke.
Validation
- CI lint, unit/security tests, coverage, and dependency audit passed on
main. - FOSSA policy check passed on
main. - Rust Foundation Heavy Gate passed on
main, including Rust wheel build and Docker packaging smoke. - Public release scan passed before release text publication.
Remote deployment and internal live validation are handled separately after Docker image publication.
VoScript v0.7.6
VoScript v0.7.6
Patch release focused on runtime stability, alignment isolation, transcription quality, and release-gate hardening.
Highlights
- Keeps health checks responsive during GPU cleanup by avoiding full Python GC on active job boundaries.
- Isolates WhisperX alignment runtime from ASR, diarization, and embedding model placement, with CPU alignment as the safe default.
- Keeps the ASR runtime on the cuDNN9-compatible faster-whisper / CTranslate2 stack while installing WhisperX without replacing ASR dependencies.
- Removes the full NLTK runtime dependency by providing the small sentence-span compatibility surface required by WhisperX alignment.
- Filters short stock outro hallucinations while preserving normal meeting-context language.
- Reads normalized audio once for embedding and slices it by diarization turns, reducing repeated native decoding risk.
- Updates safe timing logs and public release-gate checks.
Validation
- CI lint, tests, security scan, Codecov, FOSSA, and Claude review passed on the release PR.
- Internal live validation covered health stability, alignment runtime behavior, hallucination filtering, and embedding audio slicing.
VoScript v0.7.5
What's Changed
- Adds safer GPU model lifecycle defaults: idle model unload defaults to 180 seconds, and Docker exposes all available NVIDIA GPUs unless the operator overrides visibility.
- Improves CUDA device selection for lazy-loaded ASR/faster-whisper, diarization/pyannote, and embedding/WeSpeaker models while preserving explicit indexed CUDA pinning.
- Fixes faster-whisper CUDA argument handling when the internal torch device is indexed.
- Improves pyannote local snapshot loading by generating runtime-localized configs for nested models.
- Expands CI quality gates and aligns public configuration, API docs, and changelogs for 0.7.5.
Validation
- CI lint, tests, security scan, and dependency checks passed on the release commit.
- Public release privacy scan passed.
- Internal live validation completed for idle unload, multi-GPU lazy load, pinned CUDA behavior, pyannote local snapshot loading, and faster-whisper CUDA handling.
- Remote service health was rechecked after merge: Docker healthy,
/healthz200, OpenAPI 0.7.5, two GPUs visible, and idle timeout set to 180 seconds.
VoScript v0.7.4
Highlights
- Preserves a loaded or persisted AS-norm cohort during automatic rebuild checks, so transcription cleanup cannot indirectly shrink the cohort.
- Keeps the v0.7.4 API/result contract updates for speaker labels, optional alignment metadata, voiceprint matching, and denoise defaults.
- Adds regression coverage for direct-load, empty-source, small-source, and explicit manual rebuild cohort behavior.
Validation
- PR checks passed: lint, tests, security scan, and WIP gate.
- Remote deployment verified healthy with OpenAPI 0.7.4.
- Internal live validation completed with a synthetic non-sensitive transcription, speaker_label/result contract checks, voiceprint API checks, and cohort-preservation checks before and after transcription cleanup.
VoScript v0.7.3
VoScript v0.7.3 — Runtime stability hotfixes
This patch release focuses on runtime resilience for warm deployments, Chinese alignment prerequisites, and safer failure metadata.
Bug fixes
- Pyannote diarization and WeSpeaker embedding loading now prefer complete local Hugging Face snapshot caches when available, reducing unnecessary network access during warm deployments.
- The runtime disables the Hugging Face Xet/CAS path by default unless operators explicitly override it, avoiding a failure-prone download path in some environments.
- Docker and compose defaults add faster Hub metadata fallback behavior for slow or unreliable networks.
- Prompt-contaminated ASR repetition runs are filtered before diarization, punctuation, and artifact generation.
- The Docker base image is upgraded to PyTorch 2.6 so current transformers safety checks can load WhisperX's default Chinese alignment weights.
- Chinese alignment remains enabled by default, with explicit alignment policy configuration available for temporary operational fallback.
- Pyannote checkpoint loading scopes trusted metadata types to the model load call instead of using a process-global allowlist.
- Alignment failure metadata is sanitized and classified without exposing raw exception details.
Deployment
- Rebuild the container image to pick up the PyTorch 2.6 base image.
- Existing model cache volumes remain compatible.
Compatibility
- Existing transcription results remain compatible.
- The
alignmentobject is additive, andwords[]remains optional.
VoScript 0.7.2
VoScript 0.7.2
Highlights
- Adds architecture foundation layering across pipeline, providers, infra, application, and voiceprints.
- Hardens failure paths around uploads, artifact persistence, in-flight dedup, corrupt persisted results, and export names.
- Aligns FastAPI metadata with 0.7.2 and fixes Docker healthcheck behavior without adding curl to the runtime image.
- Cleans public release materials so local-only planning and validation assets stay out of the public repository.
Compatibility
- Existing HTTP endpoints and persisted status/result shapes remain compatible.
- Provider preset/API selection, streaming, and speaker memory remain future work, not part of the 0.7.2 public contract.
Validation
- CI passed: lint, unit/security tests, and dependency audit.
- Internal live validation covered API regression, overlap bench, batch transcription, and AS-norm enrollment/probe paths.
Docker
- Publishing this release triggers the existing Docker release workflow for GHCR and Docker Hub tags.
v0.7.0 — Speaker consolidation + ngram dedup
What's new
Bug fix
- Speaker cluster consolidation: multiple diarization clusters matching the same enrolled speaker are now merged into a single canonical label. Previously the same person could appear as separate speakers.
Features
no_repeat_ngram_sizeparameter: new optional field onPOST /api/transcribe(default0). When ≥ 3, suppresses n-gram repetitions in the transcript (e.g. "like like like" → "like"). Non-integer values return 422.paramsfield: completed results now includeno_repeat_ngram_sizein theparamsobject.
Tests
- 43 new E2E tests; suite: 84 collected, 78 pass, 6 expected skips.
Deployment
docker-compose.yml: added./app:/appvolume mount for hot-reload without image rebuild.
Docker
docker pull mapleeve/voscript:0.7.0Full changelog: doc/changelog.en.md