Skip to content

Refactor architecture validation gates#43

Open
MapleEve wants to merge 24 commits into
mainfrom
refactor/v0.8.5-architecture-convergence
Open

Refactor architecture validation gates#43
MapleEve wants to merge 24 commits into
mainfrom
refactor/v0.8.5-architecture-convergence

Conversation

@MapleEve

Copy link
Copy Markdown
Owner

Summary

  • Fix WhisperX alignment cache-only mode so imported HF/Transformers offline flags are honored.
  • Harden release validation: strict live E2E fallback, forced Rust health smoke, heavy gate synchronize trigger.
  • Align API docs with public alignment/admission contracts.

Validation

  • ruff check app/ tests/e2e/test_api_core.py tests/unit/test_provider_registry.py tests/unit/test_kernel_release_gates.py --ignore E501
  • ruff format --check app/ tests/unit/test_provider_registry.py tests/unit/test_kernel_release_gates.py
  • python voscript-api/scripts/public_release_scan.py --root .
  • python voscript-api/scripts/docs_code_drift_gate.py --root . --check
  • python voscript-api/scripts/architecture_gate.py --root . --check
  • PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 .venv/bin/python -m pytest tests/unit/ tests/test_security.py tests/test_voiceprint_db.py tests/test_job_service.py -p pytest_cov --tb=short --no-header --cov=app --cov-report=term --cov-fail-under=90
  • PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 .venv/bin/python -m pytest tests/e2e/test_api_core.py --collect-only -q
  • cargo fmt --manifest-path crates/voscript_core/Cargo.toml -- --check && cargo clippy --manifest-path crates/voscript_core/Cargo.toml --features python-bindings --all-targets -- -D warnings && cargo test --manifest-path crates/voscript_core/Cargo.toml

Remote ai-wan deployment is running image tag voscript:6725b41c540e-aiwan-full-rust with RUST_KERNEL_MODE=required.

@github-actions

Copy link
Copy Markdown

👍 @MapleEve

Thank you for raising your pull request and contributing to VoScript.
Please make sure you have followed our contributing guidelines. We will review it as soon as possible.
If you encounter any problems, please feel free to connect with us.
非常感谢您提出拉取请求并为 VoScript 做出贡献,请确保您已经遵循了我们的贡献指南,我们会尽快审查它。
如果您遇到任何问题,请随时与我们联系。

@codecov

codecov Bot commented Jun 13, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 90.20568% with 100 lines in your changes missing coverage. Please review.
✅ Project coverage is 90.52%. Comparing base (2353652) to head (022b9df).
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
app/application/transcription_submission.py 80.82% 37 Missing ⚠️
app/infra/transcription_records.py 91.50% 9 Missing ⚠️
app/application/transcription_records.py 95.00% 8 Missing ⚠️
app/infra/job_persistence.py 53.84% 6 Missing ⚠️
app/pipeline/orchestrator.py 60.00% 6 Missing ⚠️
app/application/admission.py 95.83% 3 Missing ⚠️
app/application/transcription_jobs.py 80.00% 2 Missing ⚠️
app/postprocess/alignment.py 95.45% 2 Missing ⚠️
app/providers/artifacts/__init__.py 33.33% 2 Missing ⚠️
app/providers/asr/__init__.py 50.00% 2 Missing ⚠️
... and 14 more
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #43      +/-   ##
==========================================
- Coverage   90.90%   90.52%   -0.39%     
==========================================
  Files          84       95      +11     
  Lines        3696     4359     +663     
==========================================
+ Hits         3360     3946     +586     
- Misses        336      413      +77     
Flag Coverage Δ
unit 90.52% <90.20%> (-0.39%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

@MapleEve MapleEve marked this pull request as ready for review June 13, 2026 13:41
Copilot AI review requested due to automatic review settings June 13, 2026 13:41
@claude

claude Bot commented Jun 13, 2026

Copy link
Copy Markdown

Claude encountered an error after 1m 38s —— View job


Code Review in Progress

Reviewing PR #43: Refactor architecture validation gates

Tasks

  • Read REVIEW.md guidelines
  • Review security and privacy (tokens, paths, logs, IDs)
  • Review model lifecycle (lazy loading, GPU semaphore, CUDA cache)
  • Review GPU/CPU fallback behavior
  • Review API compatibility (endpoints, contracts, auth)
  • Review test coverage for new functionality
  • Verify English/Chinese documentation synchronization
  • Review architecture changes (admission, submission, records)
  • Review CI/CD and release gate changes
  • Post findings

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

本 PR 以“架构 ring/边界 + 发布/CI gate 加固”为主线,重构了转写上传与持久化读写的分层(application/infra),并将 Rust kernel 默认姿态切换为 required,同时补齐了对齐(WhisperX)离线/缓存模式、资源预算(磁盘/并发/超长音频)与公共文档/契约的一致性验证。

Changes:

  • 将转写上传提交、record 读取/导出、admission(active/in-flight/磁盘)策略下沉到 app/application/*,并新增/强化 infra 仓储与运行时 job API。
  • 强化对齐/降噪/embedding 的“时长预算”短路路径,修复 WhisperX cache-only 场景下已导入 HF/Transformers 模块仍可能联网的问题。
  • 加固 CI/Release 工作流:引入精确 source ref 解析、新增 architecture/docs drift gate、重跑 heavy gate(含 synchronize),并把默认运行姿态与文档升级到 0.8.5 / RUST_KERNEL_MODE=required

Reviewed changes

Copilot reviewed 100 out of 101 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
tests/unit/test_voiceprint_db.py 更新并发 dedup 测试以使用 application/infra 新边界
tests/unit/test_transcription_submission.py 新增 upload submission 用例的失败路径与 ring 约束测试
tests/unit/test_transcription_records.py 新增 record 读取/导出/纠错与“infra 仓储边界”测试
tests/unit/test_provider_registry.py 验证 providers facade 与 pipeline.registry 的职责分离与 override 语义
tests/unit/test_provider_capabilities.py 新增 capability 静态元数据与 alignment capability 归属验证
tests/unit/test_pipeline_runner.py runner 增加 capability 预检与公开安全元数据契约测试
tests/unit/test_kernel_release_gates.py 加固 release/heavy gate 断言与默认 Rust required 姿态
tests/unit/test_kernel_bridge.py Rust bridge 版本更新与 rollback 语义测试更新
tests/unit/test_job_runtime.py 新增 runtime job store API 与 admission 计数/原子性测试
tests/unit/test_docs_code_drift_gate.py 新增 docs/code drift gate 回归测试
tests/unit/test_config_defaults.py 默认 RUST_KERNEL_MODE=required 与 compose/docs 一致性断言
tests/unit/test_audio_layers.py 增加音频时长元数据与降噪/normalize typed error 行为测试
tests/unit/test_artifact_status_schema_contracts.py status helper 下沉到 infra,并新增 pipeline metadata/public alignment 归一化契约测试
tests/unit/test_api_route_coverage.py API 行为覆盖更新:records/submission、admission 503、输入校验与 voiceprints 路径错误映射
tests/unit/test_admission.py 新增 admission 策略的单元测试(active/in-flight/磁盘)
tests/test_voiceprint_db.py 旧版 DB 测试显式 pin 到 Python 语义(关闭 rust paths)
tests/test_security.py 安全回归:模块 reload 范围、错误信息断言与 submission/jobs 迁移
tests/e2e/test_api_core.py E2E 默认不再 fallback(需显式开关)
docker-compose.yml 新增 admission/预算与对齐/embedding/降噪预算 env,并默认 Rust required
doc/security.zh.md 文档版本号更新至 0.8.5
doc/security.en.md 文档版本号更新至 0.8.5
doc/quickstart.zh.md 更新 noisereduce gate 文案与预算项说明
doc/quickstart.en.md 更新 noisereduce gate 文案与预算项说明
doc/configuration.zh.md 配置索引升级到 0.8.5,新增 admission/预算与 Rust required 默认说明
doc/configuration.en.md 配置索引升级到 0.8.5,新增 admission/预算与 Rust required 默认说明
doc/changelog.zh.md 新增 0.8.5 changelog(Rust required 默认、CI gate、E2E 严格化等)
doc/changelog.en.md 新增 0.8.5 changelog(Rust required 默认、CI gate、E2E 严格化等)
doc/api.zh.md API 文档补齐 audio 下载/voiceprints get 与 admission 503 文案、对齐字段调整
doc/api.en.md API 文档补齐 audio 下载/voiceprints get 与 admission 503 文案、对齐字段调整
crates/voscript_core/src/lib.rs Rust 包版本断言更新到 0.8.5
crates/voscript_core/Cargo.toml Rust crate 版本升级到 0.8.5
CLAUDE.md 增加仓库写作语言与证据保留规则
Cargo.lock Rust crate 锁文件版本更新到 0.8.5
app/providers/voiceprint_match/init.py providers facade 禁止命名选择(仅 default),引导使用 PipelineRunner/registry
app/providers/vad/init.py 同上:facade 只允许 default
app/providers/punc/init.py 同上:facade 只允许 default
app/providers/postprocess/init.py 同上:facade 只允许 default
app/providers/normalize/default.py normalize 超时改为 typed error(脱离 FastAPI)
app/providers/normalize/init.py normalize facade 只允许 default
app/providers/kernel_bridge/release_gates.py release gate 集合/文案与 python owner 路径更新
app/providers/ingest/init.py ingest facade 只允许 default
app/providers/enhance/default.py 增加降噪时长预算短路,避免超长音频全量加载
app/providers/enhance/init.py enhance facade 只允许 default
app/providers/embedding/default.py embedding 增加全量 preload 时长预算,长音频改为按 turn 分段读取
app/providers/embedding/init.py embedding facade 只允许 default
app/providers/diarization/default.py WhisperX cache-only 修复(同步模块 offline flag)、alignment 时长预算 short-circuit
app/providers/diarization/init.py diarization facade 只允许 default
app/providers/capabilities.py 引入静态 capability 元数据、alias 规范化与 NotFound 错误类型
app/providers/asr/init.py asr facade 只允许 default
app/providers/artifacts/default.py alignment 元数据做 public-safe 归一化后再写入 artifacts
app/providers/artifacts/init.py artifacts facade 只允许 default
app/providers/_registry.py 新增 facade 选择 guard(拒绝 provider_name!=default)
app/providers/init.py providers 包改为惰性导出,避免重导出 pipeline registry helpers
app/postprocess/alignment.py 抽出纯 alignment 后处理 helper(供 diarization 使用)
app/pipeline/step_keys.py 抽出 canonical step key/alias 规范化
app/pipeline/stages/voiceprint_match/init.py stages 改为直接 resolve_provider 并调用 provider 接口
app/pipeline/stages/vad/init.py 同上:stage 直接 resolve_provider
app/pipeline/stages/punc/init.py 同上:stage 直接 resolve_provider
app/pipeline/stages/postprocess/init.py 同上:stage 直接 resolve_provider
app/pipeline/stages/normalize/init.py 同上:stage 直接 resolve_provider 并使用 contract request
app/pipeline/stages/ingest/init.py 同上:stage 直接 resolve_provider
app/pipeline/stages/enhance/init.py 同上:stage 直接 resolve_provider 并使用 contract request
app/pipeline/stages/embedding/init.py 同上:stage 直接 resolve_provider 并使用 contract request
app/pipeline/stages/diarization/alignment.py 旧路径改为兼容性 re-export
app/pipeline/stages/diarization/init.py diarization stage 归一化 alignment 元数据到 public-safe 子集
app/pipeline/stages/asr/init.py stage 直接 resolve_provider 并使用 contract request
app/pipeline/stages/artifacts/init.py stage 直接 resolve_provider 并调用 provider.build
app/pipeline/runner.py 新增 capability 预检/skip 元数据,并通过动态 import 调 infra cleanup
app/pipeline/registry.py 使用 step_keys 规范化并新增 is_provider_override
app/pipeline/orchestrator.py 通过动态 import 断开 pipeline->infra/provider 的静态依赖边
app/pipeline/errors.py pipeline lookup errors 抽到独立模块
app/pipeline/contracts/requests.py provider/stage key 规范化改用 step_keys
app/pipeline/contracts/normalize.py 新增 normalize typed errors
app/pipeline/contracts/metadata.py 新增 pipeline metadata ownership/public-safe alignment 元数据归一化契约
app/pipeline/contracts/errors.py 兼容性 re-export 到新 errors 模块
app/pipeline/contracts/init.py 汇总导出更新:新增 metadata/normalize errors,移除 status helper 直出
app/infra/transcription_records.py 新增转写 records 的文件系统仓储(status/result/audio)
app/infra/job_status.py status payload contract helper 下沉到 infra
app/infra/job_runtime.py runtime job store API、in-flight/active admission 原语与计数 helpers
app/infra/job_persistence.py 增加 public 原子写适配器与 status 写/丢弃 helper
app/infra/audio/paths.py 路径校验从 HTTPException 改为 typed error
app/infra/audio/metadata.py 新增 audio duration 元数据读取 helper
app/infra/audio/errors.py 新增音频路径 typed errors
app/infra/audio/init.py 导出新增的 metadata/errors
app/Dockerfile 默认要求 wheel(缺失 fail closed),并明确 rollback 语义
app/config.py 版本升级 0.8.5,默认 Rust required,并新增 admission/预算配置
app/application/transcription_records.py 新增 application records usecases(status/list/audio/export/纠错)
app/application/transcription_jobs.py job 状态写入改走 infra API,并在 finally 释放 admission
app/application/admission.py 新增 admission 策略实现(active/in-flight/磁盘)
app/api/routers/voiceprints.py 音频路径 typed error 映射为 400 HTTPException
.github/workflows/rust-foundation-heavy.yml heavy gate 引入 resolve-source,PR synchronize 触发,强制 exact ref
.github/workflows/release.yml release 工作流拆分并加固:resolve-source、lint/unit/security、wheel artifact、docker smoke、exact-ref build/push
.github/workflows/ci.yml 新增 architecture/docs drift gate,并更新 pip-audit ignore 列表
.env.example 默认 Rust required,并新增 admission/预算与对齐/embedding/降噪预算 env

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +164 to +172
posix_path = PurePosixPath(value)
windows_path = PureWindowsPath(value)
if (
value in {".", ".."}
or posix_path.is_absolute()
or windows_path.is_absolute()
or posix_path.name != value
or windows_path.name != value
):
}
transcriptions._write_status(job_id, "completed", filename=audio_path.name)
out_dir = transcriptions.TRANSCRIPTIONS_DIR / job_id
job_persistence._write_status(job_id, "completed", filename=audio_path.name)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants