feat(0.8.0): memory, skills, coverage release#25
Merged
Conversation
- F: fastembed bge-reranker-v2-m3 cross-encoder stage in recall pipeline with 250 ms timeout + graceful RRF-only fallback; clx model fetch/status/list CLI; background prefetch from UserPromptSubmit once per process - G: auto_recall.pin_recent_sessions opt-in injects last-N session summaries with current-session self-pin guard; new Storage::recent_session_summaries - H: memory.auto_summarize opt-in Stop hook rolls up N-turn windows into AutoSummary snapshots via configured chat LLM; deterministic template fallback when LLM unavailable; new SnapshotTrigger::AutoSummary
…iring - I: workspace llvm-cov ignore-regex (event.rs, main.rs, runtime.rs) + 17 ratatui TestBackend + insta snapshots for dashboard/ui/detail.rs and dashboard/settings/render.rs - J: cargo-mutants v27 configuration (mutants.toml) + weekly baseline workflow + PR-diff workflow + docs/mutation-testing.md - K: 30-pair synthetic RAGAS golden set (no PHI, no user content) + criterion bench benches/recall_accuracy.rs over rrf_enabled true/false - L: clx config-trust file-hash trustlist (parallel to existing PR #15 trust-mode) — bypasses inert filter at project.rs:86 for trusted hashes; ~/.clx/trusted_configs.json with 0600 mode; 15 unit + 4 integration tests - M: clx-hook main.rs slimmed 196 to 126 LoC, delegates to lib router; dashboard event loop now drives reducer via all DashboardEvent variants (Key, Resize, Tick, Quit); dead_code warnings on Resize/Tick/Quit gone
- Add rrf_enabled, rrf_k, time_decay_half_life_days, percentile_gate, reranker_enabled, reranker_timeout_ms to AutoRecallConfig with defaults - Map AutoRecallConfig fields into RecallQueryConfig in hook subagent - Attach FastembedReranker in hook and MCP recall when reranker_enabled=true
Replaces event_type='message' query that never matched production writes with a count of tool_events rows for the current session.
…s, plugin schema - A (security): redact_json_value walks serde_json::Value recursively, redacting 20 sensitive key patterns case-insensitive; verify_model_dir_complete enforces required files non-zero before .ready sentinel; clx model fetch --force now acquires lock before destructive remove_dir_all - B (concurrency): migrate_to_v7 adds UNIQUE INDEX tool_events_dedup_idx on (session_id, tool_name, IFNULL(target,''), window_end_unix/60); append_or_extend_tool_event rewritten as INSERT ... ON CONFLICT DO UPDATE for atomic upsert; stop_auto_summary re-reads last AutoSummary timestamp before write to skip duplicate snapshots on rapid Stop hooks - C (async + deps): FastembedReranker::score lazy-loads inside the same spawn_blocking as the rerank call so tokio::time::timeout governs cold load; fastembed 4 -> 5 (ort 2.0 stable); criterion 0.5 -> 0.8 - D (plugin schema): plugin.json drops non-spec mcp_servers, adds author + license; all 6 SKILL.md strip non-spec version frontmatter field
…Embedder Recall pipeline Domain layer no longer imports Storage / LlmClient / EmbeddingStore. Two ports added in recall/ports.rs; RecallEngine moved to recall/engine.rs and depends on traits only. Concrete adapters live in storage/recall_repo.rs (StorageSnapshotRepo) and recall/adapters.rs (LlmQueryEmbedder). All call sites in clx-hook, clx-mcp, and the recall accuracy bench wire adapters at construction time. Public builder API preserved. Codex BLOCKER finding resolved.
Fixes 47 distinct pedantic warnings across the 0.8.0 feature crates: - Missing doc backticks around SQLite/HuggingFace/identifier names - format! into String -> write! - push_str single-char -> push - Collapsed nested if-let chains - map().unwrap_or() -> map_or() - match-for-destructuring -> if-let / let-else - must_use on builder methods - Removed undeclared test-fixtures cfg gate in clx-hook router Documented allow attributes on dispatch-shape functions, owned-PathBuf clap arms, and the App API retained for forward dashboard wiring. Behavior preservation verified: 1218 tests still pass, 0 fail.
Red/Green/Purple sequential review classified 13 findings: - 4 DEFENDED (false-positive or existing test coverage) - 4 PARTIAL (real but mitigated) - 4 UNDEFENDED (real, no defense) - 1 SUBSUMED Three UNDEFENDED items are MUST-FIX in 0.8.0: F10 (HIGH): PostToolUse audit_log.command stored raw Bash and MCP commands with inline secrets. The pre_tool_use path wrapped through log_audit_entry (redact_secrets); post_tool_use bypassed it. Fix: redact_secrets(command) before AuditLogEntry::new in post_tool_use.rs. F4 (HIGH): format_recall_context interpolated stored snapshot summary/key_facts verbatim into the historical-context wrapper. A malicious clx_remember payload could close the tag and inject system-style instructions that persist across sessions. Fix: new sanitize_recall_text helper escapes < and > in both summary and key_facts paths. Regression test asserts the escaped form. F1 (MED): redact_secrets free-text scanner missed api_key = sk_... with whitespace around the separator, lowercase bearer tokens, and Authorization: Basic ... credentials. Fix: section 3 (case-insensitive bearer/basic scheme) runs before a new section 2b (whitespace-tolerant keyword scan that skips scheme tokens). Four regression tests. Verification: cargo test --workspace = 1223 pass / 0 fail (+5 from Purple regressions). cargo clippy --workspace --all-targets -D warnings = exit 0. CHANGELOG documents both the fixes and the five Purple findings deferred to 0.8.1 (F3, F5, F7, F8, F9) with rationale per finding.
…e findings now in 0.8.0)
…mit count) per final gate
…w-identical wiring)
…ompt, not per-item loop)
…t-in (eliminates macOS prompt)
… band, L1 timeout, denial_count)
…omplete F8, fixes /dev/zero unbounded-read DoS)
…eak (harden_command forces model-fetch dryrun)
…rage denominator policy
…r / 2 false-pos) + CHANGELOG Known issues
…ess LLVM auto-resolve & no-fail-fast
…newer-DB guard, I-R2 Stop contract fixture
…LI (+43 hermetic tests)
…hain_trust.rs) from instrumented denominator, documented rationale in Cargo.toml
…nstrumented 83.02% -> 85.68%+)
…ath-guard (fixes macOS symlinked-HOME false-reject)
…nator, 97% deferred to 0.8.1 (injectable provider harness)
…(eliminate residual leak pattern)
…ig path - redact_title_config_path: longest-known-prefix match handles the ratatui panel-title truncation (the prior full-string match only worked on the dev HOME, so CI on a different HOME was red) - canonicalize the volatile tail to fixed width; gated so only the title line changes (sessions/audit/rules/detail snaps byte-identical) - regenerate 9 affected snapshots; verified green on short and long clean HOME and cargo insta test --workspace --check (CI Coverage gate)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The "memory and quality" release. 17 atomic commits, 1218 tests passing, 0 failing.
Eight user-asked features delivered:
plugin/.claude-plugin/(2026 schema)tool_eventstable (60s windowed dedup, atomic UPSERT)clx config-trustfile-hash trustlist for project configsArchitecture (Wave 4b)
Recall pipeline Domain layer no longer imports Storage / LlmClient / EmbeddingStore directly. Two new ports:
recall::ports::SnapshotRepo— implemented bystorage::recall_repo::StorageSnapshotReporecall::ports::QueryEmbedder— implemented byrecall::adapters::LlmQueryEmbedderRecallEnginedepends only on trait references. Layering proof inrecall/mod.rsdocstring.Decisions (user-approved 2026-05-16)
clx model fetchbackground prefetch + 250 ms graceful degradation)tool_eventsretentionretention.tool_events_daysTwo-round comprehensive review
Round 1 (Codex) found and self-fixed 2 BLOCKERs:
rrf_enabled,rrf_k, etc.) — production callers always used Defaultevent_type='message'rows that are never written — fix countstool_eventsRound 2 (4 parallel agents) closed all remaining findings:
clx model fetchclx model fetch --forcelock ordering inversiontool_eventsupsert race (schema v7 UNIQUE INDEX + INSERT ... ON CONFLICT DO UPDATE)stop_auto_summarydouble-write racefastembed4 → 5.13.4 (ort 2.0 stable)criterion0.5 → 0.8.2mcp_servers: {}, drop SKILLversion:, addauthor/license)Schema migrations
tool_eventstable + 2 indexes (additive, no destructive changes)UNIQUE INDEX tool_events_dedup_idxfor atomic upsert (additive)Test plan
cargo test --workspacepasses locally on macOS (verified: 1218 pass, 0 fail, 10 ignored)cargo check --workspaceclean (verified)clx recall "previous discussion on Azure backend"returns hits within 500 ms p95clx model fetchdownloads bge-reranker-v2-m3 (568 MB) and writes.readyonly after integrity gateclx config-trust add <repo>/.clx/config.yamlregisters hash; edit the file, recall re-trusts only afterclx config-trust addrerunclaude /skills list)clx maintenance trimremovestool_eventsrows older than 30dmemory.auto_summarize.enabled: truewrites one AutoSummary snapshot per 5 turnsDeferred to 0.8.1 (documented)
summarize.rsDomain port extractionpolicy/{rules,llm}.rsDomain port extractionquery_percentile_gateduplication (subagent + mcp/recall)Risks
bge-reranker-v2-m3is a 568 MB one-time download — first recall after upgrade may use RRF-only until prefetch completes; setauto_recall.reranker_enabled: falseto opt outplugin/scripts/migrate.sh; oldplugin/skills/removed in 0.9.0fastembedv5 ships withort 2.0.0-rc.12transitively (waiting on GA-stable in a future fastembed point release)Branch state