Add aim-sot Source-of-Truth subsystem#187
Merged
Merged
Conversation
5d3e6ad to
79ded32
Compare
First build step of the aim-sot source-of-truth subsystem: the registry contract that later steps and every user's registry depend on. - schema/registry.schema.json (JSON Schema draft-07) for .sot/registry.yaml: 6 required core fields (id, kind, boundary_type, sot_location, owner, description), kind/boundary_type/status enums, provenance and pointer-link fields. additionalProperties:false at document and entry level enforces the no-copy / no-machine-field invariants on the committed registry. - templates/: user-facing registry.yaml + README starters; no project data, illustrative examples only. - SKILL.md stub; consult/detect-propose/verify modes land in later steps. - 23 contract tests: schema structure, enum/required validation, instance-level rejection of unknown fields, and template validity.
Second build step. Adds the consult engine over .sot/registry.yaml. - Subcommands list/get/where/who/drift; --json and --registry flags. - Registry resolution: --registry override -> git-root -> parent-walk (non-git fallback), for portability across project layouts. - Read-only: reads the registry only; light structural validation (YAML parse + entries-is-list). Full schema validation is deferred to verify mode; resilient to a malformed sibling entry. - _load_entries() is the single seam for the Item-3 derived memory cache. - run-with-env.sh (Pattern B) invocation documented in SKILL.md. - 48 tests incl. git-root/parent-walk resolution and a full main()-level read-only guarantee.
…very Move test_registry_schema.py and test_sot_consult.py from co-located _ai-memory/skills/aim-sot/tests/ (invisible to CI's pytest tests/, testpaths=["tests"]) to top-level tests/, matching the repo convention. Convert the consult test import to the importlib.util.spec_from_file_location pattern; repoint the schema test data paths. Behavior-preserving, no test-logic changes; 48 tests pass.
Wave-1 Item 3. Adds the detect-propose mode: hybrid auto-discovery (manifest->component, top-dir->path, ADR->concern; semantic fields human-owned), the four-type drift taxonomy (location, temporal-staleness, content-hash, declaration-vs-reality/K1, local-hash only), and propose-only output (root-cause + impact; never writes .sot/registry.yaml). Adds a per-install drift cache (atomic write + advisory lock + 7-day TTL throttle, bypassed on a registry edit while preserving hash baselines) and a derived memory cache that reindexes registry entries into the conventions collection (MemoryType.SOT_ENTRY); consult reads through it with a graceful file fallback. Stateless first-run candidate cap. Adds MemoryType.SOT_ENTRY and updates dependent type-count/coverage assertions. Unit tests cover each drift type, the never-writes invariant, cache atomicity/lock/TTL, and reindex determinism.
Add aim_sot_verify.py implementing the BP-024 verification taxonomy (schema, referential, completeness, content checks) over .sot/registry.yaml, emitting a structured PASS/CONDITIONAL/FAIL verdict. Structural checks are driven from registry.schema.json. URL resolution and drift-check execution are opt-in; owner validation uses CODEOWNERS when present. Adds the verify mode entry in SKILL.md and a hermetic test suite.
Propose-only end-of-turn hook that invokes the detect-propose engine and surfaces drift/new-candidate counts to stderr; ships unregistered (opt-in). Also corrects the SKILL.md invocation blocks to pass full script paths to run-with-env.sh.
…ve 2) Per-CLI propose-only trigger adapters that invoke the aim-sot detect-propose engine on each CLI's end-of-session event, replicating the Claude Stop hook: - codex: Stop (src/memory/adapters/codex/sot_drift.py) - cursor: PreCompact (src/memory/adapters/cursor/sot_drift.py) - gemini: PreCompact (native PreCompress; src/memory/adapters/gemini/sot_drift.py) Each adapter normalizes its native stdin via the existing per-CLI schema, gates on the presence of .sot/registry.yaml (opt-in), and runs the engine in propose-only mode via run-with-env.sh. Fail-open throughout; never writes any committed file. Ships unregistered per the portability rule. 30 hermetic tests.
Add manual-enablement instructions for the Codex/Cursor/Gemini sot_drift trigger adapters (they ship unregistered per the portability rule), mirroring the Claude Stop hook opt-in section, with each CLI's exact hook-config format. Also reword the Modes section to drop internal build-sequencing references that don't belong in user-facing skill content.
Harden the aim-sot detect-propose engine so an unconfirmed drift can no longer advance the baseline it is meant to flag, and bound its filesystem and store interactions. Baseline / throttle (5a cache): - Stat the artifact (mtime/size) before honouring the 7-day TTL skip and re-check when it changed, closing the window where an artifact-only edit was invisible for up to a week. - Hold last_verified_sha on detected drift instead of advancing it, so the proposal re-fires until resolved; advance only when the entry is clean or the registry's human last_verified is newer (a re-confirmation). A first-sighting entry records drift_status="unverified", not "clean". Reindex (5b cache): - Load the registry and construct storage before deleting existing points (prepare-then-replace), and keep existing points on a transiently-empty or unparseable registry. Return an explicit success/failure result and advance registry_sha only on success. Serialize the reindex per project_id. Discovery / safety: - Derive the project root only from a conforming <root>/.sot/registry.yaml and skip discovery for a flat registry path, preventing an unbounded scan. - Prune skipped directories during traversal and cap the walk; hash files in chunks; fsync before replace and clean up the temp file on a failed write. Companion tests cover edit-within-TTL detection, baseline-hold then re-fire then human-reconfirm re-baseline, cold-start unverified, reindex failure/empty preservation and idempotency, lock release on error, and the flat-registry no-scan guard. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
K1 previously treated a missing or unreadable 5a drift baseline as a silent
PASS, so a registry whose content was never machine-confirmed could read as
verified. K1 now emits a skipped-no-baseline CONDITIONAL warning ("manual human
confirmation required") for cold-start (no cache record / drift_status=
"unverified"), baseline-loss, and resolution-failure (project_id unresolvable
while a drift cache exists), distinguishing genuine cold-start from baseline
loss in the message. A resolution failure also prints a stderr warning, and a
new --project-id flag lets CI or teammates supply the cache key explicitly.
The verdict now reports ran-pass / no-op (R3/C2/C4) / skipped-no-baseline
distinctly; pass_count counts only ran-pass checks so a flat count can no
longer masquerade as full content verification.
K4 now accepts committed locations seeded in proposal mode (symmetric with S2's
id seeding), so a proposed sot_location colliding with a committed entry's
location fails the gate.
Adds tests for cold-start / unverified / baseline-loss / resolution-failure K1
outcomes, the --project-id override, the distinct verdict buckets, the
proposal-vs-committed K4 collision, and the --exec-drift-checks paths
(non-zero exit, timeout, OSError) all resolving to CONDITIONAL.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
General/cascading conventions recall applied no type filter on UNKNOWN intent because get_target_types(UNKNOWN) returns [] → effective_types=None. After a project runs reindex, sot_entry JSON blobs surfaced in ordinary rule/guideline recall and per-turn injection. Add a must_not guard in MemorySearch.search(): when searching the conventions collection without an explicit memory_type and without a caller-provided must_not_types, inject sot_entry into must_not_conditions at the Qdrant layer. Explicit memory_type=["sot_entry"] (the aim-sot consult path) bypasses the guard and continues to retrieve sot_entry records as before.
…ters Cursor: change hook event from preCompact to stop (the actual end-of-turn event per spec §4 and _CURSOR_HOOK_MAP). Update SKILL.md hooks.json snippet and adapter docstring accordingly. Gemini: add AfterAgent→Stop mapping to _GEMINI_HOOK_MAP (was absent); change adapter from PreCompress to AfterAgent. Fix resolve_cwd to prefer GEMINI_PROJECT_DIR env var before falling through to stdin cwd and GEMINI_CWD (BP-032 §Finding 1). Update SKILL.md hooks.json snippet. Codex: no event change needed (Stop was already correct). Add loop-guard documentation to all three Wave-2 adapters (no known CLI-level loop-active field; propose-only design is structurally loop-free per BP-032 §Finding 4). Tests: add per-adapter event-contract assertions (T-CA11, T-GA11, T-CD11) confirming native event → expected canonical name → validates against VALID_HOOK_EVENTS. Add GEMINI_PROJECT_DIR precedence test (T-GA12).
Document the M3 DD-D verdict outcome buckets: ran_pass (substantive checks that passed), no_op (inert checks: R3/C2/C4), and skipped (checks that could not run due to missing drift baseline). pass_count now reflects ran_pass only — inert and skipped checks are not counted as passed, so a human reading pass_count knows exactly how many checks substantively verified content. Update K1 taxonomy description and soft-check ruling: K1 on no baseline (cold-start, baseline-loss, project-id resolution failure) now emits a skipped_no_baseline CONDITIONAL warning rather than silently passing as if content were verified.
User-facing documentation for the aim-sot feature: a [Unreleased] CHANGELOG entry and a new docs/AIM-SOT.md covering the registry model and schema, the three skill modes (consult / detect-propose / verify), the propose-only + HITL verify-gate guarantee, opt-in-OFF-by-default enablement instructions for all four CLIs, and the 5a/5b runtime caches. Behavior described reflects the post-cycle-1-fix state: propose-only, fail-open triggers, cold-start unverified, baseline held on drift. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…red cache Fix two data-safety defects in the 5b reindex path of the SOT detect-propose engine, plus documentation and discovery-scan precision: - An unquoted registry last_verified (e.g. 2026-06-01) parses as a YAML-native datetime.date and lands in the reindex payload. json.dumps raised TypeError on it, failing the entire reindex so registry_sha never advanced and the rebuild retried-and-failed every run (also disabling the throttle via force_recheck). Serialize with default=str so dates become isoformat strings. - When existing points were deleted but every store_memory raised, the reindex returned success with stored=0, letting the caller advance registry_sha over an emptied-not-restored cache with no retry. Return failure when the prepared set is non-empty but nothing was stored. - Document that cold-start unverified is an intentionally-transient in-session marker; tighten the human-reconfirm heal-condition comment to state the strict day-granular comparison; emit a stderr note when discovery truncates at the directory cap; correct the empty-registry branch comment. Adds companion tests for each fix (date serialization, delete-then-all-stores- fail, drift-cache temp cleanup on dump error, reindex-lock release on exception).
FV-1: cond_checks now built first from non-skipped_no_baseline warnings so a mixed-baseline registry (hash-drift K1 + unverified K1 in same run) correctly lands K1 in cond, not skipped (DD-D). Companion test confirms FAILs on old logic. V-LOW-1 (message-only): K1 cold-start vs baseline-loss label for the project_id_resolved path now uses bool(components) — this project's own cache — instead of the global _drift_state_populated() glob, preventing multi-project mislabeling. Companion test and updated test_K1_baseline_loss_message. V-LOW-2: _sha256_short returning None (file exists but unreadable) now emits a K1 CONDITIONAL warning instead of silently passing. Companion test added. FV-2: add test_K1_empty_sha_conditional (comp present, last_verified_sha="" → CONDITIONAL "no recorded content hash" — existing behavior, test was missing). FV-3/FV-4 (NITs): update test_verdict_conditional docstring; add dead-code comment to _build_verdict skipped_checks - fail_checks step. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…recall Use `not memory_types` instead of `memory_types is None` so a caller passing `memory_type=[]` also triggers the sot_entry exclusion (defense-in-depth; unreachable today as upstream coerces []→None, but blocks future regressions). Strengthen test_explicit_must_not_types_caller_override_respected to assert sot_entry absent from caller-supplied exclusion list. Add test_empty_list_memory_type_conventions_excludes_sot_entry for the [] case. Note test_rule_and_guideline_types_unaffected as representative for port/structure variants.
…ts (cycle-2) - F-ADP-1: replace wrong verify JSON example (ran_pass=12, R1/R4 in ran_pass, fail_count:0 vs failures:[R1]) with a verified clean-run shape: verdict=PASS, ran_pass=13 (incl. R2), pass_count=13, failures/warnings empty. - F-ADP-2: rewrite T-CA11 to spy on normalize_cursor_event via adapter.main(), asserting the adapter wires event_name='stop'; fails on preCompact reversion. - F-ADP-3: make three Gemini tests hermetic — strip GEMINI_PROJECT_DIR + GEMINI_CWD so tests pass regardless of whether those env vars are set in the caller's env. - F-ADP-4: align adapter module docstrings with exact SKILL.md section headings (Codex/Cursor/Gemini § references). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
A flat --registry path (not under <root>/.sot/) makes _project_root_from_registry return None, which reached the path-resolving checks (R1/R4/C3/discovery/K1) unguarded and raised TypeError. A validation gate must emit a verdict, not traceback. Resolve a fallback root from the registry's own directory when the project root is None (mirroring the detect-propose guard) and pass it to the check fan-out, so every consumer receives a usable root. Add a regression test asserting the flat-registry path emits a structured FAIL rather than crashing.
…jection
The 5b reindex stored derived sot_entry rows with
source_hook="aim_sot_detect_propose", but the core payload allow-list
(src/memory/validation.py valid_hooks) had no SOT value, so every write was
rejected by validation and the cache stayed empty. store_memory raises
ValueError("Validation failed: ...") on rejection; the reindex inner loop
swallowed it via a bare except, so a 0-row rebuild was misreported by
cmd_reindex as "store unreachable" (exit 0) — sending operators after a phantom
connectivity issue.
- Add "aim_sot_detect_propose" to the core valid_hooks allow-list.
- Distinguish validation rejection from store-unreachable: ReindexResult gains a
reason field; the inner loop counts ValueError("Validation failed") rejections;
cmd_reindex reports the rejection accurately and exits non-zero, while a
genuinely unreachable store still exits zero and leaves the cache intact.
Regression tests hit the real memory.validation.validate_payload end-to-end (no
mocked store): the SOT source_hook is accepted, a bad hook is still rejected, a
row persists through the real allow-list via the reindex path, and the
validation-rejection path reports the correct reason and a non-zero exit.
detect-propose run bailed at a "No registry found" early-return before the
discovery scan whenever no .sot/registry.yaml existed, with a circular message
("Run aim-sot detect-propose to create one" — you are running it). A project
could not be bootstrapped from zero even though the discovery engine works.
- Add _run_cold_start_discovery: when no registry exists, run the discovery scan
and emit candidate proposals to stdout, rooted at the conforming project root
or the current working directory (flat/absent path). Kept separate from the
registry-present drift path (no drift detection, 5b reindex, or 5a cache).
- Stay propose-only: the registry is never created or written; the empty-state
message now points to the bootstrap steps instead of being circular.
- Document the first-run/bootstrap flow in SKILL.md and correct the stale
registry template header about the propose -> verify -> approve gate.
Regression tests: cold-start emits discovery candidates via the cwd fallback and
creates no .sot/registry.yaml; the empty-state message is a bootstrap hint, not
the circular bail.
consult added --json and --registry on the top-level parser, so the documented form (consult list --json) failed with 'unrecognized arguments' — only the pre-subcommand form (consult --json list) worked. detect-propose and verify both accept these flags after the subcommand. Move both flags onto the subcommands via a shared parents=[common] parser so consult list --json / consult get <id> --registry PATH work, consistent with the sibling scripts and the SKILL.md example. Existing test invocations updated to the post-subcommand ordering; regression tests (C24-C27) cover the post-subcommand forms and assert the pre-subcommand ordering is now rejected.
…low-up) A flat --registry override makes _project_root_from_registry return None, so verify resolved declared locations against the registry's own directory (the DEFECT-3 fix). But _run_all_checks still ran C1 auto-discovery unconditionally against that directory, firing spurious "discovered component(s) not registered" warnings. Thread a discover flag (discover=project_root is not None) so verify skips discovery for a flat root, matching detect-propose's M5 skip-discovery contract; path resolution for R1/R4/C3/K1 is unchanged. Add the missing DEFECT-3 CHANGELOG entry. Tests: test_flat_registry_override_skips_discovery (no spurious C1) and test_conforming_registry_still_discovers_C1 (conforming path still discovers).
Add the authoring loop that categorizes a project's boundaries by type and writes gradeable source-of-truth entry descriptions: - SKILL.md ## Authoring: categorize -> per-type checklist -> describe -> D1-D4 grade -> coverage + schema-bind pre-emit gates -> emit - references/authoring-guide.md: per-type categorization signals, canonical-parts checklists, the D1-D4 description rubric, the registry-entry emit template (kind/boundary_type enums inline), and a persistence + compute-tier prompt for multi-service stacks - references/grading-exemplars.md: contrastive PASS/WEAK/FAIL examples - tests/test_gc08_authoring_guidance.py: structural contract test, including an all-enum kind-token validity guard Guidance only; no engine or schema changes.
Add a 'digest' subcommand to aim_sot_consult.py that renders the registry as a compact, line-capped summary: one line per component (id, kind, owner, sot_location) plus a drift rollup (clean / N stale / unverified). Truncates to a count + pointer beyond 200 lines; emits nothing for an empty registry. Reads the derived entry index via the existing read-through path; read-only, registry never written. Add tests/test_sot_consult_digest.py (19 hermetic tests) covering the render format, the three-way drift rollup across the full drift_status enum, line-cap truncation, empty/no-registry handling, JSON output, and a static never-writes scan.
Add standalone, opt-in session-start hooks for Claude, Codex, Cursor, and Gemini that inject the SOT registry digest as ambient context at session start. Each hook invokes the consult digest mode (read-only) and emits the summary into its CLI's session-start channel; all are fail-open (a hook failure never blocks the session) and gated on registry presence (no registry = not opted in). The hooks ship unregistered — per-CLI opt-in instructions are documented in the skill. Add hermetic tests for all four.
Add a Lifecycle section to the aim-sot skill documenting the create (bootstrap) and update flows end-to-end — discover, author, verify, apply — with the cross-cutting invariants (propose-only, verify gates apply, human applies, never auto-rewrite). Point the bootstrap step at the Authoring guidance instead of leaving the description fields to be written by hand. Documentation only; no engine change.
added 4 commits
June 14, 2026 21:26
…ross all CLIs Register the SOT ambient-digest (session-start) and drift hooks into the installer's project-level config writers for Claude, Gemini, Cursor, and Codex. Gated by AI_MEMORY_SOT_HOOKS (default on; set to off to skip registration). Hooks self-guard on .sot/registry.yaml presence, so projects without a registry are unaffected. - Claude: generate_settings.generate_hook_config (digest->SessionStart resume|compact, drift->Stop); merge_settings reuses it, so existing installs upgrade idempotently (dedup-safe, dead-hook-sweep-safe). - Gemini/Cursor/Codex: write_*_config dict entries with each CLI's existing session-start + drift event and matcher shape. - Tests: per-CLI registration, gate-off suppression, re-merge idempotency, dead-hook strip-survival, and langfuse+sot Stop combination.
…on/endpoint SOT row Lifecycle Create step 3: a fresh-registry verify is CONDITIONAL with 0 failures, not PASS — cold-start K1 skipped_no_baseline and advisory (C1) warnings are expected and do not block apply; literal PASS requires an established drift baseline. The prior 'all 16 checks must pass' was unachievable on the bootstrap the step describes. authoring-guide 2.3 Service/API: add 'API version + served endpoint' canonical part (api/concern; openapi info.version + servers[]) so the published contract revision and base URL are declared as a source of truth.
test_idempotency_no_duplicates hard-coded the pre-feature counts (1 SessionStart group, 1 Stop group). With the SOT digest (SessionStart) and drift (Stop) hooks registered by default (AI_MEMORY_SOT_HOOKS on), the merged config has 2 SessionStart groups and 2 Stop groups; the merge stays idempotent across a double run (dedup by command). Update the two count assertions to match; per-group hook counts are unchanged.
…exclusion - Rewrite Stop Hook §, Digest Session-Start Hook §, and Trigger Hooks § in SKILL.md: hooks are registered by default on install (alongside core ai-memory hooks). Reconcile BP-032 explicitly: portability concern is bounded — hooks self-guard on .sot/registry.yaml presence (no-op in non-SOT projects) and the settings merge is backup + atomic + non-clobbering. Supersedes stale "opt-in/never auto-registers" wording. - Add MED-1 note: AI_MEMORY_SOT_HOOKS=off prevents adding hooks on install but does not remove already-registered hooks (settings merge is append/base-wins). Document opt-out token: only "off" (case-insensitive) disables; false/0/no leave hooks on. - Update all 9 hook/adapter module docstrings to match default-on policy. - Harden sot_entry auto-exclusion in search.py (H1 guard): append sot_entry to must_not_types unconditionally for untyped conventions queries (union), so a caller passing an unrelated must_not_types cannot resurface sot_entry rows. Explicit memory_type=["sot_entry"] consult path is unaffected. - Update test_explicit_must_not_types_caller_gets_sot_entry_appended to assert the new union behavior.
79ded32 to
66b71c6
Compare
added 6 commits
June 15, 2026 02:47
…fixes consult read the 5b derived memory cache first and returned it whenever non-empty, with no binding to the committed .sot/registry.yaml. After any registry edit, or when another project's rows shared the group_id, it shadowed the file with stale or cross-state entries. consult now uses the cache only when the 5a drift cache's registry_sha (advanced solely on a successful reindex) matches the committed file's SHA; otherwise it falls back to the committed file. consult stays strictly read-only. Also: - verify --proposal flags a proposal lacking an 'entries' key instead of silently verifying it as an empty set. - detect-propose run prunes drift-cache records for components no longer in the committed registry (parity with the 5b reindex prune). - SKILL.md 5b-cache field name corrected to type=sot_entry to match the code. Tests: realistic-registry regression for the consult staleness fix (fails without the gate), proposal wrong-shape coverage, and drift-cache orphan prune coverage.
aim-sot's engine ships under _ai-memory/skills/ with no full .claude/skills/ copy, so installed projects could not discover it as a skill even though its SOT session and drift hooks were registered. deploy_ai_memory_skills now generates a thin discovery shim in .claude/skills/ for aim-* skills that live under _ai-memory/skills/ without a full copy. It runs on every install (matching aim-sot's always-on SOT hooks), never clobbers a skill that already has a full copy, and is idempotent on re-install. The aim-* prefix excludes oversight-internal parzival-save-* skills, which are correctly absent from an end-user project. Document the AI_MEMORY_SOT_HOOKS opt-out as a commented line in docker/.env.example. Add tests/test_install_aim_sot_skill_surface.py exercising the real deploy_ai_memory_skills: aim-sot is surfaced as a shim, a full-copy sibling stays unshimmed, parzival-save-* is not surfaced, and re-install is idempotent.
Lane A (_ai-memory/skills/aim-sot/): - Guard consult's detect_propose sibling import; when the freshness helpers are unavailable, _cache_is_fresh now treats the cache as not fresh so consult still serves the committed registry file instead of failing. - Drop the unreachable default in verify's proposal loader (the missing-entries guard already returns), reading data["entries"] directly. - Rework the A1 consult-freshness regression test to drive the real CLI path (consult list --json) and fail on the staleness assertion rather than a missing-symbol error when run against the pre-gate code. - Add an A3 case asserting throttle-skipped components that remain in the registry survive the orphan prune; add a consult test covering the committed-file fallback when the sibling helpers are unavailable. Lane B (scripts/install.sh): - Clarify the shim "no clobber" comment (the skip-guard protects full copies; the shim itself is rebuilt every run) and document the shim loop's ordering dependency on the canonical-files copy. - Extend the install idempotency test to change the canonical SKILL.md between deploys and assert the regenerated shim reflects the change.
…5b.py A1 updated _try_memory_cache(registry_path, project_id) signature; align the three direct call sites in T-CR1/T-CR2/T-CR3 to pass "test-project". T-CR4 uses patch.object and is unaffected.
…consult freshness Non-SOT writes now pass exempt_handle_pii=False to scanner; update test assertions to assert the correct call signature as regression guard.
… _scroll_sot_payloads seam C4-1: _cache_is_fresh now requires both SHA uniformity and row-count equality with the committed file's entry count. A partial reindex (stored < total due to validation rejection) leaves all rows stamped with the committed SHA but row count short; the old SHA-only check trusted that subset forever. Fix: _load_entries parses the committed file once when payloads exist, passes len(file_entries) to _cache_is_fresh, and reuses file_entries as the fallback return value (no second YAML parse). New test: 2-of-3 rows stamped committed SHA → file fallback, all 3 committed entries returned. C4-2: F2 refactor made _load_entries bypass _try_memory_cache (calls _scroll_sot_payloads directly), orphaning both the helper and its tests. Removed _try_memory_cache; rewired tests/test_sot_consult_5b.py to the _scroll_sot_payloads seam via monkeypatch (mirrors test_a1_consult_freshness.py). Updated test_a1_consult_freshness.py::test_fresh_cache_used_preserves_drift_status to supply all 12 enriched rows (cardinality guard requires full-set cache). C4-3: Added one-line comment at the store_memories_batch scanner loop noting exempt_handle_pii is intentionally omitted (SOT uses per-row store_memory; batching SOT rows would silently re-redact owner handles). Comment only, no logic change.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Ships
aim-sot— a Source-of-Truth subsystem that tracks where the truth lives for each part of the end-user's own project, and whether it's still accurate. It is a generic, multi-CLI feature: the registry lives in the user's repo, consultable by the user's agents, kept current by a propose-only trigger behind a mandatory in-skill verify gate. Behavior model throughout: detect-and-propose + verify-gate, never silent auto-rewrite.The feature ships on all four CLIs — Claude Code, Codex, Cursor, and Gemini. The registry, consult, detect-propose, and verify modes are shared; each CLI gets a propose-only end-of-turn trigger.
Design spec:
oversight/specs/SOT-SUBSYSTEM-DESIGN-2026-06-07b.md(in the oversight workspace).What's in this PR (5 parts)
Registry of record —
.sot/registry.yamllives in the user's repo, committed, human-authored. Shipped as a JSON-Schema (draft-07) +.sot/registry.yaml/.sot/README.mdtemplates. No project data is baked into the skill;additionalProperties:falseenforces the no-copy / no-machine-field invariant. (8-kind taxonomy,path|component|concernboundary types, 6 core fields +status.)aim-sotskill — consult mode — read-only "where does X live / who owns it / how is it drift-checked," served from the derived memory cache with a registry fallback.aim-sotskill — detect-propose engine — hybrid auto-discovery → proposes a minimal patch on drift or new candidates. Never writes the registry. Maintains a per-install drift cache (~/.ai-memory/drift-state/, atomic + lock + 7-day TTL throttle) and a derivedconventionsmemory cache (memory_type="sot_entry", rebuildable).aim-sotskill — verify mode — the 16-check gate (Schema / Referential / Completeness / Content) → PASS / CONDITIONAL / FAIL. Schema checks are driven fromregistry.schema.jsonat runtime (stdlib only — no new dependency). Mandatory in-skill; CI / pre-commit is opt-in only, never auto-installed into the user's repo.Per-CLI trigger hooks — a propose-only end-of-turn trigger per CLI that invokes the detect-propose engine and surfaces drift / new-candidate counts to stderr; fail-open on every path (never blocks the CLI):
.claude/hooks/scripts/sot_drift_stop.py(Stop), wired viamerge_settings.py. Loop-guarded (stop_hook_active), worktree-cwd-normalized.src/memory/adapters/codex/sot_drift.py(Stop)src/memory/adapters/cursor/sot_drift.py(PreCompact)src/memory/adapters/gemini/sot_drift.py(PreCompress→ canonicalPreCompact)The three non-Claude adapters normalize their native stdin via the existing per-CLI schema, take
cwdfrom the normalized event, gate on the presence of.sot/registry.yaml, and reuse the shared engine-invocation core. All git-agnostic (TTL re-check, no git dependency).Notes for review
.claude/hooks/scripts/, wired viamerge_settings.py), notsrc/memory/adapters/claude/. Only Codex/Cursor/Gemini use thesrc/memory/adapters/{cli}/layer. This matches the convention in Ship an AI-Memory agent-guidance file to projects via the installer (TD-600) #185 (Claude →.claude/…, the other three →adapters/templates/{cli}/).Stop, CursorpreCompact, GeminiPreCompress); the canonical event vocabulary (schema.pyVALID_HOOK_EVENTS) is untouched.SKILL.mddocuments the manual-enablement snippet for each CLI (.claude/settings.json,.codex/hooks.json,.cursor/hooks.json,.gemini/settings.json). The engine also runs standalone (manual/cron) as the no-hook default.SKILL.mdinvocation paths: the consult / detect-propose / verify invocation blocks pass the full script path torun-with-env.sh(${AI_MEMORY_INSTALL_DIR:-$HOME/.ai-memory}/_ai-memory/skills/aim-sot/scripts/…); the bare-name form resolved toscripts/memory/and failed "Script not found" from a normal CWD.SOT_ENTRYmemory type was added to the enum but is intentionally system-internal (not wired into search / intent / classifier).Testing
Each mode ships a companion test deliverable (contract tests for schema/templates; hermetic unit tests for the engine modes and every trigger). All tests live under top-level
tests/. The Claude hook has 14 hermetic tests; each of the three adapters has 10 (30 total) — covering the registry opt-in gate, invocation correctness (engine argv + mutation guards), non-git fallback, the propose-only / no-write guarantee (AST-based source scan with a mutation guard), and fail-open paths (empty / malformed stdin, engine non-zero exit, subprocess timeout, SIGALRM). The non-Claude adapter tests exercise the real per-CLI schema normalizers.Deferred (tracked as code TODOs, not in this PR)
dir-tree digest) forcomponent/pathboundaries whosesot_locationis a directory.sot_entryinto the search / intent / classifier surfaces (kept system-internal for now).