feat: multi-backend reviewers (fan-out) + pre-edit block + native host-hook packaging by metaphorics · Pull Request #2 · OutlineDriven/odin-reflector

metaphorics · 2026-06-08T21:08:45Z

Summary

Extends the reflector from a single OpenAI Codex reviewer into a two-axis system: pick (and stack) who reviews and run it inside any of five host agents.

Axis A — Reviewer(s) via -p: codex (default), claude, cursor-agent, grok, agy (Antigravity), selectable and concurrently fanned out (REFLECTOR_BACKENDS=codex,claude,…). Verdicts merge with an asymmetric lattice (any-FAIL→FAIL, else any-UNCERTAIN→UNCERTAIN, else PASS; infra-empty results excluded, never coerced to UNCERTAIN). Default ["codex"] short-circuits to the inline single call — byte-identical to before.
Axis B — Hosts: native hook packaging for Codex (.codex-plugin/ + hooks/codex-hooks.json + trust-by-hash installer), Cursor (.cursor-plugin/ + Claude-compat installer), Grok (advisory on post/Stop, real hard-block on PreToolUse), Antigravity (enable_json_hooks-gated, Stop-centric), alongside the existing Claude Code native hooks. A thin per-host input-normalizer + output-renderer seam keeps the router host-agnostic.
Opt-in pre-edit hard-block: REFLECTOR_PREEDIT_BLOCK=1 adds a PreToolUse deny gate (deny on exit-0 stdout permissionDecision, fail-open on UNCERTAIN, deny-loop breaker, never writes fail-state). Off by default; absent from the committed hooks/hooks.json so the default Claude Code path spawns no extra process.

Security fix (notable)

The behavioral read-only harness (scripts/test-readonly.sh, new) caught that the default codex reviewer was not actually read-only: _codex_argv passed --sandbox read-only --full-auto, and --full-auto (deprecated on codex 0.137.0) resolves the effective sandbox to workspace-write, overriding read-only. Removed --full-auto — codex now runs read-only or refuses to run (never unsandboxed). Pre-existing bug, preserved byte-identically by the feature's stability invariant until the new harness exposed it. Documented in docs/solutions/security/.

Invariants (named, tested)

INV-CODEX-PATH-STABLE (default path stable, with the deliberate documented --full-auto exception), INV-MERGE, INV-DENY-STDOUT, INV-PREBLOCK-NOSTATE, INV-STOP-DELIVERY, INV-READONLY, INV-VERDICT-TEXT.

Testing

python3 scripts/codex-reflector.py --test-parse → 270/270 passed (was 41); ruff clean; shellcheck clean across all installers + wrappers + harness.
Behavioral read-only (scripts/test-readonly.sh, this env): claude PASS, grok PASS (write-attempt blocked), codex SKIP (read-only sandbox unenforceable on this Landlock-restricted kernel → refuses to run, safe), cursor-agent/agy SKIP (not authenticated here).
Reviewed across multiple adversarial passes (0 blockers); see commits.

Known limitations (honest)

Antigravity is advisory-only until firing gates (Stop re-injection, PreToolUse deny) are confirmed live (KTD-10).
Non-codex behavioral read-only verified for claude/grok in-env; cursor-agent/agy verified only by argv + vendor-doc (rerun the harness where authed).
Per-host live firing (Codex trust gate, Cursor cloud-Stop, Grok deny wire-shape, agy enable_json_hooks) is documented as a manual checklist in scripts/test-readonly.sh — the adapters/installers are built and offline-tested; live firing needs an authed host.
install-codex.sh trust-by-hash is best-effort (Codex canonicalization undocumented); fails safe (wrong hash → Codex skips the hook, never wrong-trusts).

Plan: docs/plans/ (multi-backend reflectors). stdlib-only Python + POSIX-sh installers; no new dependencies.

Summary by CodeRabbit

New Features
- Multi-host reflector support (Codex, Claude Code, Cursor, Grok, Antigravity)
- Optional opt-in pre-edit hard-block review gate
- New host-specific hook manifests and installer flows
Documentation
- Expanded README and CLAUDE.md with host/reviewer contracts, fan-out/merge rules, safety guarantees, and operational checklists
- Multiple new security guides (sandbox verification, redact-before-truncate, prompt-metadata injection, fan-out failure modes)
Tests
- Read-only sandbox validation harness and installer idempotency test suite
Chores
- Added hook wrappers and per-host installers; improved install idempotency and smoke-test guidance

Generalize the single-Codex reviewer into a data-driven backend registry that can fan out to N reviewers per event and merge their verdicts. Default REFLECTOR_BACKENDS=["codex"] short-circuits to the inline single call, keeping the codex path byte-identical (INV-CODEX-PATH-STABLE). - U1: Backend spec table + invoke_backend(); shared _codex_argv() used by both the codex reviewer row and the summarizer invoke_codex (byte-identity incl. CODEX_REFLECTOR_MODEL override + LIGHTNING effort bump). Reviewer call sites routed through invoke_backend; summarizer stays pinned to codex+FAST_MODEL. - U2: four prompt-delivery mechanisms, stdout/file capture, per-spec timeout, read-only levers per backend (codex --sandbox read-only; claude/grok --permission-mode plan; cursor --mode plan; grok --sandbox read-only; agy --sandbox + stdin=DEVNULL). No write-enabling flag on any path. - U3: fan_out() (ThreadPoolExecutor, N=1 short-circuit) + pure merge_verdicts() (drop infra-empty, parse per-reviewer raw, asymmetric lattice, empty-set sentinel) + resolve_backends() + REVIEWER_LABEL(set) + Stop delivery budget (_compact_output_stop caps compaction to 1 layer). - U4: codex-only effort gating; non-codex force backend-native default_model; model override is codex-scoped only. - U5: self-test extended to 129 cases (byte-identity x3, levers, merge lattice, resolve precedence, label, codex-scoped override, summarizer pinning). Security fix (adversarial review): build_bash_failure_prompt now wraps its attacker-controllable inputs in _sandbox_content (was _redact only) — closing a prompt-injection hole; sandbox audit widened to all 8 builders. Fixed claude/cursor-agent default_model (were inheriting the codex gpt-5.5 id) to sonnet / sonnet-4; added a guard that no non-codex backend uses DEFAULT_MODEL. Op: extend

Add an opt-in PreToolUse pre-flight review that can DENY a high-severity edit before it lands. Off by default (REFLECTOR_PREEDIT_BLOCK); respond_pretooluse gates-first and returns None before any work, and no PreToolUse entry is added to committed hooks/hooks.json — so the default Claude Code path is unchanged (INV-CODEX-PATH-STABLE). Opt-in installers wire the event later. - build_pretooluse_prompt: reviews the PROPOSED edit (tool_input), never the on-disk file or tool_response; _redact + _sandbox_content; higher FAIL bar (block only high-confidence/high-severity; when in doubt PASS). - respond_pretooluse: single primary reviewer (resolve_backends[0], never fan_out); FAIL -> exit-0 stdout hookSpecificOutput.permissionDecision="deny" (INV-DENY-STDOUT, never exit-2/stderr — the only channel Grok honors); PASS/UNCERTAIN/empty -> None quiet allow (fail-open, never permissionDecision ="allow"); never writes fail-state (INV-PREBLOCK-NOSTATE). - Deny-loop breaker (M-B): per-(file,edit-hash) counter in a SEPARATE /tmp/codex-reflector-denies-{sid}.json (flock); after N denies of the same edit, allow + advisory; cwd-derived fallback key when session_id is absent so the breaker can't silently disable (e.g. under Grok). - _ME_PRE_EDIT preset bounds effort/latency on the synchronous path. Adversarial-review fixes: deny reason and the >400K prompt-build path hard- truncate instead of running unbounded compaction (INV-STOP-DELIVERY on the synchronous path — a host timeout mid-compaction would drop the deny and let the edit land). _sandbox_content neutralizes the closing fence via an explicit escape so a crafted proposed edit can't break out and force PASS (applies to all builders). Deny label attributes the one reviewer that ran. Self-test 149/149. Op: extend

Introduce the Axis-B host seam so future hosts plug in without touching the router core (KTD-7). Two symmetric seams + a canonical schema: - resolve_host(payload): REFLECTOR_HOST env wins (validated), else infer from distinguishing payload keys, ambiguous -> claude identity (safe default). - _normalize_input(host, hook_data): host payload -> canonical Claude-shaped hook_data; cursor -> existing _normalize_cursor_input, claude/codex identity, grok/antigravity clean identity extension points for U10/U11. - Canonical namedtuple (systemMessage/additionalContext/reason/blocking/ permission_decision/permission_decision_reason + raw) + _to_canonical lift. - _render_host_output / _render_identity_output: the claude/codex/cursor renderer re-emits the responder dict VERBATIM from canonical.raw using the exact pre-refactor exit-code expression, so default-host output is byte- identical by construction (INV-CODEX-PATH-STABLE; verified by a 48-case differential oracle). - probe_backends (m3): absent reviewer binary -> visible notice, treated as infra-empty (never UNCERTAIN) so it can't wedge Stop. Also fixes a regression where gating _normalize_cursor_input on workspace keys under-detected bare Cursor payloads; resolve_host now keys on Cursor's camelCase event name. Self-test 190/190. Op: extend

…y (KTD-11) A present-but-logged-out stdout-capture backend (grok/claude/cursor-agent/agy) can exit nonzero while printing auth-error text to stdout. Without a returncode check that text was returned as a "verdict", parsed to UNCERTAIN, and on Stop (fail-closed) blocked the agent — the exact wedge KTD-11 forbids. Guard stdout-capture backends: nonzero exit -> "" (infra-empty, excluded from the merge). codex (file-capture) is unchanged, so INV-CODEX-PATH-STABLE holds. Op: correct Restores: spec:KTD-11

install-cursor.sh gains a --pre-edit flag that adds an opt-in preToolUse hook (maps via Cursor's _CURSOR_EVENT_MAP) so the pre-edit deny gate can run under Cursor. Default install is unchanged (4 events, no PreToolUse); the hook also self-gates on REFLECTOR_PREEDIT_BLOCK at runtime, so it is inert unless enabled. Cursor is otherwise already a working host via the U7 seam; .cursor-plugin/ plugin.json manifest already present. Cloud-Stop degradation + settings.json drift docs are consolidated in U12. Op: extend

_normalize_grok_input maps Grok's Claude-compat camelCase stdin envelope (hookEventName, workspaceRoot/CLAUDE_PROJECT_DIR, sessionId/conversationId) to canonical Claude-shaped fields. _render_grok_output is asymmetric per KTD-9: PreToolUse emits permissionDecision=deny on stdout (Grok's only honored hard-block channel); post/Stop/PreCompact are advisory — logged to a /tmp side-channel + best-effort systemMessage, never additionalContext, never exit 2. install-grok.sh wires the 5 hooks into Claude-compat discovery and enables REFLECTOR_PREEDIT_BLOCK unconditionally (Grok's sole enforcement seam). Wired both seams (_normalize_input, _render_host_output); self-test +18 grok cases (209/209). Firing+deny wire-shape is KTD-10 smoke-test-gated. Op: extend

…e (U11) Op: extend

Package the reflector to run inside Codex's native hook system and make the failure-diagnostic flow live on Codex. - .codex-plugin/plugin.json: Codex plugin manifest mirroring .claude-plugin, declaring a hooks capability (-> ./hooks/codex-hooks.json). - hooks/codex-hooks.json: native wiring for PostToolUse/Stop/PreCompact via a committed wrapper (codex-reflector-hook.sh) using ${CLAUDE_PLUGIN_ROOT}. PreToolUse pre-edit hard-block is OPT-IN, split into codex-hooks-preedit.json. - scripts/install-codex.sh: idempotent installer (POSIX sh, python3 read-modify- write, no jq). Merges ~/.codex/hooks.json and PRE-TRUSTS hook commands BY HASH ([hooks.state] in config.toml, backup + parse-validate + rollback). Generates abs-path wrappers. --preedit opt-in; --dangerously-bypass-hook-trust footnote. - B4: _normalize_codex_input re-emits PostToolUseFailure from error/non-zero PostToolUse payloads so classify() routes to bash_failure / code_change_failure. Wired host=="codex" in _normalize_input. INV-CODEX-PATH-STABLE preserved: a non-error codex PostToolUse passes through byte-identical (ambiguous code/status spellings excluded from exit-key detection). - run_self_test: B4 re-emit + invariant cases; manifest/hooks JSON schema-shape validation. 218/218 passed; ruff + shellharden clean. Op: extend

Holistic /ce-code-review of the merged host layer (0 blockers) surfaced two shipped-behavior enforcement bugs + hardening: - Cursor --pre-edit shipped INERT: the wired preToolUse hook omitted REFLECTOR_PREEDIT_BLOCK=1, so respond_pretooluse gated off and never denied. Now the --pre-edit command sets it inline (Cursor runs hooks via a shell, like grok/codex); default install still wires no preToolUse hook. - Antigravity host-pin relied on an inline `VAR=val cmd` env-prefix the codex unit deliberately distrusted. If agy execs without shell interpretation, the Stop payload (no workspacePaths) infers host=claude and reads the bare /tmp fail-state file — silently losing the FAIL (INV-STOP-DELIVERY). Now mirrors codex: a committed wrapper (hooks/antigravity-reflector-hook.sh) and an installer-generated absolute-path wrapper `export REFLECTOR_HOST=antigravity`, referenced as `sh <wrapper>` (plain two-arg exec, no shell-interp dependency). - Refreshed the stale _normalize_input docstring (grok/antigravity now dispatch, not identity passthrough). - Extended the INV-READONLY forbidden-flag self-test set with --no-plan, --dangerously-bypass-approvals-and-sandbox, workspace-write, danger-full-access. Self-test 270/270; ruff clean; installers sh -n clean. Op: correct Restores: spec:INV-STOP-DELIVERY

…2 simplify) /ce-simplify-code pass: moved the _IDENTITY_HOSTS frozenset from the host-seam section down to just before its sole consumer _state_path, so the definition precedes use, and deleted the apologetic forward-reference comment that only existed to excuse the out-of-order reference. Byte-identical value, single consumer (the identity renderer dispatches by string literal, not the symbol); strictly behavior-preserving (adversarially verified). The simplify pass deliberately left the per-host normalizers/renderers separate (merging would hide host-specific wire-shape structure — speculative DRY) and the rare-path double parse_verdict alone (factoring needs restructuring). Self-test 270/270. Op: compress

…g checklist (U12) - CLAUDE.md: fixed the stale model-override gotcha (now codex/reviewer-scoped; summarizer pinned to FAST_MODEL ignoring the env vars); added the two-axis Reviewer×Host model, fan-out + merge_verdicts, the opt-in pre-edit gate, the env-var table, the named-invariants table, reviewer×host capability matrix, and per-host gaps/known-limitations. - README.md: user-facing reviewer selection/fan-out, pre-edit gate, per-host install, the matrix, per-host gaps, auth prerequisites, version snapshot. - scripts/test-readonly.sh (NEW): behavioral INV-READONLY harness — drives each reviewer with a write-attempt prompt in a fresh scratch repo and asserts the repo is unchanged; SKIPs absent/unauthed CLIs (nonce liveness probe) and sandbox-unenforceable kernels. Carries the KTD-10 per-host firing checklist. Live results: claude PASS, grok PASS, codex FAIL (see below), cursor/agy SKIP. The harness surfaced a CRITICAL pre-existing finding: the codex reviewer is NOT read-only — `_codex_argv` passes `--sandbox read-only --full-auto`, but `--full-auto` (deprecated on codex 0.137.0) resolves the sandbox to workspace-write, overriding read-only. Documented honestly in CLAUDE.md/README; the argv fix is deferred to an explicit decision (it touches the byte-identity self-test + changes default behavior on Landlock-restricted kernels). Op: extend

…ad-only (INV-READONLY) The U12 behavioral harness proved the DEFAULT codex reviewer was NOT read-only: _codex_argv passed `--sandbox read-only --full-auto`, but on codex 0.137.0 `--full-auto` is deprecated and resolves the effective sandbox to workspace-write, OVERRIDING read-only — so prompt-injection from reviewed content could make codex write the repo. Pre-existing (the original reflector shipped --full-auto; the multi-backend feature preserved it byte-identically). Removed --full-auto from _codex_argv (single source for the codex reviewer AND the codex-pinned summarizer). codex now runs read-only on a sandbox-capable host, or REFUSES to run where the kernel cannot set up its sandbox (Landlock EPERM) — the review fails open (no review, no writes), never unsandboxed. Harness confirms: codex now SKIPs on the Landlock-restricted test kernel instead of writing. Deliberate, security-justified break of the codex argv byte-identity: INV-READONLY wins over INV-CODEX-PATH-STABLE byte-identity. Updated the two byte-identity self-test assertions and synced scripts/test-readonly.sh + CLAUDE.md + README.md. Self-test 270/270; ruff + shellcheck clean. Op: correct Restores: spec:INV-READONLY

The docstring said grok 'falls back to identity until its renderer lands' — stale since U10; grok has a dedicated asymmetric renderer. Documentation-only, zero behavioral impact. Closes the final code-review minor. Self-test 270/270. Op: correct

…arning (ce-compound) The --full-auto-defeats-read-only bug as a compounding solution doc: argv-presence checks pass while the effective sandbox is workspace-write; behavioral write-attempt testing is the arbiter. Bug track, security category. Op: extend

One-line pointer so future agents search the learnings store before debugging recurring issues (ce-compound discoverability check). Op: extend

chatgpt-codex-connector · 2026-06-08T21:08:51Z

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, add credits to your account and enable them for code reviews in your settings.

gemini-code-assist

Code Review

This pull request expands the Codex Reflector plugin to support multiple reviewer backends (such as Claude, Grok, and Antigravity) and multiple agent hosts, adding corresponding installer scripts, native hook configurations, and a behavioral read-only test harness. Feedback on the changes highlights a potential failure in the newly added shell wrappers (antigravity-reflector-hook.sh, codex-reflector-hook.sh, and codex-reflector-preedit-hook.sh) if both CLAUDE_PLUGIN_ROOT and PLUGIN_ROOT are unset, and suggests robust relative path fallbacks to prevent execution errors.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

coderabbitai · 2026-06-08T21:10:32Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 4e1004b4-2dd2-4c47-8151-05e86e16cf58

📥 Commits

Reviewing files that changed from the base of the PR and between 1f25be2 and b574c78.

📒 Files selected for processing (1)

docs/solutions/conventions/end-of-options-double-dash-is-portable-and-load-bearing.md

✅ Files skipped from review due to trivial changes (1)

docs/solutions/conventions/end-of-options-double-dash-is-portable-and-load-bearing.md

📝 Walkthrough

Walkthrough

Converts Codex Reflector into a multi-host reflector: adds plugin manifest, per-host hook wiring and wrappers, host installers (idempotent/trusted), optional pre-edit hard-block wiring, read-only verification and installer idempotency tests, and extensive docs and security conventions.

Changes

Multi-Backend Reflector Plugin

Layer / File(s)	Summary
Plugin manifest and Codex hook wiring `.codex-plugin/plugin.json`, `hooks/codex-hooks.json`, `hooks/codex-reflector-hook.sh`, `hooks/codex-hooks-preedit.json`, `hooks/codex-reflector-preedit-hook.sh`	Plugin manifest and Codex hook configurations wire PostToolUse/Stop/PreCompact to the reflector and provide an opt-in PreToolUse pre-edit wrapper; wrappers export REFLECTOR_HOST/REFLECTOR_PREEDIT_BLOCK and exec the Python entrypoint.
Codex installer (idempotent, trust-by-hash) `scripts/install-codex.sh`	Installer generates wrapper(s), merges hook entries into `hooks.json` idempotently, computes canonical SHA-256 digests for installed hooks, writes `[hooks.state.*] trusted_hash` into `config.toml`, and validates/backups on failure.
Antigravity hooks and installer `antigravity/hooks.json`, `hooks/antigravity-reflector-hook.sh`, `scripts/install-antigravity.sh`	Antigravity JSON hook bundle wires PostToolUse/PostToolUseFailure/Stop/PreCompact to a host-pinned wrapper; installer idempotently merges `hooks.json` and updates `settings.json` to enable JSON hooks with safe-merge/backup behavior.
Grok installer and wiring `scripts/install-grok.sh`	Grok installer emits Grok-compatible `settings.json` including PreToolUse and per-event matchers/timeouts with REFLECTOR_PREEDIT_BLOCK and REFLECTOR_HOST=grok; merges idempotently via `jq` or replaces with `--force`.
Cursor pre-edit extension `scripts/install-cursor.sh`	Cursor installer adds `--pre-edit` to optionally inject PreToolUse with REFLECTOR_PREEDIT_BLOCK and updates dedupe logic to match both bare and pre-edit command variants exactly.
Installer idempotency tests `scripts/test-install-idempotent.sh`	Test harness runs per-host installers twice, inspects hook JSON totals and exact-command duplicates, and asserts idempotent merges; includes regression checks against over-stripping.
Read-only sandbox behavioral verification `scripts/test-readonly.sh`, `docs/solutions/security/readonly-sandbox-levers-need-behavioral-verification.md`	Test harness runs reviewer backends against a scratch git repo to assert INV-READONLY behavior (tracked-file fingerprinting and PASS/FAIL/SKIP) and docs describe flag-override failure modes and behavioral verification recommendations.
Documentation, security solutions, and conventions `README.md`, `CLAUDE.md`, `docs/solutions/*`	README rewritten for multi-backend architecture, CLAUDE.md expanded with reviewer/host contracts and gotchas, and many security/convention write-ups added (fan-out exceptions, redact-before-truncate, unsandboxed metadata, installer deduping, key-normalizer convention).

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

🐰 I hopped through hooks and manifest light,
Wrappers set hosts and pre-edit might,
Installers merge, idempotent and neat,
A sandbox test proves the read-only seat,
Carrots, docs, and safety — small crunchy delight!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 60.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and concisely describes the three main changes: multi-backend reviewer fan-out, pre-edit block functionality, and native host-hook packaging.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch v2

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Copilot

Pull request overview

This PR expands the reflector from a single Codex-based reviewer running in Claude Code into a two-axis system: (1) selectable, stackable reviewer backends with concurrent fan-out and merged verdicts, and (2) host-specific adapters/installers so the reflector can run inside multiple agent hosts (Claude Code, Codex, Cursor, Grok, Antigravity), including an opt-in synchronous pre-edit deny gate.

Changes:

Added a backend registry + fan-out execution in scripts/codex-reflector.py, with deterministic merge semantics and host input/output seams (including Grok and Antigravity renderers).
Introduced an opt-in PreToolUse pre-edit hard-block gate (REFLECTOR_PREEDIT_BLOCK=1) with a deny-loop breaker and strict fail-open behavior on uncertainty/infra failures.
Added behavioral read-only verification harness and new host installers/packaging artifacts (Codex native hooks + trust-by-hash, Grok/Cursor Claude-compat wiring, Antigravity JSON hook wiring + enablement).

Reviewed changes

Copilot reviewed 15 out of 16 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
scripts/test-readonly.sh	New behavioral harness to verify reviewer “read-only” behavior via write-attempts on a scratch repo.
scripts/install-grok.sh	New installer wiring Grok via Claude-compat hooks, always enabling pre-edit blocking.
scripts/install-cursor.sh	Adds `--pre-edit` option to wire the opt-in `PreToolUse` gate for Cursor.
scripts/install-codex.sh	New installer for Codex native hooks, with wrapper scripts and best-effort trust-by-hash.
scripts/install-antigravity.sh	New installer for Antigravity JSON hooks plus `enable_json_hooks` settings toggle.
scripts/codex-reflector.py	Major refactor: backend registry, fan-out + merge, pre-edit gate, host normalization/rendering seams, and read-only/security hardening.
README.md	Updated docs for multi-backend reviewers, pre-edit gate, per-host install, and limitations.
hooks/codex-reflector-preedit-hook.sh	Opt-in pre-edit wrapper for Codex plugin-dir packaging.
hooks/codex-reflector-hook.sh	Codex plugin-dir wrapper pinning `REFLECTOR_HOST=codex`.
hooks/codex-hooks.json	Codex native hook wiring (default events) for plugin-dir packaging.
hooks/codex-hooks-preedit.json	Separate opt-in Codex `PreToolUse` wiring fragment.
hooks/antigravity-reflector-hook.sh	Antigravity plugin-dir wrapper pinning `REFLECTOR_HOST=antigravity`.
docs/solutions/security/readonly-sandbox-levers-need-behavioral-verification.md	New security writeup documenting the `--full-auto` sandbox override issue and behavioral testing approach.
CLAUDE.md	Expanded architecture/invariants documentation for reviewer×host axes and new safety contracts.
antigravity/hooks.json	Antigravity JSON hook wiring (plugin-dir artifact) pinning `REFLECTOR_HOST=antigravity`.
.codex-plugin/plugin.json	Codex plugin manifest for native hook packaging.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

coderabbitai

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

scripts/install-cursor.sh (1)

136-155: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Pre-edit hook entries are not deduplicated in merge mode.

Line 153 removes only hooks matching codex_command, but PreToolUse uses REFLECTOR_PREEDIT_BLOCK=1 .... That leaves stale pre-edit hooks and duplicates on repeated --pre-edit installs.

Proposed fix

 codex_command="python3 \"${reflector_script}\""
+codex_preedit_command="REFLECTOR_PREEDIT_BLOCK=1 ${codex_command}"
@@
-    jq --arg codex_command "$codex_command" -s '
-      def without_codex($command):
+    jq --arg codex_command "$codex_command" --arg codex_preedit_command "$codex_preedit_command" -s '
+      def without_codex($command; $preedit):
         map(
           if has("hooks") then
-            . + {hooks: (.hooks | map(select((.command // "") != $command)))}
+            . + {hooks: (.hooks | map(select(((.command // "") != $command) and ((.command // "") != $preedit))))}
           else
             .
           end
         )
@@
-              (($existing.hooks[$key] // []) | without_codex($codex_command))
+              (($existing.hooks[$key] // []) | without_codex($codex_command; $codex_preedit_command))
               + ($codex.hooks[$key] // [])

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scripts/install-cursor.sh` around lines 136 - 155, The merge leaves duplicate
PreToolUse/pre-edit entries because without_codex($codex_command) only removes
exact matches on $codex_command but PreToolUse adds entries that include
REFLECTOR_PREEDIT_BLOCK=1; update the jq pipeline around without_codex and the
final hooks merge to filter out any hook whose (.command // "") equals
$codex_command OR contains "REFLECTOR_PREEDIT_BLOCK" (or otherwise normalize by
applying unique_by(.command) / unique on command strings after merging), so
modify the without_codex function and/or the reduce that builds hooks to exclude
those patterns and/or deduplicate by .command to prevent repeated pre-edit
hooks.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@CLAUDE.md`:
- Around line 102-106: The fenced code block containing the flow diagram
starting with "stdin JSON → resolve_host() → _normalize_input(host) → classify()
→ _gate_model_effort(codex-only) → ..." is missing a language tag; change the
opening fence from ``` to ```text so the block is typed (e.g., replace the
untyped fenced block with one that begins with ```text) to satisfy markdownlint
MD040.

In `@scripts/install-antigravity.sh`:
- Around line 90-117: Normalize settings_path and hooks_path to absolute paths
before computing settings_dir/hooks_dir and before creating wrapper_path and
hook_command; for example, resolve settings_path and the chosen hooks_path (from
AGY_HOOKS or default) with a realpath/readlink -f step so settings_dir,
hooks_dir and wrapper_path are all absolute, then write the wrapper and set
hook_command using that absolute wrapper_path (ensure hook_command uses the
resolved wrapper_path variable). Update references around settings_dir,
hooks_path, hooks_dir, wrapper_path and hook_command so all path variables are
the resolved absolute values.

In `@scripts/install-codex.sh`:
- Around line 161-162: The merge code is always including preedit_command in the
local list named "ours" (currently set as "ours = {hook_command,
preedit_command}"), which causes existing pre-edit hooks to be stripped when
reinstalling without the --preedit flag; change the construction of "ours" so
that preedit_command is only included when the pre-edit feature is enabled
(e.g., check the existing preedit flag/variable such as "preedit" or "--preedit"
and append preedit_command to ours conditionally), and keep the downstream merge
logic that compares "ours" and existing hooks unchanged so that a reinstall
without preedit does not remove an existing pre-edit hook.

---

Outside diff comments:
In `@scripts/install-cursor.sh`:
- Around line 136-155: The merge leaves duplicate PreToolUse/pre-edit entries
because without_codex($codex_command) only removes exact matches on
$codex_command but PreToolUse adds entries that include
REFLECTOR_PREEDIT_BLOCK=1; update the jq pipeline around without_codex and the
final hooks merge to filter out any hook whose (.command // "") equals
$codex_command OR contains "REFLECTOR_PREEDIT_BLOCK" (or otherwise normalize by
applying unique_by(.command) / unique on command strings after merging), so
modify the without_codex function and/or the reduce that builds hooks to exclude
those patterns and/or deduplicate by .command to prevent repeated pre-edit
hooks.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d932802a-93a3-4378-bca8-bf6ebb94ef2d

📥 Commits

Reviewing files that changed from the base of the PR and between 301c7e4 and 72a0817.

📒 Files selected for processing (16)

.codex-plugin/plugin.json
CLAUDE.md
README.md
antigravity/hooks.json
docs/solutions/security/readonly-sandbox-levers-need-behavioral-verification.md
hooks/antigravity-reflector-hook.sh
hooks/codex-hooks-preedit.json
hooks/codex-hooks.json
hooks/codex-reflector-hook.sh
hooks/codex-reflector-preedit-hook.sh
scripts/codex-reflector.py
scripts/install-antigravity.sh
scripts/install-codex.sh
scripts/install-cursor.sh
scripts/install-grok.sh
scripts/test-readonly.sh

When both CLAUDE_PLUGIN_ROOT and PLUGIN_ROOT are unset/empty, plugin_root collapsed to "" and the wrapper exec'd /scripts/codex-reflector.py at the filesystem root (fails). Fall back to the script's own parent dir (hooks/.. = repo root) so direct/absolute invocation of the wrapper still resolves the reflector. Addresses gemini-code-assist PR #2 review. Note: the ${X:-$(...)} default only evaluates when the var is unset, so the normal (env-set) path is unchanged. This does NOT rescue `sh "$EMPTY/hooks/..."` — that resolves to /hooks/... and fails in sh before this script runs. Op: correct Restores: spec:wrapper-resolves-reflector-path

_normalize_input dispatches host=="antigravity" to _normalize_antigravity_input (codex-reflector.py:2882), so the comment claiming "antigravity has no normalizer yet" was stale. The test still holds because the normalizer is a no-op on already-Claude-shaped input; the comment now states that idempotency property. Addresses Copilot PR #2 review. Op: correct Restores: spec:comment-matches-code

A relative --hooks-path or AGY_HOOKS baked a relative path into the wrapper's hook_command (`sh "rel/.../hook.sh"`), which fails once agy runs the hook from its own cwd. Re-resolve settings_dir/hooks_dir via `cd && pwd -P` after mkdir (the idiom install-codex.sh already uses) and rebuild the *_path vars, so the generated command is always absolute. Verified: relative --hooks-path now emits an absolute hook_command. Addresses CodeRabbit PR #2 (Major). Op: correct Restores: spec:hook-command-absolute-path

`ours` (the set of commands the idempotent merge strips before re-adding) unconditionally included preedit_command, but `desired` only re-adds the PreToolUse hook when --preedit is passed. So a plain reinstall stripped a previously opted-in pre-edit hard-block without re-adding it. Only claim ownership of preedit_command when pre-edit is enabled this run. Verified: install --preedit then plain reinstall preserves PreToolUse; PostToolUse stays a single group. Addresses CodeRabbit PR #2 (Major). Op: correct Restores: spec:preedit-hook-survives-default-reinstall

The data-flow fenced block was untyped; markdownlint MD040 flags it. Addresses CodeRabbit PR #2 (Minor). Op: correct Restores: spec:fenced-blocks-have-language

Two compress-only, behavior-preserving simplifications surfaced by an axis-decomposed simplify pass and confirmed by adversarial audit: - parse_verdict computed raw.strip() twice; bind once as `stripped` (raw is a never-reassigned param, so both sites were provably equal). - _read_tail had two comments narrating one readline(); keep the richer inline one, drop the redundant standalone. Oracle: --test-parse 270/270, ruff clean. Op: compress

…rawl) spec_needs_outfile() was a pure single-caller forwarder for `spec.output_capture == "file"` — and its sibling _build_backend_argv already inlines that exact test. Inline it at the one call site and drop the wrapper, so the file-capture predicate has one consistent form. Oracle: --test-parse 270/270, ruff clean. grep confirms no residual refs. Op: compress

respond_code_review / respond_plan_review / respond_subagent_review each hand-inlined the identical FAIL-write / PASS-clear / UNCERTAIN-no-op block (differing only in tool_name + key). CLAUDE.md names this exact triple under the "UNCERTAIN preserves prior state" invariant — a shared contract enforced only by copy-paste, free to drift. Extract _apply_verdict_state() as the single owner; the load-bearing UNCERTAIN WHY comment moves into it once. It calls write_fail_state/clear_fail_state by global name so the self-test's module-scope monkeypatch still intercepts (verified). Oracle: --test-parse 270/270, ruff clean. Op: compress

install-cursor.sh's jq dedup stripped only the bare codex_command, never the "REFLECTOR_PREEDIT_BLOCK=1 ..."-prefixed PreToolUse command. A second --pre-edit install therefore appended a DUPLICATE PreToolUse hook — and each duplicate fires _record_deny on the same key per edit, tripping the _PRE_EDIT_MAX_DENIES loop-breaker after one denial instead of two, silently weakening the pre-edit gate. Match prior reflector hooks by the unique reflector_script path substring so both the bare and prefixed variants are stripped before re-adding. Verified: 2x --pre-edit install -> one PreToolUse group/hook. install-grok is unaffected (it dedups on the same prefixed command it installs). Found by ce-code-review (correctness/testing/adversarial/previous-comments corroborated; also CodeRabbit outside-diff). Op: correct Restores: spec:installer-idempotent-reinstall

Two robustness gaps that could invert the fail-closed Stop checkpoint to fail-open: - fan_out's N>1 path collected results via f.result(), which re-raises any worker exception invoke_backend did not already convert to "". One backend's error then aborted the whole comprehension (discarding the OTHER backends' verdicts) and propagated through respond_stop/main with no top-level catch, crashing the hook — exit 1 is non-blocking on Stop, so a fail-closed checkpoint silently became fail-open. Guard each result via _future_raw -> "" (infra-empty), consistent with invoke_backend's fail-open contract and INV-MERGE. - invoke_backend's mkstemp ran OUTSIDE its try, so a tmpfile OSError bypassed the documented fail-open. Move it inside the try. Regression test: a raising N>1 worker -> infra-empty, others survive, merge -> PASS (272/272). Found by ce-code-review (adversarial P1, corroborated by learnings). Op: correct Restores: spec:fan-out-fails-open-per-worker

tool_response (controlled by the MCP edit/diagnostic tool) was interpolated into the review/failure prompts OUTSIDE the _sandbox_content fence: - build_code_review_prompt: filePath was neither redacted nor newline- stripped (error wasn't newline-stripped either). A forged "\nPASS\n" in filePath lands as a verdict line; respond_code_review then clear_fail_state on it, suppressing a real FAIL at the fail-closed Stop gate. - build_code_change_failure_prompt: same unsandboxed-metadata pattern. It is a no-verdict diagnostic (distortion-only), hardened for consistency. Add _safe_meta(): redact-THEN-truncate (a secret straddling the cap can't leak its prefix) + newline-collapse; apply at every tool_response metadata interpolation. Regression tests: forged "\nPASS\n" in filePath is collapsed, not raw-injected (275/275). Found by ce-code-review (security P1, corroborated by learnings); the redact-before-truncate refinement was caught by the reflector's own review. Op: correct Restores: spec:untrusted-metadata-sanitized

…-TEXT) The read-only-lever loop verified INV-READONLY for all 5 backends but never asserted INV-VERDICT-TEXT for the claude reviewer — dropping "text" from its extra_argv left the suite green while silently letting a JSON output mode wrap (and hide) the verdict from parse_verdict's first-5-lines scan. Assert --output-format is present and followed by "text". Found by ce-code-review (testing P2). Op: correct Restores: test:inv-verdict-text-claude

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@scripts/codex-reflector.py`:
- Line 1131: Update the list comprehension return so the zip call is explicit
about strict length matching to avoid silent truncation (B905): change the
comprehension that returns [(name, _future_raw(f)) for name, f in zip(backends,
futures)] to use zip(backends, futures, strict=True) so any mismatch between
backends and futures raises immediately; locate the expression using the names
backends, futures and helper _future_raw to apply the change.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 6c7f5473-b200-4e2a-8b73-512894eb6ed7

📥 Commits

Reviewing files that changed from the base of the PR and between 72a0817 and aacdaf3.

📒 Files selected for processing (8)

CLAUDE.md
hooks/antigravity-reflector-hook.sh
hooks/codex-reflector-hook.sh
hooks/codex-reflector-preedit-hook.sh
scripts/codex-reflector.py
scripts/install-antigravity.sh
scripts/install-codex.sh
scripts/install-cursor.sh

✅ Files skipped from review due to trivial changes (1)

hooks/codex-reflector-hook.sh

🚧 Files skipped from review as they are similar to previous changes (6)

hooks/antigravity-reflector-hook.sh
hooks/codex-reflector-preedit-hook.sh
scripts/install-cursor.sh
scripts/install-antigravity.sh
scripts/install-codex.sh
CLAUDE.md

…d absent When a host omitted session_id, _atomic_update_state/_read_state bailed on the empty id, so a PostToolUse FAIL was silently never recorded and Stop never blocked on it — while respond_pretooluse already synthesized a nosession-<hash(cwd)> id. Add _resolve_session_id() (same scheme) and route BOTH the PostToolUse write (main) and the Stop read (respond_stop) through it, so write/read agree on a stable per-cwd key. An id the host DID send is returned unchanged (INV-CODEX-PATH-STABLE). Regression: respond_stop blocks on a missing-session FAIL via the pending- FAIL fast path (read-side wiring guarded end-to-end); empty-id drop asserted (281/281). main()'s one-line write-side resolution is not unit-testable (sys.exit) and shares the same helper. Found by ce-code-review (adversarial P3, pre-existing); the read-both-paths refinement was caught by the reflector's own review. Op: correct Restores: spec:fail-state-recorded-without-session-id

The cursor --pre-edit duplicate-hook bug (afb4478) reached review only because nothing tested installer idempotency — --test-parse is hermetic and never shells out to the installers. Add scripts/test-install-idempotent.sh: run each merging installer twice against a scratch dir and assert the hook set is stable AND free of duplicate commands per event (a count-only check would miss a duplicate already present after run 1). Covers cursor/grok/ codex/antigravity; skips jq-less installers (skip != pass); cleans scratch dirs via a trap. shellcheck-clean; negative control confirms the dupe check fails on a planted duplicate. Document it in CLAUDE.md + README; bump the documented self-test count 270 -> 281 (tests added across the review fixes). Found by ce-code-review (testing P2). Op: extend

The cursor idempotency fix (afb4478) switched the jq dedup to $reflector_script, leaving codex_command with no remaining reference — the python heredoc derives its command from sys.argv[1] independently. shellcheck SC2034. Remove it. Op: compress

…n_id (Graft) The _resolve_session_id helper (fc9ab2b) unified the nosession-<hash(cwd)> fallback for the fail-state write/read paths, but its original mirror site — respond_pretooluse's deny-loop key — kept the inline copy, so the formula lived in two places (free to drift). Route it through the helper too; now all three session-key sites share one owner. Behavior-identical (the inline expression was byte-for-byte the helper body). Docstring updated: the helper is the canonical owner, no longer "mirroring" respond_pretooluse. Oracle: --test-parse 281/281, ruff clean. Found by ce-simplify-code (reuse + quality + efficiency all flagged it). Op: compress

The idempotency fix (afb4478) used contains($reflector_script) to strip prior hooks — but that ALSO strips a user's own hook that merely references the reflector path (e.g. `mylogger python3 ".../codex-reflector.py" --tail`), clobbering it on every reinstall. Match the two EXACT commands this installer emits (bare + REFLECTOR_PREEDIT_BLOCK=1 variant); this re-adds codex_command (and a pre_edit_command sibling) that 18d7fef removed once the substring approach orphaned it. Still idempotent for --pre-edit and plain reinstalls. Harness gains an over-strip regression: a user hook referencing the path survives reinstall (now 5 checks). Found by ce-code-review round 2 (adversarial P2, reproduced) — a regression the first round's substring fix introduced. Op: correct Restores: spec:dedup-strips-only-own-commands

The metadata-sanitization fix (9a51a39) routed tool_response.* through _safe_meta but left the sibling `File: {file_path}` / `Tool: {tool_name}` header lines raw — same outside-the-fence position, same verdict-bearing prompt (build_code_review_prompt), same forged-"\nPASS\n" -> clear_fail_state injection class under a poisoned-agent threat model. Route them (and build_code_change_failure_prompt's File/Tool/Error lines) through _safe_meta, matching build_pretooluse_prompt's existing safe_path/safe_tool. Regression: a forged newline in tool_input.file_path's header is collapsed (282/282). Found by ce-code-review round 2 (security P1, corroborated by correctness). Op: correct Restores: spec:untrusted-metadata-sanitized

The two lists are built in lockstep so this is behavior-neutral, but strict= turns any future length desync into a hard error instead of silently truncating — a dropped backend on a fail-closed Stop would invert it to fail-open. Satisfies ruff B905 (CodeRabbit flagged it; the repo has no ruff config so default ruff didn't). Found by ce-code-review round 2 (correctness/adversarial/previous-comments; CodeRabbit PR #2). Op: correct Restores: spec:fan-out-no-silent-truncation

The idempotency harness used `set -e`, so a broken installer aborted the whole run before the remaining installers were tested and without a clear per-installer FAIL line. Drop `set -e` (keep `set -u`): a failed installer leaves a missing/unparseable settings file, report() reads it as -1 -> FAIL, and the harness continues through the rest of the matrix. Verified: happy path still 5/5; a missing settings file reports FAIL and continues. Found by ce-code-review round 2 (testing P3). Op: correct Restores: spec:harness-reports-each-installer

Five recur-prone learnings surfaced by this session's two ce-code-review rounds + simplify, each grounded in the commits that fixed them and adversarially audited (accuracy / schema / no-overlap): - security/unsandboxed-prompt-metadata-injection — untrusted metadata outside the _sandbox_content fence is a forged-verdict vector (_safe_meta; 9a51a39/5494478) - security/fan-out-worker-exception-inverts-fail-closed — a worker exception crashes the hook; exit 1 is non-blocking on Stop -> fail-open (_future_raw; cf19cad) - security/redact-before-truncate-ordering — truncate-first leaks a secret's prefix; redact before slicing (_safe_meta; 9a51a39) - conventions/installer-dedup-must-match-all-command-variants — dedup on the exact command-set emitted, not a path substring (install-cursor.sh; afb4478/5372732) - conventions/key-normalizer-helper-must-route-all-call-sites — a shared-key helper must route every read+write site or write/read silently disagree (_resolve_session_id; fc9ab2b/c667abc) Authored via an ultracode workflow (5 drafters -> adversarial audit); CLAUDE.md already surfaces docs/solutions/ for discovery. Op: extend

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

scripts/install-cursor.sh (1)

141-166: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Critical: Unconditional dedup strips opted-in pre-edit hook on plain reinstall.

The jq filter strips BOTH $cmd and $precmd unconditionally (line 148), but the Python heredoc only emits the PreToolUse hook when pre_edit==1 (lines 117-132). This means running install-cursor.sh (without --pre-edit) after a prior install-cursor.sh --pre-edit run will silently remove the user's opted-in PreToolUse hook.

The strip-set must match the emit-set: strip $precmd only when --pre-edit is passed, matching the Codex installer's conditional pattern (see context snippet from install-codex.sh lines 164-167 where ours conditionally includes preedit_command).

🛡️ Proposed fix: conditional strip-set

Pass the pre_edit flag to jq and conditionally strip the pre-edit command:

-    jq --arg cmd "$codex_command" --arg precmd "$pre_edit_command" -s '
+    jq --arg cmd "$codex_command" --arg precmd "$pre_edit_command" --argjson pre_edit "$pre_edit" -s '
       # Strip ONLY our two exact generated commands (bare + pre-edit variant),
-      # so reinstall is idempotent for both --pre-edit and plain runs WITHOUT
-      # clobbering a user-authored hook that merely references the reflector path.
-      def without_reflector($cmd; $precmd):
+      # when --pre-edit is passed (matching emit-set). Plain runs strip only the
+      # bare command so they do not clobber a prior --pre-edit opt-in.
+      def without_reflector($cmd; $precmd; $pre_edit):
         map(
           if has("hooks") then
-            . + {hooks: (.hooks | map(select((.command // "") as $c | $c != $cmd and $c != $precmd)))}
+            . + {hooks: (.hooks | map(select((.command // "") as $c |
+              if $pre_edit then ($c != $cmd and $c != $precmd)
+              else ($c != $cmd)
+              end
+            )))}
           else
             .
           end
         )
         | map(select((has("hooks") | not) or ((.hooks // []) | length > 0)));
 
       .[0] as $existing
       | .[1] as $codex
       | (($existing.hooks // {}) + ($codex.hooks // {}) | keys_unsorted | unique) as $keys
       | $existing + {
           hooks: reduce $keys[] as $key ({};
             .[$key] = (
-              (($existing.hooks[$key] // []) | without_reflector($cmd; $precmd))
+              (($existing.hooks[$key] // []) | without_reflector($cmd; $precmd; $pre_edit))
               + ($codex.hooks[$key] // [])
             )
           )
         }

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scripts/install-cursor.sh` around lines 141 - 166, The jq filter's
without_reflector currently always removes both $cmd and $precmd, which deletes
an opted-in PreToolUse hook when reinstalling without --pre-edit; modify the
invocation and filter so the pre-edit command is only stripped when pre_edit is
true: pass a boolean pre_edit flag into jq from the shell, change
without_reflector($cmd; $precmd) to conditionally include $precmd only when
pre_edit is true (e.g., call without_reflector($cmd; ($precmd |
select($pre_edit)))) and update the jq call site that reads "$settings_path" and
"$tmp_new" and writes "$tmp_merged" so it supplies the pre_edit value derived
from the script's pre_edit variable.

🧹 Nitpick comments (1)

scripts/test-install-idempotent.sh (1)

89-119: ⚡ Quick win

Consider adding cross-run idempotency test case.

The current Cursor test runs --pre-edit twice, catching duplicate accumulation and user-hook over-stripping. Consider adding a test case that runs --pre-edit then plain (without the flag) to verify that an opted-in PreToolUse hook is preserved, catching the conditional strip-set bug flagged in install-cursor.sh.

This would match the Codex installer's convention (context snippet 3, lines 164-167) where a plain reinstall does not remove a previously opted-in --preedit hook.

💡 Proposed additional test case

After line 95, add:

  # Cross-run preservation: opt-in with --pre-edit, then plain reinstall should
  # preserve the PreToolUse hook (not strip it).
  d2=$(mkdir_scratch)
  sh "$repo/scripts/install-cursor.sh" --pre-edit "$d2" >/dev/null 2>&1
  r1_prehook=$(grep -c 'REFLECTOR_PREEDIT_BLOCK=1' "$d2/.claude/settings.json" || echo 0)
  sh "$repo/scripts/install-cursor.sh" "$d2" >/dev/null 2>&1
  r2_prehook=$(grep -c 'REFLECTOR_PREEDIT_BLOCK=1' "$d2/.claude/settings.json" || echo 0)
  if [ "$r1_prehook" -gt 0 ] && [ "$r1_prehook" -eq "$r2_prehook" ]; then
    printf 'PASS install-cursor.sh preserves --pre-edit opt-in across plain reinstall\n'
    pass=$((pass + 1))
  else
    printf 'FAIL install-cursor.sh stripped --pre-edit hook on plain reinstall (%s -> %s)\n' "$r1_prehook" "$r2_prehook"
    fail=$((fail + 1))
  fi

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scripts/test-install-idempotent.sh` around lines 89 - 119, Add a cross-run
idempotency test that verifies an opted-in PreToolUse hook survives a plain
reinstall: create a second scratch dir via mkdir_scratch, run sh
"install-cursor.sh" --pre-edit on it, record presence/count of the
REFLECTOR_PREEDIT_BLOCK (or equivalent PreToolUse marker) in
"$d2/.claude/settings.json", then run sh "install-cursor.sh" (plain) on the same
dir and re-check the marker; assert that the marker count before and after are
equal and non-zero and fail the test if the plain reinstall stripped the
PreToolUse hook. Use the existing pattern for reporting/pass/fail and reference
install-cursor.sh, PreToolUse, and REFLECTOR_PREEDIT_BLOCK to locate where to
add the snippet.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@scripts/install-cursor.sh`:
- Around line 141-166: The jq filter's without_reflector currently always
removes both $cmd and $precmd, which deletes an opted-in PreToolUse hook when
reinstalling without --pre-edit; modify the invocation and filter so the
pre-edit command is only stripped when pre_edit is true: pass a boolean pre_edit
flag into jq from the shell, change without_reflector($cmd; $precmd) to
conditionally include $precmd only when pre_edit is true (e.g., call
without_reflector($cmd; ($precmd | select($pre_edit)))) and update the jq call
site that reads "$settings_path" and "$tmp_new" and writes "$tmp_merged" so it
supplies the pre_edit value derived from the script's pre_edit variable.

---

Nitpick comments:
In `@scripts/test-install-idempotent.sh`:
- Around line 89-119: Add a cross-run idempotency test that verifies an opted-in
PreToolUse hook survives a plain reinstall: create a second scratch dir via
mkdir_scratch, run sh "install-cursor.sh" --pre-edit on it, record
presence/count of the REFLECTOR_PREEDIT_BLOCK (or equivalent PreToolUse marker)
in "$d2/.claude/settings.json", then run sh "install-cursor.sh" (plain) on the
same dir and re-check the marker; assert that the marker count before and after
are equal and non-zero and fail the test if the plain reinstall stripped the
PreToolUse hook. Use the existing pattern for reporting/pass/fail and reference
install-cursor.sh, PreToolUse, and REFLECTOR_PREEDIT_BLOCK to locate where to
add the snippet.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 829c40a9-d13c-4c6b-b738-dfd60dd045c0

📥 Commits

Reviewing files that changed from the base of the PR and between aacdaf3 and 1f25be2.

📒 Files selected for processing (10)

CLAUDE.md
README.md
docs/solutions/conventions/installer-dedup-must-match-all-command-variants.md
docs/solutions/conventions/key-normalizer-helper-must-route-all-call-sites.md
docs/solutions/security/fan-out-worker-exception-inverts-fail-closed.md
docs/solutions/security/redact-before-truncate-ordering.md
docs/solutions/security/unsandboxed-prompt-metadata-injection.md
scripts/codex-reflector.py
scripts/install-cursor.sh
scripts/test-install-idempotent.sh

✅ Files skipped from review due to trivial changes (5)

docs/solutions/conventions/installer-dedup-must-match-all-command-variants.md
docs/solutions/conventions/key-normalizer-helper-must-route-all-call-sites.md
docs/solutions/security/fan-out-worker-exception-inverts-fail-closed.md
docs/solutions/security/redact-before-truncate-ordering.md
README.md

🚧 Files skipped from review as they are similar to previous changes (1)

CLAUDE.md

…on learning A sixth recur-prone learning, drafted in the prior session but never persisted before compaction (confirmed via session-history search). The review cycle flagged `cp --` / `readlink --` as non-POSIX "GNUisms"; both claims are false, and stripping the guard would re-introduce a dash-prefixed-path misparse bug. - conventions/end-of-options-double-dash-is-portable-and-load-bearing — `cp` shall conform to POSIX XBD 12.2 (Guideline 10); `readlink` is not in POSIX but GNU+BSD both accept `--` via getopt; reject-with-spec-citation is the correct disposition, not the "fix" Authored via a Full ce-compound run (3 research drafters + session-history search), then adversarially fact-checked against primary sources (Open Group POSIX Issue 8, GNU coreutils, FreeBSD) — all five load-bearing claims confirmed. CLAUDE.md/AGENTS.md already surface docs/solutions/ for discovery. Op: extend

metaphorics · 2026-06-09T13:04:37Z

@coderabbitai review

coderabbitai · 2026-06-09T13:04:45Z

✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copilot

Pull request overview

Copilot reviewed 22 out of 23 changed files in this pull request and generated 4 comments.

+settings_path="${settings_dir}/settings.json"
+tmp_new=$(mktemp)
+tmp_merged=$(mktemp)
+trap 'rm -f "$tmp_new" "$tmp_merged"' EXIT INT HUP TERM


+tmp_hooks=$(mktemp)
+tmp_merged=$(mktemp)
+tmp_settings=$(mktemp)
+trap 'rm -f "$tmp_hooks" "$tmp_merged" "$tmp_settings"' EXIT INT HUP TERM


 settings_path="${settings_dir}/settings.json"
 tmp_new=$(mktemp)
 tmp_merged=$(mktemp)
 trap 'rm -f "$tmp_new" "$tmp_merged"' EXIT INT HUP TERM


+mkdir_scratch() {
+  _d=$(mktemp -d)
+  tmpdirs="$tmpdirs $_d"
+  printf '%s' "$_d"
+}


metaphorics added 15 commits June 9, 2026 01:44

feat(host): Antigravity host — Stop-centric + host-discriminated stat…

6187ed5

…e (U11) Op: extend

docs: surface docs/solutions/ in CLAUDE.md for agent discoverability

72a0817

One-line pointer so future agents search the learnings store before debugging recurring issues (ce-compound discoverability check). Op: extend

Copilot AI review requested due to automatic review settings June 8, 2026 21:08

Copilot started reviewing on behalf of metaphorics June 8, 2026 21:08 View session

gemini-code-assist Bot reviewed Jun 8, 2026

View reviewed changes

Comment thread hooks/antigravity-reflector-hook.sh Outdated

Comment thread hooks/codex-reflector-hook.sh Outdated

Comment thread hooks/codex-reflector-preedit-hook.sh Outdated

Copilot AI reviewed Jun 8, 2026

View reviewed changes

Comment thread scripts/codex-reflector.py Outdated

Comment thread scripts/install-antigravity.sh

Comment thread scripts/test-readonly.sh

coderabbitai Bot reviewed Jun 8, 2026

View reviewed changes

Comment thread CLAUDE.md Outdated

Comment thread scripts/install-antigravity.sh

Comment thread scripts/install-codex.sh Outdated

metaphorics added 8 commits June 9, 2026 06:25

docs: tag CLAUDE.md data-flow fence as text (MD040)

2a6abcf

The data-flow fenced block was untyped; markdownlint MD040 flags it. Addresses CodeRabbit PR #2 (Minor). Op: correct Restores: spec:fenced-blocks-have-language

metaphorics added 4 commits June 9, 2026 07:23

coderabbitai Bot reviewed Jun 8, 2026

View reviewed changes

Comment thread scripts/codex-reflector.py Outdated

metaphorics added 9 commits June 9, 2026 07:43

coderabbitai Bot reviewed Jun 9, 2026

View reviewed changes

metaphorics requested a review from Copilot June 9, 2026 13:04

Copilot started reviewing on behalf of metaphorics June 9, 2026 13:04 View session

Copilot AI reviewed Jun 9, 2026

View reviewed changes

Conversation

metaphorics commented Jun 8, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Security fix (notable)

Invariants (named, tested)

Testing

Known limitations (honest)

Summary by CodeRabbit

Uh oh!

chatgpt-codex-connector Bot commented Jun 8, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot commented Jun 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

metaphorics commented Jun 9, 2026

Uh oh!

coderabbitai Bot commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

metaphorics commented Jun 8, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 8, 2026 •

edited

Loading

coderabbitai Bot commented Jun 9, 2026 •

edited

Loading