Skip to content

feat(v0.9.5): full VM reversal methodology in ciphertext-recovery SKILL#7

Merged
icloudza merged 1 commit into
mainfrom
claude/v0.9.5-vmp-methodology
May 11, 2026
Merged

feat(v0.9.5): full VM reversal methodology in ciphertext-recovery SKILL#7
icloudza merged 1 commit into
mainfrom
claude/v0.9.5-vmp-methodology

Conversation

@icloudza

Copy link
Copy Markdown
Owner

Summary

skills/ciphertext-recovery/SKILL.md previously had a one-paragraph treatment of VMP / 自研 VM, plus a "bypass via IO-buffer semantic ops" strategy. The bypass strategy is correct as the default path for most VMP tasks — but the SKILL had no guidance for cases where the user explicitly asks for a complete byte-code → executable Python decoder, or where bypass deadlocks because the IO buffer state lives entirely inside the VM context (invisible to trace).

Without structured guidance for these escalation cases, the agent would either give up or, worse, ship a half-reversed "decoder" with fabricated handler semantics. VMP reversal is brittle — one wrong bit in opcode bit-field decode produces 100+ lines of plausible-looking but semantically false listing. This PR closes that gap.

What's added

完整 VM 还原 4 阶段流程 section (+140 lines in SKILL.md) with strict brittleness gates at every transition:

Stage Purpose Gate (must pass to advance)
A. VMP 识别 confirm it's actually VMP, not OLLVM-fla 3 necessary conditions (high-frequency dispatcher + computed-goto + persistent VM context register), AND, all must ✓
B. opcode schema 推导 determine word size / endian / bit-field / encoding state / PC stride 100-opcode frequency check + multi-handler hit check + no-ghost-opcode check. 99% does NOT pass; needs 100%
C. 单 handler 迭代反编译 reverse each handler with round-trip emulation per-handler hypothesis_add → conclude with falsification_evidence proving Python emulator output = trace mem_w output
D. 业务级闭环验证 bit-for-bit business-level match 3 levels (instruction 100% / block 100% / business bit-for-bit), all must pass

Anti-hallucination scaffold wiring

Every stage explicitly invokes v0.9.0–v0.9.3 scaffold:

  • Stage B schema concludes with falsification_evidence (FIX#5)
  • Stage C handler audits via hypothesis-reviewer (FIX#6 hard gate; mandatory every 10 handlers when count > 30)
  • Stages chain via depends_on — Stage A abandon cascades to B/C/D (FIX#4 abandon-cascade)
  • Stage D pass → write_artifact [H<n>] citations link the whole chain (FIX#7 — non-load-bearing handlers can be hypothesis_archived)
  • v0.9.3 high-confidence tier marker gate enforces "complete decoder" language ONLY appears when Stage D bit-for-bit passes

Explicit 不可自动化 boundary

Four scenarios are documented as out-of-scope for pure-trace + algokiller workflow. Agent MUST either record as contradicting evidence (FIX#2 auto-caps confidence at low) or hypothesis_abandon the full VM-reversal track:

  • Encrypted opcode with runtime-decrypted key
  • Self-modifying opcode stream
  • VM-internal state-integrity checks
  • JIT-style runtime native code emission (that's JIT, NOT VMP)

"Half-decoder is 10× more misleading than 'I can't see it' is" — verbatim in the SKILL.

Brand hygiene maintained

Methodology distilled from a generic VM reversal playbook circulating in security research community. No brand-specific opcode tables, no app-specific case studies — v0.9.4 brand hygiene policy fully respected.

Test plan

  • No engine change → native 146/146 PASS unchanged
  • No handler change → Python 83/83 PASS unchanged
  • Brand residue scan: 0 hits in skills/agents/docs/server/tools/examples
  • CI on PR

Why its own release (v0.9.5 not v0.9.4 amendment)

Adds a new behavioural path the agent didn't have before (full reversal vs bypass). Server / handlers / schemas / gates unchanged — plugin functions identically at the API level; the difference is agent-visible methodology only. Big enough to deserve a version.

skills/ciphertext-recovery/SKILL.md previously had a one-paragraph
treatment of VMP / 自研 VM with a "bypass via IO-buffer semantic ops"
strategy. The bypass strategy is correct as the default path for most
VMP tasks. But the SKILL had no guidance for cases where:

- The user explicitly asks for a complete byte-code → executable Python
  decoder, OR
- The bypass path deadlocks because the IO buffer's intermediate state
  lives entirely inside the VM context (invisible to trace).

Without a structured methodology for these escalation cases, the agent
would either give up or, worse, ship a half-reversed "decoder" with
fabricated handler semantics. VMP reversal is brittle — one wrong bit
in the opcode bit-field decode produces 100+ lines of plausible but
semantically false listing.

Added a "完整 VM 还原 4 阶段流程" section (+140 lines) with strict
brittleness gates at every transition:

Stage A — VMP 识别 (3 necessary conditions, all must ✓)
Stage B — opcode schema 推导 (100-opcode frequency check + multi-handler
          hit check + no-ghost-opcode check; 99% does NOT pass)
Stage C — 单 handler 迭代反编译 (per-handler round-trip emulation;
          handler count > 30 → mandatory reviewer audit every 10)
Stage D — 业务级闭环验证 (3 levels: instruction 100% / block 100% /
          business bit-for-bit; all must pass or 还没干完)

Every stage explicitly wires into the v0.9.0-v0.9.3 anti-hallucination
scaffold:

- Stage B schema concludes with falsification_evidence (FIX#5)
- Stage C handler audits via hypothesis-reviewer (FIX#6 hard gate)
- Stages chain via depends_on (FIX#4 abandon-cascade)
- Stage D pass → write_artifact with [H<n>] citations
- v0.9.3 high-confidence tier marker enforces "complete decoder"
  language ONLY appears when Stage D passes bit-for-bit

Explicit 不可自动化 boundary documents 4 scenarios as out-of-scope for
pure-trace + algokiller workflow:

- Encrypted opcode with runtime-decrypted key
- Self-modifying opcode stream
- VM-internal state-integrity checks
- JIT-style runtime native code emission (this is JIT, NOT VMP)

In these cases the SKILL mandates the agent either record the blocker
as contradicting evidence (FIX#2 auto-caps confidence at low) or
hypothesis_abandon the full VM-reversal track. "Half-decoder is 10×
more misleading than 'I can't see it' is" — verbatim in the SKILL.

Reference: methodology framework distilled from a generic VM reversal
playbook circulating in security research community; specific opcode
values / brand-bound case studies deliberately excluded per v0.9.4
brand hygiene policy.

Tests: no engine change. native 146/146 PASS. python 83/83 PASS.
Plugin behaviour at API level is identical to v0.9.4; the difference
is agent-visible methodology in the ciphertext-recovery skill.
@icloudza icloudza merged commit eed26ab into main May 11, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant