Skip to content

spec: jolt-core transcript migration to spongefish split-trait surface#1586

Open
shreyas-londhe wants to merge 22 commits into
a16z:mainfrom
shreyas-londhe:spec/jolt-core-transcript-migration
Open

spec: jolt-core transcript migration to spongefish split-trait surface#1586
shreyas-londhe wants to merge 22 commits into
a16z:mainfrom
shreyas-londhe:spec/jolt-core-transcript-migration

Conversation

@shreyas-londhe

Copy link
Copy Markdown
Contributor

Summary

Spec for follow-up PR (b) of the jolt-transcript spongefish port (#1455). Migrates jolt-core — and all remaining legacy-facade consumers (jolt-verifier, jolt-dory, jolt-sumcheck, jolt-openings, jolt-crypto) — onto jolt-transcript's spongefish-native split-trait surface (ProverTranscript / VerifierTranscript / OptimizedChallenge), deletes jolt-core's duplicate transcript stack (~1,370 LOC) and the legacy facade (legacy.rs), so a single spongefish-backed Fiat-Shamir surface remains workspace-wide.

Key decisions captured

  • Split-trait surface (positional + NARG), not the facade — the facade is retired.
  • Clean break on Fiat-Shamir bytes; correctness gate is internal prover/verifier consistency (muldiv host+zk), not byte-equality. Downstream verifier regeneration (Solidity / gnark / Lean) stays a coordinated follow-up.
  • OptimizedChallenge for Poseidon added, with a pinned 128-bit squeeze rule + entropy-audit requirement.
  • Single big-bang PR; two distinct migrations (jolt-core's own Transcript trait + the differently-shaped facade Transcript trait) each mapped per-method.

Notes

  • Planning/spec PR — no code changes; only specs/jolt-core-transcript-migration.md.
  • Self-reviewed via three independent zero-context analysis passes (15.8% → 28.4% → 9.5% ambiguity); the second pass surfaced the facade-consumer scope, now mapped.
  • Run /analyze-spec or add the spec-review label for an external pass.

Follow-up (b) of the jolt-transcript spongefish port: migrate jolt-core and
all remaining legacy-facade consumers onto jolt-transcript's split-trait
surface and retire the facade, so Fiat-Shamir routes through a single
spongefish-backed implementation workspace-wide.
@github-actions github-actions Bot added the spec Tracking issue for a feature spec label Jun 3, 2026
@moodlezoup moodlezoup added the claude-spec-review-request Triggers Claude spec analysis label Jun 3, 2026
@github-actions

github-actions Bot commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

Claude spec review session started: https://claude.ai/code/session_01C18w8YD7eWxsDS2P9HoLd6

Copy link
Copy Markdown
Collaborator

Spec Analysis: jolt-core Transcript Migration to Spongefish Split-Trait Surface

Dimension Score Gap
Goal 0.92 Clear
Constraints 0.78 One explicit deferred design decision (transcript-poseidonchallenge-254-bit coupling) and one conditional fallback on a property the implementer must audit
Success Criteria 0.85 New jolt_core_transcript_roundtrip invariant is described conceptually — the "representative sequence" of calls is not pinned to an artifact or hand-written sequence
Context 0.90 Mostly accurate citations; minor drift (preamble at zkvm/mod.rs:312-355, not :277-320; declared 4 RV64IMAC* pairs but 2 resolve to identical Blake2b types)
Ambiguity ~13%

Status: Approved — The spec is clear enough for one-shot implementation.

Summary of what will be built:

  • Delete jolt-core/src/transcripts/ (1,370 LOC confirmed) and jolt-transcript/src/legacy.rs (287 LOC confirmed), plus the lib.rs re-exports of SpongeTranscript, MAX_LABEL_LEN, domain::{Label, LabelWithCount, U64Word}, and the Blake2bTranscript/KeccakTranscript/PoseidonTranscript aliases.
  • Re-point every transcript callsite (jolt-core: 27 files importing crate::transcripts; facade consumers jolt-verifier, jolt-dory, jolt-sumcheck, jolt-openings, jolt-crypto: 54 T: Transcript<Challenge = …> bound sites and ~55 AppendToTranscript impls per exploration) onto jolt_transcript::{ProverTranscript, VerifierTranscript, OptimizedChallenge}. Per-method mapping tables are in §Architecture for both migrations (jolt-core trait and facade trait have different shapes — explicit).
  • Add OptimizedChallenge for PoseidonSponge (confirmed currently unimplemented in crates/jolt-transcript/src/prover.rs) with a pinned 128-bit squeeze rule: squeeze one Fr via verifier_message::<Fr>, truncate to low-128 bits of its canonical LE bigint representation, re-embed via Fr::from(u128). Sponge advances by exactly one squeeze on both sides.
  • Rewire jolt-core's transcript-* Cargo features to forward to jolt-transcript's; keep challenge-254-bit (it gates JoltField::Challenge in field/ark.rs:44-49, not transcripts/).

Key invariants pinned:

  1. Prover/verifier challenge agreement under clean-break FS bytes — muldiv (host + zk) is the e2e gate; downstream Solidity/gnark/Lean verifiers are explicit non-goals (coordinated follow-ups).
  2. Existing jolt-eval soundness RedTeam invariant must continue to hold post-swap.
  3. OptimizedChallenge::challenge_128 agreement across all three sponges, including the newly-added Poseidon impl.
  4. Workspace-wide single transcript implementation: grep gate on crate::transcripts::* and jolt_transcript::{Transcript, AppendToTranscript, SpongeTranscript, …}.

Critical evaluation criteria:

  • cargo nextest run -p jolt-core muldiv --features host and --features host,zk both pass.
  • cargo nextest run -p jolt-verifier -p jolt-blindfold -p jolt-eval passes (model crate, BlindFold, invariants).
  • cargo nextest run -p jolt-sumcheck -p jolt-openings -p jolt-crypto passes (facade-consumer migration validation).
  • Clippy clean in both host and host,zk modes.
  • Extended transcript_symmetry invariant (new shared Op::OptimizedChallenge variant) plus new jolt_core_transcript_roundtrip invariant pass under cargo nextest run -p jolt-eval.

Residual items to resolve during implementation (flagged in spec, not blocking approval):

  • The "open question" on whether transcript-poseidon should retain its challenge-254-bit enable in Cargo.toml:46. The spec correctly notes that OptimizedChallenge for PoseidonSponge truncates to 128 bits regardless of JoltField::Challenge width, so the historical coupling looks vestigial — but the spec explicitly defers the decision. Recommend: keep the coupling unless the audit below proves it inert, since Mont254BitChallenge vs MontU128Challenge may affect non-transcript callers.
  • The Poseidon low-128-bit min-entropy audit (audit (b) in §Architecture). If it fails for the Circom-compatible BN254 parameters, fall back to Fr::from(s) mod 2^128 via the field's own reduction. Document the chosen rule inline at the impl.
  • The state() → spongefish state peek mapping for facade consumers is conditional on "verify no production use" — grep first; if any production callsite is found, the mapping needs to be defined upfront.

Next step: Run /implement-spec to implement this spec:

Note: the claude-spec-approved label should be added to this PR. (The remote review environment lacks the GitHub label-management MCP tool; @shreyas-londhe please add it manually, or the reviewing maintainer can.)


Generated by Claude Code

@moodlezoup moodlezoup added claude-spec-approved Claude spec analysis found no ambiguities and removed claude-spec-review-request Triggers Claude spec analysis labels Jun 3, 2026 — with Claude
Comment thread specs/jolt-core-transcript-migration.md Outdated
### Non-Goals

1. **Byte-identical Fiat-Shamir output.** This is an explicit clean break. The migrated transcript uses spongefish's duplex-sponge + NARG layout; serialized proofs produced before this PR will not verify after it, and that is acceptable. No backward-compatibility shim for old proof bytes.
2. **Downstream verifier regeneration.** The Solidity verifier (`jolt-core/src/zkvm/transpilable_verifier.rs`), the gnark transpiler (`transpiler/`), and the Lean extractor (`zklean-extractor/`) consume the transcript byte layout and will need regeneration — these are coordinated follow-up PRs, named as non-goals in the parent spec, and are out of scope here.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no Solidity verifier

Comment thread specs/jolt-core-transcript-migration.md Outdated

Clustering: `zkvm/mod.rs:277-320` (`fiat_shamir_preamble`), `zkvm/prover.rs`, `zkvm/verifier.rs`, `subprotocols/sumcheck.rs`, `subprotocols/univariate_skip.rs`, `poly/opening_proof.rs`, `poly/commitment/dory/commitment_scheme.rs`, `subprotocols/blindfold/*`. The optimized-challenge family (`challenge_scalar_optimized` / `challenge_vector_optimized`) is ~50 of the production calls — confirm by grep during implementation rather than trusting this estimate.
- **`JoltToDoryTranscript`:** the jolt-core bridge (`jolt-core/src/poly/commitment/dory/wrappers.rs:415`) is rewired to wrap a `ProverTranscript` / `VerifierTranscript` instead of jolt-core's `Transcript`. The jolt-dory bridge (`crates/jolt-dory/src/transcript.rs`) is moved from the facade onto the split-trait surface.
- **Poseidon `OptimizedChallenge`:** new impl in `jolt-transcript/src/{prover,verifier}.rs` defining `challenge_128` for `PoseidonSponge`. The Blake2b/Keccak impls (`prover.rs:64`, `:74`) are `Fr::from(verifier_message::<u128>(self))` — a 16-byte squeeze reinterpreted as a `u128`. PoseidonSponge squeezes a full 254-bit `Fr`, not bytes, so the **pinned rule** is: squeeze one field element `s = verifier_message::<Fr>(self)`, take the low 128 bits of its canonical little-endian representation, and re-embed: `Fr::from(u128::from_le_bytes(s.into_bigint().to_bytes_le()[..16].try_into()?))`. This keeps the 128-bit challenge width identical across all three sponges. **Sponge-advancement pin:** the Poseidon `challenge_128` consumes exactly **one** `Fr`-worth of squeeze on both prover and verifier (the truncation happens on the squeezed value, not by squeezing fewer elements), so both sides advance the sponge state identically and a `challenge_128` at a given position leaves the sponge where a `verifier_message::<Fr>()` at that position would (one squeeze). This is what makes prover/verifier agreement (Invariant #3) hold despite Blake2b/Keccak squeezing a 16-byte `u128` while Poseidon squeezes a full field element. **Audit requirements (must be checked, not assumed):** (a) the prover and verifier `OptimizedChallenge` impls squeeze in lock-step so they derive the same value (Invariant #3); (b) the low-128-bit truncation preserves ≥128 bits of min-entropy for the Poseidon sponge output (the canonical low limbs must be uniform). If (b) fails for the Circom-compatible BN254 parameters, fall back to `Fr::from(s) ` reduced mod `2^128` via the field's own reduction. The chosen rule and its entropy justification are documented inline at the impl.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can leave challenge_128 unimplemented! for Poseidon. For recursion, the truncation itself is costly and somewhat defeats the purpose of using Poseidon in the first place. So to answer the earlier open question, transcript-poseidon should still enable challenge-254-bit.

Now that every consumer is on the spongefish-native split traits, delete
jolt-transcript's source-compatible facade (Transcript, AppendToTranscript,
the Blake2bTranscript/KeccakTranscript aliases, the  module, legacy.rs
and its tests/benches) and replace it with the field-typed FsAbsorb /
FsChallenge / FsTranscript vocabulary in messages.rs.

Migrate the modular crates (jolt-crypto, jolt-dory, jolt-hyperkzg,
jolt-openings, jolt-sumcheck, jolt-verifier) and jolt-core's consumers:
- group/commitment types implement CanonicalSerialize instead of
  AppendToTranscript; transcripts absorb them generically
- absorb whole collections in one message and drop per-field FS labels,
  matching jolt-core's granularity
- jolt-core reuses jolt_transcript::FsAbsorb instead of a duplicate trait
- tests call prover_transcript/verifier_transcript directly

Add jolt-eval invariants for the NARG round-trip: prover/verifier symmetry,
check_eof malleability rejection, and instance-digest domain separation across
the Blake2b/Keccak/Poseidon sponges.
@Vishalkulkarni45 Vishalkulkarni45 force-pushed the spec/jolt-core-transcript-migration branch 2 times, most recently from b8991bd to dfd11a6 Compare June 8, 2026 18:40
@github-actions github-actions Bot added the implementation PR contains implementation of a spec label Jun 10, 2026
shreyas-londhe and others added 6 commits June 10, 2026 12:12
…ature

The FsChallenge impls (jolt-core/src/transcript_msgs.rs and
crates/jolt-transcript/src/messages.rs) were selected by the
`transcript-poseidon` Cargo feature via a blanket impl over every sponge.
Under that feature, a Blake2b/Keccak-backed state would have silently
switched to full-field challenges — the doc claimed gating was "on the
sponge, not the width", but the code gated on the feature. Feature
unification across a workspace build could therefore change what a byte
sponge derives.

Replace the cfg-gated blanket impls with per-sponge-type impls: Blake2b512
and Keccak keep the 128-bit u128 squeeze; PoseidonSponge gets the full-field
Fr squeeze. The modular vocabulary (messages.rs) is now implemented only for
the byte sponges, so instantiating a modular verifier over a Poseidon state
is a compile error instead of reaching the `unimplemented!()` challenge_u128
at runtime (the Poseidon impl is retained only so generic OptimizedChallenge
bounds resolve, per the maintainer decision on a16z#1586). Callsites that bound
`OptimizedChallenge` for challenge derivation move to `FsTranscript`/`FsChallenge`.

Also guard the attacker-controlled `zk_mode` proof flag: a verifier can only
check the mode its `zk` feature was built for (the proof struct and BlindFold
code differ per mode at compile time), so reject a mismatched flag up front in
JoltVerifier::new with the new ProofVerifyError::ZkModeMismatch rather than
failing downstream with an empty accumulator. Adds a regression test.

Thicken the jolt_core_transcript_consistency seed corpus with challenge runs,
back-to-back NARG frames, and a dense multi-stage interleaving so a
write/read or absorb/challenge transposition is caught below the muldiv e2e.
The spec still described the spec-phase design while the implementation had
moved on under maintainer review. Set Status=implemented/PR=1586 and amend the
sections review changed: Invariant a16z#3 and the Poseidon section now record the
maintainer's decision to leave challenge_128 unimplemented for Poseidon (full-
field challenge-254-bit instead of a 128-bit truncation, which was the old
pinned rule); the transcript-construction section describes the as-built
fiat_shamir_instance statement digest (superseding the EmptyInstance pin); the
"Solidity verifier" reference is removed per the maintainer correction; and the
transpiler workspace exclusion is noted.

Drop the two dangling references to an uncommitted notes/ directory (DEV-21 in
Cargo.toml, NARG-limits-and-requirements.md in CLAUDE.md), keeping the rationale
inline.
…pt fuzz

CI ran the fuzz workflow for the first time on this branch and surfaced two
breakages the migration left behind (fuzz targets were never compiled before):

- jolt-sumcheck `valid_prefix_proof`: the local `prover_transcript` binding
  shadowed the imported `prover_transcript` function, so the second
  construction call failed (E0618); and on a concrete `ProverState` the bare
  `.challenge()` resolved to spongefish's deprecated inherent method
  (`-D warnings` error) instead of the `FsChallenge` trait method. Rename the
  locals to `pt`/`vt` and call `FsChallenge::<Fr>::challenge` explicitly.
- jolt-transcript fuzz: the crate's `fuzz/` directory was deleted in the
  migration (its target fuzzed the now-removed legacy facade), but the
  fuzz-crates workflow still listed `jolt-transcript` in its matrix, so
  `cargo fuzz list` failed on the missing manifest. Drop the matrix entry; the
  spongefish-native surface is exercised by jolt-eval's Fuzz invariants.
…lity

0.6.2 carries the upstream fix (arkworks-rs/spongefish#171) making the
`Hash<D>` duplex sponge encode its squeeze counter at fixed `u64` width instead
of native `usize`. Under 0.6.1 a 64-bit prover and a 32-bit (wasm32) verifier
derived different Fiat-Shamir challenges, so the WASM Verifier E2E rejected
valid proofs (UniSkipVerificationError). The change is a no-op on 64-bit, so
native muldiv (host + zk) is unaffected.

Verified locally: muldiv host + host,zk pass; a wasm32 verifier built against
0.6.2 verifies a 64-bit-produced fibonacci proof end-to-end (was failing on
0.6.1).
@socket-security

socket-security Bot commented Jun 11, 2026

Copy link
Copy Markdown

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff Package Supply Chain
Security
Vulnerability Quality Maintenance License
Updatedcargo/​spongefish@​0.6.1 ⏵ 0.6.2100100100100100

View full report

shreyas-londhe and others added 2 commits June 11, 2026 13:16
…hing verifier

The `valid_prefix_proof` fuzz target mirrors the verifier's per-round transcript
to build a known-valid proof. It absorbed the round polynomial coefficient-by-
coefficient (`absorb_field` in a loop), but the verifier absorbs the whole
coefficient slice as ONE message (`RoundMessage for UnivariatePoly` ->
`absorb_field_slice`). N separate length-prefixed messages are a different sponge
input than one concatenated message, so prover-side and verifier-side challenges
diverged and a fully-valid proof's final eval disagreed with the prover-side
running sum -- the fuzzer hit the assert_eq! at line 131.

Latent until the target was made to compile again; absorb as a single slice so
the prover model matches the verifier exactly. Verified with a deterministic
replica over 8 num_vars x 6 degrees x 20 seeds.
Vishalkulkarni45 and others added 2 commits June 15, 2026 09:55
Address two HIGH review findings on the ported field-aligned Poseidon
work: replace #[allow(non_snake_case)] with #[expect(...)] per repo lint
policy, and assert the reachable-child invariant in AST arena compaction
so a broken topology fails loudly instead of indexing usize::MAX.
@moodlezoup moodlezoup added the claude-review-request Request a review from Claude Code label Jun 16, 2026
@github-actions

Copy link
Copy Markdown
Contributor

Claude code review session started: https://claude.ai/code/session_01BmmGr6eb3uqgTbvbzbuSYq

@moodlezoup moodlezoup left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed the transcript migration across four parallel passes (semantic consistency, deep-bug, soundness/security, simplification), focused on the Fiat-Shamir-critical surface.

No soundness or correctness issues found. Specifically verified and clean:

  • Absorb/squeeze ordering preserved 1:1 vs. the pre-PR code across the preamble, all 8 stages, sumcheck/uni-skip round loops, BlindFold, opening proofs, and the Dory bridge — every commitment/round-poly/opening is bound before the challenge that depends on it.
  • public_message vs prover_message classification is correct: prover-only values (witness/advice commitments, round polys, openings) go into the NARG; shared/recomputable values are absorbed on both sides.
  • Challenge embeddings preserved (no plain↔optimized swaps); Poseidon's challenge_u128/challenge_128 unimplemented!() and parse_scalar_units' asserts are unreachable from attacker-controlled proof bytes.
  • fiat_shamir_instance digest binds the same field set/order the old fiat_shamir_preamble did, computed byte-identically on both sides; NARG framing is length-prefixed and usize::try_from-guarded, with staged cursors and a forward-progress check; check_eof runs on the success path.

One minor maintainability note inline. Also: the PR description still says "Planning/spec PR — no code changes; only specs/jolt-core-transcript-migration.md," but this is the full 173-file implementation (the spec header reads Status: implemented) — worth updating so reviewers aren't misled about scope.


Generated by Claude Code

Comment thread transpiler/src/narg_parser.rs Outdated
.unwrap(),
);
let body_start = offset + FRAME_LEN_PREFIX_BYTES;
let body_remaining = (narg.len() - body_start) as u64;

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parse_narg re-implements the 8-byte LE length-prefix framing that crates/jolt-transcript/src/codec.rs::read_length_prefixed_body declares "the one authoritative parser ... so the accept/reject behavior cannot drift." Two independent copies of the NARG framing is exactly the drift that doc set out to prevent.

Not a correctness bug today — the as u64/as usize casts here are safe because the len > body_remaining check (line 119) runs in u64 space before the len as usize at line 126, so a length above usize::MAX is rejected just as read_length_prefixed_body's usize::try_from rejects it. But consider making read_length_prefixed_body (or a thin split_frame wrapper) pub and looping over it here, so the framing logic lives in one place.


Generated by Claude Code

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx for the review will fix it

@moodlezoup moodlezoup left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should try to keep the new crates decoupled from ark-serialize.
Also, can you split this into multiple PRs to make it easier to review?

@Vishalkulkarni45

Copy link
Copy Markdown
Contributor

We should try to keep the new crates decoupled from ark-serialize. Also, can you split this into multiple PRs to make it easier to review?

sure, do you want this in two PRs (one for the spongefish migration and cleanup, and one for the new transpiler), or should I split it up even further?

@Vishalkulkarni45

Copy link
Copy Markdown
Contributor

We should try to keep the new crates decoupled from ark-serialize. Also, can you split this into multiple PRs to make it easier to review?

do you mean we should remove ark-serialize usage entirely from the new crates, rather than just hiding it behind a local trait/API boundary and tests?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

claude-review-request Request a review from Claude Code claude-spec-approved Claude spec analysis found no ambiguities implementation PR contains implementation of a spec spec Tracking issue for a feature spec

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants