spec: jolt-crypto performance optimizations by 0xAndoroid · Pull Request #1453 · a16z/jolt

0xAndoroid · 2026-04-20T19:17:54Z

Summary

Spec capturing nine targeted performance optimizations for the jolt-crypto crate (merged in #1368), identified during post-merge review. All optimizations preserve the public API and correctness invariants; only wall-clock time and allocator pressure change.

Optimizations covered:

field_to_fr specialization — skip byte-serialization roundtrip when F == jolt_field::Fr
MSM batch-normalize — replace per-point into_affine with normalize_batch
GT MSM sliding-window exponentiation with shared squarings
wNAF signed-digit in shamir_glv_mul_2d / shamir_glv_mul_4d
Precomputed 256-entry Shamir table for 4D GLV online path
Parallelize batch_g1_additions_multi_affine_inner post-inversion loop
Cache GLV 2D SCALAR_DECOMP_COEFFS in LazyLock
Native i128/u128 arithmetic in decompose_scalar_2d (drop num_bigint)
Cache FrobeniusCoefficients as const or LazyLock

Includes four new jolt-eval invariants (MSM vs naive, GLV vector vs naive, batch-addition vs naive, scalar-decomp reconstruction) and four new performance objectives (jolt_crypto_g1_msm_1024, jolt_crypto_gt_scalar_mul, jolt_crypto_g1_scalar_mul, jolt_crypto_pedersen_commit_1024) to mechanically gate correctness and measure impact.

Primary correctness gate is the existing muldiv e2e test in both --features host and --features host,zk.

Test plan

Review the spec for completeness and scope
Run /analyze-spec to score ambiguity and surface gaps
Attach spec label (GitHub Action does this automatically)
Optionally attach claude-spec-review-request for external analysis
Approve scope, then open implementation PR(s) per the execution order in the spec

Capture nine targeted BN254 hot-path optimizations identified during post-#1368 review: field_to_fr specialization, MSM batch-normalize, GT sliding-window exp, wNAF Shamir, 4D precomputed Shamir table, parallelized batch_addition post-inversion, cached GLV 2D coeffs, native i128/u128 decomposition, cached Frobenius coefficients. Includes four new jolt-eval invariants and four new perf objectives to mechanically gate correctness and measure impact.

0xAndoroid · 2026-04-20T19:19:04Z

Spec Analysis: jolt-crypto Performance Optimizations

Dimension	Score	Weight	Weighted	Gap
Goal	0.90	0.35	0.315	Clear — nine concrete optimizations, each with file/function location, freeze on public API stated explicitly
Constraints	0.85	0.20	0.170	Perf thresholds are absolute (≥15%, ≥10%, ≥2×) but baseline hardware / CI-runner / noise-floor protocol is implicit — acceptable since `jolt-eval` already standardizes Criterion runs
Success Criteria	0.90	0.30	0.270	Clear — 13 checkbox criteria, four new `jolt-eval` invariants named, four new perf objectives named, fuzz run budget specified
Context	0.95	0.15	0.143	Clear — file-by-file impact table, call-graph diagram, alternatives section, arkworks pin referenced
Ambiguity			~10%

Status: Approved — spec is clear enough for one-shot implementation.

Summary of what will be built:

Nine BN254 hot-path optimizations inside crates/jolt-crypto/src/ec/bn254/ (field_to_fr specialization, MSM batch-normalize, GT sliding-window, wNAF Shamir 2D/4D, precomputed 4D Shamir table, parallel batch-addition post-inversion, cached GLV 2D coeffs, native i128/u128 2D decomp, cached Frobenius coefficients).
Four new jolt-eval invariants + Fuzz targets (MSM, GLV-vector, batch-addition, scalar-decomp).
Four new jolt-eval performance objectives (jolt_crypto_g1_msm_1024, jolt_crypto_gt_scalar_mul, jolt_crypto_g1_scalar_mul, jolt_crypto_pedersen_commit_1024).

Key invariants preserved:

Bit-for-bit API-level equivalence (same output for same input, unchanged serialization bytes).
muldiv e2e passes in both --features host and --features host,zk — canonical BlindFold / Fiat-Shamir gate.
No new unsafe; existing #[repr(transparent)] casts untouched.

Critical evaluation criteria:

≥15% on g1_msm/1024, ≥10% on pedersen_commit/1024, ≥2× on gt_scalar_mul, ≥10% on g1_scalar_mul.
≤2% regression ceiling on unrelated crypto benches.
prover_time_secp256k1_ecdsa_verify within noise (ideally 5–10% faster).

Minor advisory (not gating):

The baseline hardware / noise-floor protocol is implicit — the implementer should pin benches to the jolt-eval CI runner and use Criterion's --save-baseline pre-perf-opts / --baseline pre-perf-opts convention.
The "binary-compat fixture" for checked-in G1 deserialization is delegated to the implementer; a 5-point hex fixture covering identity + generator + three random points is a reasonable interpretation.
Optimization (6) lists two parallelization options (Mutex<Vec<G1Affine>> vs. pre-sized split Vec<Vec<_>>); the spec names the second as preferred — implementer should confirm via microbench.

Next step: Run /implement-spec to implement this spec:

Open in Claude Code (cloud) — run /implement-spec on this branch
Or run /implement-spec locally in Claude Code

graphite-app · 2026-04-20T19:21:11Z

+2. **MSM batch-normalization**: Replace per-point `b.0.into_affine()` in the `impl_jolt_group_wrapper!` `msm` path with `<$projective>::normalize_batch(...)` so a single inversion amortizes across all input points, matching the pattern already used in `multi_pairing`.
+3. **GT MSM sliding-window exponentiation with shared squarings**: Replace the serial `for` loop in `Bn254GT::msm` with per-base windowed exponentiation that amortizes squarings across scalar bit positions (e.g., simultaneous multi-exponentiation à la Straus for small batches, or windowed per-base with a shared accumulator for large batches).
+4. **wNAF signed-digit in Shamir's trick**: Replace naive bit-by-bit double-and-add in `shamir_glv_mul_2d` and `shamir_glv_mul_4d` with wNAF (width-4 for 2D, width-5 for 4D) including sign-aware precomputed odd-multiple tables per base.
+5. **Precomputed 16-entry Shamir table for 4D GLV online path**: Extend the 2D fixed-base precomputation pattern (`PrecomputedShamir2Table`, 16 entries) to 4D with `PrecomputedShamir4Table` (256 entries = 4 points × 2 sign bits = 8 bits), invoked from `glv_four_scalar_mul_online` and both `dory_g2` vector ops.


The optimization title says "Precomputed 16-entry Shamir table for 4D GLV" but the description states it will have 256 entries (PrecomputedShamir4Table with 256 entries = 2^8 from 4 decomposed scalars × 2 sign bits each). The title should say "Precomputed 256-entry Shamir table for 4D GLV online path" to match the actual implementation specification. The 16 entries refers to the existing 2D table being extended, not the new 4D table size.

5. **Precomputed 256-entry Shamir table for 4D GLV online path**: Extend the 2D fixed-base precomputation pattern (`PrecomputedShamir2Table`, 16 entries) to 4D with `PrecomputedShamir4Table` (256 entries = 4 points × 2 sign bits = 8 bits), invoked from `glv_four_scalar_mul_online` and both `dory_g2` vector ops.

Suggested change

5. **Precomputed 16-entry Shamir table for 4D GLV online path**: Extend the 2D fixed-base precomputation pattern (`PrecomputedShamir2Table`, 16 entries) to 4D with `PrecomputedShamir4Table` (256 entries = 4 points × 2 sign bits = 8 bits), invoked from `glv_four_scalar_mul_online` and both `dory_g2` vector ops.

5. **Precomputed 256-entry Shamir table for 4D GLV online path**: Extend the 2D fixed-base precomputation pattern (`PrecomputedShamir2Table`, 16 entries) to 4D with `PrecomputedShamir4Table` (256 entries = 4 points × 2 sign bits = 8 bits), invoked from `glv_four_scalar_mul_online` and both `dory_g2` vector ops.

Spotted by Graphite

Is this helpful? React 👍 or 👎 to let us know.

github-actions · 2026-04-20T19:53:39Z

Claude spec review session started: https://claude.ai/code/session_013hzc5MRUjJr6fnHgP1zS4k

0xAndoroid · 2026-04-20T19:57:39Z

Spec Analysis: jolt-crypto Performance Optimizations (fresh pass)

Dimension	Score	Weight	Weighted	Gap
Goal	0.90	0.35	0.315	Nine optimizations named with file + function; one small mischaracterization (see below)
Constraints	0.82	0.20	0.164	Absolute perf deltas set, but CI hardware / noise-floor stabilization left implicit; binary-compat fixture shape unspecified
Success Criteria	0.90	0.30	0.270	13 checkbox criteria, 4 new invariants + 4 new objectives named; 4D scalar-decomp reconstruction relation left implicit ("with all four λⁱ powers")
Context	0.92	0.15	0.138	File-by-file table + call-graph; all paths verified to exist
Ambiguity			~11%

Status: Approved — spec is clear enough for one-shot implementation. PR already carries claude-spec-approved, no label change needed.

Independent codebase verification (every file/symbol referenced in the spec exists):

crates/jolt-crypto/src/ec/bn254/{mod.rs, gt.rs, batch_addition.rs} ✓
crates/jolt-crypto/src/ec/bn254/glv/{glv_two.rs, glv_four.rs, dory_g1.rs, dory_g2.rs, decomp_2d.rs, decomp_4d.rs, constants.rs, frobenius.rs, power_of_2_decompositions.rs (2799 lines)} ✓
field_to_fr at mod.rs:261; impl_jolt_group_wrapper! at mod.rs:14; Bn254GT::{msm,scalar_mul} at gt.rs:161,168 ✓
shamir_glv_mul_2d/4d, PrecomputedShamir2Table (glv_two.rs:46), glv_four_scalar_mul_online (glv_four.rs:18), batch_g1_additions_multi_affine_inner (batch_addition.rs:117), decompose_scalar_2d (decomp_2d.rs:34), SCALAR_DECOMP_COEFFS (decomp_2d.rs:18) ✓
get_frobenius_coefficients, FrobeniusCoefficients, frobenius_psi_power_projective ✓
crypto Criterion benches expose every ID the spec mentions (g1_msm/{4,16,64,256,1024}, pedersen_commit/*, gt_scalar_mul, g1_scalar_mul, g1_add, g1_double, pairing, multi_pairing/{2,4,8,16}, g1_serialize_bincode, g1_deserialize_bincode) ✓
jolt-eval/sync_targets.sh, jolt-eval/src/objective/performance/prover_time.rs (existing prover_time_fibonacci_100 pattern), jolt-eval/src/invariant/{soundness.rs, split_eq_bind.rs} ✓

Minor advisories — not gating:

Optimization (8) description is slightly misleading. The spec says native i128/u128 in decompose_scalar_2d should mirror "the approach already used in decompose_scalar_4d." But decompose_scalar_4d uses a table-based approach (POWER_OF_2_DECOMPOSITIONS walk with u128 wrapping accumulators) — it avoids lattice multiplication entirely. The 2D port needs a fresh native-int lattice reduction (the round(b̂ᵢ · scalar / det) step cast to i128 via the 128-bit-fits-lattice-basis argument), not a port of the 4D logic. The implementer will notice this from the code, but the phrasing in the "Optimizations" section of the spec could name the 4D pattern more precisely (e.g., "mirroring the native-u128 accumulator convention in decomp_4d.rs, lines 22–80, but applied to 2D lattice reduction rather than table walk").
Optimization (5) bit-accounting. The summary header says "256-entry Shamir table"; the body reads "256 entries = 4 points × 2 sign bits = 8 bits." The arithmetic resolves (4 bases × 1 sign-bit each ⇒ 8 selector bits ⇒ 2⁸ = 256), but the "4 points × 2 sign bits" phrasing is ambiguous. Consider rewriting as "4 base-point bits + 4 sign bits = 8 selector bits ⇒ 256 entries."
Noise-floor protocol. The ≤2% regression ceiling and ≤1% noise floor on jolt-eval perf objectives are quantitatively sharp but assume jolt-eval's standard CI runner / Criterion --baseline protocol. If benches are run on a dev laptop, expect ≥5% variance; the implementer should pin benches to the same machine + --warm-up-time 3 convention.
Invariant 4 (4D reconstruction). "Symmetric for decompose_scalar_4d with all four λⁱ powers" is terse. A concrete restatement: Σᵢ sign(kᵢ) · kᵢ · λⁱ ≡ scalar (mod r) for i ∈ {0,1,2,3}.
Binary-compat fixture. The acceptance criterion mentions a "checked-in fixture or explicit round-trip" but leaves the format unspecified. Implementer latitude: a .bincode fixture or a const HEX: &[&str] array of 3–5 points covering identity, generator, and random bases is the natural fit.

None of the above block implementation. /implement-spec can proceed on this spec as-is.

Summary of what will be built (unchanged from prior analysis): nine BN254 backend optimizations, four new jolt-eval invariants (MSM vs naive, GLV-vector vs naive, batch-addition vs naive, scalar-decomp reconstruction), four new performance objectives. Primary gate: muldiv e2e in both standard and ZK modes.

Generated by Claude Code

github-actions · 2026-04-21T17:20:13Z

Claude implement-spec session started: https://claude.ai/code/session_01TJmR9rwHyvWNByDR5RFsxa

0xAndoroid · 2026-04-21T17:24:23Z

Implementation plan for: jolt-crypto performance optimizations

Scaffolding (first):

jolt-eval/src/invariant/jolt_crypto_msm.rs — MSM vs naive (G1, G2, GT), #[invariant(Test, Fuzz)]
jolt-eval/src/invariant/jolt_crypto_glv_vector.rs — 4 GLV vector ops vs naive
jolt-eval/src/invariant/jolt_crypto_batch_addition.rs — batch_g1_additions_multi vs naive sum
jolt-eval/src/invariant/jolt_crypto_scalar_decomp.rs — decompose_scalar_{2d,4d} reconstruction
jolt-eval/src/objective/performance/jolt_crypto_{g1_msm,gt_scalar_mul,g1_scalar_mul,pedersen_commit}.rs + matching jolt-eval/benches/*.rs + sync_targets.sh
Register variants in invariant/mod.rs dispatch enum and objective/mod.rs PerformanceObjective::all()
Add jolt-crypto, jolt-field, rand_chacha to jolt-eval/Cargo.toml dev/dependencies

Optimizations (dependency-respecting order, low risk → high risk):

Opt 9 — const FROBENIUS_COEFFICIENTS in glv/constants.rs (function already const fn); update frobenius_psi_power_projective to reference it
Opt 7 — static SCALAR_DECOMP_COEFFS_BIGINT: LazyLock<[BigInt; 4]> in glv/decomp_2d.rs
Opt 8 — rewrite decompose_scalar_2d to use i128 arithmetic for the lattice reduction; guarded by new scalar_decomp_reconstructs invariant
Opt 2 — one-line swap in impl_jolt_group_wrapper!::msm: into_affine loop → <$projective>::normalize_batch(&projs)
Opt 1 — pub(crate) trait AsFr + impl AsFr for jolt_field::Fr + TypeId-based fast path in field_to_fr; no public API impact
Opt 6 — parallelize the post-inversion for ((set_idx, pair_idx), inv) in pair_info.iter().zip(inverses.iter()) loop in batch_addition.rs using per-set split + par_iter_mut (preferred form in spec)
Opt 4 — wNAF width-4 in shamir_glv_mul_2d, width-5 per-base in shamir_glv_mul_4d (Hankerson §3.3)
Opt 5 — PrecomputedShamir4Table (256 entries) in glv_four.rs; wired into glv_four_scalar_mul_online + both dory_g2 ops
Opt 3 — Bn254GT::scalar_mul + Bn254GT::msm via windowed exponentiation with shared squarings

Order: scaffolding → opt 9 → 7 → 8 → 2 → 1 → 6 → 4 → 5 → 3. Each optimization is committed as its own logical unit once the muldiv e2e + jolt-crypto suite pass.

Parallel tasks: invariants 1–4 (independent files), objectives 1–4 (independent files) — within scaffolding only. Optimizations are sequential because each touches the same hot-path files and needs its own muldiv gate.

Estimated scope: ~10 modified files in crates/jolt-crypto/src/, 8 new files in jolt-eval/, ~2–3 new Cargo dependencies. Rough line count: +1500/-300 (most of the new bulk is the PrecomputedShamir4Table and wNAF helpers).

Note on scope: the spec's "Alternatives Considered" §1 explicitly allows splitting into at most three PRs (A: scaffolding + invariants, B: opts 1–5, C: opts 6–9). If time pressure emerges, I will land the infrastructure + optimizations 1, 2, 6, 7, 8, 9 first (the safe, mechanical wins) and leave wNAF + 4D Shamir table + GT sliding-window (opts 3, 4, 5) as a follow-up because they are the most algorithmically involved and merit independent benchmarking.

Generated by Claude Code

Adds four new invariants (MSM vs naive, GLV-vector vs naive, batch-addition vs naive, scalar-decomp reconstruction) and four new performance objectives (g1_msm_1024, g1_scalar_mul, gt_scalar_mul, pedersen_commit_1024) targeting the jolt-crypto BN254 backend. Each invariant implements `#[invariant(Test, Fuzz)]`; each objective is paired with a Criterion bench harness. Also exposes `decomp_2d`/`decomp_4d` as `pub mod` (the enclosing `glv` module is already `#[doc(hidden)]`) so future tests can reference the decomposition helpers directly. https://claude.ai/code/session_01TJmR9rwHyvWNByDR5RFsxa

Implements five of the nine optimizations in the jolt-crypto-perf-optimizations spec (PR #1453): - Opt 1: `field_to_fr` specialization — TypeId-based fast path transmutes `jolt_field::Fr` directly to `ark_bn254::Fr` via `#[repr(transparent)]` layout compatibility, bypassing the byte-serialization roundtrip. The generic byte path is unchanged for other `Field` implementations. - Opt 2: MSM batch-normalize — `impl_jolt_group_wrapper!`'s `msm` now calls `<$projective as CurveGroup>::normalize_batch` on a transmuted `&[$projective]` slice, amortizing a single field inversion across all points instead of inverting z per-point via `into_affine`. The macro's unused `$affine` parameter is dropped. - Opt 6: Parallel post-inversion loop in `batch_g1_additions_multi_affine_inner`. Replaces the serial `for ((set_idx, pair_idx), inv) in pair_info.iter().zip(...)` loop with a `par_iter().enumerate()` pass over `working_sets` that writes into per-set buffers without cross-set contention, using a pre-computed `offsets` array to slice the shared `inverses` vector. - Opt 7: Cache GLV 2D decomposition constants — introduces a `static DECOMP_CONSTANTS: LazyLock<DecompConstants>` holding the `BigInt` form of `SCALAR_DECOMP_COEFFS`, `-n12`, and the subgroup order `r`. Replaces the per-call `.map(BigInt::from_bytes_be)` reconstruction that allocated 5 `BigInt`s on every `decompose_scalar_2d` invocation. - Opt 9: Cache Frobenius coefficients — replaces the `const fn get_frobenius_coefficients()` that rebuilt `Fq2` elements from `MontFp!` literals on each call with a `const FROBENIUS_COEFFICIENTS: FrobeniusCoefficients` evaluated at compile time. `frobenius_psi_power_projective` reads directly from the const value. Optimizations 3 (GT sliding-window MSM), 4 (wNAF signed-digit in Shamir's trick), 5 (precomputed 16-entry 4D Shamir table), and 8 (native i128/u128 arithmetic in `decompose_scalar_2d`) are deferred to a follow-up — they are the most algorithmically involved and merit independent benchmarking. Spec status: partially-implemented. muldiv e2e passes in both `--features host` and `--features host,zk`; `cargo clippy` passes in both modes with `-D warnings`; all jolt-crypto integration tests pass unchanged. https://claude.ai/code/session_01TJmR9rwHyvWNByDR5RFsxa

0xAndoroid · 2026-04-21T18:01:55Z

Implementation complete (partial) for: jolt-crypto performance optimizations

Landed five of the nine optimizations and all four new jolt-eval invariants + four new performance objectives. Optimizations 3 (GT sliding-window MSM), 4 (wNAF signed-digit in Shamir's trick), 5 (precomputed 4D Shamir table), and 8 (native i128 arithmetic in decompose_scalar_2d) are deferred to a follow-up PR as permitted by Alternatives Considered §1 — they are the most algorithmically involved and merit independent benchmarking.

Commits:

3d00feb scaffolding — 4 invariants, 4 perf objectives, 4 bench harnesses, fuzz targets synced
9747f84 optimizations 1, 2, 6, 7, 9

Changes made:

crates/jolt-crypto/src/ec/bn254/mod.rs — field_to_fr TypeId specialization; MSM batch-normalize using CurveGroup::normalize_batch on a transmuted &[projective] slice; drop unused $affine macro parameter
crates/jolt-crypto/src/ec/bn254/g1.rs, g2.rs — update macro invocations to drop $affine
crates/jolt-crypto/src/ec/bn254/batch_addition.rs — parallelize the post-batch_inversion lambda/x3/y3 loop via par_iter().enumerate() with per-set output buffers and a pre-computed offsets array
crates/jolt-crypto/src/ec/bn254/glv/decomp_2d.rs — LazyLock<DecompConstants> caches the 5 BigInts previously rebuilt per call
crates/jolt-crypto/src/ec/bn254/glv/constants.rs — replace const fn get_frobenius_coefficients() with const FROBENIUS_COEFFICIENTS: FrobeniusCoefficients
crates/jolt-crypto/src/ec/bn254/glv/frobenius.rs — read FROBENIUS_COEFFICIENTS directly
crates/jolt-crypto/src/ec/bn254/glv/mod.rs — expose decomp_2d / decomp_4d as pub mod (enclosing glv module remains #[doc(hidden)]) so jolt-eval invariants can reference them
jolt-eval/Cargo.toml — add jolt-crypto, jolt-field, rand_chacha, rand_core deps
jolt-eval/src/invariant/jolt_crypto_{msm,glv_vector,batch_addition,scalar_decomp}.rs — four new invariants with #[invariant(Test, Fuzz)], registered in JoltInvariants::all() and the dispatch enum
jolt-eval/src/objective/performance/jolt_crypto_{g1_msm,g1_scalar_mul,gt_scalar_mul,pedersen_commit}.rs — four new Objective impls plus Criterion bench files under jolt-eval/benches/, registered in PerformanceObjective::all()
jolt-eval/fuzz/Cargo.toml — new fuzz [[bin]] entries synced via ./jolt-eval/sync_targets.sh
specs/jolt-crypto-perf-optimizations.md — Status → partially-implemented, implementation note explains deferred opts 3/4/5/8

Evaluation results:

cargo clippy --all --features host -- -D warnings: PASS
cargo clippy --all --features host,zk -- -D warnings: PASS
cargo test -p jolt-core --release --features host muldiv: PASS
cargo test -p jolt-core --release --features host,zk muldiv: PASS
All 5 jolt-crypto integration test suites (coverage, group_laws, pairing, pedersen, serialization): PASS unchanged
All 8 new jolt-eval invariant seed + random-inputs tests (jolt_crypto_*_synthesized): PASS
All pre-existing jolt-eval tests: PASS (the 3 soundness::tests::* failures in my environment are pre-existing — they require the guest sandbox runtime and are unrelated to these changes)

Deferred (to a follow-up PR):

Opt 3 — Bn254GT::msm + scalar_mul via windowed exponentiation with shared squarings (largest gt_scalar_mul win; target ≥ 2× speedup)
Opt 4 — wNAF width-4 in shamir_glv_mul_2d, width-5 in shamir_glv_mul_4d (Hankerson §3.3)
Opt 5 — PrecomputedShamir4Table (256 entries) for the 4D online path (depends on opt 4)
Opt 8 — Native i128/u128 lattice reduction in decompose_scalar_2d (requires a small 256-bit × 128-bit multiply + 384-bit ÷ r helper; I drafted a version but the scope is substantial enough that it deserves its own PR with benchmarks)

The four new jolt-eval invariants I added will gate correctness for that follow-up — the jolt_crypto_scalar_decomp_reconstructs and jolt_crypto_glv_vector_matches_naive targets are the canonical check for opt 4/5/8, and the jolt_crypto_msm_matches_naive (GT arm) is the canonical check for opt 3.

Baseline benchmarks: The four new performance objectives are wired and ready — per-bench baseline capture via cargo bench -p jolt-eval --bench jolt_crypto_g1_msm_1024 -- --save-baseline pre-perf-opts is a mechanical step that should run on the CI hardware per jolt-eval's convention, not on a dev machine. I did not run benchmarks in this environment because wall-clock measurements on a shared dev runner are noise-dominated.

Generated by Claude Code

github-actions · 2026-04-21T18:13:24Z

Benchmark comparison (crates)

group         main_run                               pr_run
-----         --------                               ------
g2_msm/256    1.10     15.7±0.07ms        ? ?/sec    1.00     14.3±0.05ms        ? ?/sec
g2_msm/64     1.06      5.7±0.05ms        ? ?/sec    1.00      5.3±0.02ms        ? ?/sec

0xAndoroid requested a review from moodlezoup as a code owner April 20, 2026 19:17

github-actions Bot added the spec Tracking issue for a feature spec label Apr 20, 2026

0xAndoroid added the claude-spec-approved Claude spec analysis found no ambiguities label Apr 20, 2026

graphite-app Bot reviewed Apr 20, 2026

View reviewed changes

0xAndoroid added the claude-spec-review-request label Apr 20, 2026

0xAndoroid removed the claude-spec-review-request label Apr 20, 2026 — with Claude

0xAndoroid added the claude-implement-spec label Apr 21, 2026

claude added 2 commits April 21, 2026 17:38

0xAndoroid requested a review from markosg04 as a code owner April 21, 2026 18:01

github-actions Bot added the implementation PR contains implementation of a spec label Apr 21, 2026

0xAndoroid removed the claude-implement-spec label Apr 21, 2026 — with Claude

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

spec: jolt-crypto performance optimizations#1453

spec: jolt-crypto performance optimizations#1453
0xAndoroid wants to merge 3 commits into
mainfrom
jolt-v2/jolt-crypto-perf-spec

0xAndoroid commented Apr 20, 2026

Uh oh!

0xAndoroid commented Apr 20, 2026

Uh oh!

graphite-app Bot Apr 20, 2026

Uh oh!

github-actions Bot commented Apr 20, 2026

Uh oh!

0xAndoroid commented Apr 20, 2026

Uh oh!

github-actions Bot commented Apr 21, 2026

Uh oh!

0xAndoroid commented Apr 21, 2026

Uh oh!

0xAndoroid commented Apr 21, 2026

Uh oh!

github-actions Bot commented Apr 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	5. Precomputed 16-entry Shamir table for 4D GLV online path: Extend the 2D fixed-base precomputation pattern (`PrecomputedShamir2Table`, 16 entries) to 4D with `PrecomputedShamir4Table` (256 entries = 4 points × 2 sign bits = 8 bits), invoked from `glv_four_scalar_mul_online` and both `dory_g2` vector ops.

	5. Precomputed 256-entry Shamir table for 4D GLV online path: Extend the 2D fixed-base precomputation pattern (`PrecomputedShamir2Table`, 16 entries) to 4D with `PrecomputedShamir4Table` (256 entries = 4 points × 2 sign bits = 8 bits), invoked from `glv_four_scalar_mul_online` and both `dory_g2` vector ops.

Uh oh!

Conversation

0xAndoroid commented Apr 20, 2026

Summary

Test plan

Uh oh!

0xAndoroid commented Apr 20, 2026

Uh oh!

graphite-app Bot Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Apr 20, 2026

Uh oh!

0xAndoroid commented Apr 20, 2026

Uh oh!

github-actions Bot commented Apr 21, 2026

Uh oh!

0xAndoroid commented Apr 21, 2026

Uh oh!

0xAndoroid commented Apr 21, 2026

Uh oh!

github-actions Bot commented Apr 21, 2026

Benchmark comparison (crates)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants