Skip to content

perf: Dory/pairing micro-optimizations#87

Merged
MatteoMer merged 1 commit into
mainfrom
perf/dory-pairing-microopts
Apr 19, 2026
Merged

perf: Dory/pairing micro-optimizations#87
MatteoMer merged 1 commit into
mainfrom
perf/dory-pairing-microopts

Conversation

@MatteoMer

Copy link
Copy Markdown
Owner

Summary

  • Increase MAX_PREPARED_BATCH from 16 to 64 — shares one Fp12.square() across 64 pairs instead of 4 sub-batches of 16
  • Combine dbl+add lines in batchedMillerLoopUnprepared — uses fp12Mul034By034 + fp12MulBy01234 (23 Fp2.mul) instead of two separate fp12MulBy034 (26 Fp2.mul), matching batchedMillerLoopPreparedSparse
  • Precomputed 16-entry Shamir table for GLV 4D scalar mul — replaces up to 4 individual addAffine per bit with a single table lookup

Benchmark (sha256_2048, 3-run median)

Config Median (ms)
Baseline 2663
Optimized 2599
Delta -64 ms (~2.4%)

Test plan

  • zig build test -Doptimize=ReleaseFast — all tests pass
  • bench/run-bench.sh sha256_2048 — proof verified, no regression

🤖 Generated with Claude Code

Three independent changes targeting pairing and GLV scalar mul:

1. Increase MAX_PREPARED_BATCH from 16 to 64 — shares one Fp12.square()
   across 64 pairs instead of 4 sub-batches of 16.

2. Combine dbl+add lines in batchedMillerLoopUnprepared — at non-zero
   ATE bits, use fp12Mul034By034 + fp12MulBy01234 (23 Fp2.mul) instead
   of two separate fp12MulBy034 calls (26 Fp2.mul). Matches the pattern
   already used in batchedMillerLoopPreparedSparse.

3. Precomputed 16-entry Shamir table for GLV 4D scalar mul — replaces
   up to 4 individual addAffine calls per bit with a single table
   lookup and addAffine.

Median benchmark (3 runs, sha256_2048):
  Baseline: 2663 ms → Optimized: 2599 ms (~2.4% improvement)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@MatteoMer MatteoMer merged commit 2bc7f5c into main Apr 19, 2026
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant