Skip to content

Sumfold [DRAFT: NOT READY FOR REVIEW]#209

Draft
wu-s-john wants to merge 52 commits into
NethermindEth:main-betafrom
wu-s-john:sumfold
Draft

Sumfold [DRAFT: NOT READY FOR REVIEW]#209
wu-s-john wants to merge 52 commits into
NethermindEth:main-betafrom
wu-s-john:sumfold

Conversation

@wu-s-john

Copy link
Copy Markdown

No description provided.

wu-s-john added 30 commits June 3, 2026 17:02
Use borrowed/in-place field addition while accumulating rotated and shifted binary polynomial evaluations. These paths run once per relevant set bit across each virtual bit-op or shifted bit-slice column, so avoiding clones removes a large amount of temporary field-element copying without changing the immediate-reduction semantics.
Add a narrow delayed modular reduction path for 4-limb Montgomery fields and
use it in the hot binary polynomial evaluation paths.

The new `zinc_utils::delayed_reduction` module introduces:
- `MontgomeryLimbs` for exposing reduced Montgomery-form field limbs.
- `DelayedModularReduction` for sum-only delayed accumulation.
- `BarrettReductionParams` with const `mu` computation.
- A `Uint<5>` accumulator implementation for summing 4-limb field elements.
- An optimized `barrett_reduce_5` path for reducing bounded 5-limb sums.
- Implementations for both `MontyField<4>` and `ConstMontyField<_, 4>`.

This lets binary polynomial evaluation accumulate many selected `eq(r, b)`
values as raw Montgomery limbs, then perform one Barrett reduction per output
coefficient instead of doing a field reduction after every conditional add.

Apply the accumulator to two hot paths:
- Lifted binary polynomial evaluation in the protocol layer.
- Streaming shifted bit-slice evaluation in the PIOP booleanity code.

The lifted binary evaluation now builds the `eq(point, *)` table once, scans the
binary trace rows, conditionally adds `eq_b` into per-bit `Uint<5>`
accumulators, and reduces once per bit coefficient. The shifted bit-slice
streaming path uses the same delayed accumulation strategy while continuing to
avoid materializing shifted bit-slice MLE buffers.

Use `crypto_bigint::Uint<5>` directly as the accumulator rather than a custom
wide-limb wrapper, keeping the representation aligned with the rest of the
integer code. The Barrett reducer is specialized to the actual accumulator
width, avoiding the unused sixth limb from the earlier 6-limb reducer shape.

Also extend the relevant protocol prover/verifier bounds so the optimized paths
can access Montgomery limbs, and generalize `ConstMontyField` projection support
through `FromRef`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant