Omri/add math cairo test#2384

Closed

OmriEshhar1 wants to merge 33 commits into

mainfrom

omri/add_math_cairo_test

OmriEshhar1 commented May 4, 2026 •

edited by phil-starkware

Loading

Collaborator

TITLE

Description

Description of the pull request changes and motivation.

Checklist

Linked to Github Issue
Unit tests added
Integration tests added.
This change requires new documentation.
- Documentation has been added/updated.
- CHANGELOG has been updated.

This change is

naor-starkware and others added 30 commits

April 6, 2026 20:39


          feat: add test_helpers module (error_utils, test_utils) behind functi…

f97e022

…on_runner flag

- Create vm/src/test_helpers/ with error_utils.rs and test_utils.rs
- Move from cairo_test_suite/ (fix filename typo: utlis → utils)
- Fix crate:: import paths (were cairo_vm:: when outside the crate)
- Fix $crate in macro_export macro (clippy::crate_in_macro_def)
- Simplify load_cairo_program! path using with_file_name()
- Gate module behind function_runner feature in lib.rs

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>


          test(test_helpers): add unit tests for assert_mr_eq!, load_cairo_prog…

0edf8b8

…ram! and error_utils checkers

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>


          style: cargo fmt

298b0c0

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>


          test: add conversion-failure and allow-large-err fixes to test_helpers

26c995c

- Add AlwaysFailConversion helper + 2 tests for assert_mr_eq! unwrap_or_else
  panic branch (no-message and message variants)
- Allow clippy::result_large_err on hint_err test helper

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>


          chore: update CHANGELOG for PR #2378

39fcf0f

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>


          docs: mark load_cairo_program! example as ignore to suppress llvm-cov…

b4cb481

… noise

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>


          fix: replace unwrap_or_else closures in macros to avoid llvm-cov empt…

2ccebe3

…y function name error

#[macro_export] macros containing closures (|x| ...) cause llvm-cov to
emit a "function name is empty" error. Replaced unwrap_or_else(|e| panic!(...))
with match expressions to eliminate closures from macro expansions.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>


          refactor: update function_runner → test_utils cfg gates and docs

153f306

Follow-up to dropping the function_runner feature flag.
Gate test_helpers module and function_runner module under test_utils,
and update the doc comment in function_runner.rs accordingly.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>


          refactor: make function_runner module and its methods public

5e2dd11

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>


          chore: update CHANGELOG PR link from #2378 to #2381

516a2a5

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>


          refactor: replace assert_mr_eq! macro with generic function

53a159f

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>


          chore: apply cargo fmt

3e693fb

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>


          fix: expect_diff_type_comparison and expect_diff_index_comp unwrap Hi…

9b1b8f5

…nt::Internal

These errors arrive wrapped as Hint(Internal(...)) since they originate
inside hint execution, not as bare VirtualMachineError variants.
Remove now-unused expect_vm_error helper and vm_err test helper.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>


          feat(makefile,ci): add cairo_test_suite_programs target and CI integr…

c4a8e04

…ation

Makefile:
- Add CAIRO_TEST_SUITE_ROOT, CAIRO_TEST_SUITE_FILES, COMPILED_CAIRO_TEST_SUITE
  variables to track cairo test suite sources
- Add pattern rule to compile each .cairo to .json via cairo-compile
- Add cairo_test_suite_programs target
- Hook cairo_test_suite_programs into the test target
- Add cairo_test_suite_programs to .PHONY

CI (rust.yml):
- Add cairo_test_suite_programs to the build-programs matrix
- Add vm/src/tests/cairo_test_suite/**/*.json to CAIRO_PROGRAMS_PATH cache
- Update all program cache keys to hash test suite source files
- Add function_runner feature flag to the tests job

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>


          chore: update CHANGELOG for PR #2380

828320a

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>


          update CHANGELOG.md

d8f7f68


          fix(ci): remove stale function_runner feature flag from coverage command

f0bb2d0

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>


          feat: add math cairo tests under vm/src/tests/cairo_test_suite

aa8d90d

- Create cairo_test_suite module skeleton (mod.rs, test_math/mod.rs)
- Add main_math_test.cairo stub (imports all math.cairo functions)
- Add math_test_utils.rs (RC_BOUND, MAX_DIV, is_quad_residue_mod_prime)
- Move test_math_cairo.rs from cairo_test_suite/, update all imports
  - cairo_vm:: → crate:: for all internal paths
  - crate::error_utils → crate::test_helpers::error_utils
  - cairo_function_runner → function_runner module path
  - Fix 5x runner.runner.vm → runner.vm (old struct design remnant)
- Register cairo_test_suite mod behind function_runner feature

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>


          fix: correct CairoFunctionRunner usage in test_math_cairo.rs

15d89e7

- new() → new_for_testing() (type alias has no 1-arg constructor)
- runner.get_return_values() → runner.vm.get_return_values() (method is on VirtualMachine)
- Add missing macro imports: assert_mr_eq! and load_cairo_program!

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>


          fix(cairo_test_suite): rename cairo file and add horner_eval import

902b78a

Rename main_math_test.cairo → math_test.cairo for clarity.
Add `from starkware.cairo.common.math_utils import horner_eval` so the
compiled JSON includes horner_eval (used in test_horner_eval).
Update the load_cairo_program! call to reference math_test.json.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>


          fix(cairo_test_suite): import all functions from math.cairo

e41b80f

Add missing assert_not_nullptr, safe_div, safe_mult.
Move horner_eval and assert_is_power_of_2 into the math.cairo import
block (both are defined there, not in math_utils).
Match declaration order to match math.cairo.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>


          refactor(cairo_test_suite): remove redundant numeric type suffixes

c775891

- Remove isize from MaybeRelocatable tuple literals (inferred from From impl)
- Remove usize from add_usize() calls (inferred from param type)
- Remove _i64 suffixes in split_int cases (inferred from param types)
- Remove usize from shift/bit operations
- Use type annotation on variable instead of inline i64 suffixes on literals

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>


          fix: remove assert_not_nullptr import not available in cairo-lang 0.13.5

a3f510b

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>


          test(math_test_utils): add unit tests for is_quad_residue_mod_prime

78fdc0a

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>


          style: cargo fmt

db0283d

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>


          chore: update CHANGELOG for PR #2379

fc935bc

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>


          refactor: replace function_runner cfg gates with test_utils in cairo_…

8c01849

…test_suite

Also fix duplicate and stale CHANGELOG entries for PRs #2377-#2379.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>


          fix: replace CairoFunctionRunner with CairoRunner in test_math_cairo

454d393

CairoFunctionRunner type alias was removed, use CairoRunner directly.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>


           update CHANGELOG.md

9326c23


          refactor: migrate assert_mr_eq! macro calls to function in test suite

fa68c82

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

naor-starkware and others added 3 commits

April 9, 2026 12:43


          chore: apply cargo fmt

d6a6c58

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>


          feat: add math cairo tests to cairo-vm

efce14d


          fix CI failure

b91624d

OmriEshhar1 self-assigned this

github-actions Bot commented May 4, 2026

**Hyper Thereading Benchmark results**




hyperfine -r 2 -n "hyper_threading_main threads: 1" 'RAYON_NUM_THREADS=1 ./hyper_threading_main' -n "hyper_threading_pr threads: 1" 'RAYON_NUM_THREADS=1 ./hyper_threading_pr'
Benchmark 1: hyper_threading_main threads: 1
  Time (mean ± σ):     24.594 s ±  0.052 s    [User: 24.142 s, System: 0.448 s]
  Range (min … max):   24.558 s … 24.630 s    2 runs
 
Benchmark 2: hyper_threading_pr threads: 1
  Time (mean ± σ):     24.661 s ±  0.014 s    [User: 24.214 s, System: 0.442 s]
  Range (min … max):   24.651 s … 24.671 s    2 runs
 
Summary
  hyper_threading_main threads: 1 ran
    1.00 ± 0.00 times faster than hyper_threading_pr threads: 1




hyperfine -r 2 -n "hyper_threading_main threads: 2" 'RAYON_NUM_THREADS=2 ./hyper_threading_main' -n "hyper_threading_pr threads: 2" 'RAYON_NUM_THREADS=2 ./hyper_threading_pr'
Benchmark 1: hyper_threading_main threads: 2
  Time (mean ± σ):     13.260 s ±  0.001 s    [User: 24.346 s, System: 0.453 s]
  Range (min … max):   13.259 s … 13.261 s    2 runs
 
Benchmark 2: hyper_threading_pr threads: 2
  Time (mean ± σ):     13.278 s ±  0.019 s    [User: 24.363 s, System: 0.452 s]
  Range (min … max):   13.265 s … 13.291 s    2 runs
 
Summary
  hyper_threading_main threads: 2 ran
    1.00 ± 0.00 times faster than hyper_threading_pr threads: 2




hyperfine -r 2 -n "hyper_threading_main threads: 4" 'RAYON_NUM_THREADS=4 ./hyper_threading_main' -n "hyper_threading_pr threads: 4" 'RAYON_NUM_THREADS=4 ./hyper_threading_pr'
Benchmark 1: hyper_threading_main threads: 4
  Time (mean ± σ):      9.944 s ±  0.224 s    [User: 35.839 s, System: 0.596 s]
  Range (min … max):    9.785 s … 10.102 s    2 runs
 
Benchmark 2: hyper_threading_pr threads: 4
  Time (mean ± σ):      9.837 s ±  0.296 s    [User: 35.539 s, System: 0.584 s]
  Range (min … max):    9.627 s … 10.046 s    2 runs
 
Summary
  hyper_threading_pr threads: 4 ran
    1.01 ± 0.04 times faster than hyper_threading_main threads: 4




hyperfine -r 2 -n "hyper_threading_main threads: 6" 'RAYON_NUM_THREADS=6 ./hyper_threading_main' -n "hyper_threading_pr threads: 6" 'RAYON_NUM_THREADS=6 ./hyper_threading_pr'
Benchmark 1: hyper_threading_main threads: 6
  Time (mean ± σ):      9.741 s ±  0.196 s    [User: 36.020 s, System: 0.608 s]
  Range (min … max):    9.603 s …  9.879 s    2 runs
 
Benchmark 2: hyper_threading_pr threads: 6
  Time (mean ± σ):      9.680 s ±  0.064 s    [User: 35.761 s, System: 0.606 s]
  Range (min … max):    9.635 s …  9.726 s    2 runs
 
Summary
  hyper_threading_pr threads: 6 ran
    1.01 ± 0.02 times faster than hyper_threading_main threads: 6




hyperfine -r 2 -n "hyper_threading_main threads: 8" 'RAYON_NUM_THREADS=8 ./hyper_threading_main' -n "hyper_threading_pr threads: 8" 'RAYON_NUM_THREADS=8 ./hyper_threading_pr'
Benchmark 1: hyper_threading_main threads: 8
  Time (mean ± σ):      9.621 s ±  0.091 s    [User: 36.306 s, System: 0.629 s]
  Range (min … max):    9.557 s …  9.686 s    2 runs
 
Benchmark 2: hyper_threading_pr threads: 8
  Time (mean ± σ):      9.957 s ±  0.085 s    [User: 36.197 s, System: 0.619 s]
  Range (min … max):    9.897 s … 10.017 s    2 runs
 
Summary
  hyper_threading_main threads: 8 ran
    1.03 ± 0.01 times faster than hyper_threading_pr threads: 8




hyperfine -r 2 -n "hyper_threading_main threads: 16" 'RAYON_NUM_THREADS=16 ./hyper_threading_main' -n "hyper_threading_pr threads: 16" 'RAYON_NUM_THREADS=16 ./hyper_threading_pr'
Benchmark 1: hyper_threading_main threads: 16
  Time (mean ± σ):      9.827 s ±  0.028 s    [User: 36.842 s, System: 0.705 s]
  Range (min … max):    9.808 s …  9.847 s    2 runs
 
Benchmark 2: hyper_threading_pr threads: 16
  Time (mean ± σ):      9.775 s ±  0.019 s    [User: 36.927 s, System: 0.666 s]
  Range (min … max):    9.761 s …  9.789 s    2 runs
 
Summary
  hyper_threading_pr threads: 16 ran
    1.01 ± 0.00 times faster than hyper_threading_main threads: 16

github-actions Bot commented May 4, 2026

Benchmark Results for unmodified programs 🚀

Command	Mean [s]	Min [s]	Max [s]	Relative
`base big_factorial`	2.101 ± 0.038	2.070	2.176	1.01 ± 0.02
`head big_factorial`	2.079 ± 0.010	2.068	2.105	1.00

Command	Mean [s]	Min [s]	Max [s]	Relative
`base big_fibonacci`	2.022 ± 0.023	2.004	2.074	1.01 ± 0.01
`head big_fibonacci`	2.009 ± 0.011	2.001	2.040	1.00

Command	Mean [s]	Min [s]	Max [s]	Relative
`base blake2s_integration_benchmark`	7.080 ± 0.069	7.014	7.234	1.00 ± 0.02
`head blake2s_integration_benchmark`	7.076 ± 0.110	7.000	7.309	1.00

Command	Mean [s]	Min [s]	Max [s]	Relative
`base compare_arrays_200000`	2.133 ± 0.005	2.128	2.145	1.00
`head compare_arrays_200000`	2.136 ± 0.012	2.120	2.160	1.00 ± 0.01

Command	Mean [s]	Min [s]	Max [s]	Relative
`base dict_integration_benchmark`	1.410 ± 0.012	1.398	1.435	1.01 ± 0.01
`head dict_integration_benchmark`	1.399 ± 0.002	1.396	1.403	1.00

Command	Mean [s]	Min [s]	Max [s]	Relative
`base field_arithmetic_get_square_benchmark`	1.208 ± 0.025	1.188	1.259	1.01 ± 0.03
`head field_arithmetic_get_square_benchmark`	1.202 ± 0.027	1.188	1.276	1.00

Command	Mean [s]	Min [s]	Max [s]	Relative
`base integration_builtins`	7.204 ± 0.116	7.126	7.510	1.01 ± 0.02
`head integration_builtins`	7.166 ± 0.033	7.134	7.231	1.00

Command	Mean [s]	Min [s]	Max [s]	Relative
`base keccak_integration_benchmark`	7.269 ± 0.059	7.197	7.377	1.00
`head keccak_integration_benchmark`	7.270 ± 0.040	7.230	7.339	1.00 ± 0.01

Command	Mean [s]	Min [s]	Max [s]	Relative
`base linear_search`	2.154 ± 0.028	2.127	2.211	1.01 ± 0.02
`head linear_search`	2.142 ± 0.029	2.122	2.221	1.00

Command	Mean [s]	Min [s]	Max [s]	Relative
`base math_cmp_and_pow_integration_benchmark`	1.490 ± 0.011	1.484	1.521	1.00 ± 0.01
`head math_cmp_and_pow_integration_benchmark`	1.488 ± 0.003	1.484	1.492	1.00

Command	Mean [s]	Min [s]	Max [s]	Relative
`base math_integration_benchmark`	1.466 ± 0.013	1.453	1.486	1.01 ± 0.01
`head math_integration_benchmark`	1.457 ± 0.005	1.450	1.466	1.00

Command	Mean [s]	Min [s]	Max [s]	Relative
`base memory_integration_benchmark`	1.210 ± 0.004	1.204	1.214	1.00
`head memory_integration_benchmark`	1.214 ± 0.011	1.203	1.241	1.00 ± 0.01

Command	Mean [s]	Min [s]	Max [s]	Relative
`base operations_with_data_structures_benchmarks`	1.525 ± 0.004	1.514	1.529	1.00
`head operations_with_data_structures_benchmarks`	1.530 ± 0.006	1.525	1.545	1.00 ± 0.00

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`base pedersen`	526.9 ± 2.0	525.0	531.7	1.00
`head pedersen`	527.3 ± 2.7	525.6	535.0	1.00 ± 0.01

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`base poseidon_integration_benchmark`	606.1 ± 9.2	599.6	630.4	1.01 ± 0.02
`head poseidon_integration_benchmark`	602.2 ± 2.0	598.0	604.8	1.00

Command	Mean [s]	Min [s]	Max [s]	Relative
`base secp_integration_benchmark`	1.790 ± 0.011	1.780	1.819	1.00 ± 0.01
`head secp_integration_benchmark`	1.784 ± 0.005	1.776	1.793	1.00

Command	Mean [ms]	Min [ms]	Max [ms]	Relative
`base set_integration_benchmark`	668.1 ± 2.8	664.5	672.8	1.00
`head set_integration_benchmark`	668.3 ± 2.1	665.8	672.5	1.00 ± 0.01

Command	Mean [s]	Min [s]	Max [s]	Relative
`base uint256_integration_benchmark`	4.126 ± 0.048	4.079	4.228	1.00
`head uint256_integration_benchmark`	4.146 ± 0.077	4.100	4.358	1.00 ± 0.02

OmriEshhar1 closed this

OmriEshhar1 deleted the omri/add_math_cairo_test branch

May 4, 2026 09:52

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet