wheels: compress CUDA fat binaries to fit EB under PyPI 320 MB limit by jameslehoux · Pull Request #292 · BASE-Laboratory/OpenImpala

jameslehoux · 2026-05-27T06:47:18Z

EB adds ~100 MB of template instantiations to the AMReX static library. With 3 CUDA architectures (75-real, 80-real, 90-virtual), the fat binary exceeds PyPI's 320 MB ceiling. Adding -Xfatbin --compress-all to both the AMReX build and the OpenImpala wheel build compresses the embedded SASS/PTX with negligible runtime decompression cost (~30-50% reduction in device code size).

github-actions · 2026-05-27T06:59:40Z

Code Coverage Report

------------------------------------------------------------------------------
                           GCC Code Coverage Report
Directory: .
------------------------------------------------------------------------------
File                                       Lines     Exec  Cover   Missing
------------------------------------------------------------------------------
src/io/CathodeWrite.cpp                       95       83    87%   40-41,97-100,115-116,182-185
src/io/CathodeWrite.H                          1        1   100%
src/io/DatReader.cpp                         136      106    77%   28-29,32,37,94-95,101-102,109-111,137-139,143,146-150,154-157,164,166,210-211,256,259
src/io/DatReader.H                             1        1   100%
src/io/HDF5Reader.cpp                        344       84    24%   40-41,43-44,46-49,52,54-56,58-59,62,64-66,68-74,92-93,126-128,144-145,154-157,174-180,182-187,204,213-215,217,219-228,230-233,236-238,240-251,253-258,266,266,266,266,266,266,266,270,270,270,270,270,270,270,274,276,278,280,282,288,290,297,297,297,297,297,297,297,301,301,301,301,301,301,301,305,305,305,305,305,305,305-306,306,306,306,306,306,306,309,309,309,309,309,309,309-310,310,310,310,310,310,310-311,311,311,311,311,311,311,313,313,313,313,313,313,313-314,314,314,314,314,314,314-315,315,315,315,315,315,315,319,319,319,319,319,319,319,324,324,324,324,324,324,324-325,325,325,325,325,325,325-326,326,326,326,326,326,326-327,327,327,327,327,327,327,332,332,332,332,332,332,332,337,337,337,337,337,337,337-338,338,338,338,338,338,338,343,343,343,343,343,343,343,350,350,350,350,350,350,350,357-358,432-435,437-440
src/io/HDF5Reader.H                            3        3   100%
src/io/ImageLoader.cpp                        61       42    68%   25,38,48,60-62,64-70,72,77,89-90,92,94
src/io/RawReader.cpp                         267      136    50%   51-52,91-92,113-114,117-119,122-123,142-144,157-159,168-170,176-179,187-188,194-198,202-206,211-214,221-226,233-239,273,275-276,278,285-286,303,314,316,320,327,329,333-336,340,348-349,355-357,363-365,367-368,371,374,376,379-382,384-386,388,390-391,393,395-396,398,400-401,403,405-406,408,412-413,415,419-420,422,427,467,473-474,535-538,552,554-556,558,560-562,572,576-578,580,602
src/io/RawReader.H                             1        1   100%
src/io/TiffReader.cpp                        385      131    34%   60-66,68-70,72-74,76-78,80-81,83-85,87-89,91-93,95-97,99-100,102-104,107-109,112-113,115-118,120,123,125-128,144-145,149-151,153-159,161,187,211,218,227,229-232,241,243-246,249,256,289-294,307,310-318,320-321,324-328,332-336,339-343,345-349,352-358,360-364,368,370,376-378,380-394,397,399-403,405-410,414-419,421-426,429-430,433-435,569-589,591-592,595-602,604,607-623,626-628,684,687-688,691-697,699,703-714,716-717
src/io/TiffReader.H                            5        5   100%
src/props/BoundaryCondition.H                131       74    56%   63,68,70,216,224-229,233-236,238-244,247-249,252-253,255,258-261,264-265,271-272,274-279,285-287,290-296,299,303,365-366,371,373
src/props/ConnectedComponents.cpp             71       69    97%   115-116
src/props/ConnectedComponents.H                4        4   100%
src/props/DeffTensor.cpp                      62       59    95%   122,128-129
src/props/Diffusion.cpp                      510      378    74%   93-94,97-98,103-104,106-116,118,123-132,134-141,144-150,153-157,159-163,165,168-173,175-177,179,182-184,186-187,190-191,193,195-198,200,202-203,288-289,297-298,300,349,359-360,368-371,373-375,404-413,415,453,461,465-467,526-527,533,535,539,547,581,610,638,646,735-736,739-740,757-760,771-772,774,824
src/props/EffDiffFillMtx.H                   120      106    88%   58,216-217,221-225,229,231-235
src/props/EffectiveDiffusivityHypre.cpp      413      372    90%   189-191,193-197,352-355,458,610-613,615-617,619-622,631-634,641,670,682-685,687-689,691,706,724,726
src/props/EffectiveDiffusivityHypre.H          7        7   100%
src/props/FloodFill.cpp                       90       87    96%   109-110,250
src/props/HypreStructSolver.cpp              343      210    61%   87-88,121,133-134,145,303,313,315,318,350,360,362,365,371-374,376-380,382-383,385-389,392-393,395-396,398,401-402,405-406,408-411,413-417,419-420,422-426,429-430,432-433,435,438-439,442-443,445-447,449-455,457-461,464-465,467-468,470,473-474,477,479-481,483-489,491-495,498-499,501-502,504,507-508,511,513-515,517-520,522-526,529-530,532-533,535,538-539,542,545-546,559
src/props/HypreStructSolver.H                  6        6   100%
src/props/MacroGeometry.H                     17       17   100%
src/props/ParticleSizeDistribution.cpp        11       11   100%
src/props/ParticleSizeDistribution.H           6        6   100%
src/props/PercolationCheck.cpp                53       46    86%   32-33,49-51,68,73
src/props/PercolationCheck.H                   4        4   100%
src/props/PhysicsConfig.H                     90       89    98%   150
src/props/ResultsJSON.H                      225      222    98%   242,395,416
src/props/REVStudy.cpp                       151      128    84%   72,83-91,159,170-173,175,183-186,188-190
src/props/SolverConfig.H                      32       20    62%   30,32,37-44,75-76
src/props/SpecificSurfaceArea.cpp             56       55    98%   59
src/props/SpecificSurfaceArea.H                6        6   100%
src/props/ThroughThicknessProfile.cpp         38       38   100%
src/props/ThroughThicknessProfile.H            5        5   100%
src/props/Tortuosity.H                         2        2   100%
src/props/TortuosityDirect.cpp               219      191    87%   81-83,86,100-106,113-114,125,134,140,202-209,226,394,424,433
src/props/TortuosityDirect.H                   5        5   100%
src/props/TortuosityHypre.cpp                793      567    71%   149-150,155-156,240-243,246-248,311,335-337,340-341,343,371-373,376-378,408-411,620,644,648,669,686-687,689-691,694-701,708-709,711,713,716-726,730-736,738-742,746-748,750-752,755-762,769-770,772,774-784,788-796,798-801,803,813,819-822,824-826,835-838,840-842,878,881-882,902-904,907,918-921,923,960,965-968,971-973,977-980,982,984-987,989,994-996,998,1047,1056,1061,1064-1069,1085-1088,1102-1106,1111-1116,1126-1130,1135-1140,1145-1149,1152-1155,1162-1165,1176,1185,1187,1191,1193,1218,1259-1260,1346-1348,1474-1477
src/props/TortuosityHypre.H                   15       15   100%
src/props/TortuosityHypreFill.H              127       98    77%   85,203,205-212,237-239,241-245,247-248,250,252,255-256,258-262
src/props/TortuosityKernels.H                 97       53    54%   52,56-60,62-65,69-74,76-80,84-85,90,129,143,157,243,245-248,250-253,257-260,262-265
src/props/TortuosityMLMG.cpp                 149      142    95%   283-285,287-288,293,314
src/props/TortuosityMLMG.H                     1        1   100%
src/props/TortuositySolverBase.cpp           311      247    79%   70-72,74-75,94-100,118,122,124,160-163,218,221,223,409,412-414,416,424-427,429-435,440,445-447,453-454,456-458,494,498-500,503,508-511,513,544,548-550,552,554,558
src/props/TortuositySolverBase.H              13       13   100%
src/props/VolumeFraction.cpp                  25       25   100%
src/props/VolumeFraction.H                     4        4   100%
------------------------------------------------------------------------------
TOTAL                                       5511     3975    72%
------------------------------------------------------------------------------

Generated by CI — coverage data from gcovr

codecov · 2026-05-27T06:59:48Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

jameslehoux merged commit 641f323 into master May 27, 2026
5 checks passed

github-actions Bot added devops gpu labels May 27, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

wheels: compress CUDA fat binaries to fit EB under PyPI 320 MB limit#292

wheels: compress CUDA fat binaries to fit EB under PyPI 320 MB limit#292
jameslehoux merged 1 commit into
masterfrom
claude/issue-289-mlmg-eb-migration

jameslehoux commented May 27, 2026

Uh oh!

Uh oh!

github-actions Bot commented May 27, 2026

Uh oh!

codecov Bot commented May 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

jameslehoux commented May 27, 2026

Uh oh!

Uh oh!

github-actions Bot commented May 27, 2026

Code Coverage Report

Uh oh!

codecov Bot commented May 27, 2026

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant