Skip to content

ci: all CI pipeline optimizations combined (P0-P7)#8

Open
huth-stacks wants to merge 19 commits into
developfrom
ci/all-optimizations
Open

ci: all CI pipeline optimizations combined (P0-P7)#8
huth-stacks wants to merge 19 commits into
developfrom
ci/all-optimizations

Conversation

@huth-stacks

Copy link
Copy Markdown
Owner

What

Combined branch with all 7 CI optimization PRs applied together. This measures the aggregate impact.

Individual PRs

  • P0: Fix unit test failure masking (correctness)
  • P1: Remove yml from paths-ignore (correctness)
  • P2: Start cache building immediately (latency)
  • P3: Use larger runners for compilation jobs (latency)
  • P4: Increase unit test partitions 8→12 (latency)
  • P6: Reuse cached build for constants check (latency)
  • P7: Enable clippy caching (latency)

Expected Aggregate Impact

Metric Baseline Expected
nextest-archive 15m33s ~8-10m (larger runner)
Slowest unit partition 24m1s ~14-16m (more partitions)
Constants check 4m10s ~10-15s (artifact reuse)
Cargo hack native 13m14s ~7-8m (larger runner)
Clippy 4-8m ~1-2m (caching, warm)
create-cache start delay ~28s ~0s (parallel)

Total Lines Changed

~30 lines across 7 files. Each change is independently revertable.

Security Checklist

  • No new permissions granted
  • No secrets exposure
  • No new third-party actions
  • Cache isolation preserved
  • All GitHub-hosted runners

Fixed logical operator in tenure-start block validation to correctly
reject blocks where either the coinbase or tenure-change transaction is
in the wrong position, not only when both are wrong
Wait for state machine updates before proposing block to avoid the
signer rejecting the proposal immediately.
@huth-stacks huth-stacks added the no changelog Skip changelog fragment check label Mar 24, 2026
@huth-stacks huth-stacks reopened this Mar 24, 2026
@huth-stacks huth-stacks force-pushed the ci/all-optimizations branch from d283796 to 314f520 Compare March 24, 2026 21:30
@huth-stacks huth-stacks reopened this Mar 24, 2026
@huth-stacks huth-stacks force-pushed the ci/all-optimizations branch from 314f520 to ca1ee69 Compare March 24, 2026 22:03
…-in-signer

fix: `continue` instead of `return` on unknown signer
fix: logic error in `is_wellformed_tenure_start_block`
…igner-tests

test: fix flakiness in `signer_waits_for_validation_before_signing`
@huth-stacks huth-stacks force-pushed the ci/all-optimizations branch from ca1ee69 to e30f2d5 Compare March 25, 2026 11:32
The unit-tests job had continue-on-error: true and the check-tests
job did not depend on unit-tests, causing test failures to be
silently swallowed. Remove continue-on-error and add unit-tests
to the check-tests needs array.
The paths-ignore block excluded **.yml files, which meant pushes
to master/develop/next containing only workflow file changes would
silently skip CI. Remove this exclusion so workflow changes are
always validated.
The create-cache workflow waited for rustfmt, changelog-check, and
check-release before starting the 15-minute nextest archive build.
These gates are independent from compilation. Test workflows already
depend on both create-cache and the format checks independently,
so format failures still block test results.
Switch nextest-archive, cargo-hack native-targets, and constants-check
to ubuntu-latest-m (4 vCPU, 16GB RAM) for faster compilation. The
release workflow already uses these runners. Expected ~30-50% faster
compilation with negligible cost difference.
Add a Python script that reads JUnit XML timing data from previous
CI runs and uses greedy bin-packing to distribute tests into
time-balanced partitions. Falls back to hash-based partitioning
when no timing data is available (first run, or data expired).

New files:
  .github/scripts/split-tests-by-timing.py - bin-packing script
  .config/nextest.toml - enables JUnit XML output for CI profile

Each partition uploads its JUnit XML as an artifact (90-day retention).
On subsequent runs, all partitions download this timing data and the
script assigns tests to minimize the slowest partition's duration.

POC: inlines the nextest command to test this approach. Production
implementation should integrate with stacks-network/actions.
Instead of compiling stacks-inspect from scratch (4+ minutes), the
nextest-archive job now generates the constants JSON and uploads it
as an artifact. The constants-check job downloads and diffs it,
reducing the check from 4 minutes to ~10 seconds.

Also adds create-cache to constants-check's needs in ci.yml so the
artifact is available before the download step runs.
The clippy workflow explicitly disabled caching, causing full
recompilation on every PR. Enable the built-in cargo/target
caching from actions-rust-lang/setup-rust-toolchain.
Set CARGO_INCREMENTAL=0 and CARGO_PROFILE_DEV_DEBUG=0 for the test
cache build. Incremental compilation adds ~10% overhead on clean CI
builds and bloats the target directory. Debug info level 2 (default)
increases binary size and link time without benefit in CI where we
don't debug interactively.
Remove -Cinstrument-coverage from RUSTFLAGS in create-cache.yml.
This eliminates ~15% compilation overhead, ~56% binary size bloat,
and per-test .profraw I/O from every PR run. Coverage data is no
longer collected per-PR. The coverage report job will skip gracefully.

Standard practice for large Rust projects — coverage on merge, not
per-PR. Can be restored by reverting this one-line change.
Remove pull_request from cargo-hack-check's trigger condition.
Feature combination checks still run on merge queue, releases,
and manual dispatch — catching issues before they reach develop.
PRs skip this 13-minute job for faster feedback.
@huth-stacks huth-stacks force-pushed the ci/all-optimizations branch from e30f2d5 to a97066b Compare March 25, 2026 11:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

no changelog Skip changelog fragment check

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants