Skip to content

ci: all optimizations v2 (excl P3)#27

Open
huth-stacks wants to merge 25 commits into
developfrom
ci/all-optimizations-v2
Open

ci: all optimizations v2 (excl P3)#27
huth-stacks wants to merge 25 commits into
developfrom
ci/all-optimizations-v2

Conversation

@huth-stacks

Copy link
Copy Markdown
Owner

Combined P0-P17 excluding P3/larger runners

brice-stacks and others added 25 commits March 21, 2026 08:13
Fixed logical operator in tenure-start block validation to correctly
reject blocks where either the coinbase or tenure-change transaction is
in the wrong position, not only when both are wrong
Wait for state machine updates before proposing block to avoid the
signer rejecting the proposal immediately.
…-in-signer

fix: `continue` instead of `return` on unknown signer
fix: logic error in `is_wellformed_tenure_start_block`
…igner-tests

test: fix flakiness in `signer_waits_for_validation_before_signing`
The unit-tests job had continue-on-error: true and the check-tests
job did not depend on unit-tests, causing test failures to be
silently swallowed. Remove continue-on-error and add unit-tests
to the check-tests needs array.
The paths-ignore block excluded **.yml files, which meant pushes
to master/develop/next containing only workflow file changes would
silently skip CI. Remove this exclusion so workflow changes are
always validated.
The create-cache workflow waited for rustfmt, changelog-check, and
check-release before starting the 15-minute nextest archive build.
These gates are independent from compilation. Test workflows already
depend on both create-cache and the format checks independently,
so format failures still block test results.
Add a Python script that reads JUnit XML timing data from previous
CI runs and uses greedy bin-packing to distribute tests into
time-balanced partitions. Falls back to hash-based partitioning
when no timing data is available (first run, or data expired).

New files:
  .github/scripts/split-tests-by-timing.py - bin-packing script
  .config/nextest.toml - enables JUnit XML output for CI profile

Each partition uploads its JUnit XML as an artifact (90-day retention).
On subsequent runs, all partitions download this timing data and the
script assigns tests to minimize the slowest partition duration.

Uses --workspace-remap and --profile ci to ensure JUnit output is
written correctly when running from a nextest archive.

POC: inlines the nextest command to test this approach. Production
implementation should integrate with stacks-network/actions.
Instead of compiling stacks-inspect from scratch (4+ minutes), the
nextest-archive job now generates the constants JSON and uploads it
as an artifact. The constants-check job downloads and diffs it,
reducing the check from 4 minutes to ~10 seconds.

Also adds create-cache to constants-check's needs in ci.yml so the
artifact is available before the download step runs.
The clippy workflow explicitly disabled caching, causing full
recompilation on every PR. Enable the built-in cargo/target
caching from actions-rust-lang/setup-rust-toolchain.
Set CARGO_INCREMENTAL=0 and CARGO_PROFILE_DEV_DEBUG=0 for the test
cache build. Incremental compilation adds ~10% overhead on clean CI
builds and bloats the target directory. Debug info level 2 (default)
increases binary size and link time without benefit in CI where we
don't debug interactively.
Remove -Cinstrument-coverage from RUSTFLAGS in create-cache.yml.
This eliminates ~15% compilation overhead, ~56% binary size bloat,
and per-test .profraw I/O from every PR run. Coverage data is no
longer collected per-PR. The coverage report job will skip gracefully.

Standard practice for large Rust projects — coverage on merge, not
per-PR. Can be restored by reverting this one-line change.
Remove pull_request from cargo-hack-check's trigger condition.
Feature combination checks still run on merge queue, releases,
and manual dispatch — catching issues before they reach develop.
PRs skip this 13-minute job for faster feedback.
sccache caches individual compilation units (rustc invocations) using the
GitHub Actions cache backend, providing much better cache reuse across
commits compared to the current cargo target dir caching which keys on
commit SHA. Research shows 11-35% build speedup with warm sccache cache.

Adds mozilla-actions/sccache-action and RUSTC_WRAPPER=sccache to:
- create-cache.yml (nextest archive build)
- clippy.yml
- constants-check.yml
- cargo-hack-check.yml (native, wasm, and fuzz targets)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…pendencies

check-release now starts immediately (no dependencies), running in
parallel with rustfmt and changelog-check. All downstream jobs only
depend on check-release and create-cache where needed, not on formatting
checks. This shaves ~0.5-2 min off the critical path of every PR run.

rustfmt and changelog-check still run and report status independently —
they just no longer block test execution.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Remove the separate setup job that only reads the toolchain version.
Each job now reads it inline. Split the combined native-targets job
into separate linux-targets and windows-targets jobs that run in
parallel, reducing wall-clock time by ~2-6 min.

Job structure: linux-targets, windows-targets, wasm-targets, fuzz-targets
all run in parallel with no inter-dependencies.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Switch nightly/dispatch docker image builds from --profile release (fat
LTO) to --profile release-lite (thin LTO). This saves ~3-8 min per
build. Tagged release builds via release-build.yml are unchanged and
still use full release profile.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add compression-level: 0 to upload-artifact steps that upload .zip
files. These are already compressed, so re-compressing them wastes
~30-90s per release/image workflow with no size benefit.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Install lld and add -Clink-arg=-fuse-ld=lld to rustflags for
x86_64-unknown-linux-gnu targets in release-build.yml and
docker-image.yml. lld is significantly faster than the GNU linker
for linking with LTO, saving ~1-4 min on release builds.

Only applied to x86_64-gnu; aarch64 (cross-compilation), musl,
windows, and macOS targets are unchanged.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants