fix(task-148): Toyota Way 500-line refactor + FALSIFY-CORPUS-004 + QLoRA + GPU training backend#1003
Closed
noahgift wants to merge 2 commits into
Closed
fix(task-148): Toyota Way 500-line refactor + FALSIFY-CORPUS-004 + QLoRA + GPU training backend#1003noahgift wants to merge 2 commits into
noahgift wants to merge 2 commits into
Conversation
Addresses 2026-04-22 outage where all 16 intel-clean-room runners went offline because / on intel hit 100% (3.5T/3.6T). Runner diag logs couldn't be written, so GitHub marked runners offline. Two layers of defence: - pre-job hook: aggressive target/ prune when disk >= 85% - nightly timer: prune target/ older than 7 days Scripts are runner-host-agnostic — install path and deployment recipe in scripts/runner-infra/README.md. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…or + FALSIFY-CORPUS-004 + QLoRA contract + GPU training backend Toyota Way (PMAT-689): split 5 files over 500-line cap via include!() pattern - distill.rs 1984→468 (4-way split: types/config_and_execute/train_and_write/text_generate) - extended_commands.rs →497 (4 sibling sub-enum files: forensics/lints/runs/training) - dispatch_analysis.rs →453 (+ dispatch_helpers.rs + dispatch_profiling.rs) - lib_dispatch_coverage.rs 773→158 (3 sibling test files: analysis/profiling/train) - pull.rs →374 (+ pull_sharded.rs) FALSIFY-CORPUS-004 pre-flight gate (#142/#144/#145/#146/#147): - contracts/pretraining-corpus-v1.yaml v2.0.0 (INV-TRAIN-010/011) - ShardBatchIter::count_tokens static counter - cycling_iter.rs: BoxedShardIter + optional cycling - pretrain_preflight.rs + pretrain_report.rs module split - --allow-shard-cycle CLI flag wired - pretrain_tests.rs unit tests covering epoch/budget/cycle paths QLoRA distillation contract (#137): - contracts/entrenar/qlora-distillation-v1.yaml v1.1.0 PROPOSED - distill/{preflight,driver,apr_writer}.rs wiring 5 INV-DISTILL invariants - 14 harness tests at PARTIAL_ALGORITHM_LEVEL GPU training backend Phase 2 (#132): - pretrain_real_cuda.rs CUDA dispatch wiring - evidence/gpu-training-backend/ Phase 2 scaffold MODEL-2 spec updates: - ship-two-models-spec.md v2.24.0 (INV-TRAIN-011 + corpus v2.0) - roadmap.yaml phase tracking for tasks #142/#144/#145/#146/#147 Verification: - cargo test -p apr-cli --features training --lib → 5307 passed - cargo fmt --all -- --check → clean - cargo clippy -p apr-cli --features training --lib -- -D warnings → clean - cargo clippy -p aprender-train --lib -- -D warnings → clean - All changed files ≤500 lines (pmat work complete invariant GREEN) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
Hey @noahgift, Code Input detected this PR has a merge conflict. The conflicts of this PR can be resolved with a semantic merge driver. Code Input can do that automatically: https://codeinput.com/r/3rcKnuOwggR Let me know if you need more help with this conflict or how Code Input works. |
Contributor
Author
Contributor
Author
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Toyota Way file-size refactor (PMAT-689 / task #148) bundled with FALSIFY-CORPUS-004 pre-flight gate, QLoRA distillation contract (#137), and GPU training backend Phase 2 (#132).
include!()patterndistill.rs1984→468 (4-way split)extended_commands.rs→497 (4 sibling sub-enum files)dispatch_analysis.rs→453 (helpers + profiling)lib_dispatch_coverage.rs773→158 (3 sibling test files)pull.rs→374 (+pull_sharded.rs)pretraining-corpus-v1v2.0.0 (INV-TRAIN-010/011),ShardBatchIter::count_tokens,BoxedShardIter+ optional cycling, pre-flight module split,--allow-shard-cycleCLI flag, unit testscontracts/entrenar/qlora-distillation-v1.yamlv1.1.0 PROPOSEDpretrain_real_cuda.rsCUDA dispatch wiringship-two-models-spec.mdv2.24.0 +roadmap.yamlTest plan
cargo fmt --all -- --checkcargo test -p apr-cli --features training --lib→ 5307 passed, 12 ignoredcargo clippy -p apr-cli --features training --lib -- -D warningscargo clippy -p aprender-train --lib -- -D warningscargo test -p aprender-train --lib cpu_stepfn_exhaustion→ 2 passed (PMAT-688 CPU peer)🤖 Generated with Claude Code