feat(format): transpose-kernel-v1 6-gate PARTIAL discharge#1386
Closed
noahgift wants to merge 2 commits into
Closed
feat(format): transpose-kernel-v1 6-gate PARTIAL discharge#1386noahgift wants to merge 2 commits into
noahgift wants to merge 2 commits into
Conversation
Binds FALSIFY-TP-001..006 from transpose-kernel-v1 at PARTIAL_ALGORITHM_LEVEL via 6 verdict functions plus a reference `transpose_rowmajor` helper. - TP-001: transpose(A)[j][i] == A[i][j] elementwise - TP-002: transpose(transpose(A)) == A bitwise - TP-003: non-8-aligned dimensions transpose correctly (7×13, 17×3, 1×N, N×1) - TP-004: AVX2 vs scalar bit-exact (zero tolerance) - TP-005: transpose(I) == I for identity matrices - TP-006: 2048×128 attention shape matches naive reference ## Five Whys 1. Why does transpose-kernel-v1 list 6 falsification IDs without algorithm-level discharge? PMAT lints flagged FALSIFY-TP-001..006 as unbound at PARTIAL_ALGORITHM_LEVEL. 2. Why does that block ship? Coverage % cannot move while peripheral transpose kernel invariants have no algorithm-level verdict module. 3. Why bind both verdicts AND a reference `transpose_rowmajor`? The involution gate (TP-002) needs a deterministic transpose to walk through twice; the non-aligned gate (TP-003) builds its own test matrix and self-transposes. Pinning the reference in-module makes the algorithm-level decision rule self-contained. 4. Why bit-exact (`to_bits()`) for TP-001/002/004 vs float-tolerant? Transpose is a memory-layout op — there's no float arithmetic to round. Any bit-level drift between input and transposed output is a pure index/permute bug. Float-tolerant compare would miss exactly the regression class the gate exists to catch. 5. Why an explicit 2048×128 hardcoded check for TP-006 (not a parametric range)? The contract calls out that specific shape as the attention-projection real-world dim. A future kernel that accidentally only handles square shapes would pass a parametric square sweep but fail the 2048×128 case. The hardcode locks in the actual production-relevant dim. Adds 28 unit tests including a 6-pair dim sweep on TP-002 and a 4-shape correctness sweep on TP-003. Realistic-healthy walks the canonical 2048×128 / 8×8 identity / non-aligned 7×13 mixed scenario; pre-fix walks 6 simultaneous regressions (corrupted element, size mismatch, zero-dim, AVX2 ULP drift, zero-N identity, attention drift). No runtime % shift; algorithm-level coverage advances by 6 gates.
eb38b38 to
d6da4a7
Compare
Contributor
Author
auto-merge was automatically disabled
May 12, 2026 09:21
Pull request was closed
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
transpose-kernel-v1atPARTIAL_ALGORITHM_LEVELvia 6 verdict functions + referencetranspose_rowmajor.Gates bound
transpose(A)[j][i] == A[i][j]elementwise (bit-exact)transpose(transpose(A)) == Abitwisetranspose(I) == Ifor square identityReference helper
transpose_rowmajor(src, rows, cols) -> Option<Vec<f32>>— pure-Rust row-major transpose, returnsNoneon size mismatch.Five Whys
See commit message — captures why bit-exact for transpose (memory-layout op, no float arithmetic) and why 2048×128 is hardcoded for TP-006 (production-relevant attention dim).
Test plan
cargo test -p aprender-core --lib tp_001_006— 28 passed🤖 Generated with Claude Code