This repository contains a from-scratch Rust 2024 implementation of the PKU MiniC SysY Compiler Practice's target. It implements the required Koopa IR, RISC-V, and performance-mode interfaces, and now closes the current PKU Lv9+-aligned representative optimization scope.
The project is intentionally self-contained: the compiler crate has no external Rust dependencies, so it can build inside the PKU compiler-dev container without network access.
| Area | Status |
|---|---|
| SysY language coverage | Lv1-Lv9 functional coverage, including arrays, initializers, function calls, recursion, runtime calls, short-circuit logic, and control flow. |
| Required CLI | Implemented: -koopa, -riscv, and -perf. |
| Baseline backend | Emits RV32IM assembly for the full supported SysY subset. |
| Optimized backend | -perf first tries the allocated-SSA RISC-V fast path, then conservatively falls back to baseline RISC-V plus machine cleanup when needed. |
| Lv9+ optimization scope | PKU-aligned representative optimizations are complete under the current scope. Remaining work is optional generalization, observability, or performance-score tuning. |
| Validation snapshot | Local Rust checks pass; public Docker Lv1-Lv9 Koopa/RISC-V tests pass 130/130; public perf correctness passes 20/20. |
See docs/optimization.md for the optimization scope and docs/verification.md for command history.
compiler -koopa input.sy -o output.koopa
compiler -riscv input.sy -o output.s
compiler -perf input.sy -o output.s
-koopa emits Koopa text IR. -riscv emits baseline RV32IM assembly. -perf emits optimized RV32IM assembly and is the path used by the public performance suite.
Debug-oriented SSA dumps are also available:
compiler -perf input.sy --dump-ssa -o output.ssa
compiler -perf input.sy --dump-ssa-alloc -o output.alloc
compiler -perf input.sy --dump-ssa-riscv -o output.s
These dumps are intended for compiler development and regression tests. The official -perf product remains RISC-V assembly, not SSA text.
- Handwritten lexer, parser, AST, scopes, and semantic checks for the PKU SysY teaching language.
- Constant evaluation for scalar constants, array dimensions, and global initializers.
- Correct lowering for short-circuit
&&and||. - SysY globals, locals, arrays, nested brace initializers, array parameters, partial array decay for calls, recursion, and runtime library calls.
- Koopa text backend for official IR validation.
- Full baseline RV32IM backend for required
-riscvoutput. - SSA infrastructure with Koopa-style basic block arguments, verifier, CFG analysis, dominance analysis, scalar mem2reg, and block-argument lowering.
- Lowered-SSA liveness, deterministic linear-scan register allocation, spill-slot assignment and reuse, and allocated pseudo rendering.
- Allocated-SSA RV32IM renderer for the supported fast-path subset, including scalar globals, local/global arrays, function calls, and stack call arguments.
The current optimizer is representative rather than research-grade. It covers the optimization families discussed by the PKU Lv9+ optional chapter while keeping conservative correctness boundaries.
Machine-independent SSA/scalar optimization includes:
- Scalar DCE, constant folding, algebraic simplification, SCCP-style constant propagation, constant-branch simplification, dominance-scoped scalar GVN, and conservative scalar PRE.
- Conservative scalar LICM, pure single-block scalar inlining, CFG block merging, forwarding-block removal, unreachable-block removal, and block-argument DCE.
- Static counted-loop unrolling for small pure scalar loops.
- Guarded unknown-trip-count loop unroll-by-2 with explicit tail semantics.
- Representative phi-aware induction-variable strength reduction for direct
%iv * constshapes, including factor-left/right cases and additive or subtractive steps.
Machine-dependent optimization includes:
- Unreachable-instruction, fallthrough-jump, jump-chain, unused-label, and static-branch cleanup.
- Conditional-branch inversion for jump-over-fallthrough patterns.
- No-op move cleanup and transparent move propagation.
- Power-of-two multiply strength reduction, zero-register arithmetic simplification, repeated
licleanup, and small-immediate ALU folding. - Redundant load forwarding, store/load forwarding, conservative dead-store elimination, and same-base word-address disjointness checks.
- Bounded-lookahead load-use instruction scheduling.
compiler -perf is designed to optimize first and degrade only when the conservative SSA backend cannot safely handle a program shape:
- Parse and type-check the SysY source.
- Try the allocated-SSA RISC-V fast path for supported whole-program shapes.
- Run scalar SSA optimizations, block-argument lowering, liveness, linear-scan allocation, and RISC-V rendering.
- Run the machine optimizer over SSA-generated assembly.
- If the SSA path rejects the program, fall back to the baseline RISC-V backend and still run the machine optimizer.
This means fallback is correctness-preserving and still optimized at the machine level. The main remaining backend generalization would be a hybrid per-function fast path instead of the current conservative whole-program fallback boundary.
Local checks:
cargo fmt --check
cargo test
cargo clippy --all-targets --all-features --locked -- -D warningsOfficial-style Docker checks:
docker run --rm -v "$PWD:/root/compiler" maxxing/compiler-dev \
autotest -koopa /root/compiler
docker run --rm -v "$PWD:/root/compiler" maxxing/compiler-dev \
autotest -riscv /root/compiler
docker run --rm -v "$PWD:/root/compiler" maxxing/compiler-dev \
autotest -perf -s perf /root/compilerLatest recorded closeout validation on 2026-05-28:
cargo fmt --check: passed.cargo test: passed.cargo clippy --all-targets --all-features --locked -- -D warnings: passed.- Docker
autotest -koopa /root/compiler: 130/130 passed. - Docker
autotest -riscv /root/compiler: 130/130 passed. - Docker
autotest -perf -s perf /root/compiler: 20/20 passed.
The public perf harness reports correctness and coarse local timing only; it should not be treated as a hidden-suite score predictor. See docs/perf-baseline.md for the current timing notes, fast-path/fallback matrix, and noise threshold.
crates/sysyc/src/frontend/ SysY lexing, parsing, AST, and semantic input
crates/sysyc/src/backend/ Koopa, baseline RISC-V, SSA, and machine passes
crates/sysyc/src/bin/compiler.rs Required CLI entry point
crates/sysyc/tests/ Integration smoke tests
docs/optimization.md Current PKU Lv9+ optimization scope
docs/verification.md Validation history and evidence
docs/perf-baseline.md Public perf-suite baseline notes
docs/progress.md Implementation progress log
This project is licensed under the Apache License, Version 2.0. See LICENSE for the full text.
This is a teaching-compiler implementation aligned with the PKU MiniC SysY practice target. It does not copy or port existing SysY compiler implementations. Official PKU tools and public fixtures are used only for validation.
The current Lv9+ optimization work is closed at representative scope. Broader research-grade work, such as graph-coloring allocation, full aggregate or address-taken mem2reg, whole-program pointer analysis, exhaustive SSA optimization coverage, or a complete replacement of the baseline backend, is intentionally out of scope unless the project goal is broadened again.