release: v2.9.0 by octalide · Pull Request #1675 · briar-systems/mach

octalide · 2026-06-27T04:49:48Z

Integration merge of dev into main for the v2.9.0 release.

Since v2.8.0:

Darwin OS + Mach-O dynamic executables: both x86_64-darwin and aarch64-darwin triples (target OS: macOS / Darwin (runtime, syscalls, entry) #1178)
riscv64 RV64A atomic inline-asm mnemonics (riscv64 inline-asm: RV64A atomic mnemonics (lr/sc, amo*, ordering suffixes) #1668)
riscv64 backend hardening: branch relaxation (codegen(riscv64): no long-branch relaxation; out-of-range B-type branch fails to encode #1666), frame-slot overlap (codegen(riscv64): local frame slots overlap the saved ra/s0 record (silent SIGSEGV with any stack local) #1670), 32-bit and/or/xor SIGILL (codegen(riscv64): 32-bit and/or/xor encode illegal word-group ops (SIGILL) #1672)
constant-time secret-qualifier flow typing (sema: secret-qualifier flow typing — join, gate, welded-storage pointers #1645)

Tag v2.9.0 pushed on the merge commit. Makes riscv64 compile + run real std, unblocking mach-std #308/#307.

…ed pointers Implement the information-flow type rules for the `^` secret qualifier (#1645), building on the frontend tokens/AST from #1644. - type universe: TYPE_SECRET wraps an inner type as the top of a two-point secrecy lattice; intern_secret dedups and collapses `^^T`. Classifier predicates (is_integer/is_numeric/is_float/is_pointer_like) and byte_size see through `^`; carries_secret and secrecy_structure_equal back the welded rules. - resolve: bind `^T` inner names. - infer: resolve `^T` to TYPE_SECRET; JOIN secrecy through binary/unary results; taint field/element reads from secret containers; `:^` strip (the sole downgrade); secret coerces up the lattice with no syntax. - gates: secret branch/loop condition, secret memory index, secret `/`/`%` operand, and secret `&&`/`||` operand are compile errors with teaching notes. - welded storage: `*^T` cannot erase to `ptr`, `::`/`:~` cannot add or drop `^`, and a `uni`'s overlapping variants must agree on secrecy. - lowering: secrecy has no runtime representation, so `^T` lowers to `T`'s shape and `:^` is a value passthrough; mangling and generic substitution thread it. The `#[oblivious]` contract and the secret-taint-to-MIR are out of scope here and land in #1646.

- driver: end-to-end sema tests for the secret qualifier (#1645) - the clean join/coerce/strip program, the four leakage gates, `::`/`:~` secrecy preservation, welded-pointer non-launderability, and uni secrecy agreement, each with a passing public counterpart. - doc/language: new secrecy.md reference (lattice, join, gates, `:^`, welded pointers, trusted base); grammar/operators/types/uni cross-link it and drop the "not yet enforced" notes now that sema enforces the rules.

…typing

The darwin os vtable advertised "_main" as the program entry, but the std darwin runtime defines the entry as "start" (the conventional Mach-O thread entry that reads the raw kernel stack). Linking a real darwin executable therefore failed to resolve the entry. Point the vtable at "start" so the linker keys the entry symbol the runtime provides. Refs #1178

`&&` / `||` short-circuit by branching on the LEFT operand (the right is evaluated conditionally), so only a secret left operand is the secret-dependent branch the leakage model observes. A secret right operand merely taints the result via the join. Narrowing the gate from either-operand to left-only stops rejecting the sound `public && secret` while still catching the real branch leak. Test matrix and secrecy.md updated to match.

Implements emit_dyn_exec for the Mach-O format, framing a dyld-loadable MH_EXECUTE: a leading __PAGEZERO, a __TEXT that maps the mach header below the linker's text segment so dyld can read the load commands, the linker's segments, synthesized __STUBS/__GOT segments for function imports, and a __LINKEDIT carrying an LC_DYLD_INFO_ONLY bind stream plus the symbol/string tables. Emits LC_LOAD_DYLINKER (/usr/lib/dyld), one LC_LOAD_DYLIB per dependency, LC_SYMTAB/LC_DYSYMTAB, and the static LC_UNIXTHREAD entry. Each function import's got slot is bound non-lazily and every deferred call site is redirected to a per-arch import stub (x86_64 rip-relative jump, arm64 adrp/ldr/br). Both architectures are supported; the writer is registered on the vtable. Also round-trips a symbol's function-ness through the relocatable object: an undefined symbol carries no n_type class, so emit_object records it as REFERENCE_FLAG_UNDEFINED_LAZY in n_desc and parse_object recovers it, which the linker reads to give a dynamic function import its call stub. Unit tests cover the dynamic-exec skeleton for both architectures. Refs #1178

A `.dylib` link input against a Mach-O target is now recorded as an LC_LOAD_DYLIB dependency by its install name, with no on-disk probe (dyld resolves it at run time, and a system dylib is absent on a cross host). This mirrors the existing PE `.dll` handling and lets a darwin program declare a dynamic dependency, driving the linker down the dynamic-exec path. A `.dylib` against a non-Mach-O target is rejected. Refs #1178

Updates the static fixture and verify lane for the "start" entry symbol, and adds a dynamic fixture (a libprobe.dylib dependency and a probe_add import) that drives emit_dyn_exec. The verify lane now cross-compiles both fixtures for both Darwin triples and byte-verifies the dynamic load commands (LC_LOAD_DYLINKER, LC_LOAD_DYLIB, LC_DYLD_INFO_ONLY bind stream, LC_SYMTAB/LC_DYSYMTAB) and the import stub with the llvm tools. No execution step: Darwin binaries cannot run on the Linux host, and an arm64 Darwin binary additionally needs a code signature this writer does not emit. Refs #1178

feat(sema): secret-qualifier flow typing - join, gate, welded-storage pointers

Extend the riscv64 inline-asm assembler with the A-extension subset that std's riscv64 atomic arms need: lr.w/lr.d, sc.w/sc.d, the amo* read- modify-writes (swap/add/and/or/xor/max/min/maxu/minu) in .w and .d, the .aq/.rl/.aqrl ordering suffixes, and pause for spin_hint. The AMO major opcode (0x2F) is R-type with funct5 in bits [31:27] and the aq/rl bits in [26:25], so the assembled funct7 is (funct5 << 2) | (aq << 1) | rl. Atomics take the base-only (rs1) address form with no displacement, unlike regular loads and stores. lr is the two-operand rd, (rs1) form; sc and the amo* forms are rd, rs2, (rs1). The suffixes are parsed inline so a future rv32 reuses the .w forms. A byte-test covers every mnemonic and ordering variant against llvm-mc -mattr=+a, plus the no-displacement and unsupported-prefix rejections. Part of #1668.

# Conflicts: # src/lang/target/of/macho.mach

…nalyzer The inline-asm clobber analyzer drove the new atomic mnemonics to the conservative all-register fallback. Classify them precisely: lr / sc / amo* each write their rd operand (the leftmost register), and pause writes nothing like fence. A prefix test (lr. / sc. / amo) covers the whole family across its width and ordering suffixes, so the allocator no longer over-preserves around an atomic block. Extends the clobber test. Part of #1668.

feat(target): macOS / Darwin OS support (runtime entry, syscalls, dynamic exec)

RISC-V B-type conditional branches reach only +-4 KiB and J-type jumps only +-1 MiB, so a function whose encoded body exceeds those ranges could not encode: the patcher rejected the overflowing displacement instead of relaxing it. A large stack frame (>2 KiB offsets expanding into multi-instruction sequences) inflates a body past 4 KiB and overflows a conditional branch spanning it. Add branch relaxation to the riscv64 encoder. A conditional past +-4 KiB becomes its inverted guard (a short branch skipping the trampoline) plus a jal to the real target; a jal past +-1 MiB becomes an auipc+jalr pc-relative trampoline. The relaxation runs as a per-function fixpoint after the body is emitted: each growth opens a text gap that slides the rest of the function down and fixes the block / fixup / symbol / relocation tables, so each rescan remeasures against the grown layout. The new shared encode.insert_text owns the ISA-general gap mechanism; encode.block_offset is made public so the pass can resolve targets. Closes #1666

Any riscv64 function with an address-taken local or a spill / aggregate slot (frame.size > 0) corrupted its own saved return address and segfaulted at runtime. The prologue pins s0 at the frame top, with the 16-byte ra / s0 record and the callee-saved areas occupying the bytes immediately below it, but the shared frame phase assigns local slot offsets just below the frame pointer (the x86 model). On riscv64 those landed on top of the saved record - e.g. a [256]u8 buffer at s0-256 while ra was saved at s0-8, so zeroing the buffer clobbered ra. Bias the s0-relative slot offset down by frame_reserved_top (16 + callee-saved GP + FP bytes) so locals land below the reserved region. Localized to frame_slot_offset, mirrored in the assembly printer; s0 stays at the frame top so incoming stack-argument access is unchanged. It byte-encoded fine, so byte-verify never caught it and the register-only freestanding fixtures never exercised it.

Extend the freestanding rv64 fixture with the RV64A atomics: an lr.d/sc.d compare-and-swap loop swaps a stack cell 0 -> 7 (the reservation-retry and mismatch arms are encoded though the fast path takes neither), then amoadd.d adds 4 and returns the swapped value. the folded result (prior 7 + post-rmw 11 = 18) is added to the exit code, so a wrong atomic encoding or a lost store changes it. the atomics run in _start, matching how the fixture already emits its inline-asm exit. verify.sh asserts the new exit code 71 and that lr.d / sc.d / amoadd.d disassemble as the real instructions (--mattr=+a enables the A-extension decode so the words are not <unknown>). Part of #1668.

Extend the freestanding rv64 fixture with two probes folded into the qemu exit code, so a regression in either fix changes the asserted code: - parse_probe: a deterministic parse over a [2048]u8 stack buffer. The large frame inflates the encoded body past 4KiB and leaves a conditional branch out of the B-type +-4KiB range, exercising long-branch relaxation (#1666). verify.sh asserts the inverted-guard + jal sequence is present. - stack_probe: a small stack-local buffer written then read back, exercising frame-slot addressing (#1670). The result matches the unrelaxed x86_64 / aarch64 build (exit code 68). verify.sh: disassemble with --mattr=+m,+f,+d so the <unknown> guard does not false-positive on valid M / F / D words (the backend emits no .riscv.attributes ISA string yet), and read grep inputs from here-strings so a large disassembly does not trip a SIGPIPE under pipefail.

feat(riscv64): RV64A atomic inline-asm mnemonics (lr/sc, amo*, ordering suffixes)

…h-relax # Conflicts: # test/riscv64/src/main.mach # test/riscv64/verify.sh

A 32-bit bitwise and / or / xor encoded an illegal instruction and faulted with SIGILL at runtime (surfaced building std crypto / bignum, which is u32-heavy). encode_alu3 selected the major opcode through alu_opcode(width), which returns OP_32 (the RV64 word group) at 32-bit width - but RISC-V defines no andw / orw / xorw, so the word matched no encoding (funct3 6/7/4 with funct7 0 in OP_32 is reserved). Bitwise ops have no word form: they are bit-for-bit identical at any width, and a bitwise op of two consistently extended 32-bit operands yields a consistently extended result. Thread a word_form flag through encode_alu3 so mul keeps its mulw word form while and / or / xor always encode as the full-register OP. add / sub / shifts / mul / div already use valid word forms, so only the logical ops were affected. Verified: a std sha256 / sha512 / bignum consumer cross-compiled for linux-riscv64 now runs under qemu to the same result as the native build, and a 32-bit bitwise probe in the fixture matches the native value.

fix(riscv64): relax out-of-range branches + frame-slot overlap (#1666)

chore(release): v2.9.0

octalide added 23 commits June 26, 2026 22:54

Merge remote-tracking branch 'origin/dev' into feat/1645-secret-flow-…

c46a60e

…typing

Merge pull request #1662 from octalide/feat/1645-secret-flow-typing

296e533

feat(sema): secret-qualifier flow typing - join, gate, welded-storage pointers

Merge remote-tracking branch 'origin/dev' into feat/1178-darwin-os

5a22a34

# Conflicts: # src/lang/target/of/macho.mach

Merge pull request #1667 from octalide/feat/1178-darwin-os

3877252

feat(target): macOS / Darwin OS support (runtime entry, syscalls, dynamic exec)

Merge pull request #1669 from octalide/feat/1668-riscv64-rv64a-asm

b3ff706

feat(riscv64): RV64A atomic inline-asm mnemonics (lr/sc, amo*, ordering suffixes)

Merge remote-tracking branch 'origin/dev' into fix/1666-riscv64-branc…

61d081f

…h-relax # Conflicts: # test/riscv64/src/main.mach # test/riscv64/verify.sh

Merge pull request #1671 from octalide/fix/1666-riscv64-branch-relax

5afdbdc

fix(riscv64): relax out-of-range branches + frame-slot overlap (#1666)

chore(release): v2.9.0

99b10d1

Merge pull request #1674 from octalide/chore/release-2.9.0

a07382f

chore(release): v2.9.0

octalide merged commit f76d4ee into main Jun 27, 2026
10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

release: v2.9.0#1675

release: v2.9.0#1675
octalide merged 23 commits into
mainfrom
dev

octalide commented Jun 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Uh oh!

Conversation

octalide commented Jun 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant