Skip to content

Tracking: WASI hardening cycle 2 — wave 12+ follow-ups (post-#583) #616

@cataggar

Description

@cataggar

Continuation of the WASI hardening cycle that began with #583 (closed after waves 7-11). The Preview 3 surface is at 40/40 conformance + 99.1 % (338/341) WIT-method coverage; the remaining work falls into five buckets:

Bucket Description
A Direct follow-ups from waves 7-11 PRs (small, well-scoped)
B wasi:threads (legacy P1 surface) — multi-PR sub-wave per docs/design/wasi-threads.md (PR #603)
C Maintenance cadence (parity poll, bumps, audit refresh)
D Investigations (noise / flakes / disk hygiene)
E Future scope (await upstream — track but don't start yet)

Each item below carries enough context that a future wave-N agent can pick it up directly without re-deriving the scope.


A. Direct follow-ups from waves 7-11

  • A1Honour outbound HTTP request-options timeouts. PR feat(wasi:http/types@0.2): 7 missing P2 audit arms (#583) #612 wired the 6 P2 timeout getters/setters into the WIT surface but the worker (httpClientLowLevelFetch, PR feat(wasi:http): surface outbound response headers (#583 A4) #594) doesn't consume them. Pass connect-timeout / first-byte-timeout / between-bytes-timeout through to std.http.Client.Request (its timeout_options field), and gate against the new fields when set. Add a fixture-style test: configure timeout=10 ms, target a slow server, expect connection-timeout. Symmetric work for P3 if 0.3 ships the same surface.

  • A2Kernel splice(2) for output-stream.splice. PR feat(wasi:io, wasi:sockets): 8 missing P2 audit arms (#583) #615 shipped a buffer-through MVP. The Linux fast-path is splice(in_fd, NULL, out_fd, NULL, len, 0). Detect when both src/dst are real fds (vs in-memory streams) and use the syscall; preserve the buffer-through path as fallback. Out of scope: macOS/Windows (no equivalent — keep buffer-through there).

  • A3HTTPS handshake when std lands std.crypto.tls.Server. Tracked under wasi:http@0.3 incoming-handler: HTTPS termination blocked on upstream Zig 0.16 std (#583) #609. The scaffolding from PR feat(wasi:http@0.3): HTTPS incoming-handler TLS plumbing (#583 / #609) #614 (cert/key load, CLI flags, ServeHttpOptions.tls_config) is ready — the actual handshake just plugs into the placeholder HttpsTlsConfig.handshake(fd). Monitor upstream Zig releases; when the server-side API ships, flip this PR.

  • A4W7-5 wrapper "circular-import" investigation. Multiple agents (W8-1, W10-4, W11-6) reported transient Python "circular-import" failures running wasi-p3-testsuite via tests/wasi-testsuite-runner-patch/wasi_test_runner.py. The host (me) couldn't reproduce when checking directly. Likely a cache / stale __pycache__ issue. Action items:

    1. Add find . -name __pycache__ -exec rm -rf {} + to the wrapper bootstrap.
    2. If it persists, switch the wrapper from import wasi_testsuite (or whatever does the circular-import-prone thing) to importlib-based lazy loading.
    3. Verify on a fresh worktree.
  • A5Tighten wasi-microbench budgets + flip continue-on-error: false. PR perf(wasi): CoreMark-aware WASI micro-bench infrastructure (#583) #611 shipped with 1.5× initial median budgets and continue-on-error: true. After a week or two of CI data, recalibrate budgets to actual measured medians + 2σ noise margin, and flip the gate to required. Expected initial pass-rate ≥ 95 % on the recalibrated budgets.

  • A6Disk-backed wasi:keyvalue (atomic-rename write-through). PR feat(wasi:keyvalue@0.2.x): real CAS + optional file-backed persistence (#583 B4 follow-up) #608 shipped synchronous JSON-file rewrite on every mutation; document explicitly notes "atomic-rename future hardening." Implement tmp-file + rename(2) for crash-safety. Optionally add fsync between write and rename for durability (off by default — most testsuite uses don't need fsync overhead).

  • A7Finer outbound HTTP cancel via per-TLS-record polling. PR feat(wasi:http): finer-grained outbound cancel via Request-phase polling (#583 follow-up) #602 polls cancel between connect / send-head / receive-head / read-body / per-64-KiB-body-chunk phases. The bottleneck is TLS record processing — a single Request.receiveHead may pump multiple TLS records over many ms. Wrap the underlying Reader with a cancel-aware adapter that polls between TLS records. May require coordinating with W7-3's error-mapping table.

  • A8HTTP/1.1 response-body cancel. Symmetric to A7 — when the host is writing a large response back to a guest, observe task.cancel on the guest side and short-circuit the writer.

B. wasi:threads (legacy Preview 1 surface) — multi-PR sub-wave

Per docs/design/wasi-threads.md (PR #603, recommendation Option B: one Interpreter/std.Thread per WASI thread + shared linmem + atomics). Execute in order — each wave gates the next.

  • B1.1Wave 1: Thread-safe resource tables. Wrap every *_table access in WasiCliAdapter and ComponentInstance with a mutex; gate behind comptime so the single-threaded build keeps zero-overhead. Required deliverable: measured single-threaded coremark perf delta < 2 %. If the delta exceeds 2 %, use per-cache-line padded RW-locks or rethink the locking granularity.

  • B1.2Wave 2: std.Thread spawning wrapper. Productionise the existing thread_manager.zig prototype mentioned in the design doc. Each spawned thread gets its own Interpreter instance pointing at the shared Memory. Add a smoke fixture: spawn 2 threads, each increments a guest counter, both join, expect 2.

  • B1.3Wave 3: Wasm threads proposal opcode support. Replace the currently non-atomic atomic_prefix opcodes (search 0xFE in src/runtime/interpreter/) with @atomicLoad / @atomicStore / @atomicRmw / @cmpxchgStrong. Implement i32.atomic.wait / i64.atomic.wait / i32.atomic.notify against a host-side wait-queue keyed by (memory_base, addr). Fix the atomic.fence no-op.

  • B1.4Wave 4: wasi.thread-spawn host import binding. Bind the Preview 1 wasi:thread-spawn core-module import. Add a pthread end-to-end fixture (tests/wasi-threads/).

  • B1.5Wave 5: Cancellation across threads. Split the existing trap_flag into trap_flag + cancel_flag on ThreadManager. task.cancel from one thread propagates to siblings.

  • B1.6Wave 6: Per-thread wasi_ctx slot + TLS. Per-thread WASI context fields (current-task-manager, current-cancel-token) move from ComponentInstanceExecEnv. Thread-local storage via the wasm threads proposal's thread.local globals.

  • B1.7Wave 7+: shared-everything-threads canon built-ins when upstream stabilises them (current target: Component Model post-Preview-3 epoch). Treat as a future-compat checkpoint until then.

C. Maintenance cadence

  • C1Quarterly WIT-vs-impl audit refresh. Re-run the methodology from PR docs(wasi): WIT-vs-impl audit document (#583 D) #604 / docs/wasi-impl-audit.md. Use the existing reproducible-grep section as the recipe. Flag any new ❌ / ⚠️ rows for follow-up. Target cadence: every 3 months (next: mid-August 2026).

  • C2Quarterly parity skip-list poll. Re-run the W11-4 / PR ci(wasi-p3-parity): poll upstream fixes + refresh skip-list #610 methodology: poll the 4 upstream issues (wasmtime#13396/97/98 + wasi-testsuite#228) and any new entries. Remove fixed entries. Update _last_polled in tests/wasi-p3-parity-skip.json. Target cadence: every 3 months (next: mid-August 2026).

  • C3Quarterly Wasmtime version bump. Currently pinned at 44.0.1 in .github/workflows/wasi-p3-parity.yml. Bump to latest stable; verify the parity baseline doesn't regress. If newer Wasmtime fixes one of the 4 documented bugs, the upstream issue auto-resolves under C2.

  • C4wasi-testsuite submodule bump. Currently pinned at 40c1f7d3 (vendored in tests/wasi-testsuite/). Upstream evolves; bump and re-run both gates. If the bump introduces new fixtures, either pass them or add to skip-list with rationale.

D. Investigations

E. Future scope (track but don't start)


Wave-12 starter pack (recommended next steps)

If picking up cold, the lowest-risk, highest-yield wave-12 set is:

Slot Item Why
W12-1 A1 P2 timeout-options honoured Small, finishes #612's surface — observable improvement
W12-2 A2 Kernel splice(2) Self-contained perf win — Linux-only fast-path
W12-3 A6 Disk-backed keyvalue atomic-rename Closes PR #608's documented future-hardening item
W12-4 B1.1 Thread-safe resource tables (with perf measurement) The pre-req for everything else in B — best to get the perf delta measured early
W12-5 C1 WIT audit refresh Routine maintenance — drives the wave-12 ❌ → ✅ deltas
W12-6 A5 Tighten wasi-microbench budgets After CI data has accumulated since #611 — flip the gate to required

Defer A3 (HTTPS handshake — upstream Zig blocked), A4 (investigation — needs interactive debugging), B1.2+ (depend on B1.1), C2-C4 (cadence — wait the quarter).


References

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions