Skip to content

Commit 5ac49e6

Browse files
feat: spec 004 full-implementation — P2P daemon, distributed dispatch, partial mesh coverage
Scaffolding + substantial implementation for #57 sub-issues (#28#56). 802 tests passing. Full libp2p production NAT-traversal stack, TaskDispatch request-response + real WASM execution, validated in-process via tests/nat_traversal.rs. Closed fully: #31, #32, #35, #36, #44, #45, #46, #47, #48, #49, #50, #55. Partially addressed (open): #28, #29, #30, #33, #34, #37-#43, #51-#54, #56. Critical next issue: #60 (cross-machine firewall traversal). See README / CLAUDE.md / whitepaper v0.4 for honest status.
1 parent cb2f83a commit 5ac49e6

173 files changed

Lines changed: 14332 additions & 680 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.omc/project-memory.json

Lines changed: 244 additions & 134 deletions
Large diffs are not rendered by default.

.omc/state/subagent-tracking.json

Lines changed: 0 additions & 143 deletions
This file was deleted.

.specify/feature.json

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1,3 @@
1-
{"feature_directory":"specs/003-stub-replacement"}
1+
{
2+
"feature_directory": "specs/004-full-implementation"
3+
}

CLAUDE.md

Lines changed: 25 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,16 @@
11
# world-compute Development Guidelines
22

3-
Last updated: 2026-04-16
3+
Last updated: 2026-04-18
44

55
## Project Overview
66

7-
World Compute is a decentralized, volunteer-built compute federation. The codebase is a Rust workspace with 94+ source files, 489+ passing tests, and 20 library modules. All 5 CLI command groups are functional (donor, job, cluster, governance, admin). Core modules implemented: WASM sandbox with CID store integration, real Ed25519 signature verification, certificate chain validation (TPM2/SEV-SNP/TDX), BrightID/OAuth2/phone identity verification, Sigstore Rekor transparency logging, OTLP telemetry, STUN-based NAT detection, Raft coordinator consensus, and Firecracker/Apple VF sandbox drivers.
7+
World Compute is a decentralized, volunteer-built compute federation. The codebase is a Rust workspace with 150+ source files, 802 passing tests, and 20 library modules. All 5 CLI command groups are functional (donor, job, cluster, governance, admin). Production P2P daemon with full libp2p NAT-traversal stack (TCP + QUIC, Noise, mDNS + Kademlia DHT, identify, ping, AutoNAT, Relay v2 server+client, DCUtR) and distributed job dispatch (TaskOffer + TaskDispatch request-response with CBOR + real WASM execution) — validated end-to-end in-process via `tests/nat_traversal.rs`. Core modules implemented: WASM sandbox with CID store integration, real Ed25519 signature verification, certificate chain validation (TPM2/SEV-SNP/TDX), BrightID/OAuth2/phone identity verification, Sigstore Rekor transparency logging, OTLP telemetry, STUN-based NAT detection, Raft coordinator consensus, and Firecracker/Apple VF sandbox drivers.
88

99
## Active Technologies
1010
- Rust stable (tested on 1.95.0) + libp2p 0.54, tonic 0.12, ed25519-dalek 2, wasmtime 27, openraft 0.9, opentelemetry 0.27, clap 4 (003-stub-replacement)
1111
- CID-addressed content store (cid 0.11, multihash 0.19), erasure-coded (reed-solomon-erasure 6) (003-stub-replacement)
12+
- Rust stable (tested on 1.95.0) + libp2p 0.54, tonic 0.12, ed25519-dalek 2, wasmtime 27, openraft 0.9, opentelemetry 0.27, clap 4, reqwest 0.12, oauth2 4, x509-parser 0.16, reed-solomon-erasure 6, cid 0.11, multihash 0.19 (004-full-implementation)
13+
- CID-addressed content store (SHA-256), erasure-coded RS(10,18) (004-full-implementation)
1214

1315
- **Language**: Rust (stable, tested on 1.95.0)
1416
- **Networking**: rust-libp2p 0.54 (QUIC, TCP, mDNS, Kademlia, gossipsub)
@@ -67,7 +69,7 @@ gui/src-tauri/ # Tauri GUI scaffold
6769

6870
```sh
6971
# Build and test
70-
cargo test # 489+ tests (351+ lib + 138+ integration)
72+
cargo test # 802 tests (500+ lib + 300+ integration)
7173
cargo clippy --lib -- -D warnings # Zero warnings enforced
7274

7375
# Build only
@@ -109,13 +111,25 @@ The project is governed by a ratified constitution at `.specify/memory/constitut
109111
4. **Efficiency & Self-Improvement** — energy-aware scheduling, mesh LLM
110112
5. **Direct Testing** — real hardware tests required, no mocks for production
111113

112-
## Remaining Stubs
113-
114-
Most of the original 76 stubs replaced (issue #7, branch 003-stub-replacement). Remaining:
115-
- **Egress allowlist**: Endpoint allowlist field in JobManifest (egress is default-deny, correct behavior)
116-
- **Artifact registry lookup**: Full CID lookup against ApprovedArtifact registry (structural gate in place)
117-
- **Apple VF helper binary**: Swift helper (`wc-apple-vf-helper`) needs separate macOS compilation
118-
- **Full Merkle proof verification**: Rekor inclusion proof (format validation in place)
114+
## Remaining Stubs and Placeholders
115+
116+
Zero TODO comments in src/ and zero `#[ignore]` tests remain. However, several subsystems have scaffolding landed but placeholders in critical paths — these are not production-ready and are tracked in open issues:
117+
118+
- **Mesh LLM** (#27, #54): `src/agent/mesh_llm/expert.rs::load_model()` is a placeholder — no real LLaMA inference. Orchestration (router, aggregator, safety tiers, kill switch) is complete.
119+
- **AMD / Intel root CA fingerprints** (#28): pinned as `[0u8; 32]` in `src/verification/attestation.rs`. Validators enter permissive bypass mode when fingerprints are zero.
120+
- **Rekor public key** (#29): pinned as `[0u8; 32]` in `src/ledger/transparency.rs`. Signed tree head verification is skipped when the key is zero.
121+
- **Agent lifecycle → gossip wiring** (#30): heartbeat/pause/withdraw return payloads but don't broadcast over gossipsub (the daemon event loop does broadcast separately).
122+
- **Firecracker rootfs** (#33): concatenates layer bytes; does NOT run mkfs.ext4 + OCI tar extraction. A real boot would fail.
123+
- **Admin `ban()`** (#34): `src/governance/admin_service.rs::ban()` returns `Ok(())` without updating the trust registry.
124+
- **Platform adapters** (#37, #38, #39): Slurm/K8s/Cloud scaffolds exist but have not been exercised against live systems.
125+
- **GUI** (#40): never built or run.
126+
- **Deployment** (#41): Dockerfile and Helm chart exist but have never been built or deployed.
127+
- **REST gateway** (#43): routing + auth + rate-limit logic exist but no HTTP listener is bound in the daemon.
128+
- **Churn simulator** (#51): statistical model, not a real kill-rejoin harness.
129+
- **Apple VF Swift helper** (#52): never built on macOS.
130+
- **Receipt verification** (`src/verification/receipt.rs`): structural check only; coordinator public key not yet wired.
131+
- **Daemon `current_load()`** (`src/agent/daemon.rs:500`): stub returning 0.1.
132+
- **Cross-machine firewall traversal** (#60): production NAT stack validated in-process only. Real WAN operation behind institutional firewalls is unverified.
119133

120134
## CI
121135

@@ -125,6 +139,6 @@ Two GitHub Actions workflows:
125139

126140
## Recent Changes
127141

142+
- **004-full-implementation** (2026-04-18): Merged scaffolding + significant implementation for #57 and its sub-issues (#28–#56, and a first pass on #27/#54 mesh LLM). 802 tests passing across Linux/macOS/Windows + Sandbox KVM + swtpm CI. Landed: full production P2P daemon with libp2p NAT-traversal stack (TCP + QUIC + Noise + mDNS + Kademlia + identify + ping + AutoNAT + Relay v2 server/client + DCUtR), AutoRelay reservations, public libp2p bootstrap relays as default rendezvous, TaskOffer + TaskDispatch request-response protocols over CBOR, real WASM execution of dispatched jobs, `worldcompute job submit --executor <multiaddr> --workload <wasm>` CLI command, end-to-end 3-node relay-circuit integration test. Also landed: ~12 sub-issues fully completed (policy engine, GPU passthrough, adversarial tests, test coverage, credit decay, preemption, confidential compute, mTLS, energy metering, storage GC, documentation, scheduler matchmaking); ~16 sub-issues partially addressed with scaffolding (see Remaining Stubs above); #27/#54 mesh LLM orchestration shell complete but real LLaMA inference deferred. Critical open issue #60 tracks cross-machine WAN mesh formation behind firewalls.
128143
- **003-stub-replacement** (2026-04-16): Replaced all implementation stubs (#7, #8#26). 77 tasks, 489+ tests. Added reqwest, oauth2, x509-parser, rcgen dependencies. Wired CLI, sandboxes, attestation, identity, transparency, telemetry, consensus, network.
129144
- **002-safety-hardening** (2026-04-16): Red team review (#4). Policy engine, attestation, governance, incident response, egress, identity hardening. 110 tasks, PR #6.
130-
- **001-world-compute-core** (2026-04-15): Initial architecture and implementation across 11 phases.

Cargo.toml

Lines changed: 27 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,9 @@ libp2p = { version = "0.54", features = [
3636
"dns",
3737
"identify",
3838
"ping",
39+
"autonat",
40+
"request-response",
41+
"cbor",
3942
"ed25519",
4043
"macros",
4144
] }
@@ -61,6 +64,21 @@ ciborium = "0.2"
6164
ed25519-dalek = { version = "2", features = ["serde", "rand_core"] }
6265
sha2 = "0.10"
6366
rand = "0.8"
67+
rand_04 = { package = "rand", version = "0.4" }
68+
rsa = { version = "0.9", features = ["sha2"] }
69+
p256 = { version = "0.13", features = ["ecdsa"] }
70+
p384 = { version = "0.13", features = ["ecdsa"] }
71+
aes-gcm = "0.10"
72+
x25519-dalek = { version = "2", features = ["static_secrets"] }
73+
threshold_crypto = "0.2"
74+
75+
# TLS / certificate management
76+
rcgen = "0.13"
77+
tokio-rustls = "0.26"
78+
rustls = "0.23"
79+
80+
# Unix signals (preemption supervisor)
81+
nix = { version = "0.29", features = ["signal", "process"] }
6482

6583
# Content addressing
6684
cid = { version = "0.11", features = ["serde"] }
@@ -101,8 +119,16 @@ uuid = { version = "1", features = ["v4", "serde"] }
101119
hex = "0.4"
102120
base64 = "0.22"
103121

122+
# ML inference (mesh LLM)
123+
candle-core = "0.8"
124+
candle-transformers = "0.8"
125+
tokenizers = "0.20"
126+
127+
# System info (energy metering)
128+
sysinfo = "0.32"
129+
104130
[dev-dependencies]
105-
rcgen = "0.13"
131+
time = "0.3"
106132

107133
[build-dependencies]
108134
tonic-build = "0.12"

Dockerfile

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
# Stage 1: Build
2+
FROM rust:1.95-bookworm AS builder
3+
WORKDIR /build
4+
COPY . .
5+
RUN cargo build --release --bin worldcompute
6+
7+
# Stage 2: Runtime
8+
FROM gcr.io/distroless/cc-debian12
9+
COPY --from=builder /build/target/release/worldcompute /usr/local/bin/worldcompute
10+
ENTRYPOINT ["worldcompute"]

README.md

Lines changed: 35 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -9,27 +9,44 @@
99

1010
---
1111

12-
> **Honesty notice — please read before going further.**
12+
> **Status notice (updated 2026-04-18)**
1313
>
14-
> This repository contains a ratified governing constitution, a full research package (~28,600 words), detailed feature specifications, and substantial library code (391 tests passing across safety-critical modules). **However, there is no runnable agent, no working CLI, no testnet, and no deployable binary.** The CLI compiles but all commands print "not yet implemented." The library modules (policy engine, attestation verification, governance, incident response, egress enforcement) work as tested Rust code but are not wired into a running daemon.
14+
> This repository contains a ratified governing constitution, a full research package (~28,600 words), detailed feature specifications, and a substantial implementation with **802 passing tests** across all modules on Linux/macOS/Windows CI. Core systems and the P2P daemon are wired and exercised by unit + integration tests. **However, several subsystems have production scaffolding with placeholder values in critical paths — they are NOT production-ready as shipped.** The open GitHub issues track which pieces remain.
1515
>
16-
> **What exists and works (as of 2026-04-16):**
17-
> - Library crate with 422 passing tests covering safety-critical paths
18-
> - Deterministic policy engine (10-step evaluation pipeline)
19-
> - Attestation verification (TPM2/SEV-SNP/TDX — measurement validation and signature binding; full CA certificate-chain validation is pluggable but not yet integrated)
20-
> - Governance separation of duties, quorum thresholds, time-locks
21-
> - Network egress blocking (RFC1918, link-local, cloud metadata)
22-
> - Incident response containment primitives with audit trails
23-
> - CI on Linux/macOS/Windows via GitHub Actions
16+
> **What is complete and verified in code:**
17+
> - P2P daemon: full libp2p NAT-traversal stack (TCP + QUIC + Noise + mDNS + Kademlia + identify + ping + AutoNAT + Relay v2 server/client + DCUtR). Validated end-to-end in-process by `tests/nat_traversal.rs` — a 3-node relay-circuit test that dispatches a real WASM job through the relay in ~5ms.
18+
> - Distributed job dispatch: TaskOffer and TaskDispatch request-response protocols over CBOR. Real WASM execution on the executor. `worldcompute job submit --executor <multiaddr> --workload <wasm>` CLI command for end-to-end remote dispatch.
19+
> - All 5 CLI command groups functional
20+
> - WASM sandbox with CID-store integration and real workload execution (wasmtime)
21+
> - Deterministic 10-step policy engine with artifact registry + egress allowlist
22+
> - Preemption supervisor with SIGSTOP via nix (measured and logged)
23+
> - BrightID / OAuth2 / phone identity verification
24+
> - Scheduler with ClassAd matchmaking + R=3 disjoint-AS placement
25+
> - All 8 adversarial test scenarios implemented
26+
> - Confidential compute: AES-256-GCM + X25519 key wrapping
27+
> - mTLS certificate lifecycle via rcgen + Ed25519 auth tokens
28+
> - Credit decay with 45-day half-life + anti-hoarding
29+
> - Storage GC + acceptable-use filter + shard residency enforcement
30+
> - Energy metering via Intel RAPL
31+
> - 802 tests passing on CI (Linux/macOS/Windows + Sandbox KVM + swtpm)
2432
>
25-
> **What does NOT exist yet:**
26-
> - A running agent daemon
27-
> - Working CLI subcommands (all print "not yet implemented")
28-
> - P2P networking between nodes
29-
> - Actual job execution inside sandboxes
30-
> - Any form of testnet or multi-node deployment
33+
> **What has scaffolding but placeholder values or missing integration (see issues):**
34+
> - Mesh LLM (#27, #54): orchestration + router + aggregator + safety + kill switch all exist, but `load_model()` is a placeholder — no real LLaMA inference yet
35+
> - Attestation root CA fingerprints (#28): AMD ARK / Intel DCAP pinned as `[0u8; 32]` (bypass mode) — need real fingerprints before production
36+
> - Rekor public key (#29): pinned as `[0u8; 32]` — tree-head signature verification is skipped
37+
> - Firecracker rootfs (#33): concatenates layer bytes; real mkfs.ext4 + OCI-layer extraction not yet wired
38+
> - Platform adapters #37/#38/#39 (Slurm, K8s, Cloud): scaffolds + parsers; not exercised against live systems
39+
> - Tauri GUI (#40): scaffold; never built or run
40+
> - Docker / Helm deployment (#41): files present; never built or deployed
41+
> - REST gateway (#43): routing + auth logic present; no HTTP listener bound in daemon
42+
> - Admin ban (#34): `admin_service::ban()` is an explicit stub returning `Ok(())`
43+
> - Churn simulator (#51): statistical model; no real kill-rejoin
44+
> - Apple VF Swift helper (#52): scaffold; never built on macOS
3145
>
32-
> If you want to help build it, see [Contributing](#contributing). If you want to be notified when it becomes installable, watch this repository.
46+
> **Critical open issue:**
47+
> - #60: cross-machine firewall traversal. The production NAT stack is validated in-process only. Real WAN operation behind institutional / corporate firewalls is unverified, and our attempts from behind Dartmouth's firewall showed libp2p connections not completing. Resolving this is the next milestone.
48+
>
49+
> If you want to help build or test it, see [Contributing](#contributing).
3350
3451
---
3552

@@ -84,7 +101,7 @@ Five constitutional principles govern every design decision. They are not aspira
84101

85102
## Status
86103

87-
World Compute has completed library-level implementation across core and safety modules. The CLI and agent daemon are scaffolded but not yet functional. Updated 2026-04-16.
104+
World Compute has substantial implementation with 802 passing tests and a fully-wired P2P daemon. All 5 CLI command groups functional. Several subsystems still have placeholder values in critical paths (see status notice at top of README and open issues #27, #28, #29, #33, #34, #37#43, #51#54, #56, #60). Updated 2026-04-18.
88105

89106
### Design artifacts (complete)
90107

adapters/cloud/Cargo.toml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,3 +8,5 @@ license = "Apache-2.0"
88
worldcompute = { path = "../.." }
99
tokio = { version = "1", features = ["full"] }
1010
clap = { version = "4", features = ["derive"] }
11+
serde = { version = "1", features = ["derive"] }
12+
serde_json = "1"

0 commit comments

Comments
 (0)