Skip to content

test(dst): PR-D — parser/loader fuzz + D5 S3 context (CLI/server backends deferred)#305

Closed
ragnorc wants to merge 5 commits into
dst-harness-hardenfrom
dst-d5-fanout
Closed

test(dst): PR-D — parser/loader fuzz + D5 S3 context (CLI/server backends deferred)#305
ragnorc wants to merge 5 commits into
dst-harness-hardenfrom
dst-d5-fanout

Conversation

@ragnorc

@ragnorc ragnorc commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

Stacked on #303 (base dst-harness-harden), sibling to PR-C #304. The tractable slice of PR-D (D5 fan-out + fuzz). Tests-only.

Done

  • Parser/loader fuzz (dst/fuzz.rs, proptest — no nightly): the #7230 lesson (a parse/merge path only ever fed clean input shipped a latent crash). Fuzzes the GQ query parser, the schema parser, and the JSONL loader with arbitrary + structured + adversarial (dup-@key, reordered/extra/missing fields, wrong types, malformed JSON, orphan edges, enum-out-of-range, huge strings) input; asserts they return a Result — never panic — and the graph stays usable after the abuse. proptest shrinks+persists any panic. All green — no parser panics found.
  • D5 S3 context (s3_battery_holds): the full white-box battery on a real object-store backend (s3://), not just local FS. Env-gated (OMNIGRAPH_S3_TEST_BUCKET — skips locally, runs in CI's rustfs_integration job).

Deferred (with reasons — see dst/MATRIX.md)

  • CLI / long-lived server backends — crate-boundary friction: the harness lives in omnigraph-engine tests, but the omnigraph binary is in omnigraph-cli (CARGO_BIN_EXE_omnigraph is cross-package, unavailable), and the natural home (cli tests) can't reach the harness op/model/invariants modules. Plus subprocess-per-op is slow/flaky. The clean fix is to extract the harness op/model into a shared crate first.
  • cargo-fuzz/libFuzzer — needs a nightly toolchain (not installed); the proptest fuzz above gets the same "parser survives hostile input" coverage on stable.
  • porcupine linearizability — optional (Go subprocess).

Verify

  • cargo test -p omnigraph-engine --test dst16 passed (incl. the 5 fuzz tests; the S3 test skips without the bucket env).

Note

Low Risk
Test-only changes to parsers/loaders and DST/CLI smoke paths; no production behavior changes. S3 battery may not run in CI until the RustFS job invokes --test dst (noted in review).

Overview
PR-D DST coverage (tests only): Adds proptest fuzz for GQ and schema parsers (random + token soup) and an adversarial JSONL loader smoke test that must not panic and must leave the graph usable after bad batches. Wires fuzz into the dst integration binary.

Adds s3_battery_holds: a short seeded op walk on a unique s3:// graph, then the full white-box invariant battery (env-gated via OMNIGRAPH_S3_TEST_BUCKET).

Adds cli_dst_parity: the same multi-step sequence (init → two merge loads → insert → delete → query) on embedded SDK vs CLI subprocess, comparing extracted smoke-* slug sets so CLI transport matches the engine without sharing harness modules.

Updates dst/MATRIX.md and docs/dev/testing.md (ledger rows, run commands, S3 job note). Full DST harness CLI/server parity remains deferred; this PR only adds the CLI sequence smoke.

Reviewed by Cursor Bugbot for commit 0f09ec3. Bugbot is set up for automated code reviews on this repo. Configure here.

Greptile Summary

This PR expands the DST test coverage. The main changes are:

  • Adds CLI subprocess versus embedded SDK parity coverage for a multi-step graph sequence.
  • Adds proptest fuzz coverage for query parsing, schema parsing, and hostile JSONL loading.
  • Adds an env-gated S3 DST battery.
  • Updates the DST matrix and testing docs for the new coverage.

Confidence Score: 4/5

This is close, but the CI wiring should be fixed before merging.

  • The new S3 DST battery is documented as part of RustFS CI, but the inspected workflow still does not invoke it.
  • DST test paths are also missing from the RustFS trigger filter, so DST-only S3 changes can skip the object-store job.

docs/dev/testing.md and .github/workflows/ci.yml

Important Files Changed

Filename Overview
crates/omnigraph-cli/tests/cli_dst_parity.rs Adds a CLI-versus-embedded parity smoke test for a multi-operation local graph sequence.
crates/omnigraph/tests/dst/fuzz.rs Adds parser and loader fuzz coverage using the existing DST schema and helpers.
crates/omnigraph/tests/dst.rs Wires the fuzz module and adds an env-gated S3 battery.
docs/dev/testing.md Documents the new DST run modes, but the S3 CI command is not wired into the workflow.
crates/omnigraph/tests/dst/MATRIX.md Updates the DST coverage ledger for fuzzing, S3, and deferred backend work.

Fix All in Claude Code

Reviews (2): Last reviewed commit: "docs(testing): document the DST harness ..." | Re-trigger Greptile

Greptile also left 1 inline comment on this PR.

Context used:

  • Context used - AGENTS.md (source)
  • Context used - CLAUDE.md (source)

ragnorc added 3 commits June 25, 2026 14:12
The #7230 lesson: a parse/merge path only ever fed clean input shipped a
latent crash. Fuzz the GQ query parser (find_named_query), the schema
parser (parse_schema), and the JSONL loader with arbitrary + structured
+ adversarial (dup-@key, reordered/extra/missing fields, wrong types,
malformed JSON, orphan edges, enum-out-of-range, huge strings) input and
assert they return a Result, never panic — and the graph stays usable
after the abuse. proptest shrinks+persists any panic. cargo-fuzz/libFuzzer
targets deferred (need nightly); this gets the same coverage on stable.
All green — no parser panics found.
Runs a short seeded op sequence + the full white-box battery against a
real object-store backend (s3://), not just local FS — the D5 S3 context
dimension. Gated on OMNIGRAPH_S3_TEST_BUCKET (skips locally; runs in CI's
rustfs_integration job), mirroring tests/s3_storage.rs. Known bugs
allow-listed, novel -> fail.

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 68808ac. Configure here.

Comment thread crates/omnigraph/tests/dst.rs

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 68808acd15

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +613 to +614
#[tokio::test(flavor = "multi_thread")]
async fn s3_battery_holds() {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Wire the S3 DST test into RustFS CI

This new test is bucket-gated, so the normal workspace job always returns early because .github/workflows/ci.yml:193-200 runs cargo test --workspace --locked without OMNIGRAPH_S3_TEST_BUCKET. The only job that sets the bucket, rustfs_integration, runs s3_storage, write_cost_s3, server/cluster/CLI S3 tests, and failpoint S3 tests, but never cargo test -p omnigraph-engine --test dst (ci.yml:373-400), and the PR change classifier does not include crates/omnigraph/tests/dst.rs for RustFS (ci.yml:87-96). As a result this claimed S3 coverage is skipped in CI unless a developer runs it manually.

Useful? React with 👍 / 👎.

Comment on lines +622 to +624
for _ in 0..8 {
let _ = op::step(&db, &mut rng, &mut model).await;
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Fail on unexpected S3 op errors

In the S3 context, every op::step error is discarded. Since step only updates the model after successful operations, an S3-specific failure in a load, mutation, optimize, or read can leave both the database and model at the previous state, letting run_battery pass even though the backend operation failed. This should classify known open bugs like run_walk does and fail on any other S3 operation error.

Useful? React with 👍 / 👎.

| **CLI / long-lived server backend** | ❌ deferred | crate-boundary friction: harness is in `omnigraph-engine` tests, the `omnigraph` binary is in `omnigraph-cli` (`CARGO_BIN_EXE_omnigraph` is cross-package, unavailable), and the natural home — `omnigraph-cli` tests — can't reach the harness modules. Plus subprocess-per-op is slow/flaky. Needs the harness op/model extracted to a shared crate first. |
| local FS | ✅ | |
| **S3 (RustFS/MinIO)** | ⏸️ | PR-D |
| **S3 (RustFS/MinIO)** | ✅ (PR-D) | `s3_battery_holds` — full battery on `s3://`, env-gated (`OMNIGRAPH_S3_TEST_BUCKET`) |

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 S3 Battery Never Runs

This row marks the S3 context complete via s3_battery_holds, but the RustFS CI path does not run the dst integration test target. When CI provisions OMNIGRAPH_S3_TEST_BUCKET, the new battery is still never executed, so an S3 regression can pass while the matrix reports the coverage as done.

Context Used: AGENTS.md (source)

Fix in Claude Code

ragnorc added 2 commits June 25, 2026 15:43
…arity

A DST-flavored op sequence (init -> 2x load --merge multi-fragment ->
insert -> edge-free delete -> query) run against the embedded Omnigraph
SDK and the omnigraph CLI subprocess on twin local graphs must agree on
the final person-slug set. Fills the gap parity_matrix.rs leaves (it does
single-verb CLI-local-vs-server; this adds the embedded-SDK arm + a
multi-op sequence). Format-tolerant oracle (embedded to_rust_json vs CLI
--json differ in wrapping): compare the set of smoke-* slug strings. The
CLI wraps the same engine, so this guards the CLI transport layer (arg
parse, --json serialization, --store addressing, --as actor) against
divergence from embedded. Reuses support::cli(). Green.
…_dst_parity, the dst feature)

Per the maintenance contract: add the DST harness to the test-coverage
map — engine table rows for dst.rs (generative walks + white-box battery
+ regressions + statemachine shrinking + readshape + fuzz + S3 cell +
replay-equality) and dst_recovery.rs (failpoint recovery / #296 cells);
the cli_dst_parity cross-backend smoke in the omnigraph-cli row; a
'Running the DST harness' section (the feature flags + run commands + the
test-only zero-cost dst determinism seam); and the S3 cell in the rustfs
CI list. Points at dst/MATRIX.md for the coverage ledger.
Comment thread docs/dev/testing.md
- `cargo test -p omnigraph-cluster --test s3_cluster` (full control-plane lifecycle on the bucket)
- `cargo test -p omnigraph-cli --test system_local local_cli_s3_end_to_end_init_load_read_flow`
- `cargo test -p omnigraph-engine --features failpoints --test failpoints s3_` (recovery-sidecar lifecycle on a real bucket)
- `cargo test -p omnigraph-engine --test dst s3_battery_holds` (the DST white-box battery on a real object store)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 S3 CI Still Missing This line now says CI runs cargo test -p omnigraph-engine --test dst s3_battery_holds, but the RustFS workflow still does not invoke that test. The normal workspace test has no S3 bucket env, so s3_battery_holds skips there, and DST-only changes also do not match the RustFS path filter. A regression that only appears on the object-store backend can still pass every PR check while this doc says the coverage is active. Please add the DST command to the RustFS job and include the DST test paths in the RustFS trigger set.

Fix in Claude Code

@ragnorc

ragnorc commented Jun 26, 2026

Copy link
Copy Markdown
Contributor Author

Superseded by #309, which consolidates this (parser/loader fuzz + S3 context) with #303 and the omnigraph-dst crate into a single tests-only PR against main. The S3-in-CI finding raised here is fixed in #309 (a RustFS-job step now runs --test dst s3_battery_holds + the dst paths are in the RustFS change filter).

@ragnorc ragnorc closed this Jun 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant