Improve WikiGraph URI and knowledge graph build by Moskize91 · Pull Request #75 · oomol-lab/spinedigest

Moskize91 · 2026-06-28T00:57:47Z

Summary

redesign WikiGraph archive URIs and update CLI help/search/get/list behavior
require LLM-judged mention links and replace surface screening with protected frequency filtering
improve knowledge graph grounding, queue cancellation, shared SQLite bootstrap, and progress behavior

Verification

pnpm vitest run test/facade/knowledge-graph-build.test.ts test/wikimatch/policy-judge.test.ts test/cli/queue.test.ts
pnpm typecheck
pnpm lint
pnpm run lint:fix

coderabbitai · 2026-06-28T00:58:05Z

Summary by CodeRabbit

New Features
- Added wikigraph list for browsing URI-scoped, paginated object collections (--type, --limit, --cursor, --json/--jsonl).
- Introduced wikigraph help uri with clearer rules and examples for constructing valid Wiki Graph URIs.
Bug Fixes
- Updated command inputs to a consistent single <object-uri>/<archive-or-scope-uri> style across search, get, related, evidence, and pack.
- Refined evidence availability and added pagination options.
Documentation
- Refreshed root help, command help, and examples to match the new URI contract.

Walkthrough

This PR unifies wikigraph archive commands around a single wikigraph:// URI contract, adds list and uri help support, and rewrites archive URI parsing/routing for chapter-scoped object forms and the new chapter-rooted source range format. It also replaces wikimatch surface screening with surface protection and continuation paging, adds wikilink relation discovery, introduces shared-state SQLite initialization, and adds a canceling build-job state with cooperative stop checks.

Possibly Related PRs

oomol-lab/spinedigest#72: Introduced the object-based wikigraph CLI and evidence views that this PR extends with new URI forms, list, and help updates.
oomol-lab/spinedigest#59: Changed chapter-page output and formatting paths that this PR further updates in chapter navigation and URI-shaped next-step commands.
oomol-lab/spinedigest#65: Touches the same build-queue worker and job-state code paths that this PR extends with canceling and stop checks.

🚥 Pre-merge checks | ✅ 3 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Title check	⚠️ Warning	The title is relevant, but it does not use the required `<type>(<scope>): <subject>` format.	Change it to the required format, for example `feat(uri): improve WikiGraph URI and knowledge graph build`.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description check	✅ Passed	The description matches the implemented URI, help, grounding, and build changes.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

✨ Finishing Touches

✨ Simplify code

Create PR with simplified code
Commit simplified code in branch codex/uri-object-api

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 OpenGrep (1.23.0)

test/facade/knowledge-graph-build.test.ts

┌──────────────┐
│ Opengrep CLI │
└──────────────┘

�[32m✔�[39m �[1mOpengrep OSS�[0m
�[32m✔�[39m Basic security coverage for first-party code vulnerabilities.

[00.20][ERROR]: unable to find a config; path .coderabbit-opengrep-fallback.yml does not exist

test/facade/archive-view.test.ts

┌──────────────┐
│ Opengrep CLI │
└──────────────┘

�[32m✔�[39m �[1mOpengrep OSS�[0m
�[32m✔�[39m Basic security coverage for first-party code vulnerabilities.

[00.25][ERROR]: unable to find a config; path .coderabbit-opengrep-fallback.yml does not exist

test/wikilink/relation-discovery.test.ts

┌──────────────┐
│ Opengrep CLI │
└──────────────┘

�[32m✔�[39m �[1mOpengrep OSS�[0m
�[32m✔�[39m Basic security coverage for first-party code vulnerabilities.

[00.19][ERROR]: unable to find a config; path .coderabbit-opengrep-fallback.yml does not exist

1 others

_{Comment @coderabbitai help to get the list of available commands.}

coderabbitai

Actionable comments posted: 13

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/wikimatch/policy-judge.ts (1)

260-265: 🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win

Reject qid for non-recall decisions or update the prompt contract.

Line 421 says qid must only appear for recall, but skip_this_time / never_recall with an allowed qid still pass validation and are persisted into policyUpdates. Enforce the prompt contract here, or relax the prompt if qid-scoped policy updates are still intended.

Proposed fix if qid is recall-only

       if (decision.decision === "continue") {
         if (decision.qid !== undefined) {
@@
         continue;
       }
 
+      if (decision.qid !== undefined) {
+        issues.push(
+          `Candidate ${candidate.id} uses decision "${decision.decision}" but includes qid. Include qid only when decision is "recall".`,
+        );
+        continue;
+      }
+
-      if (
-        decision.qid !== undefined &&
-        !isAllowedQid(candidate, decision.qid)
-      ) {
-        issues.push(formatIllegalQidIssue(candidate, decision.qid));
-      }

Also applies to: 415-421

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/wikimatch/policy-judge.ts` around lines 260 - 265, The validation in
policy-judge should enforce the prompt contract that qid is only allowed for
recall decisions. Update the decision handling in policy-judge to reject or flag
any `qid` when `decision.type` is not recall, alongside the existing
`isAllowedQid` check, so `skip_this_time` and `never_recall` cannot persist
qid-scoped updates into `policyUpdates`. If qid-scoped updates are intended for
those decision types, then instead relax the prompt contract to match the
behavior in the same decision validation path.

🧹 Nitpick comments (2)

src/facade/build-queue.ts (1)
333-340: 🚀 Performance & Scalability | 🔵 Trivial | 🏗️ Heavy lift

Avoid reopening SQLite for every stop check.

Every throwIfStopped() call opens and closes a fresh readonly queue DB, and every reporter entry point now goes through it before doing any work. That makes hot paths like addOutputCharacters() pay for SQLite/file setup on every progress update. Reusing an existing connection, or polling/caching stop state behind a cheaper barrier, would keep cancellation responsive without turning progress reporting into DB churn.

Also applies to: 1178-1184, 1388-1470, 1484-1500
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/facade/build-queue.ts` around lines 333 - 340, The stop-check path is
reopening and closing the readonly SQLite queue DB on every `throwIfStopped()`
call via `readBuildJobForStopCheck`, which makes hot reporter paths like
`addOutputCharacters()` too expensive. Update the stop-check flow to reuse an
existing database connection or introduce a cheaper cached/polled stop-state
check so reporter entry points do not pay SQLite setup cost on each update;
focus the change around `throwIfStopped`, `readBuildJobForStopCheck`, and the
reporter methods that call it.
test/cli/archive.test.ts (1)
266-485: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Add a runArchiveCommand smoke test for list.

This suite now covers search/get/related/evidence/pack, but the new list branch still has no equivalent CLI test. One text or JSON assertion here would protect the createCollectionFindResult() wrapper and URI formatting path from regressing.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/cli/archive.test.ts` around lines 266 - 485, Add a smoke test for the
new list command in this archive CLI suite using runArchiveCommand, similar to
the existing search/get/related/evidence/pack cases. Cover the list branch with
either text or JSON output and assert the expected URI/collection formatting so
the createCollectionFindResult() path is exercised and protected from
regressions. Use the existing runArchiveCommand helper and archiveMockState
assertions to locate the new coverage alongside the other action tests.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@data/help/commands/archive/evidence.jinja`:
- Around line 19-22: The unsupported-evidence note in the archive evidence
template is incomplete because `listArchiveEvidence()` also rejects
`chapter-tree`. Update the Notes block in the `evidence.jinja` template to
explicitly mention `chapter-tree` alongside the existing unsupported cases,
keeping the guidance aligned with `listArchiveEvidence()` and the
`evidence`/`pack` usage distinction.

In `@data/help/commands/root.jinja`:
- Line 12: The root help object-kind list is missing the new chapter-tree URI
type, so update the list in the root help template to include chapter-tree
alongside the existing kinds. Make the change in the root help documentation
where the object kinds are enumerated so users can discover the archive-view
contract’s new first-class type.
- Line 27: The `wikigraph evidence` synopsis is too broad because it advertises
a generic `<object-uri>` even though the facade rejects `chapter`,
`chapter-tree`, and `source` URIs. Update the `root.jinja` help text for
`evidence` to list only the supported URI kinds, matching the validation
behavior in the `evidence` command path so users aren’t directed to invalid
inputs.

In `@src/cli/archive.ts`:
- Around line 1034-1035: The fragment URI mapping in
`listArchiveCollection()`/the `case "fragment"` branch is inventing a synthetic
`#0..0` source range for whole-fragment ids. Update the fragment handling so it
preserves the original fragment URI target instead of rewriting it to a
first-sentence subrange, using the existing `first`/`second` parsing logic only
for real source ranges and not for whole-fragment clicks.

In `@src/cli/args.ts`:
- Around line 1066-1072: The missing-input error in the archive validation
branch is too generic and should be tailored to the current action. Update the
message in the archivePath check so it uses action-specific wording for create,
export, estimate, and index, and reserve “archive URI” only where that is
actually the expected input; use the relevant action variable and the existing
withHelpRoute path to keep the guidance aligned with each workflow.

In `@src/document/shared-state-database.ts`:
- Around line 96-104: The stale-lock cleanup in removeStaleInitLock currently
relies only on the lock file’s mtime, which can wrongly delete a healthy init
lock while Database.initialize() is still running. Update this helper to use the
owner metadata already stored in the lock to verify whether the holder is still
live, or add a heartbeat/refresh in Database.initialize() so the lock mtime
stays current while it is held. Make sure the logic in removeStaleInitLock and
the init-lock acquisition path work together to preserve cross-process
exclusion.

In `@src/facade/archive-view.ts`:
- Around line 541-547: Apply chapter scoping before grouping in the
entity/triple listing path: the current calls in archive-view’s list flow
aggregate all mentions/links first and only then let listEntityCollection and
listTripleCollection pick representatives and dedupe, which can drop
chapter-local results or leak unscoped ones. Update the list logic so the
chapter filter is applied to the underlying mention/link set before it is passed
into listEntityCollection and listTripleCollection, and make sure the same
scoped-first behavior is used in the related list path referenced by the other
affected range.
- Around line 3011-3020: Top-level chunk/entity URI parsing in the archive view
is too permissive and accepts extra path segments. Update the URI switch
handling in the same parser that processes chunk/entity cases so it validates
the full path length before returning, matching the length checks already used
for triples. Make the `chunk` and `entity` branches reject any URI with more
than the expected segments so malformed inputs fail fast instead of resolving to
valid identifiers.

In `@src/facade/knowledge-graph-build.ts`:
- Around line 463-471: The paging logic in `sortCandidateOptionsByRecall`,
`sliceCandidateByOptionBudget`, and the `cursor` handling can drift when
`candidatePages.accept()` changes `recallCounts`, causing repeated or skipped
QIDs. Fix `knowledge-graph-build` by tracking the set of already presented
selectable QIDs per candidate, or by freezing the sorted order for a candidate
before advancing the offset, and apply the same adjustment in the later paging
block that uses the same cursor-based flow.

In `@src/facade/spine-digest.ts`:
- Line 109: The recovery hint in the missing-summary message is stale and still
references the old chapter discovery command. Update the user-facing text in the
spine digest path so it points to the new URI-based discovery flow, using the
same message construction around the completed-summary lookup and `serialId`,
and replace the `wikigraph chapter list` guidance with `wikigraph list
wikigraph://<archive.sdpub> --type chapter` while keeping the follow-up summary
URI guidance consistent.

In `@src/wikilink/relation-discovery.ts`:
- Around line 371-390: The plain text fragments in formatTaggedContext are being
inserted without escaping, so update the window.text.slice(...) pushes to escape
XML text the same way mention bodies are handled. Make sure every non-mention
segment before, between, and after mentions is passed through the existing
escaping helper used for tagged prompt content, while keeping mention attributes
and bodies escaped via escapeXmlAttribute and escapeXmlText.
- Around line 212-219: Reject predicates that normalize to an empty label in
relation discovery. In relation-discovery.ts, update the validation flow around
normalizePredicate(relation.predicate) so that empty strings are treated as
invalid before any relation is added to the discovered set. Use the existing
predicate handling near the “mentions” check in the relation discovery path to
push a validation issue and continue, preventing blank edge labels from reaching
the mention-link pipeline.

In `@src/wikimatch/surface-window.ts`:
- Around line 19-22: The suspicious-surface filtering in the surface-window
logic is using a different key than the aggregation path. Update the
`surface-window.ts` flow so the same sliced text value produced from
`text.slice(...)` is used both when building `suspiciousTexts` and when
filtering `suppressedCandidates`, instead of mixing it with `candidate.surface`.
Keep the change within the return/filtering logic that currently references
`candidate.surface`, and apply the same fix anywhere else in this file that
performs the same aggregation/filtering pattern.

---

Outside diff comments:
In `@src/wikimatch/policy-judge.ts`:
- Around line 260-265: The validation in policy-judge should enforce the prompt
contract that qid is only allowed for recall decisions. Update the decision
handling in policy-judge to reject or flag any `qid` when `decision.type` is not
recall, alongside the existing `isAllowedQid` check, so `skip_this_time` and
`never_recall` cannot persist qid-scoped updates into `policyUpdates`. If
qid-scoped updates are intended for those decision types, then instead relax the
prompt contract to match the behavior in the same decision validation path.

---

Nitpick comments:
In `@src/facade/build-queue.ts`:
- Around line 333-340: The stop-check path is reopening and closing the readonly
SQLite queue DB on every `throwIfStopped()` call via `readBuildJobForStopCheck`,
which makes hot reporter paths like `addOutputCharacters()` too expensive.
Update the stop-check flow to reuse an existing database connection or introduce
a cheaper cached/polled stop-state check so reporter entry points do not pay
SQLite setup cost on each update; focus the change around `throwIfStopped`,
`readBuildJobForStopCheck`, and the reporter methods that call it.

In `@test/cli/archive.test.ts`:
- Around line 266-485: Add a smoke test for the new list command in this archive
CLI suite using runArchiveCommand, similar to the existing
search/get/related/evidence/pack cases. Cover the list branch with either text
or JSON output and assert the expected URI/collection formatting so the
createCollectionFindResult() path is exercised and protected from regressions.
Use the existing runArchiveCommand helper and archiveMockState assertions to
locate the new coverage alongside the other action tests.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 8358d5bf-1668-49d8-beca-a758313ef4fa

📥 Commits

Reviewing files that changed from the base of the PR and between 692ec2f and ee35621.

📒 Files selected for processing (55)

data/help/commands/archive/evidence.jinja
data/help/commands/archive/get.jinja
data/help/commands/archive/index.jinja
data/help/commands/archive/list.jinja
data/help/commands/archive/pack.jinja
data/help/commands/archive/related.jinja
data/help/commands/archive/search.jinja
data/help/commands/root.jinja
data/help/topics/ai.jinja
data/help/topics/command.jinja
data/help/topics/overview.jinja
data/help/topics/recipe.jinja
data/help/topics/task.jinja
data/help/topics/uri.jinja
src/cli/archive.ts
src/cli/args.ts
src/cli/help.ts
src/cli/queue.ts
src/cli/stage-runtime.ts
src/document/database.ts
src/document/index.ts
src/document/shared-state-database.ts
src/facade/archive-uri.ts
src/facade/archive-view.ts
src/facade/build-queue.ts
src/facade/index.ts
src/facade/knowledge-graph-build.ts
src/facade/sdpub-coordinator.ts
src/facade/search-cache.ts
src/facade/spine-digest.ts
src/reader/chunk-batch/index.ts
src/serial.ts
src/wikilink/index.ts
src/wikilink/relation-discovery.ts
src/wikilink/types.ts
src/wikimatch/index.ts
src/wikimatch/options.ts
src/wikimatch/policy-judge.ts
src/wikimatch/range-suppression.ts
src/wikimatch/surface-blocklist.ts
src/wikimatch/surface-screening.ts
src/wikimatch/surface-window.ts
src/wikimatch/types.ts
src/wikipage/cache.ts
test/cli/archive.test.ts
test/cli/args.test.ts
test/cli/queue.test.ts
test/facade/archive-view.test.ts
test/facade/build-queue.test.ts
test/facade/knowledge-graph-build.test.ts
test/wikilink/relation-discovery.test.ts
test/wikimatch/policy-judge.test.ts
test/wikimatch/range-suppression.test.ts
test/wikimatch/surface-screening.test.ts
test/wikimatch/surface-window.test.ts

💤 Files with no reviewable changes (1)

src/wikimatch/surface-blocklist.ts

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/wikimatch/surface-window.ts`:
- Around line 19-24: Update the candidate filtering logic so protected stale
surfaces are matched against the same text-derived value used by
surface-window.ts. In screenCandidates in knowledge-graph-build.ts, stop relying
only on protectedSurfaces.has(candidate.surface) and instead compare against the
candidate text slice (or keep candidate.surface in sync with the judged surface
text) so protected items like the stale “北京” case are re-admitted correctly
after judgment.

In `@test/facade/archive-view.test.ts`:
- Around line 742-769: The scoping test is too permissive: the negative
assertion on result.items.map((item) => item.id) only catches a full leak, and
the scopedSecond expectations allow extra chapter-1 rows. Tighten the assertions
around listArchiveCollection and scopedSecond by verifying the exact set of
returned items/ids and asserting every item in scopedSecond has chapter 2, while
preserving the existing expected entity and triple matches.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 804a9f30-ec61-4925-a282-150b00821946

📥 Commits

Reviewing files that changed from the base of the PR and between ee35621 and 862264a.

📒 Files selected for processing (16)

data/help/commands/archive/evidence.jinja
data/help/commands/root.jinja
src/cli/archive.ts
src/cli/args.ts
src/document/shared-state-database.ts
src/facade/archive-view.ts
src/facade/knowledge-graph-build.ts
src/facade/spine-digest.ts
src/wikilink/relation-discovery.ts
src/wikimatch/policy-judge.ts
src/wikimatch/surface-window.ts
test/cli/archive.test.ts
test/facade/archive-view.test.ts
test/facade/knowledge-graph-build.test.ts
test/wikilink/relation-discovery.test.ts
test/wikimatch/surface-window.test.ts

✅ Files skipped from review due to trivial changes (2)

src/facade/spine-digest.ts
data/help/commands/archive/evidence.jinja

🚧 Files skipped from review as they are similar to previous changes (9)

data/help/commands/root.jinja
src/document/shared-state-database.ts
src/wikilink/relation-discovery.ts
src/cli/archive.ts
src/wikimatch/policy-judge.ts
src/facade/knowledge-graph-build.ts
test/cli/archive.test.ts
src/facade/archive-view.ts
src/cli/args.ts

coderabbitai · 2026-06-28T02:40:39Z

  return {
-    baseOffset: expandedRange.start,
-    surfaces,
-    text: text.slice(expandedRange.start, expandedRange.end),
+    candidates: suppressedCandidates.filter(
+      (candidate) =>
+        !suspiciousTexts.has(
+          options.text.slice(candidate.range.start, candidate.range.end),
+        ),


🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Protected stale surfaces are still dropped after judgment.

This function now keys suspicious surfaces off text.slice(...), but screenCandidates in src/facade/knowledge-graph-build.ts still re-admits protected candidates via protectedSurfaces.has(candidate.surface). With stale surfaces like the new "stale" test data, the model can protect "北京" and the candidate is still filtered out because its stored surface never matches the protected text.

Suggested follow-up

- protectedSurfaces.has(candidate.surface), + protectedSurfaces.has( + input.text.slice(candidate.range.start, candidate.range.end), + ),

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/wikimatch/surface-window.ts` around lines 19 - 24, Update the candidate filtering logic so protected stale surfaces are matched against the same text-derived value used by surface-window.ts. In screenCandidates in knowledge-graph-build.ts, stop relying only on protectedSurfaces.has(candidate.surface) and instead compare against the candidate text slice (or keep candidate.surface in sync with the judged surface text) so protected items like the stale “北京” case are re-admitted correctly after judgment.

coderabbitai · 2026-06-28T02:40:39Z

+        expect(result.items.map((item) => item.id)).not.toEqual(
+          expect.arrayContaining([
+            "chapter:2",
+            "node:200",
+            "summary:2",
+            "wikigraph://triple/Q1/mentions/Q2",
+          ]),
+        );
+
+        const scopedSecond = await listArchiveCollection(document, {
+          chapters: [2],
+          types: ["entity", "triple"],
+        });
+
+        expect(scopedSecond.items).toEqual(
+          expect.arrayContaining([
+            expect.objectContaining({
+              chapter: 2,
+              id: "wikigraph://entity/Q1",
+              type: "entity",
+            }),
+            expect.objectContaining({
+              chapter: 2,
+              id: "wikigraph://triple/Q1/mentions/Q2",
+              type: "triple",
+            }),
+          ]),
+        );


🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win

Tighten these scoping assertions.

Line 742 only fails if all forbidden IDs leak, and Line 756 still passes if chapter-1 rows slip into the chapter-2 result alongside the expected items. That makes this test too weak to catch partial chapter-filter regressions.

Suggested test hardening

- expect(result.items.map((item) => item.id)).not.toEqual( - expect.arrayContaining([ - "chapter:2", - "node:200", - "summary:2", - "wikigraph://triple/Q1/mentions/Q2", - ]), - ); + const resultIds = result.items.map((item) => item.id); + expect(resultIds).not.toContain("chapter:2"); + expect(resultIds).not.toContain("node:200"); + expect(resultIds).not.toContain("summary:2"); + expect(resultIds).not.toContain("wikigraph://triple/Q1/mentions/Q2"); @@ expect(scopedSecond.items).toEqual( expect.arrayContaining([ expect.objectContaining({ chapter: 2, id: "wikigraph://entity/Q1", type: "entity", }), expect.objectContaining({ chapter: 2, id: "wikigraph://triple/Q1/mentions/Q2", type: "triple", }), ]), ); + expect(scopedSecond.items.every((item) => item.chapter === 2)).toBe( + true, + );

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

expect(result.items.map((item) => item.id)).not.toEqual(

expect.arrayContaining([

"chapter:2",

"node:200",

"summary:2",

"wikigraph://triple/Q1/mentions/Q2",

]),

);

const scopedSecond = await listArchiveCollection(document, {

chapters: [2],

types: ["entity", "triple"],

});

expect(scopedSecond.items).toEqual(

expect.arrayContaining([

expect.objectContaining({

chapter: 2,

id: "wikigraph://entity/Q1",

type: "entity",

}),

expect.objectContaining({

chapter: 2,

id: "wikigraph://triple/Q1/mentions/Q2",

type: "triple",

}),

]),

);

const resultIds = result.items.map((item) => item.id);

expect(resultIds).not.toContain("chapter:2");

expect(resultIds).not.toContain("node:200");

expect(resultIds).not.toContain("summary:2");

expect(resultIds).not.toContain("wikigraph://triple/Q1/mentions/Q2");

const scopedSecond = await listArchiveCollection(document, {

chapters: [2],

types: ["entity", "triple"],

});

expect(scopedSecond.items).toEqual(

expect.arrayContaining([

expect.objectContaining({

chapter: 2,

id: "wikigraph://entity/Q1",

type: "entity",

}),

expect.objectContaining({

chapter: 2,

id: "wikigraph://triple/Q1/mentions/Q2",

type: "triple",

}),

]),

);

expect(scopedSecond.items.every((item) => item.chapter === 2)).toBe(

true,

);

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/facade/archive-view.test.ts` around lines 742 - 769, The scoping test is too permissive: the negative assertion on result.items.map((item) => item.id) only catches a full leak, and the scopedSecond expectations allow extra chapter-1 rows. Tighten the assertions around listArchiveCollection and scopedSecond by verifying the exact set of returned items/ids and asserting every item in scopedSecond has chapter 2, while preserving the existing expected entity and triple matches.

Moskize91 added 11 commits June 27, 2026 18:41

feat: introduce located WikiGraph URIs

4c56b17

fix: require LLM-judged mention links

d4c5c95

fix: replace surface screening with protected frequency filter

a7aa2e7

fix: confirm queued job cancellation

95eb270

refactor: fold candidate narrowing into grounding

c8a0920

fix: separate knowledge graph recall prompt

686da33

fix: serialize queue stop checks

5541bd1

fix: bootstrap shared sqlite databases

ff7a252

fix: ignore empty qids for skipped grounding

d9c7e38

fix: use request concurrency for grounding

1a817df

style: format recent changes

ee35621

coderabbitai Bot reviewed Jun 28, 2026

View reviewed changes

fix: address wikigraph review findings

862264a

Moskize91 merged commit 1bc597c into main Jun 28, 2026
2 of 3 checks passed

coderabbitai Bot reviewed Jun 28, 2026

View reviewed changes

Moskize91 deleted the codex/uri-object-api branch June 28, 2026 02:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve WikiGraph URI and knowledge graph build#75

Improve WikiGraph URI and knowledge graph build#75
Moskize91 merged 12 commits into
mainfrom
codex/uri-object-api

Moskize91 commented Jun 28, 2026

Uh oh!

coderabbitai Bot commented Jun 28, 2026 •

edited

Loading

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Jun 28, 2026

Uh oh!

coderabbitai Bot Jun 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

Moskize91 commented Jun 28, 2026

Summary

Verification

Uh oh!

coderabbitai Bot commented Jun 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Walkthrough

Possibly Related PRs

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 28, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

coderabbitai Bot commented Jun 28, 2026 •

edited

Loading