fix: preserve real prefix when middle ID segment is a reserved word#513
fix: preserve real prefix when middle ID segment is a reserved word#513JOhnsonKC201 wants to merge 1 commit into
Conversation
stripToValidPrefix collapsed any ID whose second segment was a valid prefix, treating e.g. "endpoint:service:x" as a double-prefix and returning "service:x". This dropped the real outer prefix, corrupting the node type and breaking edge references and idempotency. Only collapse a true same-prefix duplicate (e.g. "file:file:...") by requiring the inner segment to equal the outer prefix. A different reserved word in the middle is a legitimate path segment and is kept. Adds regression tests covering the middle-reserved-word case and idempotency.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: bfe10eed8d
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| const rest = remaining.slice(colonIdx + 1); | ||
| const innerColonIdx = rest.indexOf(":"); | ||
| if (innerColonIdx > 0 && VALID_PREFIXES.has(rest.slice(0, innerColonIdx))) { | ||
| if (innerColonIdx > 0 && rest.slice(0, innerColonIdx) === segment) { |
There was a problem hiding this comment.
Use expected type when collapsing prefixed IDs
When an LLM emits a project-prefixed ID and the project name is also a reserved prefix (for example a file node with service:file:src/foo.ts, or any bad outer valid prefix before the expected prefix), this condition no longer recurses because the inner prefix differs from the outer one. normalizeNodeId then returns service:file:src/foo.ts for a node whose type is file, so the graph no longer uses the canonical type:path ID and edges that reference the canonical file:src/foo.ts form are dropped as dangling. The duplicate-prefix decision needs the expected node prefix to distinguish this case from legitimate middle path segments like endpoint:service:x.
Useful? React with 👍 / 👎.
Problem
stripToValidPrefixinanalyzer/normalize-graph.tscollapses any node ID whose second segment happens to be a valid prefix, treating it as a double-prefix duplicate. This corrupts IDs where a reserved word legitimately appears as a middle path segment.For example,
endpoint:service:getUseris parsed as:endpoint(valid prefix) ✓service(also a valid prefix) → wrongly assumed to be a duplicate prefix…so the real
endpointprefix is dropped and the function returns{ prefix: "service", path: "getUser" }, yieldingservice:getUser. This:Fix
Only collapse a true same-prefix duplicate (e.g.
file:file:src/foo.ts) by requiring the inner segment to equal the outer prefix:A different reserved word in the middle is a legitimate path segment and is preserved. The genuine
file:file:...double-prefix case still collapses as before.Tests
Added two regression tests to
normalize-graph.test.ts:endpoint:service:getUseris preserved unchanged (was previously corrupted toservice:getUser).pnpm --filter @understand-anything/core test— 755 passing (including the existingfile:file:double-prefix test, which still passes).