Add optional CodeGraph structural context tool#211
Conversation
There was a problem hiding this comment.
🔍 OpenCodeReview found 6 issue(s) in this PR.
- ✅ 5 posted as inline comment(s)
- 📝 1 posted as summary
📄 internal/tool/definitions.go
Truncating by byte index (out[:codeGraphMaxOutput]) can split a multi-byte UTF-8 character, producing invalid UTF-8 output. Consider truncating at a valid rune boundary instead, e.g.:
if len(out) > codeGraphMaxOutput {
truncated := out[:codeGraphMaxOutput]
// Trim to last valid rune boundary
for len(truncated) > 0 && !utf8.ValidString(truncated) {
truncated = truncated[:len(truncated)-1]
}
out = truncated + "\n\n[truncated: CodeGraph output exceeded tool limit]"
}| } | ||
|
|
||
| // FilterByName returns entries excluding any tool whose name appears in names. | ||
| func FilterByName(entries []ToolConfigEntry, names ...string) []ToolConfigEntry { |
There was a problem hiding this comment.
The function name FilterByName is ambiguous — it could be interpreted as "keep only entries matching these names" (inclusive filter) rather than "exclude entries matching these names" (exclusive filter). Consider renaming to something like ExcludeByName or FilterOutByName to make the exclusion semantics clear from the name alone, reducing the chance of misuse by future callers.
Suggestion:
| func FilterByName(entries []ToolConfigEntry, names ...string) []ToolConfigEntry { | |
| func ExcludeByName(entries []ToolConfigEntry, names ...string) []ToolConfigEntry { |
| FileFind = Tool{name: "file_find"} | ||
| FileReadDiff = Tool{name: "file_read_diff"} | ||
| CodeSearch = Tool{name: "code_search"} | ||
| CodeGraph = Tool{name: "code_graph_context"} |
There was a problem hiding this comment.
Naming inconsistency: All other tool variables match their string names (e.g., CodeSearch → "code_search", FileRead → "file_read"). Here, the variable is CodeGraph but the tool name is "code_graph_context". Consider renaming the variable to CodeGraphContext to maintain consistency and avoid confusion during maintenance.
Suggestion:
| CodeGraph = Tool{name: "code_graph_context"} | |
| CodeGraphContext = Tool{name: "code_graph_context"} |
| if len(out) > codeGraphMaxOutput { | ||
| out = out[:codeGraphMaxOutput] + "\n\n[truncated: CodeGraph output exceeded tool limit]" | ||
| } |
There was a problem hiding this comment.
Bug: UTF-8 unsafe truncation. out[:codeGraphMaxOutput] slices by byte index, which can cut in the middle of a multi-byte UTF-8 character, producing invalid UTF-8 output. This could cause rendering issues or downstream parsing errors.
Use a UTF-8-safe truncation approach, for example:
if len(out) > codeGraphMaxOutput {
truncated := out[:codeGraphMaxOutput]
// Trim back to valid UTF-8 boundary
for len(truncated) > 0 && !utf8.ValidString(truncated) {
truncated = truncated[:len(truncated)-1]
}
out = truncated + "\n\n[truncated: CodeGraph output exceeded tool limit]"
}Or use strings.ToValidUTF8() after slicing.
| if kind := strings.TrimSpace(stringArg(args, "kind")); kind != "" { | ||
| cmdArgs = append(cmdArgs, "-k", kind) | ||
| } | ||
| cmdArgs = append(cmdArgs, query) |
There was a problem hiding this comment.
Potential flag injection via query/kind arguments. User-supplied query and kind values are passed directly as CLI arguments to the external codegraph binary. If query starts with -, it could be interpreted as a flag by the subprocess (e.g., --help, -p, etc.), leading to unexpected behavior or information disclosure.
Consider adding a -- separator before positional arguments to prevent flag injection, e.g.:
cmdArgs = append(cmdArgs, "--", query)This should be applied consistently across all modes where user input is appended as a positional argument.
| err := cmd.Run() | ||
| if ctx.Err() != nil { | ||
| return "code_graph_context timed out. Try using mode=search with a specific symbol, or reduce max_files/limit.", nil | ||
| } |
There was a problem hiding this comment.
Timeout error handling discards partial output and original error. When a timeout occurs (ctx.Err() != nil), this check runs before examining err from cmd.Run(). This means:
- Any partial stdout/stderr that was captured before the timeout is silently discarded, making debugging difficult.
- If both a timeout and another error occur, only the timeout message is returned.
Consider including any partial output in the timeout response to aid debugging, e.g.:
if ctx.Err() != nil {
msg := "code_graph_context timed out. Try using mode=search with a specific symbol, or reduce max_files/limit."
if partial := strings.TrimSpace(stdout.String()); partial != "" {
msg += "\n\nPartial output:\n" + stripANSI(partial)
}
return msg, nil
}There was a problem hiding this comment.
Addressed in 3ece64b: timeout handling now builds the response through codeGraphTimeoutMessage, including any captured stdout and stderr as partial output while keeping the existing timeout guidance. I also added a focused unit test for that message construction.
|
chenyangyang seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
Summary
Closes #210.
This adds an optional
code_graph_contextreview tool backed by the external CodeGraph CLI. The tool is meant for structural code-review context — symbol lookup, exploration, callers, callees, and impact checks — while keepingcode_searchas the literal text-search tool.Behavior
The CodeGraph tool is hidden from the model unless all of the following are true:
.codegraph/codegraph.dbexists in the repositorycodegraphis available onPATH1.x)codegraph status <repo>succeedsHEADmatches the review target ref used for file readsIf any check fails, OCR filters
code_graph_contextout of the tool definitions and does not register the provider, so existing users keep the current behavior.CI/CD note
The current GitHub Actions and GitLab CI examples install OpenCodeReview only; they do not install CodeGraph or build a
.codegraphindex. In those default CI/CD jobs, this tool will therefore be hidden automatically.Teams that want structural context in CI need to opt in explicitly, for example by using a job image that includes CodeGraph and running
codegraph init/codegraph syncafter checkout and beforeocr review.Implementation notes
explore,search,callers,callees, andimpactmodes.tools.jsonso the agent uses it for public API, signature, interface, model, route, auth/security, concurrency, lifecycle, or cross-file impact risks rather than simple text lookup.Tests
go test ./...