Skip to content

feat(offload): reasoning offload for codegraph_explore (bring-your-own endpoint)#918

Open
colbymchenry wants to merge 1 commit into
mainfrom
feat/offload-byo
Open

feat(offload): reasoning offload for codegraph_explore (bring-your-own endpoint)#918
colbymchenry wants to merge 1 commit into
mainfrom
feat/offload-byo

Conversation

@colbymchenry

Copy link
Copy Markdown
Owner

What

Adds an opt-in reasoning offload to codegraph_explore. When configured, explore runs retrieval locally as usual, then hands the assembled source + your question to a reasoning model you bring — any OpenAI-compatible endpoint (Cerebras, OpenAI, a local vLLM/Ollama) with your own key — and returns that model's tight, cited answer instead of the raw source dump. Your agent's main context gets the answer in far fewer tokens, at the cost of one network round-trip.

Off by default; nothing changes unless you turn it on.

Usage

codegraph offload set-endpoint https://api.cerebras.ai/v1 \
    --model gpt-oss-120b --key-env CEREBRAS_API_KEY
codegraph offload status     # show endpoint / model / key source
codegraph offload disable

Also settable by env (overrides the saved config, for CI): CODEGRAPH_OFFLOAD_URL, _MODEL, _KEY, _EFFORT, _STYLE.

Design

  • BYO, privacy-preserving. Point it at any OpenAI-compatible endpoint. Nothing but the assembled context + the question leaves the machine. The API key is never written to disk — the config (~/.codegraph/config.json) stores the name of an env var, and the key is read from it at call time.
  • Correctness-first. The synthesis prompt leads with a Coverage: full / partial / not found verdict and cites file:line for every claim, so answers stay verifiable. Validated against gpt-oss-120b-class models at low temperature; quality tracks the model you choose.
  • Strictly degradable. Any failure — no endpoint, network error, timeout, non-2xx, empty answer — returns null and the call falls back to returning the local source. The offload can never surface an error to the agent.

Layout

  • src/reasoning/config.ts~/.codegraph/config.json + CODEGRAPH_OFFLOAD_* resolution (env overrides file).
  • src/reasoning/reasoner.ts — the degradable synthesis client.
  • src/mcp/tools.ts — opt-in hook in handleExplore, after the output is assembled.
  • src/bin/codegraph.tscodegraph offload set-endpoint / status / disable.

Tests

__tests__/offload.test.ts (12) — config round-trip, key-never-on-disk, env-overrides-file, key resolution, and every degradation path (no endpoint / reject / non-2xx / empty). Full suite green.

🤖 Generated with Claude Code

…n endpoint)

codegraph_explore can now hand the source it retrieved to a reasoning model you
point at — any OpenAI-compatible endpoint (Cerebras, OpenAI, a local vLLM/Ollama)
with your own key — and return that model's tight, cited answer instead of the
raw source dump. The agent's main context gets the answer in far fewer tokens, at
the cost of one network round-trip.

Off by default. Configure with `codegraph offload set-endpoint <url> --model <m>
--key-env <ENV>` (or the CODEGRAPH_OFFLOAD_* env vars); status/disable manage it.
The API key is never written to disk — the config stores the NAME of an env var
and the key is read from it at call time. Strictly degradable: any failure
(no endpoint, network, timeout, empty answer) returns null and the call falls
back to the local source, so the offload can never surface an error to the agent.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant