feat(#39): structural call-graph reasoning (ATLAS_CALL_GRAPH)#125
Open
yogthos wants to merge 3 commits into
Open
feat(#39): structural call-graph reasoning (ATLAS_CALL_GRAPH)#125yogthos wants to merge 3 commits into
yogthos wants to merge 3 commits into
Conversation
added 3 commits
June 10, 2026 15:55
…ATLAS_CALL_GRAPH) Adds a precomputed, cached call graph and rewires the structural veto to use it, behind ATLAS_CALL_GRAPH (default off, strictly additive). This is the deeper solver-backed layer that issue itigges22#39 asked for but that V3.1 never landed: there was no stored call graph, no transitive reachability, and the veto was a shape check rather than real resolution. The graph engine in v3-service/graph/ is a faithful Python port of the chiasmus call-graph engine. It extracts defines, calls, imports, exports, and contains facts from tree-sitter, resolves Python imports across files, and answers callers, callees, reachability, path, impact, cycles, dead-code, and entry-points natively in linear time. There is no solver: like chiasmus, the everyday queries are native graph traversal, and Prolog facts are emitted (facts.py) only so an optional solver layer can be added later without re-plumbing. A per-file content-hash cache makes recompute incremental. The port is pinned to the reference, not just plausible. Golden tests compare against chiasmus's own extractGraph and native-analyses, captured via npx tsx: extraction of a fixture with classes, methods, decorators, comprehension calls, and aliased and relative imports matches exactly, and the analyses match including path tie-breaks and impact ordering. Phase 1 deepens the structural veto. graph/resolve_calls.py resolves a candidate's direct-identifier calls against the import graph. A call resolves if its name is bound anywhere in the file, is a builtin, is an imported name, or is supplied by a wildcard import whose module's actual exports are resolved through the graph. The deepening over the shipped veto is that a bare call to a name that merely exists in some unimported project file is now flagged, which catches broken cross-file references. It stays conservative: attribute and method calls are never flagged, an opaque stdlib wildcard keeps it lenient, and the veto never empties the candidate set. It is wired into the pipeline after the existing structural veto and gated by the flag. Two rounds of review ran against this. The first fixed cross-request cache corruption, set-iteration nondeterminism in the analyses, an escape_atom trailing-newline case, and an unbounded-recursion gap. The second caught the real blockers in the Phase 1 veto: assigned callables and function parameters were being flagged as unresolved because the resolution set only had def and class names. That is fixed by resolving against all names bound anywhere in the file, with regression tests for callbacks, higher-order functions, and loop, with, walrus, and comprehension targets. The graph package ships in the v3-service image, CI installs the tree-sitter grammars and runs the suite, and the design and phase status live in docs/reports/CALL_GRAPH_REASONING_V3.md. Tests: 68 under tests/v3-service, all green. Credit to Dmitri Sotnikov (@yogthos) for chiasmus, which itigges22#39 cites.
Completes the structural call-graph feature end to end on top of the phase 0/1 substrate, all behind ATLAS_CALL_GRAPH and strictly additive. Phase 2 deepens the repair context. graph/context.py repair_context builds a real reachability slice for the failing function: the call path from an entry point down to it, its transitive impact set, and its callees, bounded for the token budget. It is wired into the phase-3 repair block, preferring the graph block and falling back to the shipped 1-hop call_chain_context on flag-off or any failure. Phase 3 deepens context injection. symbol_neighborhood returns a named symbol's callers, callees, impact set, and defining files, attached to the /internal/symbol_index response as an additive graph field so the matched and skipped shape is unchanged for flag-off callers. Phase 4 adds the optional tier signal. analyses.complexity reports per-node fan-in and fan-out plus the graph maxima, exposed via the complexity analysis. Wiring it into the Go tier classifier is intentionally left out, since point 2 already ships through cyclomatic complexity and the plan gates this on a tier- accuracy measurement that needs the live stack. Phase 5 adds the solver layer without a heavy dependency. The Prolog facts already emit from phase 0 for an external SWI-Prolog or chiasmus_verify. graph/datalog.py adds a compact in-process Datalog engine, facts plus structured rules evaluated to a bounded fixpoint, shipping the built-in transitive reaches closure and supporting arbitrary in-process rules. Its reaches is cross-checked against the native reachability for all node pairs in the suite, so the solver layer is provably consistent with the native one. Phase 6 adds JavaScript. extract.py dispatches by extension to a Python or JavaScript walk, with JS covering defines, calls, imports, and contains. build_graph ingests both and the analyses run on the merged graph. The tree_sitter_javascript grammar is added to the image and CI, and JS extraction is golden-tested against chiasmus's own extractGraph: defines, calls, and imports match exactly, including arrow-const functions and new-expressions not being counted as calls. Tests: 86 under tests/v3-service, all green; ruff clean; Python 3.9 safe. The design doc records the full phase status. The standing caveat is unchanged: the graph logic is unit-tested and golden-pinned against chiasmus, but the effect inside the live run, repair, and symbol-index paths needs the GPU, llama, and sandbox stack to validate.
A wiring audit found two real end-to-end gaps and a few correctness and performance issues. All fixed. The flag never reached the v3-service container, so the whole feature was dead in the Docker deployment, the same bug pattern as ATLAS_RPG_PLANNING. It is now forwarded in docker-compose.yml and documented in .env.example. Phase 3 was half-wired. v3-service produced the symbol-index "graph" field but the proxy discarded it: the response struct had no Graph field and the consumer only used matched and skipped. The proxy now parses symbolGraphNode, and formatGraphNeighborhood folds each matched symbol's callers and callees into the injected system-note context in agent.go, so the neighborhood actually reaches the model. Added Go tests for parsing and formatting. On correctness and performance: the symbol-index handler rebuilt the whole project graph once per matched symbol, so it now builds the graph once and neighborhoods each symbol against it. The in-process Datalog engine raised a KeyError on a caller-supplied rule with an unbound head variable; run now skips unsafe derivations instead of crashing. repair_context gated on Python only while build_graph also handles JavaScript, so it now gates on is_supported. And the transitive-impact dedup hoists its set out of the loop. The datalog max_iter cap was reviewed and left as is: naive fixpoint passes are bounded by the call graph's diameter, not its tuple count, so the cap is not reachable on realistic input and remains a genuine runaway guard. The call_graph endpoint and its closure and complexity analyses stay as deliberate query capabilities with no in-pipeline caller. Tests: 88 under tests/v3-service plus the proxy Go suite, all green; ruff clean.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adds a precomputed, cached call graph and rewires the structural veto to use it, behind ATLAS_CALL_GRAPH (default off, strictly additive). This is the deeper solver-backed layer that issue #39 asked for but that V3.1 never landed: there was no stored call graph, no transitive reachability, and the veto was a shape check rather than real resolution.
The graph engine in v3-service/graph/ is a faithful Python port of the chiasmus call-graph engine. It extracts defines, calls, imports, exports, and contains facts from tree-sitter, resolves Python imports across files, and answers callers, callees, reachability, path, impact, cycles, dead-code, and entry-points natively in linear time.
I omitted the solver since everyday queries are native graph traversal, and Prolog facts are emitted (facts.py) only so an optional solver layer can be added later without re-plumbing if genuinely useful. A per-file content-hash cache makes recompute incremental.
Golden tests compare against chiasmus's own extractGraph and native-analyses of a fixture with classes, methods, decorators, comprehension calls, and aliased and relative imports, and the analyses match including path tie-breaks and impact ordering.