Skip to content

feat(#39): structural call-graph reasoning (ATLAS_CALL_GRAPH)#125

Open
yogthos wants to merge 3 commits into
itigges22:mainfrom
yogthos:feat/call-graph-reasoning
Open

feat(#39): structural call-graph reasoning (ATLAS_CALL_GRAPH)#125
yogthos wants to merge 3 commits into
itigges22:mainfrom
yogthos:feat/call-graph-reasoning

Conversation

@yogthos

@yogthos yogthos commented Jun 10, 2026

Copy link
Copy Markdown

Adds a precomputed, cached call graph and rewires the structural veto to use it, behind ATLAS_CALL_GRAPH (default off, strictly additive). This is the deeper solver-backed layer that issue #39 asked for but that V3.1 never landed: there was no stored call graph, no transitive reachability, and the veto was a shape check rather than real resolution.

The graph engine in v3-service/graph/ is a faithful Python port of the chiasmus call-graph engine. It extracts defines, calls, imports, exports, and contains facts from tree-sitter, resolves Python imports across files, and answers callers, callees, reachability, path, impact, cycles, dead-code, and entry-points natively in linear time.

I omitted the solver since everyday queries are native graph traversal, and Prolog facts are emitted (facts.py) only so an optional solver layer can be added later without re-plumbing if genuinely useful. A per-file content-hash cache makes recompute incremental.

Golden tests compare against chiasmus's own extractGraph and native-analyses of a fixture with classes, methods, decorators, comprehension calls, and aliased and relative imports, and the analyses match including path tie-breaks and impact ordering.

Yogthos added 3 commits June 10, 2026 15:55
…ATLAS_CALL_GRAPH)

Adds a precomputed, cached call graph and rewires the structural veto to use it,
behind ATLAS_CALL_GRAPH (default off, strictly additive). This is the deeper
solver-backed layer that issue itigges22#39 asked for but that V3.1 never landed: there
was no stored call graph, no transitive reachability, and the veto was a shape
check rather than real resolution.

The graph engine in v3-service/graph/ is a faithful Python port of the chiasmus
call-graph engine. It extracts defines, calls, imports, exports, and contains
facts from tree-sitter, resolves Python imports across files, and answers
callers, callees, reachability, path, impact, cycles, dead-code, and
entry-points natively in linear time. There is no solver: like chiasmus, the
everyday queries are native graph traversal, and Prolog facts are emitted
(facts.py) only so an optional solver layer can be added later without
re-plumbing. A per-file content-hash cache makes recompute incremental.

The port is pinned to the reference, not just plausible. Golden tests compare
against chiasmus's own extractGraph and native-analyses, captured via npx tsx:
extraction of a fixture with classes, methods, decorators, comprehension calls,
and aliased and relative imports matches exactly, and the analyses match
including path tie-breaks and impact ordering.

Phase 1 deepens the structural veto. graph/resolve_calls.py resolves a
candidate's direct-identifier calls against the import graph. A call resolves if
its name is bound anywhere in the file, is a builtin, is an imported name, or is
supplied by a wildcard import whose module's actual exports are resolved through
the graph. The deepening over the shipped veto is that a bare call to a name
that merely exists in some unimported project file is now flagged, which catches
broken cross-file references. It stays conservative: attribute and method calls
are never flagged, an opaque stdlib wildcard keeps it lenient, and the veto never
empties the candidate set. It is wired into the pipeline after the existing
structural veto and gated by the flag.

Two rounds of review ran against this. The first fixed cross-request cache
corruption, set-iteration nondeterminism in the analyses, an escape_atom
trailing-newline case, and an unbounded-recursion gap. The second caught the
real blockers in the Phase 1 veto: assigned callables and function parameters
were being flagged as unresolved because the resolution set only had def and
class names. That is fixed by resolving against all names bound anywhere in the
file, with regression tests for callbacks, higher-order functions, and loop,
with, walrus, and comprehension targets.

The graph package ships in the v3-service image, CI installs the tree-sitter
grammars and runs the suite, and the design and phase status live in
docs/reports/CALL_GRAPH_REASONING_V3.md. Tests: 68 under tests/v3-service, all
green. Credit to Dmitri Sotnikov (@yogthos) for chiasmus, which itigges22#39 cites.
Completes the structural call-graph feature end to end on top of the phase 0/1
substrate, all behind ATLAS_CALL_GRAPH and strictly additive.

Phase 2 deepens the repair context. graph/context.py repair_context builds a
real reachability slice for the failing function: the call path from an entry
point down to it, its transitive impact set, and its callees, bounded for the
token budget. It is wired into the phase-3 repair block, preferring the graph
block and falling back to the shipped 1-hop call_chain_context on flag-off or
any failure.

Phase 3 deepens context injection. symbol_neighborhood returns a named symbol's
callers, callees, impact set, and defining files, attached to the
/internal/symbol_index response as an additive graph field so the matched and
skipped shape is unchanged for flag-off callers.

Phase 4 adds the optional tier signal. analyses.complexity reports per-node
fan-in and fan-out plus the graph maxima, exposed via the complexity analysis.
Wiring it into the Go tier classifier is intentionally left out, since point 2
already ships through cyclomatic complexity and the plan gates this on a tier-
accuracy measurement that needs the live stack.

Phase 5 adds the solver layer without a heavy dependency. The Prolog facts
already emit from phase 0 for an external SWI-Prolog or chiasmus_verify.
graph/datalog.py adds a compact in-process Datalog engine, facts plus structured
rules evaluated to a bounded fixpoint, shipping the built-in transitive reaches
closure and supporting arbitrary in-process rules. Its reaches is cross-checked
against the native reachability for all node pairs in the suite, so the solver
layer is provably consistent with the native one.

Phase 6 adds JavaScript. extract.py dispatches by extension to a Python or
JavaScript walk, with JS covering defines, calls, imports, and contains.
build_graph ingests both and the analyses run on the merged graph. The
tree_sitter_javascript grammar is added to the image and CI, and JS extraction
is golden-tested against chiasmus's own extractGraph: defines, calls, and
imports match exactly, including arrow-const functions and new-expressions not
being counted as calls.

Tests: 86 under tests/v3-service, all green; ruff clean; Python 3.9 safe. The
design doc records the full phase status. The standing caveat is unchanged: the
graph logic is unit-tested and golden-pinned against chiasmus, but the effect
inside the live run, repair, and symbol-index paths needs the GPU, llama, and
sandbox stack to validate.
A wiring audit found two real end-to-end gaps and a few correctness and
performance issues. All fixed.

The flag never reached the v3-service container, so the whole feature was dead
in the Docker deployment, the same bug pattern as ATLAS_RPG_PLANNING. It is now
forwarded in docker-compose.yml and documented in .env.example.

Phase 3 was half-wired. v3-service produced the symbol-index "graph" field but
the proxy discarded it: the response struct had no Graph field and the consumer
only used matched and skipped. The proxy now parses symbolGraphNode, and
formatGraphNeighborhood folds each matched symbol's callers and callees into the
injected system-note context in agent.go, so the neighborhood actually reaches
the model. Added Go tests for parsing and formatting.

On correctness and performance: the symbol-index handler rebuilt the whole
project graph once per matched symbol, so it now builds the graph once and
neighborhoods each symbol against it. The in-process Datalog engine raised a
KeyError on a caller-supplied rule with an unbound head variable; run now skips
unsafe derivations instead of crashing. repair_context gated on Python only
while build_graph also handles JavaScript, so it now gates on is_supported. And
the transitive-impact dedup hoists its set out of the loop.

The datalog max_iter cap was reviewed and left as is: naive fixpoint passes are
bounded by the call graph's diameter, not its tuple count, so the cap is not
reachable on realistic input and remains a genuine runaway guard. The
call_graph endpoint and its closure and complexity analyses stay as deliberate
query capabilities with no in-pipeline caller.

Tests: 88 under tests/v3-service plus the proxy Go suite, all green; ruff clean.
@yogthos yogthos marked this pull request as ready for review June 10, 2026 20:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant