Aden: A Dense Referential Context Compiler

Aden transforms codebases into traversable knowledge graphs, making the structure of understanding explicit, machine-readable, and queryable by both humans and AI agents.

The Problem

Large language models are capable of sophisticated reasoning, but they are constrained by a finite context window. When an AI agent is dropped into a codebase of 100,000+ lines, it faces the same problem a human faces: information overload. The agent does not know which 10 files out of 500 are relevant to the task at hand. It does not know that changing Database::connect() will break QueueWorker::drain(). It has no mental map of the system.

What Aden Does

Aden compiles source code, documentation, notes, and plans into a knowledge graph where:

Every function, module, and decision becomes a node
Every relationship (imports, calls, constraints, justifications) becomes a typed edge
You can ask questions like "what depends on this function?" or "what is the blast radius of changing this module?"

Source Code → Aden Pipeline → Knowledge Graph → Context for AI

Where Aden Fits

Aden complements your existing tools — it maps the structure of a codebase, it does not find bugs or render HTML.

Instead of / alongside	What Aden adds
Static analysis tools (clippy, Semgrep)	Aden finds semantic relationships and blast radius, not bugs — keep clippy/Semgrep for correctness; use Aden to navigate the graph
Documentation generators (Rustdoc, Javadoc)	Aden produces machine-navigable context for LLMs, not HTML
grep + manual file hunting	Aden lets you query by intent and relationship, with every hit tagged by its enclosing symbol
Scrolling through READMEs	Aden assembles exactly the connected context you need, within a token budget

Quick Start

# Install (builds release, copies to ~/.local/bin, adds to PATH)
./install.sh

# Initialize your project (optional — read commands auto-build the index)
cd your-project
aden init

# Compile the whole codebase into the knowledge graph
aden gen . --auto

# Ask a natural-language question — returns dense, connected context
aden ask "How does login work?"

# Structure-aware search: every match tagged with its enclosing symbol
aden grep "hash_password"

# Find a symbol's definition AND its real aden:// anchor
aden locate --symbol login

# One-shot symbol comprehension (replaces locate + backlinks + impact + asm)
aden understand Database::connect .

# Blast radius before a refactor — who depends on this symbol?
# query/asm take a full aden:// anchor (from locate/grep/list), not a bare name:
aden query --backlinks "aden://module/<crate>/<module-doc>.adoc#code_block_3"

# Assemble a module (or symbol) overview within a token budget
aden asm --from "aden://module/<crate>/<module-doc>.adoc#code_block_3" --depth 1

# Before every commit (fast, aden-only gates)
aden ready .

# Visualise the graph — text (mermaid/dot/json) or an interactive offline browser view
aden viz --mode communities --format mermaid     # text diagram for a PR/README
aden view                                         # whole graph in the browser, with
                                                  # git-history replay (on by default)

# Expose Aden's commands to your AI client as MCP tools
aden mcp install --platform claude   # see docs/mcp-intro.md

The graph is fresh by construction: read commands (ask/asm/query/ locate/grep) detect changed source and re-index it automatically, so you rarely need to run gen by hand.

Hybrid (dense) search — optional

By default search/ask use BM25 (lexical) ranking over the graph. The optional dense feature adds local semantic embeddings fused with BM25 via Reciprocal Rank Fusion, which improves natural-language queries (it finds code by meaning, not just shared terms). It stays fully offline and deterministic — a pure-Rust ONNX model (tract + BAAI/bge-small-en-v1.5, MIT), no network at query time.

# One-time: fetch the embedding model into ~/.cache/aden-models (the only step
# that touches the network; aden itself never does). ~127 MB.
scripts/fetch-bge-model.sh

# Build aden with hybrid search enabled
cargo build -p aden-cli --features dense

With the feature off (the default), nothing changes and no extra dependencies are built. Air-gapped? Place model.onnx + tokenizer.json from BAAI/bge-small-en-v1.5 into the cache dir by hand instead of running the script.

Dual-substrate levers (opt-in)

Two retrieval levers route by what the text is: a corpus-derived PPMI rerank for code (MRR 0.216 → 0.289) and grounded OEWN synonym expansion for prose (R@1 1/42 → 41/42; end-to-end 0/15 → 15/15). Auto-gating is off by default (net-neutral to negative on natural multi-word queries over external repos); opt in with ADEN_LEXICON_ON (routed by query shape + corpus substrate), or force a single lever with ADEN_LEXICON_EXPAND / ADEN_PPMI_RERANK. Once opted in, ADEN_LEXICON_OFF force-disables. Grounded and corpus-gated, so it no-ops where it would not help. See docs/retrieval-levers.adoc.

Core Commands

Command	Purpose
`aden gen`	Compile source into the knowledge graph (symbols, call edges, docs)
`aden ask`	Natural-language question → dense, graph-traversed context
`aden understand`	One-shot symbol comprehension: definition + callers + impact + context
`aden grep`	Structure-aware search — every hit tagged with its enclosing symbol
`aden asm`	Assemble context from an anchor within a token budget
`aden query`	Graph queries: `--from`, `--backlinks` (callers), `--impact`
`aden locate`	Find symbol definitions with exact line numbers
`aden check`	Validate referential integrity
`aden lint`	Fast, language-agnostic heuristic checks; `--dead-code` for graph-based detection
`aden heal`	Detect drift between code and contracts
`aden ready`	Fast pre-commit gate: gen → lint → check → heal drift → audit
`aden sync`	Reconcile store after merges or file deletions (gen + check + heal with gc)
`aden ci-check`	Full CI gate suite including external tools; use before push
`aden view`	Interactive graph viewer in the browser — offline, with git-history replay
`aden timeline`	Time-travel file viewer: bake every git version of a file into a self-contained offline HTML page with client-side diff

Why AsciiDoc?

Human-readable — open any .adoc file and understand it
Machine-parseable — regular grammar, no complex toolchains
Version-control-friendly — diffs cleanly in Git
Referential by default — the <<anchor>> syntax builds the graph naturally

Supported Languages

Aden is language-agnostic: aden gen discovers and parses every file type it has a grammar for — not just whichever build manifest happens to be present — and indexes Markdown/AsciiDoc documentation alongside code.

Deep extraction (call graph, signatures, doc comments): Rust, Python, Go, TypeScript/JavaScript, Java, C#, C, Ruby, PHP, Kotlin.
Generic extraction (symbols + structure, no call edges): ~113 further languages wired via ext_to_language_pack_id in router.rs (.ps1/.psm1/.psd1 PowerShell included); 305+ grammars available in the bundled pack — add entries to ext_to_language_pack_id in crates/aden-parse/src/router.rs to expose more.

Grammars are compiled into the binary at build time (see .cargo/config.toml / TSLP_LANGUAGES), so parsing works fully offline — no runtime downloads.

Performance

Early self-run measurements on Aden's own repository (244 files) and external corpora:

Edge-extraction F1: 0.915 [measured] — micro-precision 0.946, micro-recall 0.886 on a 79-edge polyglot ground-truth fixture.
~10× token savings overall vs a grep-and-read agent [measured, chars/4 proxy]; up to >100× for symbol and structure lookups; ~4–5× for open-ended conceptual questions.
Hybrid retrieval beats BM25 on every corpus tested — R@1 gains of 0.06–0.14 across Go, Rust, C#, Python, TypeScript, and two larger corpora (Linux kernel subset, create-t3-app).
Energy savings vs LLM inference are estimated (not instrumented): see full methodology and caveats in docs/benchmarks.adoc.

See docs/benchmarks.adoc for full numbers, methodology, and all caveats.

Documentation

Getting Started — 10-minute intro
Philosophy — Why Aden exists and what it solves
Architecture — Technical deep-dive
AI Integration — Using Aden with AI agents
User Guide — Daily workflow reference

Acknowledgments

Aden's entire premise — that documentation can be a plain-text, regular, referential, scriptable language rather than prose locked in a binary format — rests on the people who invented and stewarded AsciiDoc:

Stuart Rackham, who created AsciiDoc in 2002. The original insight — that a document could be readable text with a regular grammar, cross-references (<<anchor>>), includes, attributes, and conditionals — is exactly what lets Aden treat docs as a queryable graph instead of opaque files. That idea is load-bearing for this whole project.
Dan Allen and the Asciidoctor project (with the AsciiDoc Working Group at the Eclipse Foundation), who carried AsciiDoc forward into a maintained processor and a real language specification.

Aden also stands on the shoulders of the wider open-source Rust ecosystem and the many authors, maintainers, and contributors behind the projects it builds on. Several are load-bearing:

Parsing & search

Max Brunsfeld, the tree-sitter project, and the numerous per-language grammar authors (bundled via tree-sitter-language-pack) whose work makes symbol and call extraction possible across 300+ languages.
Andrew Gallant (BurntSushi) and contributors — the regex (with aho-corasick/memchr) and walkdir crates behind Aden's structure-aware grep, lint, and file discovery.

Storage, graph & data

the fjall project (LSM-tree storage), petgraph (graph data structures), and the serde community (David Tolnay and contributors) — with postcard, serde_json, serde_yaml, toml, blake3, fnv, and uuid.

CLI, async & protocol

the clap, rayon, Tokio, and tower-lsp teams; ctrlc, notify, ureq, dirs, chrono; anyhow/thiserror (David Tolnay and contributors); and rmcp — the Model Context Protocol SDK from Anthropic and the MCP community.

These names are illustrative, not exhaustive, and many of these projects have multiple owners. The complete and authoritative attribution for every one of Aden's 350+ direct and transitive dependencies — each with its license — lives in NOTICE.md (regenerate with aden licenses). If you maintain a project Aden depends on and feel under-credited, that is an oversight we want to correct — please open an issue and we will fix it.

Third-party reference material

The research/ tree contains documentation that Aden parses and queries, not code it compiles or links — e.g. a secure-coding knowledge base summarizing OWASP and MITRE CWE guidance. This material is under its own third-party licenses: OWASP material under CC BY-SA 4.0 and CC BY 3.0; MITRE CWE content under the MITRE CWE Terms of Use (a separate, non-Creative-Commons instrument). Content is kept segregated from Aden's AGPL-3.0 source and never embedded in any binary. Full citations, required notices, and trademark/non-endorsement statements are in research/secure-coding/SOURCES.md and research/README.md. "OWASP" is a trademark of the OWASP Foundation; "CWE" is a trademark of MITRE Corporation. Their use here is nominative and implies no affiliation or endorsement by the OWASP Foundation or MITRE Corporation.

The Name

A Dense Referential Context Compiler — Every token is load-bearing. Every edge is typed. Every anchor resolves.

Aden is designed for the future of software development: hybrid teams of humans and AI agents working together.

Name		Name	Last commit message	Last commit date
Latest commit History 485 Commits
.aden		.aden
.agent		.agent
.cargo		.cargo
.github		.github
benches		benches
crates		crates
docs		docs
editors/vscode		editors/vscode
git-hooks		git-hooks
research		research
scripts		scripts
src		src
tools		tools
.adenallow		.adenallow
.adenignore		.adenignore
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
CLA.md		CLA.md
CONTRIBUTING.md		CONTRIBUTING.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
ISSUES.md		ISSUES.md
LICENSE		LICENSE
MAINTAINERS.md		MAINTAINERS.md
Makefile		Makefile
NOTICE.md		NOTICE.md
README.md		README.md
SECURITY.md		SECURITY.md
aden.toml		aden.toml
deny.toml		deny.toml
install.ps1		install.ps1
install.sh		install.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Aden: A Dense Referential Context Compiler

The Problem

What Aden Does

Where Aden Fits

Quick Start

Hybrid (dense) search — optional

Dual-substrate levers (opt-in)

Core Commands

Why AsciiDoc?

Supported Languages

Performance

Documentation

Acknowledgments

Third-party reference material

The Name

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Aden: A Dense Referential Context Compiler

The Problem

What Aden Does

Where Aden Fits

Quick Start

Hybrid (dense) search — optional

Dual-substrate levers (opt-in)

Core Commands

Why AsciiDoc?

Supported Languages

Performance

Documentation

Acknowledgments

Third-party reference material

The Name

About

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages