Aden transforms codebases into traversable knowledge graphs, making the structure of understanding explicit, machine-readable, and queryable by both humans and AI agents.
Large language models are capable of sophisticated reasoning, but they are constrained by a finite context window. When an AI agent is dropped into a codebase of 100,000+ lines, it faces the same problem a human faces: information overload. The agent does not know which 10 files out of 500 are relevant to the task at hand. It does not know that changing Database::connect() will break QueueWorker::drain(). It has no mental map of the system.
Aden compiles source code, documentation, notes, and plans into a knowledge graph where:
- Every function, module, and decision becomes a node
- Every relationship (imports, calls, constraints, justifications) becomes a typed edge
- You can ask questions like "what depends on this function?" or "what is the blast radius of changing this module?"
Source Code → Aden Pipeline → Knowledge Graph → Context for AI
Aden complements your existing tools — it maps the structure of a codebase, it does not find bugs or render HTML.
| Instead of / alongside | What Aden adds |
|---|---|
| Static analysis tools (clippy, Semgrep) | Aden finds semantic relationships and blast radius, not bugs — keep clippy/Semgrep for correctness; use Aden to navigate the graph |
| Documentation generators (Rustdoc, Javadoc) | Aden produces machine-navigable context for LLMs, not HTML |
| grep + manual file hunting | Aden lets you query by intent and relationship, with every hit tagged by its enclosing symbol |
| Scrolling through READMEs | Aden assembles exactly the connected context you need, within a token budget |
# Install (builds release, copies to ~/.local/bin, adds to PATH)
./install.sh
# Initialize your project (optional — read commands auto-build the index)
cd your-project
aden init
# Compile the whole codebase into the knowledge graph
aden gen . --auto
# Ask a natural-language question — returns dense, connected context
aden ask "How does login work?"
# Structure-aware search: every match tagged with its enclosing symbol
aden grep "hash_password"
# Find a symbol's definition AND its real aden:// anchor
aden locate --symbol login
# One-shot symbol comprehension (replaces locate + backlinks + impact + asm)
aden understand Database::connect .
# Blast radius before a refactor — who depends on this symbol?
# query/asm take a full aden:// anchor (from locate/grep/list), not a bare name:
aden query --backlinks "aden://module/<crate>/<module-doc>.adoc#code_block_3"
# Assemble a module (or symbol) overview within a token budget
aden asm --from "aden://module/<crate>/<module-doc>.adoc#code_block_3" --depth 1
# Before every commit (fast, aden-only gates)
aden ready .
# Visualise the graph — text (mermaid/dot/json) or an interactive offline browser view
aden viz --mode communities --format mermaid # text diagram for a PR/README
aden view # whole graph in the browser, with
# git-history replay (on by default)
# Expose Aden's commands to your AI client as MCP tools
aden mcp install --platform claude # see docs/mcp-intro.mdThe graph is fresh by construction: read commands (ask/asm/query/
locate/grep) detect changed source and re-index it automatically, so you
rarely need to run gen by hand.
By default search/ask use BM25 (lexical) ranking over the graph. The optional
dense feature adds local semantic embeddings fused with BM25 via Reciprocal
Rank Fusion, which improves natural-language queries (it finds code by meaning,
not just shared terms). It stays fully offline and deterministic — a pure-Rust
ONNX model (tract + BAAI/bge-small-en-v1.5, MIT), no network at query time.
# One-time: fetch the embedding model into ~/.cache/aden-models (the only step
# that touches the network; aden itself never does). ~127 MB.
scripts/fetch-bge-model.sh
# Build aden with hybrid search enabled
cargo build -p aden-cli --features denseWith the feature off (the default), nothing changes and no extra dependencies are
built. Air-gapped? Place model.onnx + tokenizer.json from BAAI/bge-small-en-v1.5
into the cache dir by hand instead of running the script.
Two retrieval levers route by what the text is: a corpus-derived PPMI rerank for
code (MRR 0.216 → 0.289) and grounded OEWN synonym expansion for prose (R@1
1/42 → 41/42; end-to-end 0/15 → 15/15). Auto-gating is off by default (net-neutral to
negative on natural multi-word queries over external repos); opt in with ADEN_LEXICON_ON
(routed by query shape + corpus substrate), or force a single lever with ADEN_LEXICON_EXPAND /
ADEN_PPMI_RERANK. Once opted in, ADEN_LEXICON_OFF force-disables. Grounded and corpus-gated,
so it no-ops where it would not help. See docs/retrieval-levers.adoc.
| Command | Purpose |
|---|---|
aden gen |
Compile source into the knowledge graph (symbols, call edges, docs) |
aden ask |
Natural-language question → dense, graph-traversed context |
aden understand |
One-shot symbol comprehension: definition + callers + impact + context |
aden grep |
Structure-aware search — every hit tagged with its enclosing symbol |
aden asm |
Assemble context from an anchor within a token budget |
aden query |
Graph queries: --from, --backlinks (callers), --impact |
aden locate |
Find symbol definitions with exact line numbers |
aden check |
Validate referential integrity |
aden lint |
Fast, language-agnostic heuristic checks; --dead-code for graph-based detection |
aden heal |
Detect drift between code and contracts |
aden ready |
Fast pre-commit gate: gen → lint → check → heal drift → audit |
aden sync |
Reconcile store after merges or file deletions (gen + check + heal with gc) |
aden ci-check |
Full CI gate suite including external tools; use before push |
aden view |
Interactive graph viewer in the browser — offline, with git-history replay |
aden timeline |
Time-travel file viewer: bake every git version of a file into a self-contained offline HTML page with client-side diff |
- Human-readable — open any
.adocfile and understand it - Machine-parseable — regular grammar, no complex toolchains
- Version-control-friendly — diffs cleanly in Git
- Referential by default — the
<<anchor>>syntax builds the graph naturally
Aden is language-agnostic: aden gen discovers and parses every file type it
has a grammar for — not just whichever build manifest happens to be present — and
indexes Markdown/AsciiDoc documentation alongside code.
- Deep extraction (call graph, signatures, doc comments): Rust, Python, Go, TypeScript/JavaScript, Java, C#, C, Ruby, PHP, Kotlin.
- Generic extraction (symbols + structure, no call edges): ~113 further languages
wired via
ext_to_language_pack_idinrouter.rs(.ps1/.psm1/.psd1 PowerShell included); 305+ grammars available in the bundled pack — add entries toext_to_language_pack_idincrates/aden-parse/src/router.rsto expose more.
Grammars are compiled into the binary at build time (see .cargo/config.toml /
TSLP_LANGUAGES), so parsing works fully offline — no runtime downloads.
Early self-run measurements on Aden's own repository (244 files) and external corpora:
- Edge-extraction F1: 0.915 [measured] — micro-precision 0.946, micro-recall 0.886 on a 79-edge polyglot ground-truth fixture.
- ~10× token savings overall vs a grep-and-read agent [measured, chars/4 proxy]; up to >100× for symbol and structure lookups; ~4–5× for open-ended conceptual questions.
- Hybrid retrieval beats BM25 on every corpus tested — R@1 gains of 0.06–0.14 across Go, Rust, C#, Python, TypeScript, and two larger corpora (Linux kernel subset, create-t3-app).
- Energy savings vs LLM inference are estimated (not instrumented): see full methodology and caveats in docs/benchmarks.adoc.
See docs/benchmarks.adoc for full numbers, methodology, and all caveats.
- Getting Started — 10-minute intro
- Philosophy — Why Aden exists and what it solves
- Architecture — Technical deep-dive
- AI Integration — Using Aden with AI agents
- User Guide — Daily workflow reference
Aden's entire premise — that documentation can be a plain-text, regular, referential, scriptable language rather than prose locked in a binary format — rests on the people who invented and stewarded AsciiDoc:
- Stuart Rackham, who created AsciiDoc in 2002. The original insight — that a
document could be readable text with a regular grammar, cross-references
(
<<anchor>>), includes, attributes, and conditionals — is exactly what lets Aden treat docs as a queryable graph instead of opaque files. That idea is load-bearing for this whole project. - Dan Allen and the Asciidoctor project (with the AsciiDoc Working Group at the Eclipse Foundation), who carried AsciiDoc forward into a maintained processor and a real language specification.
Aden also stands on the shoulders of the wider open-source Rust ecosystem and the many authors, maintainers, and contributors behind the projects it builds on. Several are load-bearing:
Parsing & search
- Max Brunsfeld, the tree-sitter project, and the numerous per-language
grammar authors (bundled via
tree-sitter-language-pack) whose work makes symbol and call extraction possible across 300+ languages. - Andrew Gallant (BurntSushi) and contributors — the
regex(withaho-corasick/memchr) andwalkdircrates behind Aden's structure-awaregrep, lint, and file discovery.
Storage, graph & data
- the fjall project (LSM-tree storage), petgraph (graph data
structures), and the serde community (David Tolnay and contributors) —
with
postcard,serde_json,serde_yaml,toml,blake3,fnv, anduuid.
CLI, async & protocol
- the clap, rayon, Tokio, and tower-lsp teams; ctrlc, notify, ureq, dirs, chrono; anyhow/thiserror (David Tolnay and contributors); and rmcp — the Model Context Protocol SDK from Anthropic and the MCP community.
These names are illustrative, not exhaustive, and many of these projects have
multiple owners. The complete and authoritative attribution for every one of
Aden's 350+ direct and transitive dependencies — each with its license — lives
in NOTICE.md (regenerate with aden licenses). If you maintain a project
Aden depends on and feel under-credited, that is an oversight we want to
correct — please open an issue and we will fix it.
The research/ tree contains documentation that Aden parses and queries, not
code it compiles or links — e.g. a secure-coding knowledge base summarizing
OWASP and MITRE CWE guidance. This material is under its own third-party
licenses: OWASP material under CC BY-SA 4.0 and CC BY 3.0; MITRE CWE content
under the MITRE CWE Terms of Use (a separate, non-Creative-Commons instrument).
Content is kept segregated from Aden's AGPL-3.0 source and never embedded in
any binary. Full citations, required notices, and trademark/non-endorsement
statements are in research/secure-coding/SOURCES.md and research/README.md.
"OWASP" is a trademark of the OWASP Foundation; "CWE" is a trademark of MITRE
Corporation. Their use here is nominative and implies no affiliation or
endorsement by the OWASP Foundation or MITRE Corporation.
A Dense Referential Context Compiler — Every token is load-bearing. Every edge is typed. Every anchor resolves.
Aden is designed for the future of software development: hybrid teams of humans and AI agents working together.