Skip to content

PLAYi-io/DeRAG

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DeRAG — Dynamic Ecological Retrieval-Augmented Generation

A reference implementation of an AI agent that doesn't hallucinate absence.

Standard RAG searches a pre-indexed knowledge base for semantic similarity. DeRAG gives agents a lawful way to inhabit partial access. Powered silos afford coupling. Unpowered silos show signage but the gate won't open. The agent can name the gap.

This is inference, not occupation.

The Question

Among all lawful interactions, which ones are worth Taking?

This question is not asked. This question is not answered. This question stands.

What's Different

Standard chatbots are text generators. They always produce. When they don't know, they interpolate — that's hallucination.

A DeRAG agent is an adjudicator of action in an affordance landscape:

  • The field discloses what's available. Powered silos have their gates open. The agent can enter. Unpowered silos are visible from the street — the agent can read signage and floors, but cannot enter.
  • Access is a first-class condition. Not hidden middleware. The agent knows what it can and cannot see.
  • Yield is a valid response. If the field isn't offering, the answer is not to invent. The answer is to name the gap.
  • The Ecological Interface Law: A primitive may only return what the current field can witness. A function that reaches beyond its arguments is hallucinating.

The Point

The model isn't doing the work alone. The field is doing half the work.

A model sitting in a DeRAG field with the right material in front of it can do work that's outside its usual weight class. A model without that field has to guess from training data familiarity — and guesses from training data are exactly where hallucinations live.

Put a weaker model in the field and give it direct access to specific source material: it yields honestly on what it can't see, cites specific identifiers that only exist in this corpus, and produces merge-ready edits at exact locations. It doesn't do those things because it's elite. It does them because the architecture lets it read the actual file and gives it permission to say "that silo is dark — I can't answer from here."

Put a stronger model in the same field and the floor rises: deeper reasoning, better synthesis, real refactors instead of surgical one-liners. Same architecture, same silos, different brain — the field lifts whichever one you bring.

This is the practical claim: pick the brain that matches the work. Use a cheap fast model for navigation, honest yielding, and small targeted edits. Use a heavier model for architectural reasoning, creative translation, and work that requires holding many files in mind at once. Either way, the architecture situates the model in the specific material rather than leaving it to hallucinate from its training prior.

DeRAG isn't a brain. It's a field that brains can be situated in. The situation is what produces work that neither the brain nor the field could produce alone.

Run It

Requires: Python 3.10+ and a brain (see Brains below).

# Install
pip install fastapi uvicorn websockets

# Terminal 1 — start the server
python3 derag_server.py

# Terminal 2 — connect a brain
python3 bridge.py                              # claude CLI (default)
DERAG_BRAIN=gemini-api python3 bridge.py       # Gemini API (faster, with caching)

# Browser
open http://localhost:8090

Agent Selection

By default, the agent selects its own silos — it reads the skyline and picks which buildings to power for each query (two-pass). This is the Initiative effectivity: the agent structures the field.

To turn this off and let keyword scoring pick silos (one-pass, original behavior):

DERAG_AGENT_SELECTS=false python3 derag_server.py

With agent selection off, the field controls access. The agent sees what the keywords scored, not what it chose. Yield is still lawful — the agent can still name dark silos. It just can't choose which gates open.

Brains

The bridge is brain-agnostic. Two reference adapters ship with the distro; you can write your own — see BRAINS = {...} near the bottom of bridge.py.

claude-cli (default)

Spawns the claude CLI subprocess for each query. Easiest setup — no API key needed if you already have claude installed and authenticated. Cold subprocess startup adds 1-2s per query and there's no prompt caching, so this gets slow on big corpora.

python3 bridge.py
# Override the CLI command:
CLAUDE_CMD=/usr/local/bin/claude python3 bridge.py

Note on caching: The CLI adapter sends the full corpus on every call. There is no prompt caching. For big corpora this means high latency and token cost per turn. If you're doing serious work on a large corpus, use gemini-api instead — or write an anthropic-api adapter that hits the Anthropic API directly with prompt caching enabled. The adapter interface is simple (see "Writing your own adapter" below).

gemini-api (recommended for large corpora)

Direct HTTPS to Google's Generative Language API with explicit context caching (1-hour TTL). The corpus becomes a CachedContent on Gemini's side; subsequent queries reference it by ID and pay only the dynamic input + output cost. ~80% token reduction after the first query. Much faster on big corpora.

# Get a key at https://aistudio.google.com/app/apikey
echo "YOUR_KEY" > /tmp/.gemini_key
chmod 600 /tmp/.gemini_key

DERAG_BRAIN=gemini-api python3 bridge.py

# Or via env var:
GEMINI_API_KEY=YOUR_KEY DERAG_BRAIN=gemini-api python3 bridge.py

# Override the model (default: gemini-2.5-flash):
GEMINI_MODEL=gemini-2.5-pro DERAG_BRAIN=gemini-api python3 bridge.py

The bridge prints cache status on each turn so you can see whether you got a cache hit and how many tokens were charged vs. cached. Watch for [gemini cache created] on the first query and [gemini cache hit] on subsequent queries within the hour.

Writing your own adapter

Any function (corpus: str, query: str) -> str (sync or async) is a brain. Add it to the BRAINS dict in bridge.py:

async def call_my_brain(corpus, query):
    # call ollama, llama.cpp, OpenAI, whatever
    return response_text

BRAINS = {
    "claude-cli": call_claude_cli,
    "gemini-api": call_gemini_api,
    "my-brain": call_my_brain,
}

Then run DERAG_BRAIN=my-brain python3 bridge.py.

Patches

Drop markdown files in patches/. Each patch is a silo in the city.

Format:

---
keywords: ["your", "keyword", "list"]
description: "One-line description"
---

[SILO: patch-name]
SIGNAGE: What's visible from the street — name, description, purpose.

FLOORS:
  Topic Area One
  Topic Area Two

[FOYER: patch-name]
Optional handling instructions for the agent. The agent reads this before
entering the interior — it tells the brain how this particular silo wants
to be worked. Quote or paraphrase? Preserve voice or distill? Surface
protocols with or without ground? Foyers are binding, not advice. The user
never sees them. Remove the block entirely if you don't need it.
[/FOYER]

[INTERIOR: patch-name]

## Topic Area One

Your content here. This is only accessible when the silo is powered.

## Topic Area Two

More content.

[/INTERIOR]

A template is included at patches/_template.md — copy it and fill in.

Reload patches by restarting the server, or refreshing /patches in the UI (the server re-reads on each disclosure).

Architecture

┌─────────┐       WebSocket       ┌────────────────┐
│  User   │ ◄──────────────────── │  DeRAG Server  │
│ Browser │                        │  (Python)      │
└─────────┘                        └────┬───────────┘
                                        │
                                        │ brain_call
                                        │ (corpus + query)
                                        ▼
                                   ┌──────────┐
                                   │  Bridge  │
                                   │  (Python)│
                                   └────┬─────┘
                                        │
                          ┌─────────────┼─────────────┐
                          │             │             │
                          ▼             ▼             ▼
                   ┌──────────┐  ┌──────────┐  ┌──────────┐
                   │ claude   │  │ Gemini   │  │  your    │
                   │  CLI     │  │   API    │  │ adapter  │
                   └──────────┘  └──────────┘  └──────────┘
                    (default)    (with cache)

The server holds the field. The bridge routes the corpus through a brain adapter you choose at startup. No API keys in the server. Your patches stay on your machine. Pick the brain that fits the conversation — claude-cli for offline / no-key setups, gemini-api for big corpora and snappy turns.

What This Is Not

This is not a retrieval system. It does not use embeddings. It does not do semantic search. Keyword scoring is intentional and minimal — the point is not to find the "right" patch. The point is to give the agent a lawful way to inhabit what the field offers and name what it doesn't.

This is not a production system. It is a reference implementation. A cleaner substrate. A way to show that agents can have relationships with knowledge, not just search through it.

The Full Thing

DeRAG is the open-source interface layer. The deeper architecture runs under it, at PLAYi.io.

Workspaces

DeRAG ships with four parallel workspaces. Each is a complete installation with identical code (derag_server.py, bridge.py, static/index.html) and different corpora in patches/.

Workspace Corpus Silos
DeRAG/ Reid + Hume + Descartes + Hobbes 45
DeRAG-reid-james/ Reid + William James Principles of Psychology 27
DeRAG-verne/ Jules Verne — 6 novels, act-split 19
DeRAG-hermes/ Nous Research Hermes-Agent source code (MIT) 28

To switch workspaces, stop the running server and start from a different directory:

cd ~/DeRAG-hermes        # or whichever workspace
python3 derag_server.py  # serves that workspace's patches

Workspace file structure

DeRAG-{name}/
├── derag_server.py       # server (identical across workspaces)
├── bridge.py             # bridge (identical across workspaces)
├── static/index.html     # UI (identical across workspaces)
├── patches/              # THIS IS WHAT CHANGES per workspace
│   ├── _template.md      # starter template for new patches
│   ├── MANIFEST.md       # silo inventory (skipped by loader)
│   └── *.md              # your silos — one file per silo
├── thread.json           # conversation state (per workspace, gitignored)
├── partner_curations.txt # coupling sediment (per workspace, gitignored)
├── README.md
├── LICENSE
└── requirements.txt

The code lives in DeRAG-core/ — each workspace symlinks to it. Edit the core once, all workspaces see the change. No manual sync needed.

DeRAG-core/derag_server.py    ← single source of truth
DeRAG-hermes/derag_server.py  → ../DeRAG-core/derag_server.py  (symlink)
DeRAG-verne/derag_server.py   → ../DeRAG-core/derag_server.py  (symlink)

The patches are independent — each workspace is its own corpus. thread.json and partner_curations.txt are per-workspace session state and should never be deleted.

Building your own workspace

cp -a DeRAG-reid-james DeRAG-myproject   # copy any workspace for the code
cd DeRAG-myproject
mkdir -p patches.old && mv patches/*.md patches.old/  # preserve, don't delete
cp patches.old/_template.md patches/                   # keep the template
# Now add your own .md silos to patches/
python3 derag_server.py

The Default Corpus

The DeRAG/ workspace ships with Thomas Reid's Inquiry Into the Human Mind on the Principles of Common Sense (1764) plus selections from Hume, Descartes, and Hobbes. Reid is the philosophical ground DeRAG sits on: Scottish Common Sense Realism — direct perception, no representational middleman, the world as it is rather than as we compute it.

An agent running DeRAG against Reid is engaging with the philosophy that justifies its own architecture. The loop is clean.

Run it. Ask it about the distinction between sensation and perception. Ask it why hardness is a primary quality and colour is not. Ask it what "natural signs" means. Then ask it something unpowered and watch it name the gap honestly.

License

MIT. The code is yours. The Reid corpus is public domain. Build whatever.

Contact

Ronald D Watson — ron@PLAYi.ioPLAYi.io


Built in an RV in Dade City, FL.

About

Dynamic Ecological Retrieval-Augmented Generation | agents that inhabit knowledge instead of searching it

Topics

Resources

Stars

Watchers

Forks

Contributors