AexRAG - Self-Hosted Agentic RAG Platform

AexRAG is a production-ready, self-hosted Agentic RAG (Retrieval-Augmented Generation) platform built in Rust. It combines powerful LLMs with vector search and memory systems to create intelligent agents that can reason and retrieve knowledge.

✨ Features

🤖 Agentic AI: Full ReAct (Reasoning + Acting) loop implementation
📚 RAG (Retrieval-Augmented Generation): Vector search with Qdrant for semantic retrieval
🧠 Advanced Memory System:
- Working memory (recent conversation context)
- Episodic memory (compressed conversation summaries)
- Semantic memory (pgvector-based semantic search)
🎯 Multiple Response Formats: Text, Markdown, or structured JSON with schema validation
🔌 Multi-Provider Support: OpenAI, Anthropic, Ollama (local models)
📊 Beautiful Dashboard: Single-page embedded HTML dashboard (no separate frontend build)
📄 Document Management: Upload and manage PDFs, text files, and markdown documents
🔐 Secure: API key authentication, encrypted provider keys
🐳 Easy Deployment: Single Docker image + docker-compose
⚡ High Performance: Built in Rust with async/await throughout

🏗️ Architecture

┌─────────────────────────────────────────────────────────────┐
│                        AexRAG Core                           │
│                                                             │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐    │
│  │  Agent Loop  │  │   Retrieval  │  │    Memory    │    │
│  │   (ReAct)    │→ │   (Qdrant)   │→ │  (pgvector)  │    │
│  └──────────────┘  └──────────────┘  └──────────────┘    │
│         ↓                                                  │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐    │
│  │     Tools    │  │  LLM Provider│  │   Formatter  │    │
│  │   Registry   │  │   (Multi)    │  │ (text/json)  │    │
│  └──────────────┘  └──────────────┘  └──────────────┘    │
└─────────────────────────────────────────────────────────────┘
           │                    │                    │
           ▼                    ▼                    ▼
    ┌───────────┐        ┌───────────┐       ┌───────────┐
    │ Postgres  │        │  Qdrant   │       │  OpenAI   │
    │ (pgvector)│        │  Vector   │       │ Anthropic │
    │           │        │    DB     │       │  Ollama   │
    └───────────┘        └───────────┘       └───────────┘

🚀 Quick Start

Prerequisites

Rust 1.83+ - Install from https://rustup.rs
Docker & Docker Compose - For PostgreSQL and Qdrant

Setup (< 5 minutes)

cd nexus

# Start dependencies
make setup-local

# Run AexRAG
make dev

AexRAG will:

Auto-run database migrations
Generate an API key (printed to console - save this!)
Start on http://localhost:3000

Access the Dashboard

Open http://localhost:3000 and enter your API key.

Alternative: Full Docker Build

See SETUP_INSTRUCTIONS.md for Docker deployment options.

📖 Usage Guide

Creating an LLM Provider

Before creating agents, you need to configure at least one LLM provider:

curl -X POST http://localhost:3000/api/providers \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "openai",
    "display_name": "OpenAI GPT-4",
    "api_key": "sk-...",
    "base_url": "https://api.openai.com",
    "default_model": "gpt-4"
  }'

Supported Providers:

openai - OpenAI API
anthropic - Anthropic Claude API
ollama - Local Ollama instance (set base_url to http://localhost:11434)

Creating a Knowledge Base

curl -X POST http://localhost:3000/api/knowledge-bases \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Product Documentation",
    "description": "All product docs and guides",
    "embedding_provider_id": "PROVIDER_UUID",
    "embedding_model": "text-embedding-3-small",
    "chunk_size": 512,
    "chunk_overlap": 64
  }'

Uploading Documents

curl -X POST http://localhost:3000/api/knowledge-bases/KB_UUID/documents \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F "file=@/path/to/document.pdf"

Supported formats: PDF, TXT, MD (more coming soon)

Document processing happens in the background. Check status via the API.

Creating an Agent

curl -X POST http://localhost:3000/api/agents \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Customer Support Agent",
    "description": "Helps answer customer questions",
    "provider_id": "PROVIDER_UUID",
    "model": "gpt-4",
    "system_prompt": "You are a helpful customer support agent. Use the knowledge base to answer questions accurately.",
    "temperature": 0.7,
    "max_tokens": 2048,
    "response_format": "markdown",
    "max_retrieval_chunks": 5,
    "retrieval_threshold": 0.7,
    "max_tool_iterations": 5
  }'

Linking Knowledge Bases to Agents

# Link a knowledge base to an agent
curl -X POST http://localhost:3000/api/agents/AGENT_UUID/knowledge-bases \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "knowledge_base_id": "KB_UUID"
  }'

Querying an Agent

curl -X POST http://localhost:3000/api/query \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "agent_id": "AGENT_UUID",
    "query": "What is our return policy?",
    "session_id": "user-123-session"
  }'

Response:

{
  "response": "Our return policy allows...",
  "sources": [
    {"filename": "returns.pdf", "score": 0.92}
  ],
  "latency_ms": 1234,
  "tokens": {
    "input_tokens": 450,
    "output_tokens": 120
  }
}

Session Management

Sessions maintain conversation context. Use the same session_id for multi-turn conversations:

# First message
curl -X POST http://localhost:3000/api/query \
  -d '{"agent_id": "...", "query": "Hello", "session_id": "session-1"}'

# Follow-up (remembers context)
curl -X POST http://localhost:3000/api/query \
  -d '{"agent_id": "...", "query": "What did I just ask?", "session_id": "session-1"}'

View session history:

curl http://localhost:3000/api/sessions/session-1/messages \
  -H "Authorization: Bearer YOUR_API_KEY"

🧪 Advanced Features

JSON Response Format with Schema Validation

Create an agent that returns structured data:

{
  "name": "Structured Data Agent",
  "response_format": "json",
  "json_schema": {
    "type": "object",
    "properties": {
      "answer": {"type": "string"},
      "confidence": {"type": "number"},
      "sources": {"type": "array", "items": {"type": "string"}}
    },
    "required": ["answer", "confidence"]
  }
}

Memory Compression

After 20 conversation turns, AexRAG automatically:

Summarizes the oldest 10 messages using the LLM
Embeds the summary
Stores it as a semantic memory block
Deletes the old messages

These summaries are retrieved on future queries for long-term context.

📊 API Reference

Endpoints

Method	Endpoint	Description
GET	`/api/health`	Health check
GET	`/api/stats`	Platform statistics
POST	`/api/query`	Query an agent
GET	`/api/query-logs`	View query logs
GET/POST/PUT/DELETE	`/api/agents`	Manage agents
GET/POST/DELETE	`/api/knowledge-bases`	Manage knowledge bases
POST	`/api/knowledge-bases/:id/documents`	Upload documents
GET/POST	`/api/providers`	Manage LLM providers
GET	`/api/tools`	List available tools
PUT	`/api/agents/:id/tools/:name`	Configure agent tools
GET/DELETE	`/api/sessions`	Manage sessions
POST	`/api/keys`	Create new API key

All protected endpoints require Authorization: Bearer YOUR_API_KEY header.

🛠️ Development

Local Development (without Docker)

# Start PostgreSQL with pgvector
docker run -d -p 5432:5432 \
  -e POSTGRES_USER=nexus \
  -e POSTGRES_PASSWORD=nexus \
  -e POSTGRES_DB=nexus \
  pgvector/pgvector:pg16

# Start Qdrant
docker run -d -p 6333:6333 -p 6334:6334 qdrant/qdrant

# Create .env file
cp .env.example .env

# Run migrations and start AexRAG
cargo run

Running Tests

cargo test

Building for Production

cargo build --release
./target/release/nexus

🔒 Security Considerations

Change NEXUS_SECRET in production - this encrypts stored API keys
Use HTTPS in production (put behind nginx/traefik)
Rotate API keys regularly
Configure tool restrictions (e.g., allowed domains for http_call)
Review query logs for suspicious activity
Backup your database regularly

📈 Performance Tuning

Database

-- Add indexes for common queries
CREATE INDEX idx_query_logs_session_created ON query_logs(session_id, created_at DESC);
CREATE INDEX idx_memory_embedding_ops ON memory_blocks USING ivfflat (embedding vector_cosine_ops);

Qdrant

Configure collection parameters for better performance:

// In create_collection, adjust:
quantization_config: Some(QuantizationConfig::Scalar(...))
hnsw_config: Some(HnswConfig { m: 16, ef_construct: 100, ... })

Connection Pools

Adjust in db.rs:

PgPoolOptions::new()
    .max_connections(100)  // Increase for high load
    .acquire_timeout(Duration::from_secs(5))

🐛 Troubleshooting

"Connection refused" on startup

Ensure PostgreSQL and Qdrant are running:

docker-compose ps

Migrations fail

Reset database (⚠️ destroys data):

docker-compose down -v
docker-compose up -d

Document indexing stuck on "indexing"

Check logs:

docker-compose logs -f nexus

Ensure embedding provider has valid API key.

High latency

Check avg_latency_ms in stats dashboard
Review query logs for slow queries
Consider using a faster model
Reduce max_retrieval_chunks

🤝 Contributing

Contributions welcome! Please:

Fork the repository
Create a feature branch
Write tests for new functionality
Submit a pull request

📄 License

MIT License - see LICENSE file for details

🙏 Acknowledgments

Built with:

Axum - Web framework
SQLx - Async SQL
Qdrant - Vector database
pgvector - Postgres vector extension

Made with ⚡ by the AexRAG team

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
migrations		migrations
scripts		scripts
src		src
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
API_DOCS.md		API_DOCS.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
QUICKSTART.md		QUICKSTART.md
README.md		README.md
SETUP_INSTRUCTIONS.md		SETUP_INSTRUCTIONS.md
build-local.sh		build-local.sh
docker-compose.yml		docker-compose.yml

Folders and files

Latest commit

History

Repository files navigation

AexRAG - Self-Hosted Agentic RAG Platform

✨ Features

🏗️ Architecture

🚀 Quick Start

Prerequisites

Setup (< 5 minutes)

Access the Dashboard

Alternative: Full Docker Build

📖 Usage Guide

Creating an LLM Provider

Creating a Knowledge Base

Uploading Documents

Creating an Agent

Linking Knowledge Bases to Agents

Querying an Agent

Session Management

🧪 Advanced Features

JSON Response Format with Schema Validation

Memory Compression

📊 API Reference

Endpoints

🛠️ Development

Local Development (without Docker)

Running Tests

Building for Production

🔒 Security Considerations

📈 Performance Tuning

Database

Qdrant

Connection Pools

🐛 Troubleshooting

"Connection refused" on startup

Migrations fail

Document indexing stuck on "indexing"

High latency

🤝 Contributing

📄 License

🙏 Acknowledgments

AexRAG-Enterprise

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages