Skip to content

VoidAxiom/rag-nq

Repository files navigation

RAG NQ Showcase

Local RAG demo on sentence-transformers/NQ-retrieval with Qdrant retrieval, FastAPI endpoints, and a Vite + React + TypeScript web UI.

Setup

cd rag-nq-showcase
uv sync
cp .env.example .env

Edit .env only if you need to override defaults such as RAG_QDRANT_URL, RAG_QDRANT_COLLECTION, or generation provider settings.

Run Qdrant

docker compose up -d
uv run python -m src.scripts.doctor

To stop Qdrant without deleting stored vectors:

docker compose stop

Build Indexes

Use the all-in-one command for a normal local run:

uv run python -m src.scripts.build_indexes

Or run the stages explicitly:

uv run python -m src.ingestion.ingest_raw
uv run python -m src.scripts.ingest_chunks
uv run python -m src.scripts.index_dense
uv run python -m src.scripts.index_sparse

For a smaller smoke-test corpus, set RAG_MAX_PASSAGES in .env before building indexes.

Run the API

uv run uvicorn app.api.main:app --reload

Open:

  • API docs: http://127.0.0.1:8000/docs
  • Health: http://127.0.0.1:8000/api/health
  • Config: http://127.0.0.1:8000/api/config

Example retrieval request:

curl -s http://127.0.0.1:8000/api/retrieve \
  -H "Content-Type: application/json" \
  -d '{"query":"What is the capital of France?","mode":"hybrid","top_k":5}' \
  | python -m json.tool

Example query request:

curl -s http://127.0.0.1:8000/api/query \
  -H "Content-Type: application/json" \
  -d '{"query":"What is the capital of France?","mode":"hybrid","top_k":5,"generate":true}' \
  | python -m json.tool

Run the Web UI

Two modes — pick one.

Dev (with hot reload, Vite proxy → FastAPI):

cd app/web
npm install   # first time only
npm run dev

Then open http://localhost:5173. Vite proxies /api/* to FastAPI on http://127.0.0.1:8000, so start the API in a separate terminal first.

Production (FastAPI serves built bundle):

docker compose up -d qdrant
(cd app/web && npm install && npm run build)
RAG_WEB_DIST_PATH=app/web/dist uv run uvicorn app.api.main:app

Then open http://127.0.0.1:8000. FastAPI serves the SPA at / (with SPA fallback for arbitrary React Router routes) alongside the JSON endpoints under /api/*/api/health, /api/config, /api/retrieve, /api/query, /api/scoreboard, /api/components, and /api/eval_questions/{benchmark}. The (cd app/web && ...) subshell keeps the outer shell at the repo root so uvicorn's Python import path resolves app.api.main correctly and the RAG_WEB_DIST_PATH relative path resolves to <repo>/app/web/dist.

Pages (more added per phase):

  • / — Home / Ask
  • /scoreboard — master sortable table reading GET /api/scoreboard

Evaluate Retrieval

uv run python -m src.scripts.eval_retrieval \
  --k-values 1,5,10,20 \
  --modes dense,sparse,hybrid \
  --max-queries 200 \
  --output artifacts/retrieval_eval.json

Run Tests and Lint

uv run pytest
uv run ruff check .

About

Local-only, graph-augmented, iteratively-reasoning, self-corrective RAG showcase on Apple Silicon — pushing toward published SOTA on multi-hop QA (MuSiQue, 2WikiMultiHopQA, HotpotQA) with HippoRAG 2 + Adaptive routing + Search-o1 iterative reasoning + CRAG-style gating + NLI faithfulness verification.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors