Local RAG demo on sentence-transformers/NQ-retrieval with Qdrant retrieval,
FastAPI endpoints, and a Vite + React + TypeScript web UI.
cd rag-nq-showcase
uv sync
cp .env.example .envEdit .env only if you need to override defaults such as RAG_QDRANT_URL,
RAG_QDRANT_COLLECTION, or generation provider settings.
docker compose up -d
uv run python -m src.scripts.doctorTo stop Qdrant without deleting stored vectors:
docker compose stopUse the all-in-one command for a normal local run:
uv run python -m src.scripts.build_indexesOr run the stages explicitly:
uv run python -m src.ingestion.ingest_raw
uv run python -m src.scripts.ingest_chunks
uv run python -m src.scripts.index_dense
uv run python -m src.scripts.index_sparseFor a smaller smoke-test corpus, set RAG_MAX_PASSAGES in .env before
building indexes.
uv run uvicorn app.api.main:app --reloadOpen:
- API docs:
http://127.0.0.1:8000/docs - Health:
http://127.0.0.1:8000/api/health - Config:
http://127.0.0.1:8000/api/config
Example retrieval request:
curl -s http://127.0.0.1:8000/api/retrieve \
-H "Content-Type: application/json" \
-d '{"query":"What is the capital of France?","mode":"hybrid","top_k":5}' \
| python -m json.toolExample query request:
curl -s http://127.0.0.1:8000/api/query \
-H "Content-Type: application/json" \
-d '{"query":"What is the capital of France?","mode":"hybrid","top_k":5,"generate":true}' \
| python -m json.toolTwo modes — pick one.
Dev (with hot reload, Vite proxy → FastAPI):
cd app/web
npm install # first time only
npm run devThen open http://localhost:5173. Vite proxies /api/* to FastAPI on
http://127.0.0.1:8000, so start the API in a separate terminal first.
Production (FastAPI serves built bundle):
docker compose up -d qdrant
(cd app/web && npm install && npm run build)
RAG_WEB_DIST_PATH=app/web/dist uv run uvicorn app.api.main:appThen open http://127.0.0.1:8000. FastAPI serves the SPA at / (with
SPA fallback for arbitrary React Router routes) alongside the JSON
endpoints under /api/* — /api/health, /api/config, /api/retrieve,
/api/query, /api/scoreboard, /api/components, and
/api/eval_questions/{benchmark}. The (cd app/web && ...) subshell
keeps the outer shell at the repo root so uvicorn's Python import path
resolves app.api.main correctly and the RAG_WEB_DIST_PATH relative
path resolves to <repo>/app/web/dist.
Pages (more added per phase):
/— Home / Ask/scoreboard— master sortable table readingGET /api/scoreboard
uv run python -m src.scripts.eval_retrieval \
--k-values 1,5,10,20 \
--modes dense,sparse,hybrid \
--max-queries 200 \
--output artifacts/retrieval_eval.jsonuv run pytest
uv run ruff check .