Skip to content

vishtechie07/document-decision-engine

Repository files navigation

Document Decision Engine

Local-first document RAG + decision API: ingest PDF (digital + OCR for scans), DOCX, legacy DOC (antiword), and TXT; chunk + embed into PostgreSQL (pgvector); ask with citations or run a structured decision with a small rule engine—all via FastAPI. Ollama provides Mistral and nomic-embed-text in Docker—no OpenAI, Azure, AWS, or Pinecone in this design. docker compose runs Uvicorn with --reload for a smooth dev loop.

CI

Contents

Portfolio snapshot

What you can show How this repo supports it
RAG (retrieve → ground answer → citations) POST /api/documents/* + POST /api/decisions/ask
Guardrails beyond “the model said so” POST /api/decisions/decide + app/rules/rule_engine.py
Local / air-gapped story Ollama + Postgres in Docker; no cloud model keys
Reproducible for reviewers docker compose, scripts/trial_run.*, GitHub Actions CI

“Hosting” here means publishing this repository on GitHub and letting others clone + run with Docker on their machine. This project does not ship a production cloud deploy (Kubernetes, serverless, etc.); that is an intentional scope boundary for a portfolio piece.

Monorepo: If this project lives under a parent folder (e.g. ml-demos/doc-decision-engine), either open this folder as the Git root of a dedicated repo, or change GitHub Actions defaults.run.working-directory / paths so CI runs from the correct subdirectory. The included workflow assumes the repository root is the doc-decision-engine directory.

What it does

  • Uploadparse (Tesseract OCR when text is thin; .doc via antiword) → word chunksembed (nomic-embed-text) → store in pgvector.
  • POST /api/decisions/ask — embed question, similarity search, prompt Mistral with retrieved excerpts, return citations.
  • POST /api/decisions/decide — same retrieval + JSON-shaped LLM output + deterministic rules (confidence, keywords, sections, chunk count).

Tech stack

Layer Technology Notes
Runtime Python 3.11 See Dockerfile
API FastAPI 0.104.x, Uvicorn 0.24.x OpenAPI at /docs, /redoc
Validation / settings Pydantic v2, pydantic-settings Env-driven app/config.py
Database PostgreSQL 16 + pgvector, SQLAlchemy 2.0.x, psycopg2 Vectors in Postgres
HTTP client httpx Ollama /api/embed, /api/chat
Embeddings & LLM Ollama Default models: nomic-embed-text, mistral
PDF pdfplumber, pdf2image, pytesseract, Pillow, poppler OCR optional via env
Word python-docx, antiword (.doc) antiword in Docker image
Tests pytest 7.4.x Against real Postgres in CI (see tests/)

Pinned versions: requirements.txt.

Architecture

System context

flowchart LR
  subgraph Client
    U[HTTP client / browser / scripts]
  end
  subgraph Docker["Docker Compose"]
    A[FastAPI app]
    P[(PostgreSQL + pgvector)]
    O[Ollama]
  end
  U -->|REST| A
  A -->|SQL + vectors| P
  A -->|embed + chat| O
Loading

Ingest pipeline (conceptual)

flowchart LR
  UP[Upload API] --> PARSE[Parser PDF/DOCX/DOC/TXT]
  PARSE --> CH[Chunk words]
  CH --> EMB[Embed via Ollama]
  EMB --> VS[(pgvector + rows)]
Loading

Ask vs decide

flowchart TB
  Q[Question + doc IDs] --> RET[Top-k similarity]
  RET --> CTX[Build prompt + excerpts]
  CTX --> LLM[Ollama chat]
  LLM --> ASK[Answer + citations]
  CTX --> DEC[Structured JSON + rule engine]
  DEC --> OUT[Decision + flags]
Loading

ASCII overview (portable)

+-----------+     +----------+     +-----------+
|  Upload   |---->|  Parse   |---->|  Chunk &  |
|  (API)    |     | PDF/DOCX |     |   Embed   |
+-----------+     +----------+     +-----+-----+
                                         |
                                         v
+-----------+     +----------+     +-----------+
| Decision  |<----|   LLM    |<----|  Vector   |
| Response  |     | (Ollama) |     |  Search   |
| + Rules   |     |          |     | pgvector  |
+-----------+     +----------+     +-----------+

Docker hardening (high level): API container runs as non-root (uid 1000), no-new-privileges, cap_drop: ALL, pids_limit, mem_limit; Postgres/Ollama have no-new-privileges and memory caps. Details: SECURITY.md.

Repository layout

Path Purpose
app/main.py FastAPI app, lifespan, /health
app/config.py Pydantic settings (env)
app/api/documents.py Upload, process, list, get, delete documents
app/api/decisions.py /ask, /decide, history
app/db/ SQLAlchemy engine, models, init
app/services/ Parser, embeddings, LLM, vector store, decision orchestration
app/rules/rule_engine.py Deterministic checks merged with LLM output
app/prompts.py, app/rag_utils.py Shared prompts and citation helpers
app/models/schemas.py Request/response models
tests/ Pytest (Postgres-backed)
sample_docs/ Demo files (e.g. demo_policy.txt)
uploads/ Runtime-only upload storage (empty in git except .gitignore / .gitkeep)
scripts/trial_run.* End-to-end smoke (health → upload → process → RAG)
scripts/smoke_verify.* Lightweight API checks
docker-compose.yml App + includes infra
docker-compose.infra.yml Postgres, Ollama, network, volumes
.github/workflows/ci.yml Build image, start db, pytest

HTTP API (summary)

Method Path Description
GET /health Liveness + DB connectivity
POST /api/documents/upload Multipart file upload
POST /api/documents/process/{id} Parse, chunk, embed, index
GET /api/documents/ List documents
GET /api/documents/{id} Get metadata
DELETE /api/documents/{id} Delete document + vectors
POST /api/decisions/ask RAG question + citations
POST /api/decisions/decide Structured decision + rules
GET /api/decisions/history Decision history (see handler for shape)

Full schemas: http://localhost:8000/docs when the stack is running.

Prerequisites & quick start

Use the doc-decision-engine directory (where docker-compose.yml lives).

docker compose up -d --build
docker compose exec ollama ollama pull mistral
docker compose exec ollama ollama pull nomic-embed-text

Health:

curl http://localhost:8000/health

PowerShell: curl.exe http://localhost:8000/health or Invoke-RestMethod http://localhost:8000/health

chmod +x scripts/smoke_verify.sh && ./scripts/smoke_verify.sh
powershell -ExecutionPolicy Bypass -File scripts/smoke_verify.ps1

Postgres + Ollama only

Infra is split into docker-compose.infra.yml (doc-engine-postgres, doc-engine-ollama, network doc-engine-net, named volumes). The main docker-compose.yml includes that file.

docker compose -f docker-compose.infra.yml up -d
docker compose -f docker-compose.infra.yml exec ollama ollama pull mistral
docker compose -f docker-compose.infra.yml exec ollama ollama pull nomic-embed-text

Automated trial

After models are pulled:

chmod +x scripts/trial_run.sh && ./scripts/trial_run.sh
powershell -ExecutionPolicy Bypass -File scripts/trial_run.ps1

Optional: ENGINE_URL=http://host:port for a non-default API base URL.

Manual curl demo

curl -s -X POST http://localhost:8000/api/documents/upload \
  -F "file=@sample_docs/demo_policy.txt" \
  -F "document_type=policy"
# Then process/{id} and /api/decisions/ask — see Swagger for schemas.

Structured decision example:

curl -s -X POST http://localhost:8000/api/decisions/decide \
  -H "Content-Type: application/json" \
  -d '{
    "document_ids": [1],
    "decision_type": "compliance",
    "question": "Does this policy include signatures and liability language?",
    "rules": {"required_sections": ["signature", "liability", "date"]}
  }'

Configuration

Copy .env.example.env and adjust. Compose injects env in docker-compose.yml; the app reads variables via app/config.py (e.g. DATABASE_URL, OLLAMA_BASE_URL, EMBEDDING_MODEL, LLM_MODEL, MAX_UPLOAD_BYTES, chunking, PDF OCR, ANTIWORD_BIN).

Never commit .env with real secrets (gitignored).

Services & ports

Service URL / port
API http://localhost:8000
Postgres host 5433 → container 5432
Ollama http://localhost:11434

Tests & CI

Locally (tests need Postgres; Ollama not required):

docker compose up -d db
docker compose run --rm --no-deps app pytest -q

With full stack: docker compose exec app pytest -q

GitHub Actions (.github/workflows/ci.yml): builds the app image, starts db, runs pytest via docker compose run --rm --no-deps app (does not pull Ollama).

Troubleshooting

Symptom What to check
/health"database": false docker compose ps; wait for db healthy
503 + Ollama message docker compose logs ollama; ollama pull mistral / nomic-embed-text; ollama list
Empty PDF text PDF_OCR_ENABLED (default true); Tesseract/poppler in image; PDF_OCR_MAX_PAGES
.doc fails on host Use Docker image (antiword included)
chunk_overlap error chunk_size > chunk_overlap (words)
Upload too large MAX_UPLOAD_BYTES (see app/config.py bounds)

Publishing to GitHub

What this repo already includes for a public portfolio repo:

Item Location
License LICENSE (MIT)
Security notes SECURITY.md
Contributing CONTRIBUTING.md
CI .github/workflows/ci.yml
Env template .env.example (no secrets)
Git ignore .gitignore — excludes .env, uploads/* (keeps uploads/.gitkeep), caches, venvs

Before git push: confirm .env is not staged; remove any accidental uploads/ artifacts (runtime files should stay ignored). Initialize git inside doc-decision-engine if this folder is the repo root, or configure CI paths if the repo root is higher up.

Security

See SECURITY.md for container constraints and guidance before exposing the API beyond localhost (TLS, auth, rotating DB credentials, avoiding --reload in production).

Roadmap & future implementations

Current boundaries

  • PDF_OCR_MAX_PAGES caps very large OCR jobs.
  • OCR quality depends on scans; add tesseract-ocr-<lang> in the Dockerfile for non-English.
  • Single Ollama node; first inference can be slow (model load).
  • Citation quality depends on chunking and retrieval; ambiguous questions may miss the best span.
  • Tests target Postgres (no SQLite + pgvector shortcut).

Possible next steps (not implemented; portfolio extensions)

  • Schema migrations (e.g. Alembic) instead of ad-hoc init_db.
  • Authentication / tenancy (API keys, OAuth2 proxy) for shared deployments.
  • Async job queue for long ingest or batch embedding; progress Webhooks or SSE.
  • Hybrid retrieval (BM25 + vector) and optional re-ranking.
  • Observability: structured logging, metrics, tracing (OpenTelemetry).
  • CI: contract tests with mocked Ollama HTTP for faster feedback without dropping Docker build smoke tests.
  • Smaller models documented for CPU-only or low-RAM laptops.
  • Rate limiting at reverse proxy; WAF if internet-facing.

Contributing & license

About

Local-first document RAG API: FastAPI, PostgreSQL/pgvector, Ollama embeddings & chat—Docker-ready demo with citations and a small rule engine.

Topics

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors