fedacq‑rag‑chatbot

Federal Acquisition Regulation Retrieval‑Augmented Generation (RAG) Chatbot

A production‑ready Retrieval‑Augmented Generation (RAG) system that provides fast, accurate, citation‑backed answers to questions about the Federal Acquisition Regulation (FAR) and Defense Federal Acquisition Regulation Supplement (DFARS). Designed for federal contractors, acquisition professionals, and businesses navigating the federal market.

Background

Federal contracting regulations are complex, distributed across thousands of pages of FAR/DFARS text, and updated frequently. Professionals need fast, reliable, context‑aware answers to support:

Capture strategy
Proposal development
Compliance reviews
Contract administration
Market entry decisions

This project automates that research using a modern RAG pipeline.

User Story

As a federal contractor, federal employee, or business entering the federal market, I need quick, accurate answers to questions about the regulatory landscape so that I can make informed business strategy decisions.

Acceptance Criteria

Natural‑language question interface
Retrieval of relevant FAR/DFARS sections
Accurate, citation‑backed responses
Up‑to‑date regulatory text
Reproducible end‑to‑end pipeline
Deployable locally or via Docker

Technical Approach

Data Source

FAR and DFARS pulled from official .dita XML repositories
Parsed into structured documents
Metadata normalized for retrieval

Embeddings + Vector Store

HuggingFace Embeddings (BGE‑small)
ChromaDB persistent vector store
Chunking via LlamaIndex SentenceSplitter

Retrieval‑Augmented Generation

LlamaIndex orchestration
Custom ONNX Runtime GenAI LLM (Phi‑4‑mini‑instruct‑onnx)
ChromaDB for retrieval
BGE-small embeddings
Query engine configured with top‑k similarity search

Why ONNX Runtime GenAI?

Fast inference on CPU (no GPU required)
No PyTorch dependency
No external API calls (fully local)
Smaller memory footprint
Production-ready kernels optimized by Microsoft
Works well with smaller models like Phi-4-mini-instruct-onnx

Application Layer

Flask application (src.app)
Served as an ASGI app via Hypercorn
/chat_stream endpoint with token streaming (Server‑Sent Events)
Lightweight HTML/JS/CSS UI served from src/app/static/

Deployment

Local Python environment
Docker container (Hypercorn ASGI server)
GitHub Actions CI pipeline (optional)

Architecture Overview

User → Flask (ASGI via Hypercorn) → Query Engine → LlamaIndex → ChromaDB → FAR/DFARS DITA Source

Pipeline:

Clone FAR/DFARS repos
Parse .dita XML
Chunk + embed
Store in ChromaDB
Serve via Flask API (ASGI)
LLM generates answers with citations

Repository Structure

fedacq-rag-chatbot/
│
├── src/
│   ├── app/
│   │   ├── __init__.py
│   │   ├── api.py
│   │   ├── config.py
│   │   ├── asgi.py
│   │   └── static/
│   │       ├── index.html
│   │       ├── app.js
│   │       └── styles.css
│   │
│   ├── rag/
│   │   ├── __init__.py
│   │   │
│   │   ├── indexing/
│   │   │   ├── __init__.py
│   │   │   ├── builder.py
│   │   │   └── loader.py
│   │   │
│   │   ├── llm/
│   │   │   ├── __init__.py
│   │   │   └── models.py
│   │   │
│   │   └── retrieval/
│   │       ├── __init__.py
│   │       ├── metadata.py
│   │       ├── parser_dita.py
│   │       └── query_engine.py
│   │
│   └── scripts/
│       └── build_index.py
│
├── data/
│   ├── chroma/      # Persistent ChromaDB index (Git LFS)
│   └── regs/        # FAR/DFARS cloned repositories (Git LFS)
│
├── tests/
│   ├── test_indexing.py
│   ├── test_llm.py
│   ├── test_metadata.py
│   ├── test_parser.py
│   └── test_query_engine.py
│
├── docker/
│   ├── docker-compose.yml
│   └── local.env
│
├── .dockerignore
├── .gitattributes
├── .gitignore
├── Dockerfile
├── Makefile
├── pyproject.toml
├── pytest.ini
├── requirements.txt
└── requirements.lock

Setup & Installation

1. Clone the repository

git clone https://github.com/PWDevens/fedacq-rag-chatbot.git
cd fedacq-rag-chatbot

2. Install Git LFS (required for Chroma index)

git lfs install
git lfs pull

3. Create & activate environment

python -m venv .venv
source .venv/bin/activate

python -m venv .venv
.\.venv\Scripts\Activate.ps1

4. Install project + dependencies

pip install -e .
pip install -r requirements.txt

huggingface-cli download microsoft/Phi-4-mini-instruct-onnx \
  --include cpu_and_mobile/cpu-int4-rtn-block-32-acc-level-4/* \
  --local-dir .

This creates cpu_and_mobile/cpu-int4-rtn-block-32-acc-level-4/ with the ONNX model files.

5. Verify the prebuilt RAG index

Ensure these folders contain data:

data/chroma/
data/regs/

If empty:

git lfs pull

Building the RAG Index (Local Only)

The RAG index is not built in CI due to the size of FAR/DFARS and the cost of embedding.

To rebuild locally:

First, edit .gitignore and comment out 'data/chroma/chroma.sqlite3'; then,

python -m scripts.build_index

This will:

Clone FAR + DFARS into data/regs/
Parse .dita XML files
Chunk and embed text
Write a new ChromaDB index into data/chroma/

After rebuilding, commit the updated index using Git LFS:

git add data/chroma 
git commit -m "Rebuild RAG index"
git push

Running the Application

Flask Development Server (local)

python -m flask --app src.app run --host=0.0.0.0 --port=7860

Hypercorn (ASGI, local production-like)

hypercorn --bind 0.0.0.0:7860 src.app.asgi:app

Docker

docker build -t fedacq-rag-chatbot .
docker run -d -p 7860:7860 --name ragbot fedacq-rag-chatbot

The container uses Hypercorn to serve the ASGI‑wrapped Flask app.

Testing

pytest -q

CI/CD Workflow (GitHub Actions)

The CI pipeline performs:

LFS checkout (downloads prebuilt index)
Python environment setup
Dependency installation
Test execution
Docker image build
Docker artifact upload

The CI pipeline does not rebuild the RAG index.
Index building is performed locally and versioned via Git LFS.

Usage Example

Send a POST request:

curl -X POST http://localhost:8080/chat \
  -H "Content-Type: application/json" \
  -d '{"message": "What does FAR 15.404 say about price analysis?"}'

Expected response:

Summary of the regulation
Citations
Retrieved sections
LLM‑generated explanation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

fedacq‑rag‑chatbot

Federal Acquisition Regulation Retrieval‑Augmented Generation (RAG) Chatbot

Background

User Story

Acceptance Criteria

Technical Approach

Data Source

Embeddings + Vector Store

Retrieval‑Augmented Generation

Why ONNX Runtime GenAI?

Application Layer

Deployment

Architecture Overview

Repository Structure

Setup & Installation

1. Clone the repository

2. Install Git LFS (required for Chroma index)

3. Create & activate environment

4. Install project + dependencies

5. Verify the prebuilt RAG index

Building the RAG Index (Local Only)

Running the Application

Flask Development Server (local)

Hypercorn (ASGI, local production-like)

Docker

The container uses Hypercorn to serve the ASGI‑wrapped Flask app.

Testing

CI/CD Workflow (GitHub Actions)

Usage Example

About

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 133 Commits
.github/workflows		.github/workflows
data/chroma/4e5e4c70-e806-4f06-93b5-5074a1223e85		data/chroma/4e5e4c70-e806-4f06-93b5-5074a1223e85
docker		docker
src		src
tests		tests
.dockerignore		.dockerignore
.gitattributes		.gitattributes
.gitignore		.gitignore
Dockerfile		Dockerfile
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml
requirements.lock		requirements.lock
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

fedacq‑rag‑chatbot

Federal Acquisition Regulation Retrieval‑Augmented Generation (RAG) Chatbot

Background

User Story

Acceptance Criteria

Technical Approach

Data Source

Embeddings + Vector Store

Retrieval‑Augmented Generation

Why ONNX Runtime GenAI?

Application Layer

Deployment

Architecture Overview

Repository Structure

Setup & Installation

1. Clone the repository

2. Install Git LFS (required for Chroma index)

3. Create & activate environment

4. Install project + dependencies

5. Verify the prebuilt RAG index

Building the RAG Index (Local Only)

Running the Application

Flask Development Server (local)

Hypercorn (ASGI, local production-like)

Docker

The container uses Hypercorn to serve the ASGI‑wrapped Flask app.

Testing

CI/CD Workflow (GitHub Actions)

Usage Example

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages