Testing Guide for Executive Orders RAG Chatbot

This guide provides step-by-step instructions for testing the RAG chatbot locally without requiring Azure resources.

End-to-End Testing Workflow

Follow these steps to test the entire RAG pipeline locally:

1. Set Up Sample Documents

Create a test directory and add some sample executive order documents:

mkdir -p test_docs

You can use executive orders from:

Download a few PDFs or text files of executive orders and save them to the test_docs directory.

2. Process Documents

Process the documents to extract content and split into chunks:

python scripts/ingest.py --input test_docs --output data/processed_chunks.json

Expected output:

Log messages showing document loading and processing
A processed_chunks.json file in the data directory

3. Generate Embeddings

Generate vector embeddings for the document chunks:

python scripts/embed.py --input data/processed_chunks.json --output data/embedded_chunks.json

Expected output:

Log messages showing embedding generation
An embedded_chunks.json file containing document chunks with embeddings

4. Create Vector Store Index

Create a vector store index from the embedded chunks:

python scripts/create_index.py --input data/embedded_chunks.json --output data/vector_store.json

Expected output:

A vector_store.json file containing the vector store index

5. Test Vector Search

Test the vector search functionality:

python scripts/search.py --index data/vector_store.json --query "climate change initiatives"

Expected output:

A list of relevant document chunks related to climate change

6. Test RAG CLI

Test the interactive RAG command-line interface:

python scripts/rag_cli.py --index data/vector_store.json

Expected output:

An interactive CLI where you can enter questions
Retrieved documents related to your questions

7. Test Web Interface

Test the Streamlit web interface:

streamlit run app.py

Expected output:

A web interface running at http://localhost:8501
Ability to load the vector store and search for information

To access the Admin Dashboard:

Run the script: streamlit run scripts/run_admin.py
Enter the admin password when prompted
The dashboard will be available at http://localhost:8501

Testing Specific Components

Document Processing

Test just the document processor component:

python -c "from src.document_processor import DocumentProcessor; processor = DocumentProcessor(); docs = processor.load_document('test_docs/example.pdf'); print(f'Loaded {len(docs)} document parts')"

Embeddings Generation

Test just the embeddings generator component:

python -c "from src.embeddings import EmbeddingsGenerator; generator = EmbeddingsGenerator(); emb = generator.generate_embeddings(['This is a test document']); print(f'Generated embedding with dimension {len(emb[0])}')"

Vector Store

Test just the vector store component:

python -c "from src.vector_store import LocalVectorStore; store = LocalVectorStore(); store.load('data/vector_store.json'); print(f'Loaded {len(store.documents)} documents')"

Troubleshooting

Common Issues

Issue: Missing dependencies Solution: Run pip install -r requirements.txt to install all required packages

Issue: File not found errors Solution: Make sure you've created the necessary directories (e.g., data/) and check file paths

Issue: Embedding model download issues Solution: Check your internet connection; the first run will download the model

Issue: Out of memory errors Solution: Reduce the number of documents or batch size for processing

Generating Test Data

If you don't have real executive orders, you can create test documents with:

echo "Executive Order 12345\n\nTitle: Test Executive Order\n\nJanuary 1, 2025\n\nBy the authority vested in me as President by the Constitution and the laws of the United States of America, it is hereby ordered as follows:\n\nSection 1. Policy. This is a test executive order for the RAG system." > test_docs/test_eo.txt

Performance Testing

To test with a larger dataset, you can use a loop to generate multiple test documents:

for i in {1..10}; do
  echo "Executive Order $i\n\nTitle: Test Executive Order $i\n\nJanuary $i, 2025\n\nBy the authority vested in me as President by the Constitution and the laws of the United States of America, it is hereby ordered as follows:\n\nSection 1. Policy. This is test executive order number $i for the RAG system." > "test_docs/test_eo_$i.txt"
done

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Testing Guide for Executive Orders RAG Chatbot

End-to-End Testing Workflow

1. Set Up Sample Documents

2. Process Documents

3. Generate Embeddings

4. Create Vector Store Index

5. Test Vector Search

6. Test RAG CLI

7. Test Web Interface

To access the Admin Dashboard:

Testing Specific Components

Document Processing

Embeddings Generation

Vector Store

Troubleshooting

Common Issues

Generating Test Data

Performance Testing

FilesExpand file tree

testing_guide.md

Latest commit

History

testing_guide.md

File metadata and controls

Testing Guide for Executive Orders RAG Chatbot

End-to-End Testing Workflow

1. Set Up Sample Documents

2. Process Documents

3. Generate Embeddings

4. Create Vector Store Index

5. Test Vector Search

6. Test RAG CLI

7. Test Web Interface

To access the Admin Dashboard:

Testing Specific Components

Document Processing

Embeddings Generation

Vector Store

Troubleshooting

Common Issues

Generating Test Data

Performance Testing