This project implements a Retrieval Augmented Generation (RAG) chatbot using Streamlit for the UI, LangChain for orchestration, Pinecone for vector storage, HuggingFace embeddings, and a local Large Language Model (LLM) via llama.cpp.
- Document Ingestion: Load and process PDF documents from the
documents/folder - Vector Storage: Store document embeddings in Pinecone vector database
- Retrieval: Perform similarity search to retrieve relevant context
- Local LLM: Use Mistral 7B Instruct model locally for generating responses
- Chat Interface: Interactive Streamlit app for chatting with the RAG system
- Python 3.11+
- Pinecone account (free tier available)
- Mistral 7B Instruct GGUF model file
-
Clone the repository:
git clone https://github.com/trisita-26/LangChain-Pinecone-RAG.git cd LangChain-Pinecone-RAG -
Create a virtual environment:
python -m venv venv -
Activate the virtual environment:
- Windows:
venv\Scripts\activate - macOS/Linux:
source venv/bin/activate
- Windows:
-
Install dependencies:
pip install -r requirements.txt -
Set up accounts and models:
- Create a free account on Pinecone
- Download the Mistral 7B Instruct GGUF model from Hugging Face (e.g.,
mistral-7b-instruct-v0.1.Q4_K_M.gguf) - Place the model file in a
models/directory
-
Create a
.envfile in the root directory with your API keys:PINECONE_API_KEY=your_pinecone_api_key_here PINECONE_INDEX_NAME=sample-index
Run the ingestion script to process and store your documents:
python ingestion.py
This will:
- Create a Pinecone index (if it doesn't exist)
- Load PDF files from the
documents/folder - Split documents into chunks
- Generate embeddings and upload to Pinecone
Test the retrieval system:
python retrieval.py
Start the Streamlit app:
streamlit run app.py
Open your browser to the provided URL and start chatting!
chatbot_rag.py: Main Streamlit app for the chatbot interfaceingestion.py: Script to ingest documents into Pineconeretrieval.py: Script to test document retrievalrequirements.txt: Python dependenciesdocuments/: Folder for PDF documents to ingestmodels/: Folder for local LLM model files (create this directory)
- Pinecone Index: Configured for 384 dimensions (matching HuggingFace all-MiniLM-L6-v2 embeddings)
- Chunk Size: 500 characters with 50 character overlap
- Retrieval: Top 3 similar chunks with score threshold of 0.5
- LLM: Mistral 7B Instruct with temperature 0.7, max 512 tokens
- Change the LLM model path in
chatbot_rag.py - Adjust retrieval parameters in the retriever setup
- Modify the prompt template for different response styles
- Add more document types by updating the loader in
ingestion.py
- Ensure your Pinecone API key is correct and has sufficient credits
- Check that the model file path matches the one in
app.py - Verify that documents are in PDF format in the
documents/folder - Make sure the virtual environment is activated before running scripts
This project is for educational purposes. Please check the licenses of individual components (LangChain, Pinecone, etc.) for commercial use.
- Rename .env.example to .env
- Add the API keys for Pinecone and OpenAI to the .env file
-
Open a terminal in VS Code
-
Execute the following command:
python sample_ingestion.py
python sample_retrieval.py
python ingestion.py
python retrieval.py
streamlit run chatbot_rag.py