Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 0 additions & 3 deletions 1

This file was deleted.

92 changes: 92 additions & 0 deletions backend/AI_SETUP.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
# AI Integration Setup

## Overview

Your PrivateGPT backend now includes AI integration capabilities using OpenAI's API. The system can operate in two modes:

1. **Test Mode** (Current): Uses predefined responses for testing
2. **Production Mode**: Uses OpenAI API for real AI responses

## Current Status: Test Mode

The system is currently running in **Test Mode** with intelligent fallback responses. This allows you to test the chat functionality without requiring an OpenAI API key.

## Setting Up OpenAI API (Production Mode)

To enable real AI responses, follow these steps:

### 1. Get OpenAI API Key

1. Go to [OpenAI's website](https://platform.openai.com/)
2. Sign up or log in to your account
3. Navigate to the API keys section
4. Create a new API key
5. Copy the key (it starts with `sk-`)

### 2. Update Environment Variables

Edit the `.env` file in the backend directory:

```bash
# Replace this line:
OPENAI_API_KEY=sk-placeholder-replace-with-your-real-api-key

# With your actual API key:
OPENAI_API_KEY=sk-your-actual-api-key-here
```

### 3. Configure AI Settings (Optional)

You can customize the AI behavior by modifying these settings in `.env`:

```bash
OPENAI_MODEL=gpt-4o-mini # AI model to use
OPENAI_MAX_TOKENS=1000 # Maximum response length
OPENAI_TEMPERATURE=0.7 # Response creativity (0-1)
```

### 4. Restart Backend Server

After updating the API key, restart your backend server:

```bash
# Stop the current server (Ctrl+C) and restart:
cd backend
source venv/bin/activate
uvicorn app.main:app --reload --host 127.0.0.1 --port 8000
```

## Legal Professional Features

The AI integration includes:

- **Legal-focused prompts**: Responses tailored for legal professionals
- **Professional disclaimers**: Always reminds users that AI supplements, not replaces, professional judgment
- **Contextual responses**: Understands legal terminology and concepts
- **Privacy considerations**: Built with confidentiality in mind

## Cost Considerations

- OpenAI API usage is pay-per-token
- Current settings limit responses to 1000 tokens to control costs
- Monitor your usage at [OpenAI's usage dashboard](https://platform.openai.com/usage)

## Testing

Test your AI integration with these sample queries:

1. "What is a legal contract?"
2. "Hello, I need help with compliance"
3. "What can you help me with?"

## Troubleshooting

- **"Technical difficulties" message**: Check your API key and internet connection
- **Test mode responses**: Ensure your API key doesn't start with `sk-placeholder`
- **Server errors**: Check the console logs for detailed error messages

## Security Notes

- Never commit your actual API key to version control
- Keep the `.env` file secure and private
- Consider using environment variables in production deployments
123 changes: 123 additions & 0 deletions backend/CHUNKING_README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
# Document Chunking Implementation

## ✅ Completed Features

### 1. Intelligent Document Chunking Service
- **File**: `app/services/chunking_service.py`
- **Features**:
- Configurable chunk size (default: 800 characters)
- Overlap between chunks (default: 200 characters) for context preservation
- Sentence boundary preservation
- Automatic handling of small and large documents
- Metadata preservation and enrichment for each chunk

### 2. Enhanced RAG Service Integration
- **File**: `app/services/rag_service.py`
- **Changes**:
- Integrated chunking service into document ingestion pipeline
- Enhanced metadata with chunk information (index, total chunks, character positions)
- Improved context retrieval with chunk-aware search results

### 3. API Updates
- **File**: `app/api/documents.py`
- **Changes**:
- Updated ingestion response to include chunk count
- System status now reports chunking configuration

## Configuration

The chunking behavior can be configured via environment variables:

```env
# Chunking Configuration
CHUNK_SIZE=800 # Target size for each chunk in characters
CHUNK_OVERLAP=200 # Overlap between consecutive chunks
MAX_CHUNK_SIZE=1000 # Maximum allowed chunk size
MIN_CHUNK_SIZE=100 # Minimum chunk size to create
```

## How It Works

### Document Processing Flow
1. **Document Received**: API receives document with optional metadata
2. **Chunking**: Document is split into overlapping chunks
- Preserves sentence boundaries when possible
- Maintains context with overlap between chunks
- Each chunk inherits original document metadata
3. **Embedding Generation**: Each chunk gets its own embedding
4. **Vector Storage**: Chunks stored in Pinecone with enriched metadata
5. **Retrieval**: Searches return relevant chunks with chunk information

### Example Usage

```python
# Ingesting a large document
import requests

document = "Your large document text here..."
metadata = {"document_type": "policy", "author": "HR Department"}

response = requests.post(
"http://localhost:8000/api/ingest",
json={
"documents": [document],
"metadata": [metadata]
}
)

# Response includes chunk information
# {
# "success": true,
# "document_count": 1,
# "chunk_count": 10,
# "message": "Successfully ingested 1 documents as 10 chunks",
# ...
# }
```

## Testing

### Unit Testing
Run the chunking test to verify functionality:
```bash
python test_chunking.py
```

### Integration Testing
Test end-to-end document ingestion with chunking:
```bash
python test_ingest_large.py
```

## Benefits

1. **Better Context Preservation**: Overlap ensures important information isn't lost at chunk boundaries
2. **Improved Retrieval**: More granular chunks mean better semantic search results
3. **Scalability**: Can handle documents of any size
4. **Flexibility**: Configurable chunk sizes for different use cases
5. **Metadata Tracking**: Each chunk maintains reference to source document

## Performance Considerations

- **Chunk Size**: Smaller chunks = better precision but more storage/embeddings
- **Overlap Size**: More overlap = better context but increased storage
- **Processing Time**: Chunking adds minimal overhead (< 100ms for most documents)

## Example Results

For a 6,005 character HR policy document:
- Original documents: 1
- Chunks created: 10
- Average chunk size: ~600-800 characters
- Overlap preserved: ~80-100 characters between chunks

The system successfully maintains context about "Dan Pfeiffer" across multiple chunks, ensuring queries about him retrieve relevant information regardless of which chunk contains the primary reference.

## Next Steps

With chunking complete, the system is ready for:
1. ✅ Handling large enterprise documents
2. ✅ Maintaining context across document sections
3. ✅ Improving search relevance with granular chunks
4. 🔄 API key authentication (next implementation)
5. 🔄 Production deployment configuration
Loading