🤖 Telecom AI Voice Assistant

An intelligent AI-powered customer service assistant for Reliance Jio, featuring voice interaction, RAG-based knowledge retrieval, and real-time chat capabilities.

🌟 Features

💬 Chat Interface

Real-time text chat with AI assistant
Streaming responses (token-by-token)
Session management and chat history
WebSocket support for instant messaging

📊 Performance

Metric	Result
End-to-end latency	< 2 seconds
Query resolution rate	95%+
Human escalation reduction	−35%

🎙️ Voice Interface

Speech-to-Text (STT): Faster-Whisper (Whisper base.en model)
Text-to-Speech (TTS): Kokoro-82M (lightweight, high-quality)
Voice Activity Detection (VAD): Silero-VAD (neural network-based)
Real-time voice conversations

🔍 RAG (Retrieval-Augmented Generation)

Hybrid Search: Vector (BGE embeddings) + BM25 keyword search
CRAG: Corrective RAG with relevance grading
Knowledge Base: Comprehensive Jio plans, services, and FAQs
ChromaDB: Vector database for efficient retrieval

📊 Supported Jio Services

✅ Prepaid Mobile Plans
✅ Postpaid Mobile Plans
✅ JioFiber Broadband
✅ JioAirFiber (5G Wireless Broadband)
✅ International Roaming
✅ ISD Calling Rates
✅ Digital Services (JioTV, JioCinema, etc.)

🏗️ Architecture

┌─────────────────────────────────────────────────────────────────┐
│                         Frontend (React)                         │
│                    telecom-voice-companion-main                  │
├─────────────────────────────────────────────────────────────────┤
│                         Backend (FastAPI)                        │
├──────────────┬──────────────┬──────────────┬────────────────────┤
│   Chat API   │   Voice API  │   RAG API    │   WebSockets       │
├──────────────┼──────────────┼──────────────┼────────────────────┤
│              │              │              │                    │
│  Ollama LLM  │ Whisper STT  │  ChromaDB    │  Silero VAD        │
│ (llama3.1:8b)│ (base.en)    │  (Vector DB) │                    │
│              │              │              │                    │
│              │ Kokoro TTS   │  BM25 Index  │                    │
│              │ (82M model)  │  (Keyword)   │                    │
└──────────────┴──────────────┴──────────────┴────────────────────┘

🛠️ Tech Stack

Backend

Component	Technology	Description
Framework	FastAPI 0.109	Async Python web framework
LLM	Ollama + llama3.1:8b	Local LLM inference
STT	Faster-Whisper (base.en)	Speech-to-Text using Whisper
TTS	Kokoro-82M	Lightweight 82M param TTS model
VAD	Silero-VAD	Neural network voice activity detection
Embeddings	BGE-base-en-v1.5	State-of-the-art embedding model
Vector DB	ChromaDB	Vector database for RAG
Search	BM25 + Vector	Hybrid search (RRF fusion)
Database	PostgreSQL	Session and chat history storage
Cache	Redis	Response caching

Frontend

Component	Technology	Description
Framework	React + Vite	Modern frontend tooling
Styling	TailwindCSS	Utility-first CSS
UI	shadcn/ui	Beautiful component library
Voice	Web Audio API	Browser audio recording

📋 Prerequisites

Required Software

Python 3.12+ (required for Kokoro TTS compatibility)
Node.js 18+ (for frontend)
Ollama (for local LLM)
PostgreSQL (optional, for persistence)
Redis (optional, for caching)

For GPU Acceleration (Recommended for TTS)

⚡ CUDA is highly recommended for Kokoro TTS - provides 5-10x faster audio generation

# Check if CUDA is available
python -c "import torch; print('CUDA:', torch.cuda.is_available())"

CUDA Setup:

Install NVIDIA CUDA Toolkit 12.x
Install cuDNN
Reinstall PyTorch with CUDA:

pip uninstall torch torchaudio
pip install torch torchaudio --index-url https://download.pytorch.org/whl/cu121

🚀 Quick Start

1. Clone Repository

git clone https://github.com/Vijaykrishna2334/Telecom-ai-assistant.git
cd Telecom-ai-assistant

2. Install Ollama & Pull Model

# Install from https://ollama.ai
ollama pull llama3.1:8b

3. Setup Backend

cd backend

# Create virtual environment
python -m venv venv

# Activate (Windows)
venv\Scripts\activate

# Activate (Linux/Mac)
source venv/bin/activate

# Install dependencies
pip install -r requirements.txt

# For CUDA/GPU support (recommended for Kokoro TTS):
pip uninstall torch torchaudio -y
pip install torch torchaudio --index-url https://download.pytorch.org/whl/cu121

4. Configure Environment

# Copy example env
copy .env.example .env  # Windows
cp .env.example .env    # Linux/Mac

# Edit .env with your settings

Key .env settings:

# LLM
OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_MODEL=llama3.1:8b

# Voice (Both STT and TTS auto-detect GPU/CUDA)
STT_MODEL=base.en
# Device is auto-detected - uses CUDA if available

# Kokoro TTS (auto-detects GPU)
KOKORO_VOICE=af_heart
KOKORO_LANG_CODE=a

5. Start Backend

cd backend
uvicorn app.main:app --reload --host 0.0.0.0 --port 8080

Backend runs at: http://localhost:8080

6. Setup Frontend

cd telecom-voice-companion-main

# Install dependencies
npm install

# Start development server
npm run dev

Frontend runs at: http://localhost:5173

📁 Project Structure

Telecom-ai-assistant/
├── backend/                    # FastAPI backend
│   ├── app/
│   │   ├── api/               # API routes
│   │   │   ├── routes/        # REST endpoints
│   │   │   └── websockets/    # WebSocket handlers
│   │   ├── core/              # Config, logging, security
│   │   ├── models/            # Database models
│   │   └── services/
│   │       ├── llm/           # LLM (Ollama) integration
│   │       ├── rag/           # RAG pipeline (CRAG)
│   │       │   ├── crag_chain.py      # CRAG orchestrator
│   │       │   ├── hybrid_retriever.py # Vector + BM25
│   │       │   └── ingestion.py       # Document chunking
│   │       └── voice/         # Voice processing
│   │           ├── stt.py     # Faster-Whisper STT
│   │           ├── kokoro_tts.py  # Kokoro TTS
│   │           └── vad.py     # Silero VAD
│   ├── requirements.txt
│   └── Dockerfile
│
├── knowledge/                 # RAG knowledge base
│   ├── plans/                # Jio plan documents
│   ├── faqs/                 # FAQ documents
│   ├── services/             # Service information
│   └── policies/             # Terms, conditions
│
├── telecom-voice-companion-main/  # React frontend
│   ├── src/
│   │   ├── components/       # UI components
│   │   ├── pages/           # Page components
│   │   └── hooks/           # Custom hooks
│   └── package.json
│
├── docker-compose.yml        # Docker orchestration
└── README.md

🎙️ Voice Components

Speech-to-Text (Faster-Whisper)

Model: base.en (English optimized)
Device: Auto-detects CUDA GPU (falls back to CPU)
Compute Type: float16 (GPU) or int8 (CPU)
Features: VAD filtering, beam search, Jio vocabulary hints

# Auto-detection in stt.py
# Uses CUDA GPU if available: RTX 4090 → ~3x faster transcription
# Falls back to CPU with int8 quantization

Text-to-Speech (Kokoro-82M)

Model: Kokoro-82M (82 million parameters)
Voice: af_heart (American female)
Sample Rate: 24kHz
Device: Auto-detects GPU (CUDA) for faster synthesis

# Configuration in config.py
kokoro_voice: str = "af_heart"
kokoro_lang_code: str = "a"  # 'a' = American English

Available Voices:

Voice Code	Description
`af_heart`	American Female (warm)
`af_bella`	American Female (professional)
`am_adam`	American Male
`bf_emma`	British Female
`bm_george`	British Male

Voice Activity Detection (Silero-VAD)

Model: Silero-VAD (neural network)
Threshold: 0.85 (adjustable)
Fallback: Energy-based detection

🔧 Configuration

Environment Variables

Variable	Default	Description
`OLLAMA_BASE_URL`	`http://localhost:11434`	Ollama API URL
`OLLAMA_MODEL`	`llama3.1:8b`	LLM model name
`STT_MODEL`	`base.en`	Whisper model size
`STT_DEVICE`	`cpu`	STT device (cpu/cuda)
`KOKORO_VOICE`	`af_heart`	TTS voice
`VAD_THRESHOLD`	`0.85`	VAD sensitivity
`CRAG_TOP_K`	`10`	Documents to retrieve
`DATABASE_URL`	`postgresql://...`	PostgreSQL connection
`REDIS_URL`	`redis://localhost:6379`	Redis connection

🐳 Docker Deployment

# Start all services
docker-compose up -d

# View logs
docker-compose logs -f backend

# Stop services
docker-compose down

📚 API Endpoints

Chat

Method	Endpoint	Description
POST	`/api/v1/chat`	Send message, get response
POST	`/api/v1/chat/stream`	Streaming response (SSE)
GET	`/api/v1/chat/history/{session_id}`	Get chat history

Voice

Method	Endpoint	Description
WS	`/api/v1/ws/voice`	Real-time voice WebSocket
POST	`/api/v1/voice/transcribe`	Transcribe audio
POST	`/api/v1/voice/synthesize`	Generate speech

Knowledge

Method	Endpoint	Description
GET	`/api/v1/knowledge/search`	Search knowledge base
POST	`/api/v1/knowledge/ingest`	Ingest documents

🧪 Testing

cd backend

# Run all tests
pytest

# Run with coverage
pytest --cov=app

# Run specific test
pytest tests/test_rag.py -v

🔧 Troubleshooting

Ollama Connection Error

Error: Cannot connect to Ollama

Solution: Ensure Ollama is running:

ollama serve
# In another terminal:
ollama pull llama3.1:8b

CUDA/GPU Not Detected

CUDA available: False

Solution: Install PyTorch with CUDA support:

pip uninstall torch torchaudio -y
pip install torch torchaudio --index-url https://download.pytorch.org/whl/cu121

Kokoro TTS Installation Error

ModuleNotFoundError: No module named 'kokoro'

Solution:

pip install 'kokoro>=0.9.2' soundfile

ChromaDB Errors

Error: Collection not found

Solution: Knowledge base auto-ingests on startup. If issues persist:

cd backend
python reingest_knowledge.py

Port Already in Use

Error: Address already in use :8080

Solution: Change port in .env:

PORT=8081

FFmpeg Warnings (Can Ignore)

Failed to load FFmpeg extension

This is a non-critical warning from torchaudio. TTS will still work.

🤝 Contributing

Fork the repository
Create a feature branch (git checkout -b feature/amazing)
Commit changes (git commit -m 'Add amazing feature')
Push to branch (git push origin feature/amazing)
Open a Pull Request

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

Ollama - Local LLM inference
Faster-Whisper - Fast Whisper implementation
Kokoro TTS - Lightweight TTS model
Silero-VAD - Voice activity detection
ChromaDB - Vector database
FastAPI - Modern Python web framework

📞 Support

For issues or questions:

Create a GitHub Issue
Email: vijaykrishna2334@gmail.com

Built with ❤️ for better customer service

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
backend		backend
deploy-backend		deploy-backend
knowledge		knowledge
scripts		scripts
telecom-voice-companion-main		telecom-voice-companion-main
.env.example		.env.example
.gitignore		.gitignore
DEPLOYMENT.md		DEPLOYMENT.md
Makefile		Makefile
README.md		README.md
docker-compose.dev.yml		docker-compose.dev.yml
docker-compose.yml		docker-compose.yml

Folders and files

Latest commit

History

Repository files navigation

🤖 Telecom AI Voice Assistant

🌟 Features

💬 Chat Interface

📊 Performance

🎙️ Voice Interface

🔍 RAG (Retrieval-Augmented Generation)

📊 Supported Jio Services

🏗️ Architecture

🛠️ Tech Stack

Backend

Frontend

📋 Prerequisites

Required Software

For GPU Acceleration (Recommended for TTS)

🚀 Quick Start

1. Clone Repository

2. Install Ollama & Pull Model

3. Setup Backend

4. Configure Environment

5. Start Backend

6. Setup Frontend

📁 Project Structure

🎙️ Voice Components

Speech-to-Text (Faster-Whisper)

Text-to-Speech (Kokoro-82M)

Voice Activity Detection (Silero-VAD)

🔧 Configuration

Environment Variables

🐳 Docker Deployment

📚 API Endpoints

Chat

Voice

Knowledge

🧪 Testing

🔧 Troubleshooting

Ollama Connection Error

CUDA/GPU Not Detected

Kokoro TTS Installation Error

ChromaDB Errors

Port Already in Use

FFmpeg Warnings (Can Ignore)

🤝 Contributing

📄 License

🙏 Acknowledgments

📞 Support

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages