Skip to content

Latest commit

 

History

History
248 lines (181 loc) · 5.61 KB

File metadata and controls

248 lines (181 loc) · 5.61 KB

🤖 Voice AI Bot System

A production-ready, conversational AI voice bot that bridges Exotel's WebSocket streaming with OpenAI's Realtime API for natural, speech-to-speech conversations over phone calls.

🎯 What This Bot Does

  • 🗣️ Natural Conversations: Real-time speech-to-speech using OpenAI's latest Realtime API
  • 📞 Telephony Integration: Seamless integration with Exotel's voice streaming services
  • 🛑 Smart Interruption: Handles conversation interruptions naturally
  • 🔊 Audio Enhancement: Built-in noise suppression and audio optimization for telephony
  • ⚡ Real-time Processing: 200ms audio buffering for smooth conversation flow
  • 🔒 Security First: Environment-based configuration, no hardcoded secrets
  • 🎵 High-Quality Audio: 24kHz PCM16 audio format for superior voice quality

🚀 Quick Start

Prerequisites

  • Python 3.8+
  • OpenAI API key with Realtime API access
  • Exotel account with Voicebot Applet access

Installation

  1. Clone the repository:

    git clone <repository-url>
    cd Agent-Stream
  2. Create a virtual environment:

    python3 -m venv venv
    source venv/bin/activate  # Windows: venv\Scripts\activate
  3. Install dependencies:

    pip install -r requirements.txt
  4. Configure environment variables:

    cp env.example .env
    # Edit .env with your OpenAI API key and other settings

Configuration

Edit your .env file with the following required settings:

# REQUIRED - Get from OpenAI dashboard
OPENAI_API_KEY=your-openai-api-key-here

# SERVER CONFIG
SERVER_HOST=0.0.0.0
SERVER_PORT=5000

# BOT PERSONALITY
COMPANY_NAME=Your Company Name
SALES_BOT_NAME=Sarah

# AUDIO SETTINGS
SAMPLE_RATE=24000
AUDIO_CHUNK_SIZE=200

Running the Bot

# Start the bot
python main.py

# Test the configuration
python main.py --config-check

# Run system tests
python main.py --test

The bot will start a WebSocket server on 0.0.0.0:5000.

📡 Exotel Integration

WebSocket URLs

  • Local: ws://localhost:5000
  • Public: Use ngrok or your server's public IP

Exotel Voicebot Applet Configuration

  1. URL: wss://your-domain.com/?sample-rate=24000
  2. Sample Rate: 24kHz (recommended for high quality)
  3. Audio Format: Raw/slin (16-bit PCM)
  4. Bidirectional Streaming: Enabled

Test Message Format

{
  "event": "connected"
}

🏗️ Project Structure

.
├── .env                 # Environment variables (local)
├── .gitignore           # Git ignore file
├── LICENSE              # Project license
├── README.md            # This README file
├── config.py            # Centralized configuration
├── core/                # Core bot logic and framework
│   ├── __init__.py
│   ├── bot_framework.py
│   └── openai_realtime_sales_bot.py
├── engines/             # AI engine components (STT, TTS, NLP, etc.)
│   ├── __init__.py
│   ├── audio_enhancer.py
│   ├── media_resampler.py
│   ├── nlp_engine.py
│   ├── stt_engine.py
│   └── tts_engine.py
├── env.example          # Example environment variables
├── main.py              # Main entry point for the application
├── requirements.txt     # Python dependencies
└── venv/                # Python virtual environment

🔧 Configuration Options

Audio Settings

  • SAMPLE_RATE: Audio sample rate (8000, 16000, 24000)
  • AUDIO_CHUNK_SIZE: Chunk size in milliseconds (default: 200)
  • BUFFER_SIZE_MS: Audio buffer size (default: 160)

Bot Personality

  • COMPANY_NAME: Your company name
  • SALES_BOT_NAME: Bot's name
  • OPENAI_VOICE: Voice selection (coral, nova, shimmer)

Server Settings

  • SERVER_HOST: Server host (default: 0.0.0.0)
  • SERVER_PORT: Server port (default: 5000)

🚀 Deployment

Option 1: Development (ngrok)

# Install ngrok
./ngrok http 5000
# Use: wss://xxxxx.ngrok-free.app

Option 2: Cloud VPS

# Setup on DigitalOcean/AWS/GCP
sudo ufw allow 5000
python main.py
# Use: wss://your-server-ip:5000

Option 3: Docker

# Build and run
docker build -t voice-bot .
docker run --env-file .env -p 5000:5000 voice-bot

🧪 Testing

Basic Tests

# Test configuration
python main.py --config-check

# Test bot connection
python main.py --test

Manual Testing

# Using wscat
wscat -c ws://localhost:5000

# Send test message
{"event": "connected"}

🔒 Security

  • All sensitive information is stored in environment variables
  • No hardcoded API keys or tokens
  • .env file is gitignored
  • Use HTTPS/WSS in production

📊 Monitoring

Key Metrics

  • Call Duration
  • Bot Response Time
  • Audio Quality Score
  • Conversation Completion Rate
  • Error Rate

Log Analysis

# Monitor key events
grep "NEW EXOTEL CONNECTION" logs/bot.log
grep "CONVERSATION COMPLETED" logs/bot.log
grep "ERROR" logs/bot.log

🤝 Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Test thoroughly
  5. Submit a pull request

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🆘 Support

  • Check the troubleshooting section in the README
  • Review GitHub Issues for similar problems
  • Post detailed issues with logs and configuration

🙏 Acknowledgments