Self-hosted inbound + outbound phone agent using LiveKit Agents.
Speaks to customers in English, Hindi, Kannada, and Marathi,
collects order details, and logs everything to Airtable.
Caller (Twilio SIP)
│
LiveKit Cloud ◄──► agent.py (runs locally)
│ │
│ ┌──────┴───────────┐
│ ▼ ▼
│ Groq Whisper Sarvam AI TTS
│ (STT + lang (multilingual
│ detection) Indian voices)
│ │
│ ▼
│ Groq LLaMA
│ (llama-3.3-70b)
│
After call ends:
├── call_logs → Airtable
└── orders → Airtable (Groq extracts structured data)
| Variable | Description | Where to get it |
|---|---|---|
LIVEKIT_URL |
WebSocket URL of your LiveKit Cloud project | LiveKit Cloud → Project Settings |
LIVEKIT_API_KEY |
LiveKit API key | LiveKit Cloud → Project Settings |
LIVEKIT_API_SECRET |
LiveKit API secret | LiveKit Cloud → Project Settings |
LIVEKIT_SIP_TRUNK_ID |
SIP trunk ID for outbound PSTN calling | LiveKit Cloud → SIP → Trunks |
GROQ_API_KEY |
Groq API key (used for both Whisper STT and LLaMA LLM) | https://console.groq.com |
SARVAM_API_KEY |
Sarvam AI API key for multilingual TTS | https://dashboard.sarvam.ai |
TWILIO_ACCOUNT_SID |
Twilio account SID | https://console.twilio.com |
TWILIO_AUTH_TOKEN |
Twilio auth token | https://console.twilio.com |
TWILIO_PHONE_NUMBER |
Your Twilio phone number in E.164 format | https://console.twilio.com |
AIRTABLE_PAT |
Airtable Personal Access Token (read+write scopes) | Airtable → Account → Developer Hub |
AIRTABLE_BASE_ID |
ID of your Airtable base (starts with app) |
Airtable base URL |
MAX_CALL_DURATION_SECONDS |
Auto-disconnect after this many seconds (default: 300) | Set in .env |
AGENT_SYSTEM_PROMPT |
Full system prompt for the agent — edit to change behaviour | Set in .env |
cd /path/to/project
python -m venv venv
source venv/bin/activate
pip install -r requirements.txtcp .env.example .env
# Fill in all values in .envpython agent.py devWhy local running works: LiveKit Cloud handles all media routing and SIP bridging. Your laptop only needs an outbound internet connection — no ports need to be opened. This is production-ready for demos.
- Create an account at https://cloud.livekit.io
- Create a project. Copy Project URL, API Key, and API Secret →
.env - Enable SIP in the project settings
- Inbound trunk: LiveKit Cloud → SIP → Trunks → New Inbound Trunk
- Set allowed IPs to Twilio's SIP signalling IPs
- Outbound trunk: LiveKit Cloud → SIP → Trunks → New Outbound Trunk
- Use
pstn.twilio.comas the destination - Set auth credentials matching your Twilio SIP credential list
- Copy the trunk ID (
ST_…) →LIVEKIT_SIP_TRUNK_IDin.env
- Use
- Create an account at https://console.twilio.com
- Buy a phone number capable of voice →
.envasTWILIO_PHONE_NUMBER - Inbound SIP: Elastic SIP Trunking → Create trunk
- Set the Origination SIP URI to your LiveKit SIP inbound endpoint
- Assign your Twilio number to this trunk
- Outbound SIP credential list: SIP → Credential Lists → Create
- Add username + password matching what you set in the LiveKit outbound trunk
- Copy Account SID and Auth Token →
.env
- Create an account at https://console.groq.com
- Generate an API key →
GROQ_API_KEYin.env - No separate key needed — one key handles both Whisper STT and LLaMA LLM
- Create an account at https://dashboard.sarvam.ai
- Generate an API key →
SARVAM_API_KEYin.env
-
Create a base at https://airtable.com
-
Create table
call_logswith these exact fields:Field Type call_id Single line text caller_number Single line text duration_seconds Number transcript Long text language_detected Single line text created_at Date (enable time toggle) -
Create table
orderswith these exact fields:Field Type call_id Single line text customer_name Single line text item_ordered Single line text quantity Number delivery_address Single line text order_status Single line text created_at Date (enable time toggle) -
Go to Account → Developer Hub → Personal Access Tokens
- Create a token with
data.records:readanddata.records:writescopes on your base - Copy it →
AIRTABLE_PATin.env
- Create a token with
-
Copy the Base ID from the URL (
https://airtable.com/appXXX/…) →AIRTABLE_BASE_ID
import asyncio
from agent import call_customer
asyncio.run(call_customer("+919876543210"))| What | Location in code |
|---|---|
| Language → voice code mapping | LANGUAGE_VOICE_MAP dict (~line 65) |
| VAD sensitivity tuning | build_silero_vad() with inline comments |
| System prompt (default) | DEFAULT_PROMPT constant |
| Airtable table names | log_call_to_airtable() and extract_and_log_order() |
| Sarvam voice speaker | SarvamChunkedStream._run() → "speaker" key |
.envis git-ignored. Never commit it.- All credentials come exclusively from environment variables.
MAX_CALL_DURATION_SECONDSlimits cost exposure from stuck calls.