From ab2ba959a3aa7daace901d75ec5a75e5f9b3faa1 Mon Sep 17 00:00:00 2001 From: Cong <72737794+robolearning123@users.noreply.github.com> Date: Sat, 21 Mar 2026 23:02:32 -0400 Subject: [PATCH 1/2] feat(build): add make publish target and declare build/twine deps Co-Authored-By: Claude Opus 4.6 (1M context) --- Makefile | 5 ++++- pyproject.toml | 2 +- 2 files changed, 5 insertions(+), 2 deletions(-) diff --git a/Makefile b/Makefile index 4159f11..2b75954 100644 --- a/Makefile +++ b/Makefile @@ -1,4 +1,4 @@ -.PHONY: test test-unit test-bdd test-integration lint format typecheck build check clean security docs docs-build +.PHONY: test test-unit test-bdd test-integration lint format typecheck build check clean security docs docs-build publish test: pytest tests/ -v --tb=short --cov=call_use --cov-report=term-missing --cov-fail-under=100 @@ -40,3 +40,6 @@ docs: docs-build: mkdocs build + +publish: check + twine upload dist/* diff --git a/pyproject.toml b/pyproject.toml index 17a9f82..1e3061e 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -36,7 +36,7 @@ dependencies = [ ] [project.optional-dependencies] -dev = ["pytest", "pytest-asyncio", "pytest-cov", "httpx", "mypy", "hypothesis"] +dev = ["pytest", "pytest-asyncio", "pytest-cov", "httpx", "mypy", "hypothesis", "build", "twine"] docs = [ "mkdocs-material>=9.0", "mkdocstrings[python]>=0.24", From bd7072ed01bec51f3808c405a2360818b4f44a11 Mon Sep 17 00:00:00 2001 From: Cong <72737794+robolearning123@users.noreply.github.com> Date: Sat, 21 Mar 2026 23:02:39 -0400 Subject: [PATCH 2/2] docs(readme): reposition as modality primitive with comparison table Lead with "computer-use, browser-use, call-use" positioning. Add feature comparison table vs OpenClaw, Pine AI, Bland AI. Restructure with architecture diagram, state machine, and four-interface documentation (SDK/CLI/MCP/REST). Co-Authored-By: Claude Opus 4.6 (1M context) --- README.md | 239 ++++++++++++++++++++++++++++++------------------------ 1 file changed, 135 insertions(+), 104 deletions(-) diff --git a/README.md b/README.md index 09fb997..ca57566 100644 --- a/README.md +++ b/README.md @@ -6,11 +6,11 @@ [![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://pypi.org/project/call-use/) [![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](LICENSE) -**Give your AI agent the ability to make real phone calls.** +**computer-use controls computers. browser-use controls browsers. call-use controls phone calls.** -call-use is an open-source outbound call-control runtime that lets AI agents dial real phones, navigate IVR menus, talk to humans, and return structured results. Think *browser-use*, but for phone calls. +Give any AI agent the ability to make phone calls -- navigate IVR menus, conduct conversations, and return structured outcomes. -> **Early release (v0.1)** — Core functionality works. API may change before v1.0. [Report issues](https://github.com/agent-next/call-use/issues). +> **Early release (v0.1)** -- Core functionality works. API may change before v1.0. [Report issues](https://github.com/agent-next/call-use/issues).
@@ -18,6 +18,14 @@ https://github.com/agent-next/call-use/raw/main/docs/assets/demo.mp4
+## Install + +```bash +pip install call-use +``` + +## Quick Start + ```python from call_use import CallAgent @@ -31,60 +39,71 @@ print(outcome.disposition) # "completed" print(outcome.transcript) # [{speaker: "agent", text: "..."}, ...] ``` -## Features +## How It Compares -- **Four interfaces** — Python SDK, CLI, MCP server, and REST API. Use whichever fits your stack. -- **IVR navigation** — Navigate phone menus, press DTMF buttons, handle hold music automatically. -- **Human takeover** — Pause the AI mid-call, join as a human, then hand control back to the agent. -- **Approval flow** — Agent pauses and asks permission before taking sensitive actions. -- **Structured outcomes** — Every call returns typed results: transcript, events, disposition, duration. -- **Phone validation** — E.164 format enforcement, premium-rate blocking, Caribbean NPA blocking. -- **Framework integrations** — Works with [LangChain](examples/langchain_tool.py), [CrewAI](examples/crewai_integration.py), [OpenAI Agents](examples/openai_agents.py), and any MCP-compatible client. -- **Rate limiting** — Built-in per-key sliding window for the REST API. +| Feature | call-use | OpenClaw Voice Plugin | Pine AI | Bland AI | +|---------|----------|----------------------|---------|----------| +| Open source | Yes | Yes | SDKs only | No | +| Self-hosted | Yes | Yes (within OpenClaw) | No | No | +| IVR navigation | Yes | No | Yes | Yes | +| Approval flows | Yes | No | No | Unknown | +| Human takeover | Yes | No | No | Unknown | +| Structured outcomes | Yes | Partial (JSONL) | Yes | Yes | +| Framework-agnostic | Yes (SDK/CLI/API/MCP) | No (OpenClaw only) | Partial (API/SDK/MCP) | No (API only) | +| MCP support | Yes | No | Yes | No | -## Installation +## Architecture -```bash -pip install call-use +``` +┌──────────────────┐ +│ Your Code │ +│ SDK / CLI / MCP │ +└────────┬─────────┘ + │ gRPC + v +┌──────────────────┐ ┌────────────┐ ┌──────┐ +│ LiveKit Cloud │──────>│ Twilio SIP │──────>│ PSTN │ +│ Room + Data Ch. │ SIP │ │ │ │ +└────────┬─────────┘ └────────────┘ └──────┘ + │ + v +┌──────────────────┐ +│ call-use worker │ +│ │ +│ Deepgram STT │ +│ GPT-4o LLM │ +│ OpenAI TTS │ +└──────────────────┘ ``` -## Quick Start - -### Prerequisites - -call-use connects four external services into a voice AI pipeline: - -| Service | Purpose | Sign up | -|---------|---------|---------| -| [LiveKit](https://livekit.io/) | Real-time audio transport + agent dispatch | [Cloud](https://cloud.livekit.io/) or self-hosted | -| [Twilio](https://www.twilio.com/) | SIP trunk for PSTN connectivity | [Console](https://console.twilio.com/) | -| [Deepgram](https://deepgram.com/) | Speech-to-text | [Console](https://console.deepgram.com/) | -| [OpenAI](https://openai.com/) | LLM (GPT-4o) + text-to-speech | [Platform](https://platform.openai.com/) | - -### Configuration - -Set these environment variables (or use a `.env` file): - -| Variable | Description | -|----------|-------------| -| `LIVEKIT_URL` | LiveKit server URL (`wss://...`) | -| `LIVEKIT_API_KEY` | LiveKit API key | -| `LIVEKIT_API_SECRET` | LiveKit API secret | -| `SIP_TRUNK_ID` | Twilio SIP trunk ID configured in LiveKit | -| `DEEPGRAM_API_KEY` | Deepgram API key for STT | -| `OPENAI_API_KEY` | OpenAI API key for LLM + TTS | +**Two processes:** your code dispatches a call task into a LiveKit room; the worker joins the room, dials via SIP, runs the voice conversation, and publishes the structured outcome. -### Start the worker +## How It Works -The worker process handles the actual voice pipeline: +Every call follows a state machine: -```bash -call-use-worker start +``` +created -> dialing -> ringing -> connected + | + ┌───────────┼───────────┐ + v v v + in_ivr on_hold in_conversation + | | | + v v v + (DTMF nav) (wait) awaiting_approval + | | | | + └───────────┼────────┘ v + v human_takeover + ended | + ^ | + └─────────────────┘ ``` -### Make a call +The agent navigates IVR menus with DTMF, handles hold music, conducts free-form conversation, and can pause for human approval or full human takeover at any point. Every state transition emits a `CallEvent` with timestamp and metadata. -**Python SDK:** +## Four Interfaces + +**Python SDK** -- full async control: ```python import asyncio @@ -104,7 +123,7 @@ async def main(): asyncio.run(main()) ``` -**CLI** — any agent that can run shell commands can make calls: +**CLI** -- any agent that can run shell commands can make calls: ```bash call-use dial "+18001234567" -i "Ask about store hours" @@ -112,7 +131,7 @@ call-use dial "+18001234567" -i "Ask about store hours" Events stream to stderr in real-time; structured JSON result goes to stdout. -**MCP Server** — native integration for Claude Code, Codex, and other MCP clients: +**MCP Server** -- native integration for Claude Code, Codex, and other MCP clients: ```json { @@ -124,7 +143,7 @@ Events stream to stderr in real-time; structured JSON result goes to stdout. "LIVEKIT_API_KEY": "...", "LIVEKIT_API_SECRET": "...", "SIP_TRUNK_ID": "...", - "DEEPGRAM_API_KEY": "your-deepgram-api-key", + "DEEPGRAM_API_KEY": "...", "OPENAI_API_KEY": "..." } } @@ -134,27 +153,26 @@ Events stream to stderr in real-time; structured JSON result goes to stdout. Exposes four async tools: `dial` (returns immediately), `status`, `cancel`, `result`. -## Architecture +**REST API** -- for multi-tenant deployments: -``` -┌──────────┐ ┌──────────────┐ ┌────────────┐ ┌──────┐ -│ Your Code│────────▶│ LiveKit Cloud│────────▶│ Twilio SIP │────────▶│ PSTN │ -│ (SDK/CLI)│ gRPC │ │ agent │ │ SIP │ │ -└──────────┘ │ Room + Data │ dispatch │ │ ☎ │ - │ Channels │ └────────────┘ └──────┘ - └──────┬───────┘ - │ - ┌──────┴───────┐ - │ call-use │ - │ worker │ - │ │ - │ Deepgram STT │ - │ GPT-4o LLM │ - │ OpenAI TTS │ - └──────────────┘ +```python +from call_use import create_app +app = create_app(api_key="your-secret-key") +# uvicorn your_module:app ``` -**Two processes:** your code dispatches a call task into a LiveKit room; the worker joins the room, dials via SIP, runs the voice conversation, and publishes the structured outcome. +| Method | Path | Description | +|--------|------|-------------| +| `POST` | `/calls` | Create outbound call | +| `GET` | `/calls/{id}` | Get call status | +| `POST` | `/calls/{id}/inject` | Inject context into active call | +| `POST` | `/calls/{id}/takeover` | Human takeover | +| `POST` | `/calls/{id}/resume` | Resume AI agent | +| `POST` | `/calls/{id}/approve` | Approve pending action | +| `POST` | `/calls/{id}/reject` | Reject pending action | +| `POST` | `/calls/{id}/cancel` | Cancel call | + +All endpoints require an `X-API-Key` header. ## Human Takeover @@ -184,30 +202,47 @@ agent = CallAgent( ) ``` -## REST API +## Prerequisites -For multi-tenant deployments: +call-use connects four external services into a voice AI pipeline: -```python -from call_use import create_app -app = create_app(api_key="your-secret-key") -# uvicorn your_module:app +| Service | Purpose | Sign up | +|---------|---------|---------| +| [LiveKit](https://livekit.io/) | Real-time audio transport + agent dispatch | [Cloud](https://cloud.livekit.io/) or self-hosted | +| [Twilio](https://www.twilio.com/) | SIP trunk for PSTN connectivity | [Console](https://console.twilio.com/) | +| [Deepgram](https://deepgram.com/) | Speech-to-text | [Console](https://console.deepgram.com/) | +| [OpenAI](https://openai.com/) | LLM (GPT-4o) + text-to-speech | [Platform](https://platform.openai.com/) | + +### Configuration + +Set these environment variables (or use a `.env` file): + +| Variable | Description | +|----------|-------------| +| `LIVEKIT_URL` | LiveKit server URL (`wss://...`) | +| `LIVEKIT_API_KEY` | LiveKit API key | +| `LIVEKIT_API_SECRET` | LiveKit API secret | +| `SIP_TRUNK_ID` | Twilio SIP trunk ID configured in LiveKit | +| `DEEPGRAM_API_KEY` | Deepgram API key for STT | +| `OPENAI_API_KEY` | OpenAI API key for LLM + TTS | + +### Start the worker + +```bash +call-use-worker start ``` -| Method | Path | Description | -|--------|------|-------------| -| `POST` | `/calls` | Create outbound call | -| `GET` | `/calls/{id}` | Get call status | -| `POST` | `/calls/{id}/inject` | Inject context into active call | -| `POST` | `/calls/{id}/takeover` | Human takeover | -| `POST` | `/calls/{id}/resume` | Resume AI agent | -| `POST` | `/calls/{id}/approve` | Approve pending action | -| `POST` | `/calls/{id}/reject` | Reject pending action | -| `POST` | `/calls/{id}/cancel` | Cancel call | +## Integration Examples -All endpoints require an `X-API-Key` header. +call-use works with any agent framework. Brief examples below; full code in [examples/](examples/). + +**OpenAI Agents SDK** -- see [examples/openai_agents.py](examples/openai_agents.py) + +**LangChain** -- see [examples/langchain_tool.py](examples/langchain_tool.py) -## Examples +**CrewAI** -- see [examples/crewai_integration.py](examples/crewai_integration.py) + +**Claude Code (MCP)** -- see [examples/claude_code_setup.md](examples/claude_code_setup.md) | Example | Description | |---------|-------------| @@ -217,25 +252,10 @@ All endpoints require an `X-API-Key` header. | [Subscription cancellation](examples/subscription_cancellation.py) | Handle retention offers via approval flow | | [Multi-call workflow](examples/multi_call_workflow.py) | Chain sequential calls | | [Webhook integration](examples/webhook_integration.py) | FastAPI + WebSocket events | -| [LangChain tool](examples/langchain_tool.py) | Use as a LangChain tool | -| [OpenAI Agents](examples/openai_agents.py) | OpenAI Agents SDK integration | -| [CrewAI](examples/crewai_integration.py) | PhoneCallTool for CrewAI | -| [Claude Code MCP](examples/claude_code_setup.md) | MCP server setup guide | ## Documentation -Full documentation at [docs.call-use.com](https://docs.call-use.com) — getting started, guides, API reference, and architecture deep-dive. - -## Contributing - -```bash -git clone https://github.com/agent-next/call-use.git -cd call-use -pip install -e ".[dev]" -make check # lint + typecheck + test (100% coverage) + build -``` - -See [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines. +Full documentation at [docs.call-use.com](https://docs.call-use.com) -- getting started, guides, API reference, and architecture deep-dive. ## Troubleshooting @@ -249,14 +269,25 @@ See [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines. ## Known Limitations -- **In-memory state** — REST API call state is lost on restart; use LiveKit room metadata for recovery. -- **Single worker** — Horizontal scaling requires a shared state backend. -- **US/Canada only** — Outbound PSTN via Twilio SIP; international and inbound calling are planned. +- **In-memory state** -- REST API call state is lost on restart; use LiveKit room metadata for recovery. +- **Single worker** -- Horizontal scaling requires a shared state backend. +- **US/Canada only** -- Outbound PSTN via Twilio SIP; international and inbound calling are planned. ## Legal Notice call-use is a developer tool for legitimate business automation. Users are solely responsible for complying with all applicable telecommunications laws including TCPA, FCC regulations on AI-generated voices ([FCC 24-17](https://www.fcc.gov/document/fcc-makes-ai-generated-voices-robocalls-illegal)), Do Not Call registry, and state recording consent laws. See [SECURITY.md](SECURITY.md) for details. +## Contributing + +```bash +git clone https://github.com/agent-next/call-use.git +cd call-use +pip install -e ".[dev]" +make check # lint + typecheck + test (100% coverage) + build +``` + +See [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines. + ## License [MIT](LICENSE)