PersonaForge is a reliability testing platform designed specifically for voice-based conversational AI agents. It tests agents against goal-driven synthetic customer personas, ensuring that they do not hallucinate, fail to escalate, or violate business policies in production.
Before installing PersonaForge, ensure you have the following:
- Python: Version 3.10 or higher.
- Node.js: Version 18 or higher (only required for running the Dashboard/Studio).
- Database: SQLite (built-in) or PostgreSQL.
- API Accounts:
- ElevenLabs: Access to Conversational AI agents.
- Gemini (Google AI Studio): For persona decisions and Judge evaluations.
git clone https://github.com/yourusername/personaforge.git
cd personaforgeIt is highly recommended to isolate dependencies inside a virtual environment:
# Create virtual environment
python3 -m venv venv
# Activate virtual environment (macOS/Linux)
source venv/bin/activate
# Activate virtual environment (Windows)
# venv\Scripts\activatepip install -r requirements.txtCreate a .env file in the root of your project:
# ElevenLabs API Access
ELEVENLABS_API_KEY=your_elevenlabs_api_key
# Google Gemini API Access (used for the Persona & Judge Engines)
GOOGLE_API_KEY=your_gemini_api_key
# Optional: SQLite Database Path (defaults to personaforge.db in root)
DATABASE_URL=sqlite+aiosqlite:///personaforge.db
# Optional: Redis URL (required for async task queue worker mode)
REDIS_URL=redis://localhost:6379Scaffold the required directories and initial configuration templates by executing:
# Ensure Python is looking in the root directory for imports
export PYTHONPATH=$PYTHONPATH:.
# Initialize project layout
python3 -m personaforge.backend.app.cli.main initThis command creates the following file structure in your directory:
personas/: Definitions for customer personas (e.g.personas/angry_customer.yaml).scenarios/: Test scenarios pairing personas with goals (e.g.scenarios/telecom_refund.yaml).policies/: Compliance and business rules (e.g.policies/refund.md).tests/: Directory for automated tests.reports/: JSON summaries generated by runs.artifacts/: Saved raw conversation history, LLM evaluations, and reports.personaforge.yaml: Main configuration settings.
To execute a test run, you can either run a Dry-Run Simulation (mocked voices and providers to test configuration without API charges) or a Real Run (connecting directly to ElevenLabs over WebSockets).
PYTHONPATH=. python3 -m personaforge.backend.app.cli.main run scenarios/telecom_refund.yaml --count 2 --dry-run- Open
personaforge.yamland set your ElevenLabs agent ID:agent: provider: elevenlabs agent_id: your_actual_agent_id_here
- Start the test:
PYTHONPATH=. python3 -m personaforge.backend.app.cli.main run scenarios/telecom_refund.yaml --count 1
Once the run completes, check the summary via the CLI:
PYTHONPATH=. python3 -m personaforge.backend.app.cli.main report latestYou can also print the turn-by-turn log of a specific conversation (copy the UUID from your report output):
PYTHONPATH=. python3 -m personaforge.backend.app.cli.main replay <CONVERSATION_ID>