Skip to content

adrianstier/ai-agent-workflow

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

14 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

AI-Augmented Product Development Workflow

Build software products faster with 27 specialized, battle-tested AI agents

Version Agents Tested Rating


What's New (v1.2)

27 Specialized Agents Including Data Science Suite

New in v1.2:

  • 7 Data Science Agents (Agents 21-27): Complete ML pipeline from EDA to MLOps deployment
    • DS Orchestrator, Data Explorer, Feature Engineer, Model Architect, ML Engineer, Model Evaluator, MLOps Engineer
  • Full coverage for classification, regression, clustering, NLP, CV, and time series
  • Fairness auditing, interpretability (SHAP), and production monitoring

Previous (v1.1):

  • Debug Agents (Agents 10-16): Visual, performance, network, state, errors, memory leaks
  • Review Agents (Agents 17-20): Security, code review, database, design
  • Automated Testing Framework: JSON-based scenarios with weighted scoring (91%+ pass rates)
  • Edge Case Protocols: Vague input handling, security refusal, destructive operation safeguards

Key Improvements:

  • All agents tested with automated scenarios and edge cases
  • Added failure recovery and unrealistic scope detection protocols
  • Strong guardrails for security, destructive operations, and over-engineering

See: testing/ for the automated testing framework


What Is This?

This repository contains a complete, production-ready system for building software products using AI agents that emulate:

Core Development (Agents 0-9):

  • Agent 0: Project Orchestrator
  • Agent 1: Problem Framer
  • Agent 2: Competitive Mapper
  • Agent 3: Product Manager (PRD writing)
  • Agent 4: UX Designer
  • Agent 5: System Architect
  • Agent 6: Engineer
  • Agent 7: QA Test Engineer
  • Agent 8: DevOps Deployment
  • Agent 9: Analytics & Growth

Debug Suite (Agents 10-16):

  • Agent 10: Debug Detective (triage)
  • Agent 11: Visual Debug Specialist
  • Agent 12: Performance Profiler
  • Agent 13: Network Inspector
  • Agent 14: State Debugger
  • Agent 15: Error Tracker
  • Agent 16: Memory Leak Hunter

Review & Specialized (Agents 17-20):

  • Agent 17: Security Auditor
  • Agent 18: Code Reviewer
  • Agent 19: Database Engineer
  • Agent 20: Design Reviewer

Data Science Suite (Agents 21-27):

  • Agent 21: DS Orchestrator (coordinates ML projects)
  • Agent 22: Data Explorer (EDA, profiling)
  • Agent 23: Feature Engineer (features, encoding)
  • Agent 24: Model Architect (model selection, architecture)
  • Agent 25: ML Engineer (training, optimization)
  • Agent 26: Model Evaluator (evaluation, fairness, interpretability)
  • Agent 27: MLOps Engineer (deployment, monitoring)

Plus: A full-stack web dashboard to orchestrate everything!


Installation

One-liner (recommended):

curl -fsSL https://raw.githubusercontent.com/yourusername/ai-agent-workflow/main/scripts/install.sh | bash

Then create a project:

agent-init my-project
cd my-project && claude

Other methods: See INSTALL.md for git submodule, npm, and more options.


Quick Links

🎨 Interactive Visualization

Open the Interactive Agent Map - Explore all 27 agents in a visual, interactive constellation map with animated workflows.

πŸ“š For Users (Just Want to Build Products)

Document Purpose Time
INSTALL.md Installation options 5 min
docs/CLAUDE_CODE_GUIDE.md Use with Claude Code (CLI) 10 min
QUICK_START.md Manual agent usage 5 min
CHEAT_SHEET.md One-page quick reference 2 min
agents/README.md How to use each agent 15 min

πŸ› οΈ For Developers (Want to Build the Dashboard)

Document Purpose Time
dashboard/GETTING_STARTED.md Dashboard overview 10 min
dashboard/QUICK_START_DASHBOARD.md Set up in 10 minutes 10 min
dashboard/ARCHITECTURE.md System design 30 min
dashboard/IMPLEMENTATION_ROADMAP.md Week-by-week build guide 1 hour

πŸ“Š Testing & Quality

Document Purpose Time
testing/ Automated testing framework 15 min
AGENT_OPTIMIZATION_SUMMARY.md What was optimized & why 15 min

πŸ”— Project Integration

Document Purpose Time
docs/INTEGRATION_GUIDE_REIMBURSEMENT.md Example integration with ClearConcur 10 min
docs/CLEARCONCUR_QUICK_START.md Copy-paste prompts for ClearConcur 5 min

Project Structure

ai-agent-workflow/
β”‚
β”œβ”€β”€ πŸ“˜ Documentation
β”‚   β”œβ”€β”€ README.md                      # This file
β”‚   β”œβ”€β”€ QUICK_START.md                 # 5-minute guide
β”‚   β”œβ”€β”€ CHEAT_SHEET.md                 # Quick reference
β”‚   └── AGENT_OPTIMIZATION_SUMMARY.md  # Optimization details
β”‚
β”œβ”€β”€ πŸ€– Agents (27 Ready-to-Use Prompts)
β”‚   β”œβ”€β”€ agent-0 to agent-9             # Core development agents
β”‚   β”œβ”€β”€ agent-10 to agent-16           # Debug suite agents
β”‚   β”œβ”€β”€ agent-17 to agent-20           # Review & specialized agents
β”‚   β”œβ”€β”€ agent-21 to agent-27           # Data science suite agents
β”‚   β”œβ”€β”€ README.md                      # Agent usage guide
β”‚   β”œβ”€β”€ DEBUG-AGENTS-README.md         # Debug agent guide
β”‚   └── DATA-SCIENCE-AGENTS-README.md  # Data science agent guide
β”‚
β”œβ”€β”€ πŸš€ Claude Code Integration
β”‚   β”œβ”€β”€ templates/CLAUDE.md.template   # Project template for Claude Code
β”‚   β”œβ”€β”€ scripts/init-project.sh        # Initialize new projects
β”‚   β”œβ”€β”€ scripts/add-to-project.sh      # Add to existing projects
β”‚   β”œβ”€β”€ scripts/install.sh             # One-liner installer
β”‚   β”œβ”€β”€ docs/CLAUDE_CODE_GUIDE.md      # Full Claude Code guide
β”‚   └── INSTALL.md                     # All installation options
β”‚
β”œβ”€β”€ πŸ§ͺ Testing
β”‚   β”œβ”€β”€ scenarios/                     # JSON test scenarios
β”‚   β”œβ”€β”€ runner.js                      # Automated test runner
β”‚   └── README.md                      # Testing guide
β”‚
β”œβ”€β”€ πŸ“ docs/ (Integration Guides)
β”‚   β”œβ”€β”€ CLAUDE_CODE_GUIDE.md           # Claude Code setup guide
β”‚   β”œβ”€β”€ INTEGRATION_GUIDE_REIMBURSEMENT.md  # ClearConcur example
β”‚   β”œβ”€β”€ CLEARCONCUR_QUICK_START.md          # Copy-paste prompts
β”‚   └── CLEARCONCUR_CLAUDE_ADDITION.md      # CLAUDE.md additions
β”‚
└── πŸ’» Dashboard (Full-Stack App - Optional)
    β”œβ”€β”€ backend/                       # Express + Prisma + LangGraph
    β”œβ”€β”€ frontend/                      # Next.js + TypeScript
    └── docker-compose.yml             # PostgreSQL + Redis

Three Paths to Choose From

Path 1: Claude Code + Orchestrator-Driven Mode (Recommended)

Time: 5 minutes to set up, then seamless workflow

How it works:

  1. Create a project folder with CLAUDE.md from our template
  2. Start Claude Code in your project directory
  3. Tell the Orchestrator what you want to build
  4. Agent 0 drives everything - selecting agents, executing tasks, and only asking you key questions
  5. Artifacts are saved automatically

Best for:

  • Maximum efficiency - minimal context-switching
  • Developers using Claude Code (CLI)
  • Solo builders who want AI to drive the process
  • Projects where you want to focus on decisions, not prompts

Features:

  • Autonomous agent selection and execution
  • Only interrupts for key decisions
  • Automatic artifact management
  • Flow control ("speed up", "slow down", "skip", "go back")

Cost: ~$3-7 in API calls for complete workflow

Start: docs/CLAUDE_CODE_GUIDE.md


Path 2: Use the Agents Manually (Copy-Paste)

Time: 5 minutes to start, 3-4 hours for full v0.1 workflow

How it works:

  1. Open agents/agent-0-orchestrator.md
  2. Copy the prompt to Claude/ChatGPT
  3. Follow the agent's recommendations
  4. Work through all 20+ agents as needed
  5. Save artifacts manually as you go

Best for:

  • Using Claude web UI, ChatGPT, or other LLMs
  • Testing the workflow before committing
  • Projects without coding needs (just planning)
  • Maximum control over each step

Cost: ~$3-4 in API calls for complete workflow


Path 3: Build the Dashboard (Full-Stack)

Time: 4-5 weeks to full implementation

What you get:

  • Beautiful web UI for managing projects
  • Real-time agent execution
  • Automatic artifact management
  • Cost tracking & analytics
  • Multi-project support
  • WebSocket-powered live updates

Best for:

  • Developers who want to customize
  • Building this as a SaaS product
  • Teams who need collaboration features

Cost: ~$35-95/month (infrastructure + API)

Start: dashboard/GETTING_STARTED.md


Agent Performance Summary

Core Development Agents (0-9)

Agent Score Key Strength
Agent 0 (Orchestrator) 91% Failure recovery, scope detection
Agent 1 (Problem Framer) 92% Vague input handling, solution detection
Agent 3 (Product Manager) 91% Over-engineering prevention
Agent 5 (Architect) 90% Monolith-first, boring tech
Agent 6 (Engineer) 91% Security refusal, conflict detection
Agent 7 (QA) 92% Comprehensive test strategies
Agent 19 (Database) 91% Destructive operation safeguards

Debug Suite (10-16)

Agent Purpose
Agent 10 Debug triage and routing
Agent 11 Visual/CSS debugging
Agent 12 Performance profiling
Agent 13 Network/API debugging
Agent 14 State management debugging
Agent 15 Error tracking
Agent 16 Memory leak detection

Review Agents (17-20)

Agent Purpose
Agent 17 Security auditing
Agent 18 Code review
Agent 19 Database migrations & optimization
Agent 20 Design system review

Data Science Suite (21-27)

Agent Purpose
Agent 21 DS Orchestrator - coordinates ML projects
Agent 22 Data Explorer - EDA, profiling, quality
Agent 23 Feature Engineer - features, encoding, selection
Agent 24 Model Architect - model selection, architecture
Agent 25 ML Engineer - training, hyperparameter tuning
Agent 26 Model Evaluator - evaluation, fairness, interpretability
Agent 27 MLOps Engineer - deployment, monitoring, retraining

Overall: 91%+ pass rate on automated testing


What You'll Build

Example: Literature review app for PhD students

Input (to Agent 1)

"I want to build a tool to help PhD students manage literature reviews"

Constraints:
- Timeline: 4 weeks
- Budget: $0
- Solo builder
- Tech: TypeScript

Output (from all 10 agents)

artifacts/
β”œβ”€β”€ problem-brief-v0.1.md          βœ… Clear problem, personas, JTBD
β”œβ”€β”€ competitive-analysis-v0.1.md   βœ… 5 competitors analyzed, wedge strategy
β”œβ”€β”€ prd-v0.1.md                    βœ… 5 MUST features (not 15!)
β”œβ”€β”€ ux-flows-v0.1.md               βœ… User journeys, wireframes
β”œβ”€β”€ architecture-v0.1.md           βœ… Next.js + PostgreSQL (simple!)
β”œβ”€β”€ code/                          βœ… Implementation guidance
β”œβ”€β”€ test-plan-v0.1.md              βœ… Unit, integration, E2E tests
β”œβ”€β”€ deployment-plan-v0.1.md        βœ… Vercel + Neon setup
└── analytics-plan-v0.1.md         βœ… 5 critical events to track

Result: Complete product specification, ready to build!

Time: 3-4 hours Cost: $3-4


Key Improvements (v1.0)

🎯 Scope Control (Biggest Win!)

Before:

  • Agent 3 suggested 12-15 MUST features
  • Agent 5 recommended microservices + Redis + caching
  • 6-8 hours of revisions

After:

  • Agent 3 limited to 5-8 MUST features βœ…
  • Agent 5 recommends monoliths + boring tech βœ…
  • 3-4 hours total workflow βœ…

Impact: 50% scope reduction while maintaining value!

πŸ—οΈ Anti-Over-Engineering

Agent 5 (System Architect) now has strong guardrails:

❌ NO microservices for v0.1 ❌ NO Redis/caching for simple CRUD ❌ NO background jobs unless > 30 seconds ❌ NO Elasticsearch (PostgreSQL FTS is fine) ❌ NO custom auth (use managed services)

βœ… YES to boring, proven tech βœ… YES to monoliths βœ… YES to managed services βœ… YES to one-command deploys

πŸ“Š Better Outputs

Improvements across all agents:

  • βœ… Testable acceptance criteria (was vague)
  • βœ… Specific, measurable success metrics
  • βœ… Consistent terminology across agents
  • βœ… Actionable recommendations
  • βœ… Realistic scope for solo builders

Quick Start (Path 1: Manual)

1. Copy Agent 0 Prompt

cat agents/agent-0-orchestrator.md

2. Paste into Claude/ChatGPT

Add your project idea:

I want to build [YOUR IDEA].

Target users: [WHO]
Main problem: [WHAT]
Constraints:
- Timeline: [X weeks]
- Budget: [$X/month]
- Tech: [preferences]

What should I do first?

3. Follow the Agent's Recommendations

Agent 0 will tell you to run Agent 1 (Problem Framer) next.

4. Continue Through Agents as Needed

Core Development Flow:

  • Agent 1 β†’ Problem Brief
  • Agent 2 β†’ Competitive Analysis
  • Agent 3 β†’ PRD
  • Agent 4 β†’ UX Flows
  • Agent 5 β†’ Architecture
  • Agent 6 β†’ Code
  • Agent 7 β†’ Tests
  • Agent 8 β†’ Deployment
  • Agent 9 β†’ Analytics

When Debugging:

  • Agent 10 β†’ Triage β†’ Route to Agents 11-16

For Reviews:

  • Agent 17 β†’ Security Audit
  • Agent 18 β†’ Code Review
  • Agent 19 β†’ Database Changes
  • Agent 20 β†’ Design Review

5. Save Artifacts

mkdir -p my-project/artifacts
# Save each agent's output as you go

Testing Highlights

Test Scenario

Project: Literature review app for PhD students Input: Vague idea + constraints Process: All 10 agents sequentially Result: Complete v0.1 specification

Results

Metric Score Notes
Clarity 4.5/5 Clear, unambiguous outputs
Completeness 4.6/5 All required sections present
Actionability 4.7/5 Immediately usable
Scope 4.3/5 Realistic for solo builder
Consistency 4.5/5 Terminology aligned
Overall 4.5/5 ⭐ Production Ready

Cost & Time

  • Time: 3-4 hours (down from 6-8)
  • Cost: $3-4 (down from $4-6)
  • Revisions: 1-2 cycles (down from 3-4)

Improvement: ~40% faster, ~25% cheaper


Use Cases

βœ… What This System Is Great For

  • SaaS Products: Build and ship web applications
  • Internal Tools: Create tools for your team/lab
  • Research Apps: Specialized domain tools
  • Side Projects: Solo builder projects
  • Hackathons: Rapid prototyping
  • MVPs: Validate ideas quickly
  • Learning: Understand product development

⚠️ What This System Is NOT

  • ❌ Not a no-code builder (you still need to code)
  • ❌ Not AI that writes all your code (Agent 6 guides, you implement)
  • ❌ Not for large teams (optimized for solo builders)
  • ❌ Not for enterprise-scale (optimized for 10-1000 users)

FAQ

Q: Do I need to know how to code? A: For Agents 1-5 (planning), no. For Agents 6-8 (implementation), yes.

Q: Can I customize the agents? A: Yes! Edit the markdown files in agents/

Q: How much does it cost? A: Manual: ~$3-4 per project. Dashboard: ~$35-95/month.

Q: Can I use other LLMs besides Claude? A: Yes, the prompts work with GPT-4, Gemini, etc.

Q: Is my data private? A: Yes. If using Claude API directly, your data isn't used for training.

Q: Can multiple people use this? A: Manual: yes (share artifacts). Dashboard: planned for v1.0.

Q: What if an agent makes a mistake? A: Edit the artifact manually or re-run with different inputs.

Q: How long does a full workflow take? A: 3-4 hours for all 10 agents (manual prompting).


Contributing

Improvements welcome! Areas of interest:

  • πŸ§ͺ More test scenarios (different domains)
  • 🌍 Translations (agents in other languages)
  • πŸ”§ Integration code (LangGraph, CrewAI examples)
  • πŸ“± Mobile app agents
  • πŸ€– New specialized agents

To contribute:

  1. Fork the repository
  2. Make your changes
  3. Test with real projects
  4. Submit a pull request

License

MIT License - see LICENSE file


Credits

Created by: Adrian C. Stier

Built with:

Inspiration:

  • Jobs-to-be-Done framework
  • Lean Startup methodology
  • Agile development
  • Shape Up (Basecamp)

What's Next?

Immediate Actions

  1. Try the agents: QUICK_START.md
  2. Read the optimization summary: AGENT_OPTIMIZATION_SUMMARY.md
  3. Build something!

Roadmap

v1.1 (Next):

  • Agent performance monitoring
  • More test scenarios
  • Video tutorials
  • Community examples

v2.0 (Future):

  • Team collaboration features
  • Custom agent marketplace
  • Mobile app support
  • Advanced analytics

Support


Ready to build products with AI agents? πŸš€

Start here: QUICK_START.md

Or jump to:


Built with AI Tested Production Ready

About

Build software products faster with 10 specialized, battle-tested AI agents. Complete workflow from idea to deployment with comprehensive testing (4.5/5 rating). Includes full-stack dashboard architecture.

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors