The Image Generation Council is an advanced AI agentic workflow system designed to optimize the creation of AI-generated images. Inspired by the "LLM Council" architecture, this project orchestrates multiple specialized AI agents to collaborate, critique, and refine image generation prompts, simulating a professional creative studio environment.
The system operates as a sequential pipeline of "Agents" or "Stages," where specialized models perform distinct roles to ensure high-quality output. It leverages Large Language Models (LLMs) for reasoning and creativity, and Vision Language Models (VLMs) for visual feedback.
The workflow consists of four distinct stages:
-
Stage 1: Prompt Engineering (The Text Council)
- Role: Expert Prompt Engineers.
- Function: Several LLMs receive the user's high-level concept and independently draft detailed, chemically-optimized prompts (focusing on lighting, style, composition, and mood).
- Goal: To translate a vague user request into a precise technical specification.
-
Stage 2: Image Generation (The Artists)
- Role: Digital Artists.
- Function: State-of-the-art image generation models (e.g., Stable Diffusion XL, DALL-E 3) interpret the prompts from Stage 1 to render visual candidates.
- Goal: To visualize the refined concepts.
-
Stage 3: Vision Critique (The Critics)
- Role: Art Critics and Quality Assurance.
- Function: Vision-enabled models analyze the generated images against the original user request. They evaluate adherence to the prompt, visual fidelity, artifacts, and aesthetic quality.
- Goal: To provide objective, visual feedback on the results.
-
Stage 4: Protocol Synthesis (The Chairman)
- Role: The Chairman / Creative Director.
- Function: A highly capable LLM reviews the entire history: original request, generated images, and critic reviews. It selects the winning candidate, explains the decision, and offers a final recommendation.
- Goal: To provide the user with the best result and actionable insight.
The project is built using a modern full-stack architecture:
- Backend: Python 3.12+, FastAPI (Async Web Framework).
- Frontend: React 19, Vite (Modern Frontend Build Tool).
- AI Inference:
- Primary: Local LLM Support via LM Studio (OpenAPI compatible).
- Secondary: Cloud Support via OpenRouter (Configurable).
- Python 3.12+ (with
uvorpipfor package management). - Node.js 18+ (for frontend).
- LM Studio (for local offline inference).
-
Clone the Repository
git clone https://github.com/yourusername/image-generation-council.git cd image-generation-council -
Backend Setup Navigate to the root directory and install Python dependencies.
# Using uv (Recommended) uv sync # Or using standard pip pip install -r requirements.txt
-
Frontend Setup Navigate to the
frontenddirectory.cd frontend npm install
By default, the application is configured to run locally using LM Studio to ensure privacy and eliminate API costs.
- Install LM Studio: Download and install from the official website.
- Download a Model: Inside LM Studio, search for and download a model (Recommended:
Microsoft Phi-3 MiniorLlama 3 8B). - Load the Model: Go to the "Local Server" tab (
< >icon) and select the loaded model. - Start Server:
- Set Port to
1234. - Enable CORS (Cross-Origin Resource Sharing).
- Click Start Server.
- Set Port to
-
Start the Application We provide a unified launch script for convenience.
# Windows run_app.batAlternatively, run services manually:
- Backend:
uv run python -m backend.main(Port 8002) - Frontend:
npm run dev(Port 5173)
- Backend:
-
Access the Interface Open your browser to
http://localhost:5173. -
Create a Council
- Click New Conversation.
- Enter your image concept (e.g., "A cyberpunk robot drinking wine").
- Watch as the Council deliberates, generates, and critiques your request in real-time.
You can customize the models used by modifying backend/config.py:
CHAIRMAN_MODEL: The model used for final synthesis.TEXT_COUNCIL_MODELS: List of models for prompt engineering.VISION_CRITIC_MODELS: List of vision-capable models for critique.OPENROUTER_API_URL: Point tohttps://openrouter.ai/api/v1to use cloud models instead of localhost.
- "Failed to allocate buffer for kv cache": Your local model is too large for your RAM. In LM Studio, lower the "Context Length" to 2048 or use a smaller model like Phi-3 Mini.
- Connection Error: Ensure LM Studio server is running on port 1234.