Skip to content

lemonsaurus/LLM_orchestrator

Repository files navigation

LLM Orchestrator

A proof-of-concept showing how an LLM orchestrator can intelligently route requests between legacy systems and modern microservices.

❓ What This Demonstrates

When migrating from a monolithic backend to a set of microservices, you typically want to do this piece by piece following something like a Strangler Fig Pattern. This helps mitigate the risk by allowing you to move the old software piece by piece instead of all at once.

This PoC shows how an LLM can sit in the middle and figure out the best way to handle any request, automatically preferring modern services over legacy ones, and using advanced reasoning like conditional intents and multiple intents in a single message.

🏛️ Architecture

System Architecture

🚚 Moving parts

💬 Chat Interface (bot.py)

Simple terminal interface. Type your requests like you're chatting with support.

🤖 LLM Orchestrator (orchestrator.py)

The brain. Figures out what you want, decides which services to call, and when to call them.

🧠 Memory Service (memory.py)

Remembers your conversations. Stored as JSON files, one per user.

🛕 Old Backend (old_backend.py)

The legacy monolith. This will be called if the intent handler exists only here.

🚀 Microservices (microservices/)

The system will prefer to call the intent handlers in these microservices.

How It Actually Works

1. Startup

  • The orchestrator boots up and asks all services (old and new) what they can do.
  • Each service returns a list of "intents" it can handle, along with parameters and descriptions.
  • The orchestrator prefers microservice versions when there's overlap.
  • The bot service asks you to identify yourself by email.

2. You send a message

The bot passes your message to the orchestrator along with your email

3. Context Building

  • The orchestrator grabs your current conversation and recent past conversations from memory.
  • Current context is crucial, past context is just for reference.

4. Intent Classification

The orchestrator makes an LLM call with your full conversation and all available intents. The LLM responds with:

  • A list of intents to execute (in order)
  • All needed parameters for each intent
  • Optional conditions for conditional execution
  • If it's too soon to execute intents (either because we don't know what the intents are yet or because we want to gather more information), the LLM returns a response for the user designed to gather more info. Then we return to step 2.

5. Conditional Execution

  • When we have a list of intents to execute, we iterate through that list.
  • For each intent, if there's a condition attached (like "only if the order is recent"), the orchestrator makes another LLM call to check if it's true before executing, and to attach any necessary context from the previous intent handler responses to the execution of the next.
  • Intent handlers are executed sequentially, and responses are stored in memory.

6. Response Generation

After all intents finish, the orchestrator makes one final LLM call to turn all the results into a natural, helpful response for the user.

File Structure

├── 📂 conversations/             # Stored conversation JSON files
├── 📂 microservices/
│   ├── 📃 user.py                # Modern user service
│   ├── 📃 order.py               # Modern order service
│   └── 📃 shipping.py            # Modern shipping service
├── 📃 bot.py                     # Terminal interface
├── 📃 orchestrator.py            # LLM intelligence layer
├── 📃 memory.py                  # Conversation storage
├── 📃 old_backend.py             # Legacy monolith

Setup

# Install Poetry if you haven't already
curl -sSL https://install.python-poetry.org | python3 -

# Install dependencies
poetry install

# Add your OpenAI API key
echo "OPENAI_API_KEY=your_key_here" > .env

# Run the bot
poetry run python bot.py

Demo

LLM Orchestrator in action

The orchestrator automatically routes between legacy and modern services, evaluates conditions, and generates natural responses.

What this proves

LLMs can help circumnavigate complex ongoing system migrations

This PoC shows that LLMs can intelligently route between old and new systems without hardcoded logic. No routing tables, no manual mapping. The orchestrator discovers capabilities and makes smart decisions automatically.

Cross-conversation memory makes for a compelling illusion

By tracking full conversation history, the bot understands follow-up questions and references to previous topics. "What about my orders?" works because the LLM has context about who you are and what you've been discussing. This makes the conversation more engaging and saves the user from having to repeat themselves.

Sophisticated conditional execution is made easy by LLMs

Not every intent should always fire. The LLM can evaluate natural language conditions like "only if the order hasn't shipped yet" and build execution chains where later steps depend on earlier results.

Intent discovery reduces maintenance cost

Services self-describe their capabilities. Add a new microservice and the orchestrator automatically prefers it over legacy equivalents. No central registry to maintain, no deployment coordination needed.

What this does not prove

  • 🔒 This PoC has no guard rails, and will make available any feature it can access to any person. For example, you could ask for the phone number of any other user. In a production environment, this would be a huge security breach and we would need careful security guardrails and permission limitations.
  • 🚀 Making this solution performant in a production environment in a scalable way may be challenging. We need to carefully profile execution time as we build this, and minimize the number of API calls. These must be made asynchronously, and the memory service would need to behave in an idempotent manner and allow a large number of concurrent instances to access it without running into nasty race conditions.
  • 🍼 This PoC simplifies many things, like using function calls instead of API calls, file-based storage instead of a real database, it does not prove that this would work equally well with real-life production components.
  • 💸 Working with real APIs may introduce concerns like rate limiting, out of control costs, this PoC does not prove that these would not be serious concerns.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages