ReAct Agent API

Overview

This project implements an advanced ReAct (Reason + Act) agent using FastAPI and LangChain. Unlike standard conversational bots, this agent employs a sophisticated middleware architecture that dynamically manages context summarization, handles model fallbacks, and orchestrates task lists to ensure robust performance.

The system is designed as a stateful, memory-aware service during runtime, but operates on an ephemeral basis. All conversation history and task states are stored exclusively in RAM (via MemorySaver) and are permanently purged upon server restart, ensuring strict data privacy and a clean state for development.

Workflow Architecture

The request processing pipeline consists of the following layers:

API Entry Point: Receives the user query and session metadata via FastAPI.
Middleware Layer:
- Summarization: Analyzes token usage. If the history exceeds 4000 tokens, it compresses the context using a lightweight model (gpt-oss-20b).
- Todo List: Extracts and manages sub-tasks for complex queries.
Reasoning Node: The primary LLM (llama-3.3-70b) analyzes the context and decides whether to act or answer.
Tool Execution Node: Executes external actions (Calculator or Web Search) if requested by the reasoning node.
Retry & Fallback Layer:
- Tool Retry: Automatically retries failed tool calls up to 3 times with backoff.
- Model Fallback: Switches to a larger model (gpt-oss-120b) if the primary model fails.
Response Formatting: Synthesizes the final answer and usage metadata into a structured JSON.

Core Features

1. Middleware Orchestration

The agent utilizes a chain of middleware components to enhance reliability and context management:

Summarization Middleware: Automatically prevents context window overflow by summarizing conversation history once it passes a threshold (4000 tokens), retaining only the last 20 messages verbatim.
Model Fallback Middleware: Provides high availability by seamlessly switching to a backup model (openai/gpt-oss-120b) if the primary inference engine encounters errors.
Todo List Middleware: Maintains an internal state of pending tasks, allowing the agent to break down complex user requests into manageable steps.

2. Tool Integration & Retry

To bridge the gap between language modeling and factual accuracy, the system integrates specific tools:

Tavily Search: Used for retrieving real-time information, news, and facts from the web.
Numexpr Calculator: Used for precise mathematical evaluations, eliminating LLM arithmetic hallucinations.
Auto-Retry: If a tool fails (e.g., API timeout or syntax error), the ToolRetryMiddleware intercepts the error and retries the operation with exponential backoff.

3. Structured Communication

Unlike simple text streams, the API enforces strict data contracts:

Input: Requires a query and a thread_id for session continuity.
Output: Returns a ResponseFormat object containing the final answer and a list of specific tools used during the generation process.

Technical Stack

API Framework: FastAPI
Orchestration: LangChain, LangGraph
LLM Inference: Groq Cloud (Llama 3.3 70B, GPT-OSS variants)
External Search: Tavily AI Search
Math Engine: Numexpr
Validation: Pydantic

Installation and Setup

Prerequisites

Python 3.10 or higher
API Keys for Google, Groq, and Tavily.

1. Clone the Repository

git clone https://github.com/your-username/react-agent-api.git
cd react-agent-api

2. Create Virtual Environment

python -m venv venv
source venv/bin/activate  # Linux/Mac
# venv\Scripts\activate   # Windows

3. Install Dependencies

pip install -r requirements.txt

4. Environment Configuration

Create a .env file in the project root with the following variables:

GOOGLE_API_KEY=your_google_api_key_here
GROQ_API_KEY=your_groq_api_key_here
TAVILY_API_KEY=your_tavily_api_key_here

# Optional: LangSmith Tracing (useful for debugging the graph)
LANGCHAIN_TRACING_V2=true
LANGSMITH_ENDPOINT=https://eu.api.smith.langchain.com
LANGCHAIN_API_KEY=your_langchain_api_key_here
LANGCHAIN_PROJECT="ReAct-Agent"

5. Run the Server

python main.py

The API will start at http://127.0.0.1:8000. Interactive API documentation (Swagger UI) is available at http://127.0.0.1:8000/docs.

API Documentation

GET /

Simple health check endpoint to verify the service status.

Response: JSON with status, service name, and docs link.

POST /api/chat

The primary interface for interaction. Triggers the Agent workflow with middleware support.

JSON Body:
query: The user's question.
thread_id: A unique string identifier for the user session (e.g., "session_001").
Response:
response: The generated text answer.
tool_usage: A list of strings indicating which tools were utilized (e.g., ['tavily_search', 'calculator']).

Configuration: Changing LLM Models

The project is currently configured to use Groq for high-speed inference. It uses different models for Chat, Summarization, and Fallback scenarios.

Where to Modify

Model definitions are located in app/services/agent.py.

How to Switch Models (Groq)

To change the specific Llama or GPT-OSS version, update the model parameter in the get_agent function:

# app/services/agent.py

# 1. Main Chat Model
chat_llm = ChatGroq(
    model="llama-3.3-70b-versatile", # Change to desired model
    temperature=0,
    api_key=settings.GROQ_API_KEY
)

# 2. Summarization Model
summarization_llm = ChatGroq(
    model="mixtral-8x7b-32768", # Example of changing model
    temperature=0,
    api_key=settings.GROQ_API_KEY
)

How to Switch Providers (e.g., to OpenAI)

Since the project uses LangChain, switching providers requires minimal code changes.

Install the provider package:

pip install langchain-openai

Update imports and initialization in app/services/agent.py:

from langchain_openai import ChatOpenAI

# Replace ChatGroq with ChatOpenAI
chat_llm = ChatOpenAI(
    model="gpt-4o",
    temperature=0,
    api_key="your_openai_key"
)

Note: Ensure you update the .env file with the necessary API keys for the new provider.

Data Privacy and Ephemeral Storage

This application operates in Ephemeral Mode.

In-Memory MemorySaver: The agent's memory (checkpoints) is initialized using MemorySaver(). Conversations exist only in RAM.
Server Restart: Upon terminating the process or restarting the server (uvicorn), all conversation history, todo lists, and session data are permanently erased. This ensures no sensitive data persists on the disk.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
app		app
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
architecture.excalidraw		architecture.excalidraw
architecture.svg		architecture.svg
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ReAct Agent API

Overview

Workflow Architecture

Core Features

1. Middleware Orchestration

2. Tool Integration & Retry

3. Structured Communication

Technical Stack

Installation and Setup

Prerequisites

1. Clone the Repository

2. Create Virtual Environment

3. Install Dependencies

4. Environment Configuration

5. Run the Server

API Documentation

GET /

POST /api/chat

Configuration: Changing LLM Models

Where to Modify

How to Switch Models (Groq)

How to Switch Providers (e.g., to OpenAI)

Data Privacy and Ephemeral Storage

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ReAct Agent API

Overview

Workflow Architecture

Core Features

1. Middleware Orchestration

2. Tool Integration & Retry

3. Structured Communication

Technical Stack

Installation and Setup

Prerequisites

1. Clone the Repository

2. Create Virtual Environment

3. Install Dependencies

4. Environment Configuration

5. Run the Server

API Documentation

GET /

POST /api/chat

Configuration: Changing LLM Models

Where to Modify

How to Switch Models (Groq)

How to Switch Providers (e.g., to OpenAI)

Data Privacy and Ephemeral Storage

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages