Web Search Agent

A modular Python AI agent that:

Detects user intent (general question, launch/date calculation, latest news)
Retrieves web context with Tavily
Builds intent-aware prompts
Streams responses from Google Gemini
Extracts dates from snippets and computes day differences for launch/date questions
Preserves conversational context to resolve follow-up references like "it"

Project Structure

web-search-agent/
  main.py
  README.md
  .env
  requirements.txt
  config/
    settings.py
  agent/
    web_agent.py
  services/
    context_service.py
    intent_service.py
    llm_service.py
    search_service.py
  utils/
    date_utils.py
    formatter.py
  models/
    schemas.py
  prompts/
    prompts.py

Setup Instructions

1. Prerequisites

Python 3.10+
Internet access (for Tavily and Gemini APIs)

2. Open the correct folder

From your workspace root:

cd web-search-agent

3. Install dependencies

pip install -r requirements.txt

4. Configure environment variables

Create/update .env in web-search-agent:

GOOGLE_API_KEY=your_google_api_key
TAVILY_API_KEY=your_tavily_api_key
GOOGLE_MODEL=gemini-2.5-flash

Notes:

GOOGLE_MODEL is optional; default is gemini-2.5-flash.
If the configured model is unavailable, the app falls back to gemini-2.0-flash.

How To Run The Project

Run from inside web-search-agent:

python main.py

You will see:

Web Search Agent (type 'exit' to quit)

Then ask multiple questions interactively.

Example:

User Query: nuclear bomb was invented how many days ago?
User Query: can you predict when it can be used in the future?

The second query can resolve it from conversation context.

Dependencies Used

From requirements.txt:

google-generativeai>=0.8.0
- Gemini client SDK for text generation and streaming tokens.
python-dotenv>=1.0.1
- Loads .env configuration into process environment.
requests>=2.32.0
- HTTP client used for Tavily API calls.

Architecture Overview

Layered flow

main.py
- CLI loop and interaction lifecycle.
agent/web_agent.py
- Orchestrates the end-to-end pipeline.
services/*
- intent_service.py: rule-based intent detection.
- context_service.py: conversational entity tracking and pronoun resolution.
- search_service.py: Tavily API integration.
- llm_service.py: Gemini streaming generation and model fallback.
utils/*
- formatter.py: builds prompt context and source output format.
- date_utils.py: date regex extraction and day-difference computation.
prompts/prompts.py
- intent-specific prompt templates and instructions.
models/schemas.py
- typed data models shared across modules.
config/settings.py
- strict environment-backed settings loading.

Request lifecycle

Read user query.
Resolve ambiguous references using previous turn entity (if needed).
Detect intent.
Search web with Tavily.
Build context from source title, URL, snippet.
Build intent-aware prompt.
Stream Gemini answer token-by-token.
For launch/date intent, compute and append day-difference summary.
Print formatted sources.

Design Decisions And Trade-offs

1) Rule-based intent detection (chosen) vs LLM intent classifier

Decision: Regex/rule patterns in IntentService.
Why: Fast, deterministic, no extra API cost, easy to debug.
Trade-off: Lower recall on unusual phrasing; requires manual pattern updates.

2) Single orchestrator agent class (chosen) vs distributed workflow engine

Decision: Keep orchestration in WebAgent.
Why: Simpler control flow and easier interview explanation.
Trade-off: As features grow, orchestration can become crowded.

3) Stateless external APIs + local conversational memory (chosen)

Decision: Store last extracted entity in memory for pronoun resolution.
Why: Improves follow-up query quality with minimal complexity.
Trade-off: Memory is session-local only; not persisted across restarts.

4) Local date for day-difference calculations (chosen)

Decision: Use system local date (date.today()).
Why: Matches user expectation in CLI and avoids timezone confusion.
Trade-off: Results can differ across machines in different timezones.

5) Tavily for retrieval + Gemini for generation (chosen)

Decision: Retrieval-augmented generation with source grounding.
Why: Better factuality than pure LLM answers and supports citations.
Trade-off: Accuracy depends on search result quality and snippet completeness.

6) Graceful fallback for model failures (chosen)

Decision: Fallback model on NotFound and extractive summary on quota exhaustion.
Why: Keeps the app usable under API instability/limits.
Trade-off: Fallback response quality may be lower than normal generation.

Troubleshooting

`pip install -r requirements.txt` fails

Ensure your current directory is web-search-agent before running install.
Verify Python/pip point to the same interpreter.

`python main.py` exits with missing env vars

Confirm .env exists in web-search-agent.
Required keys:
- GOOGLE_API_KEY
- TAVILY_API_KEY

Gemini returns model not found

Keep GOOGLE_MODEL=gemini-2.5-flash or another valid model.
The app automatically attempts gemini-2.0-flash fallback.

Search/API errors

Check network connectivity.
Verify Tavily key validity and quota.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Web Search Agent

Project Structure

Setup Instructions

1. Prerequisites

2. Open the correct folder

3. Install dependencies

4. Configure environment variables

How To Run The Project

Dependencies Used

Architecture Overview

Layered flow

Request lifecycle

Design Decisions And Trade-offs

1) Rule-based intent detection (chosen) vs LLM intent classifier

2) Single orchestrator agent class (chosen) vs distributed workflow engine

3) Stateless external APIs + local conversational memory (chosen)

4) Local date for day-difference calculations (chosen)

5) Tavily for retrieval + Gemini for generation (chosen)

6) Graceful fallback for model failures (chosen)

Troubleshooting

`pip install -r requirements.txt` fails

`python main.py` exits with missing env vars

Gemini returns model not found

Search/API errors

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
agent		agent
config		config
models		models
prompts		prompts
services		services
utils		utils
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Web Search Agent

Project Structure

Setup Instructions

1. Prerequisites

2. Open the correct folder

3. Install dependencies

4. Configure environment variables

How To Run The Project

Dependencies Used

Architecture Overview

Layered flow

Request lifecycle

Design Decisions And Trade-offs

1) Rule-based intent detection (chosen) vs LLM intent classifier

2) Single orchestrator agent class (chosen) vs distributed workflow engine

3) Stateless external APIs + local conversational memory (chosen)

4) Local date for day-difference calculations (chosen)

5) Tavily for retrieval + Gemini for generation (chosen)

6) Graceful fallback for model failures (chosen)

Troubleshooting

pip install -r requirements.txt fails

python main.py exits with missing env vars

Gemini returns model not found

Search/API errors

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`pip install -r requirements.txt` fails

`python main.py` exits with missing env vars

Packages