-
Notifications
You must be signed in to change notification settings - Fork 4
Home
CrawlLama is a high-performance, local AI Research Agent engineered for advanced Open Source Intelligence (OSINT) and complex multi-hop reasoning. It utilizes local Large Language Models (LLMs) via Ollama to provide a secure, private, and highly extensible research environment.
CrawlLama is designed for researchers, developers, and security professionals who require a local-first agent capable of executing deep-dive queries. Unlike standard conversational AI, CrawlLama is built to:
- Execute Multi-Step Reasoning: Deconstruct complex tasks into sequential research phases.
- Perform Specialized OSINT: Utilize dedicated modules for email, phone, IP, and social media analysis.
- Maintain Privacy: Process all LLM logic locally while maintaining high performance.
- Scale Effort Dynamically: Adjust reasoning depth based on task complexity.
The system utilizes a modular, tool-centric architecture:
- Core Engine: Manages state, session persistence, and LLM communication.
- Tool Registry: Orchestrates external integrations including web search providers (DuckDuckGo, Brave, Serper), Wikipedia, and OSINT modules.
- Agent Orchestrator: Powered by LangGraph, managing the iterative flow of Search, Analysis, Critique, and Synthesis.
- Plugin System: Supports dynamic loading of custom functionality without core modification.
CrawlLama features an Adaptive Agent Hopping system. It automatically analyzes query intent to select the optimal agent level (Low, Mid, or High complexity). This ensures resource efficiency for simple tasks while providing maximum depth for complex investigations.
Utilizing LangGraph, the agent performs iterative research cycles:
- Router: Path determination and initial strategy.
- Search: Data acquisition from multiple sources.
- Analyze: Interpretation of findings and entity extraction.
- Follow-Up: Gap identification and secondary research cycles.
- Critique: Self-evaluation for factual accuracy and hallucination prevention.
CrawlLama provides a robust suite of tools for Open Source Intelligence:
- Email Intelligence: Validation, MX record analysis, and breach detection.
- Phone Intelligence: Carrier identification, normalization, and country detection.
- IP Intelligence: Geolocation, ISP verification, and reputation scoring.
- Social Intelligence: Automated profile discovery across 12+ platforms.
-
Advanced Operators: Support for
site:,inurl:,filetype:, and other advanced search parameters.
- Hybrid Search: Combines semantic vector search with traditional keyword matching.
- Intelligent Caching: Implements TTL-based caching to optimize performance and reduce API latency.
- Extended Context: Support for 16k+ tokens (hardware dependent) for deep history maintenance.
A professional terminal interface featuring Markdown rendering, real-time token monitoring, and a comprehensive interactive settings menu.
Integrate CrawlLama into existing workflows via a robust RESTful interface.
-
Key Endpoints:
/query,/osint/query,/memory, and/health. - Security: Integrated API key authentication and configurable rate limiting.
A live monitoring interface for system-wide visibility:
- System Metrics: Real-time tracking of CPU, RAM, and Disk utilization.
- Component Status: Health checks for Ollama, RAG, Cache, and Tools.
- Performance Logs: Response latency tracking and error diagnostics.
- Install Ollama: ollama.com
-
Clone and Setup:
git clone https://github.com/arn-c0de/Crawllama.git cd Crawllama ./setup.sh # or setup.bat on Windows
-
Run:
./run.sh # or run.bat
Comprehensive guides for extending and integrating CrawlLama:
- Plugin Development: Learn how to create custom plugins.
- API Reference: Detailed documentation for REST API endpoints.
- Architecture Overview: Deep dive into the project structure and code organization.
You can clone this wiki to your local machine for offline editing or version control:
git clone https://github.com/arn-c0de/Crawllama.wiki.gitFor comprehensive documentation, refer to the Documentation Overview.
CrawlLama Research Agent | GitHub Repository | Project Website Privacy-focused, local AI intelligence for the modern researcher.