Use the Iru API MCP Server with local Ollama models for 100% private, offline AI-driven device management.
Ollama integration is achieved through community bridge tools that connect your stdio-based MCP server to local Ollama models. This provides complete privacy with no cloud dependencies.
- ✅ 100% local and offline - No data leaves your machine
- ✅ Zero code changes - Uses existing stdio server
- ✅ No API costs - Completely free
- ✅ Privacy-focused - Perfect for sensitive environments
- ✅ Multiple model options - Qwen, Llama, Mistral, and more
- ✅ Two integration methods - TypeScript bridge or Python TUI
📝 Note: Complete Ollama integration documentation is coming soon as part of Phase 2 of the roadmap.
For now, refer to the ROADMAP - Phase 2: Ollama section for detailed implementation tasks.
- Ollama installed (
brew install ollamaon macOS) - 8GB+ RAM recommended (16GB+ for 7B models)
- This MCP server built (
npm run build)
qwen2.5-coder:7b-instruct- Best for tool callingllama3.2:3b- Lightweight alternativemistral:7b-instruct- Good balance
Option A: ollama-mcp-bridge (TypeScript)
- https://github.com/patruff/ollama-mcp-bridge
- Multi-server support
- Smart tool routing
Option B: mcp-client-for-ollama (Python TUI)
- https://github.com/jonigl/mcp-client-for-ollama
- Beautiful terminal interface
- Human-in-the-loop approvals
- Hot-reloading servers
This integration guide will include:
- Step-by-step setup for both bridge options
- Model comparison and benchmarks
- Performance optimization tips
- Troubleshooting guide for OOM errors
- Offline workflow examples
- System requirements and recommendations
- ROADMAP - Phase 2 - Complete implementation plan
- Gemini CLI Integration - Alternative cloud-based option
- Main README - Project overview
Target Completion: 2-4 weeks after Phase 1 (Gemini CLI) completion
Estimated Setup Time: 1-2 hours once documented
Last Updated: 2025-10-27