This guide covers deploying IPFS Datasets Python in production environments.
For detailed deployment guides, see:
- Docker Deployment - Containerized deployment
- GraphRAG Production Guide - Production GraphRAG setup
- Runner Setup - GitHub Actions self-hosted runners
- GPU Runner Setup - GPU-enabled deployment
The easiest way to deploy in production:
# Build and run with Docker
docker-compose up -dSee Docker Deployment Guide for full details.
For scalable cloud deployment:
# Deploy to Kubernetes
kubectl apply -f deployments/kubernetes/See GraphRAG Production Guide for Kubernetes setup.
For direct server installation:
# Install dependencies
pip install ipfs-datasets-py[all]
# Configure systemd service
sudo cp ipfs-datasets-mcp.service /etc/systemd/system/
sudo systemctl enable ipfs-datasets-mcp
sudo systemctl start ipfs-datasets-mcp- Review Configuration - See Configuration Guide
- Security Setup - See Security & Governance
- Performance Tuning - See Performance Optimization
- Backup Strategy - Plan data backup and recovery
- Monitoring Setup - Configure logging and metrics
-
Environment Setup
# Set production environment export ENVIRONMENT=production # Configure secrets cp .env.example .env # Edit .env with production credentials
-
Database Setup (if using SQL)
# Initialize database python scripts/setup/init_database.py -
IPFS Node Setup
# Start IPFS daemon ipfs daemon & # Or use managed IPFS service
-
Deploy Application
# Using Docker docker-compose -f docker-compose.prod.yml up -d # Or using systemd sudo systemctl start ipfs-datasets-mcp
-
Verify Deployment
# Check service status docker-compose ps # or sudo systemctl status ipfs-datasets-mcp # Test endpoints curl http://localhost:8899/health
- Monitor Logs - Check for errors
- Load Testing - Verify performance
- Security Scan - Run security checks
- Backup Verification - Test backup/restore
- Documentation - Update runbooks
For HA deployments:
- Multiple application instances behind load balancer
- Redundant IPFS nodes
- Replicated vector stores (Qdrant cluster)
- Database replication
Scaling strategies:
- Horizontal: Multiple container replicas
- Vertical: Increase resource allocation
- Database: Read replicas, sharding
- IPFS: DHT optimization, gateway caching
Essential monitoring:
- Application Metrics: Response times, error rates
- IPFS Metrics: Peer count, data transfer
- Resource Metrics: CPU, memory, disk, network
- Business Metrics: Queries processed, documents indexed
See Unified Dashboard for monitoring setup.
Production security essentials:
- Network Security: Firewalls, VPNs, private networks
- Access Control: Authentication, authorization, API keys
- Data Security: Encryption at rest and in transit
- Audit Logging: Track all operations
See Security & Governance Guide for details.
What to backup:
- Configuration files (configs.yaml, .env)
- Vector store indices
- Database dumps
- IPFS pinned content list
- Application Recovery: Redeploy from container images
- Data Recovery: Restore from backups
- IPFS Recovery: Re-pin content from backup list
- Database Recovery: Restore from SQL dumps
Production performance tips:
- Enable caching (Redis recommended)
- Use GPU for embeddings if available
- Optimize IPFS settings for your workload
- Tune vector store parameters
See Performance Optimization Guide.
Service won't start:
- Check logs:
docker-compose logsorjournalctl -u ipfs-datasets-mcp - Verify configuration files
- Check port conflicts
Poor performance:
- Review resource allocation
- Check database query performance
- Monitor IPFS peer connections
- Profile application code
Connection errors:
- Verify network connectivity
- Check firewall rules
- Validate DNS resolution
- Test IPFS API access
For production support:
- Documentation: See User Guide
- Issues: Report at GitHub Issues
- Community: Join discussions
- Configuration Guide - Configure the application
- Docker Deployment - Container deployment
- Security Guide - Security best practices
- Performance Guide - Optimization tips