Welcome to the comprehensive documentation for IPFS Datasets Python - a production-ready decentralized AI data platform.
- Getting Started - Introduction and basic concepts
- Installation Guide - Install and setup instructions
- User Guide - Comprehensive usage guide
- Quick Start Examples - Code examples to get started
- Developer Guide - Development guidelines and patterns
- API Reference - Complete API documentation
- Contributing - How to contribute to the project
🗄️ IPLD Vector Database - Production-ready distributed vector search
- Vector Store Guides - Implementation guides
- 18 MCP tools for vector operations
- 95% test coverage, 150+ tests
📚 Knowledge Graphs Enhanced - Modular extraction and query system
- Knowledge Graph Guides - Complete documentation
- 110+ new tests with comprehensive coverage
- Unified query engine with hybrid search
📖 Documentation Reorganized - Clean, structured hierarchy
- Archive - Historical reports and planning docs
- Guides - 45 organized feature guides
- Reduced clutter: 85% fewer files in docs root
Essential guides for all users:
- index.md - Main documentation portal
- getting_started.md - Introduction to basic concepts
- installation.md - Installation instructions
- user_guide.md - Comprehensive user guide
- developer_guide.md - Guide for developers
- unified_dashboard.md - Dashboard documentation
- FEATURES.md - Complete feature list
- CHANGELOG.md - Version history
guides/ - Feature Guides and How-To Documentation
Organized by feature and component:
-
guides/knowledge_graphs/ - Knowledge graph documentation (16 guides)
- Implementation guides, migration paths, quick references
- Entity extraction, relationship mapping, query engine
-
guides/processors/ - Processor subsystem documentation (29 guides)
- Architecture, migration guides, quick references
- File conversion, multimedia processing, data transformation
-
guides/deployment/ - Deployment and runner setup guides
-
guides/tools/ - Tool-specific documentation (MCP, scrapers, web search)
-
guides/infrastructure/ - Infrastructure and CI/CD guides
-
guides/security/ - Security, audit logging, and governance
-
guides/reference/ - API reference and technical documentation
tutorials/ - Step-by-Step Tutorials
Hands-on tutorials for specific features and use cases.
examples/ - Usage Examples
Code samples and practical examples for common scenarios.
architecture/ - System Architecture
Technical design documents and architecture diagrams.
reports/ - Project Reports
Historical completion reports and project summaries (44+ files).
archive/ - Archived Documentation
Historical documentation and deprecated content organized into:
- archive/completion_reports/ - Phase, session, and task completion reports (44 files)
- archive/knowledge_graphs/ - Historical KG planning and reports (29 files)
- archive/processors/ - Historical processor planning and reports (27 files)
- archive/root_status_reports/ - Root directory status reports (25 files)
- archive/reorganization/ - Documentation reorganization history
- archive/deprecated/ - Deprecated and obsolete documents
Direct links to module documentation:
- Vector Stores - IPLD vector database, FAISS, Qdrant, Elasticsearch
- Embeddings - Embedding generation and management
- Search - Advanced search including RAG and GraphRAG
- Knowledge Graphs - Extraction, query, and storage
- PDF Processing - PDF analysis and processing
- Multimedia - Media processing capabilities
- LLM - Language model integration
- MCP Tools - 200+ tools for AI assistants
- IPLD - InterPlanetary Linked Data
- Audit - Security and audit logging
- Vector Search & Embeddings: See
guides/knowledge_graphs/and../ipfs_datasets_py/vector_stores/ - Knowledge Graphs: See
guides/knowledge_graphs/for all KG documentation - File Processing: See
guides/processors/for file conversion and multimedia - MCP Integration: See
guides/tools/for MCP server and tool documentation - Deployment: See
guides/deployment/for production deployment guides - Security: See
guides/security/for audit logging and governance
- Getting Started: Start with
getting_started.mdandinstallation.md - Building AI Applications: See
user_guide.mdand MCP documentation - Contributing Code: See
developer_guide.mdand architecture docs - Production Deployment: See
guides/deployment/andguides/infrastructure/
- Centralization: All documentation lives in the
docs/directory - Organization: Follow the established structure for different doc types
- Cross-Referencing: Use relative links between documentation files
- Archive Old Docs: Move completed session/phase reports to
archive/ - Update Guides: Keep permanent guides in
guides/up to date - Index Files: Each subdirectory should include a README.md or index file
This repository now includes a root-level mkdocs.yml configured to publish docs from docs/, including generated API pages in docs/api/.
- Build site locally:
mkdocs build - Serve docs locally:
mkdocs serve
If MkDocs is not installed in your environment, install it with pip install mkdocs.
The documentation was comprehensively reorganized to improve navigation:
- Archived 100+ files: Session reports, phase completions, and planning docs moved to
archive/ - Created guides structure: 45 permanent guides organized by feature in
guides/ - Reduced clutter: 85% reduction in docs root files (177 → 27 core files)
For details, see DOCS_REORGANIZATION_2026_02_16.md.
- General Questions: See FAQ or User Guide
- Bug Reports: Open an issue on GitHub
- Feature Requests: Check existing issues or open a new one
- Contributing: See CONTRIBUTING.md for guidelines