🏥 Cataract-LMM: Surgical Video Analysis

A comprehensive, production-ready framework for multi-task deep learning in surgical video analysis, featuring instance segmentation, phase recognition, skill assessment, and video processing capabilities.

🎯 Overview

Cataract-LMM is an enterprise-grade AI framework designed for large-scale, multi-center surgical video analysis. Built on modern software engineering principles, this repository provides state-of-the-art deep learning models for comprehensive analysis of cataract surgery videos.

🔬 Research Foundation

This framework implements methodologies from cutting-edge research in computer-assisted surgery, providing validated approaches for:

Surgical Instance Segmentation using YOLO, Mask R-CNN, and SAM architectures
Surgical Phase Recognition with Video Transformers, 3D CNNs, and temporal models
Surgical Skill Assessment through multi-modal analysis and performance metrics
Video Processing with GPU-accelerated pipelines for medical video data

🏆 Key Differentiators

Production-Ready: Enterprise-grade architecture with comprehensive testing and CI/CD
Multi-Task Learning: Unified framework supporting four core surgical analysis tasks
Scalable Design: Microservices-ready architecture with containerization support
Medical Compliance: HIPAA-aware design patterns and secure data handling
Research-to-Production: Seamless transition from research notebooks to production deployment

📋 Table of Contents

🚀 Quick Start
✨ Features
🏗️ Architecture
📦 Installation
🎯 Usage Examples
🛠️ Development
📊 Model Zoo
🔧 Configuration
🧪 Testing
📖 Documentation
🤝 Contributing
📄 License
📣 Citation
👨‍💻 Author
📞 Support & Community
🚀 Roadmap

🚀 Quick Start

Prerequisites

Python 3.8+
CUDA 11.8+ (for GPU acceleration)
FFmpeg (for video processing)
Docker (optional, for containerized deployment)

Installation

# Clone the repository
git clone https://github.com/MJAHMADEE/Cataract_LMM.git
cd Cataract_LMM

# Install using Poetry (recommended)
cd codes
poetry install

# Activate virtual environment
poetry shell

# Or install using pip
pip install -r requirements.txt

# Validate installation
python setup.py --validate-only

Basic Usage

# Video processing
cd surgical-video-processing
python main.py --input path/to/video.mp4 --output ./results --config configs/default.yaml

# Instance segmentation  
cd surgical-instance-segmentation
python inference/predictor.py --model yolo --input data/images/

# Phase recognition
cd surgical-phase-recognition
python validation/training_framework.py --config configs/default.yaml --mode train

# Skill assessment
cd surgical-skill-assessment
python main.py --config configs/comprehensive.yaml --mode evaluate

✨ Features

🧠 AI/ML Capabilities

Component	Models	Key Features
Instance Segmentation	YOLO v8/11, Mask R-CNN, SAM	Real-time surgical instrument detection and segmentation
Phase Recognition	Video Transformers, 3D CNNs, TeCNO	11-phase surgical workflow analysis
Skill Assessment	Multi-modal CNNs, Attention Models	Objective surgical skill evaluation
Video Processing	GPU-Accelerated Pipelines	Medical-grade video preprocessing and enhancement

🛠️ Engineering Excellence

🏗️ Modular Architecture: Microservices-ready design with clear separation of concerns
🔒 Security First: HIPAA-compliant patterns, secure credential management
📊 Comprehensive Testing: 85%+ test coverage with unit, integration, and E2E tests
🚀 CI/CD Pipeline: Automated testing, security scanning, and deployment workflows
📈 Monitoring & Observability: Structured logging, metrics collection, and health checks
🐳 Containerization: Multi-stage Docker builds with security hardening

🔧 Developer Experience

📚 Rich Documentation: Comprehensive guides, API references, and examples
🎯 Configuration Management: YAML-based configuration with validation
🧪 Development Tools: Pre-commit hooks, linting, formatting, and type checking
📦 Dependency Management: Poetry-based modern Python packaging
🔧 Development Environment: VS Code integration with debugging support

🏗️ Architecture

System Overview

graph TB
    A[Video Input] --> B[Video Processing Pipeline]
    B --> C[Frame Extraction & Preprocessing]
    C --> D[Multi-Task Analysis Engine]
    
    D --> E[Instance Segmentation]
    D --> F[Phase Recognition] 
    D --> G[Skill Assessment]
    
    E --> H[Surgical Instruments]
    F --> I[Surgery Phases]
    G --> J[Skill Metrics]
    
    H --> K[Clinical Decision Support]
    I --> K
    J --> K

Project Structure

Cataract_LMM/
├── 🏠 README.md                    # Project overview and documentation
├── 📄 LICENSE                     # CC-BY-4.0 license
├── 🤝 CONTRIBUTING.md             # Contribution guidelines
├── 🔒 .gitignore                  # Git ignore patterns
├── 📊 codes/                      # Main codebase
│   ├── 🎬 surgical-video-processing/          # Video preprocessing and enhancement
│   │   ├── core/                  # Core processing algorithms
│   │   ├── pipelines/             # Processing pipelines
│   │   ├── metadata/              # Video metadata management
│   │   ├── quality_control/       # Quality assurance tools
│   │   └── configs/               # Configuration files
│   ├── 🎯 surgical-instance-segmentation/     # Instance segmentation models
│   │   ├── models/                # YOLO, Mask R-CNN, SAM implementations
│   │   ├── training/              # Training pipelines
│   │   ├── inference/             # Real-time inference engines
│   │   ├── evaluation/            # Model evaluation tools
│   │   └── data/                  # Dataset utilities
│   ├── 🔄 surgical-phase-recognition/         # Phase classification models
│   │   ├── models/                # Video Transformers, 3D CNNs, TeCNO
│   │   ├── validation/            # Training and validation frameworks
│   │   ├── preprocessing/         # Video preprocessing
│   │   ├── analysis/              # Result analysis tools
│   │   └── configs/               # Model configurations
│   ├── 📊 surgical-skill-assessment/          # Skill evaluation framework
│   │   ├── models/                # Skill assessment models
│   │   ├── engine/                # Training and inference engines
│   │   ├── utils/                 # Analysis utilities
│   │   └── configs/               # Assessment configurations
│   ├── 🧪 tests/                  # Comprehensive test suite
│   ├── 📚 docs/                   # Documentation source
│   ├── 🐳 docker/                 # Docker configurations
│   ├── 📊 reports/                # Analysis reports
│   ├── ⚙️ pyproject.toml         # Python project configuration
│   ├── 🔒 Dockerfile             # Container definition
│   ├── 🚀 Makefile               # Development automation
│   └── 🔧 setup.py               # Project setup script
├── 🤖 .github/                   # GitHub configurations
│   └── workflows/                 # CI/CD pipelines
└── 📓 security_scanning_demo.ipynb  # Security analysis notebook

📦 Installation

System Requirements

Component	Minimum	Recommended
Python	3.8	3.11+
RAM	16GB	32GB+
GPU Memory	8GB	24GB+
Storage	50GB	500GB+
CUDA	11.8	12.0+

Installation Methods

Method 1: Poetry (Recommended)

# Install Poetry
curl -sSL https://install.python-poetry.org | python3 -

# Clone and setup
git clone https://github.com/MJAHMADEE/Cataract_LMM.git
cd Cataract_LMM/codes

# Install dependencies
poetry install --extras "dev docs"

# Activate environment
poetry shell

Method 2: Conda Environment

# Create environment
conda create -n cataract-lmm python=3.11
conda activate cataract-lmm

# Clone and install
git clone https://github.com/MJAHMADEE/Cataract_LMM.git
cd Cataract_LMM/codes
pip install -r requirements.txt

Method 3: Docker Deployment

# Build container
docker build -t cataract-lmm:latest .

# Run interactive container
docker run -it --gpus all -v $(pwd)/data:/app/data cataract-lmm:latest

Verification

# Run comprehensive validation
python setup.py --validate-only

# Run tests
pytest tests/ -v

# Check GPU availability
python -c "import torch; print(f'CUDA Available: {torch.cuda.is_available()}')"

🎯 Usage Examples

Video Processing Pipeline

from surgical_video_processing import VideoProcessor, QualityController

# Initialize processor with configuration
processor = VideoProcessor("configs/high_quality.yaml")

# Process surgical video
result = processor.process_video(
    input_path="data/surgery_video.mp4",
    output_dir="outputs/processed/",
    apply_deidentification=True,
    quality_threshold=0.8
)

print(f"Processed {result.frame_count} frames")
print(f"Quality score: {result.average_quality:.3f}")

Instance Segmentation

from surgical_instance_segmentation import SegmentationPredictor

# Load pre-trained model
predictor = SegmentationPredictor(
    model_type="yolo_v8",
    device="cuda"
)

# Segment surgical instruments
results = predictor.predict_batch(
    image_paths=["frame001.jpg", "frame002.jpg"],
    confidence_threshold=0.7,
    save_visualizations=True
)

# Extract detections
for result in results:
    print(f"Detected {len(result.boxes)} instruments")
    print(f"Classes: {result.class_names}")

Phase Recognition

from surgical_phase_recognition import PhaseClassifier

# Initialize phase recognition model
classifier = PhaseClassifier(
    model_name="video_transformer",
    config_path="configs/phase_recognition.yaml"
)

# Classify surgical phases in video sequence
phases = classifier.classify_sequence(
    video_path="data/surgery_complete.mp4",
    sequence_length=16,
    overlap=0.5
)

# Display phase timeline
for phase in phases:
    print(f"Time: {phase.timestamp:.2f}s - Phase: {phase.name}")

Skill Assessment

from surgical_skill_assessment import SkillEvaluator

# Initialize skill assessment framework
evaluator = SkillEvaluator("configs/skill_assessment.yaml")

# Assess surgical performance
assessment = evaluator.evaluate_surgery(
    video_path="data/complete_surgery.mp4",
    phase_annotations="data/phases.json",
    surgeon_level="resident"  # resident, fellow, attending
)

# Generate skill report
report = evaluator.generate_report(assessment)
print(f"Overall Score: {report.overall_score}/100")
print(f"Efficiency: {report.efficiency_score}/10")
print(f"Precision: {report.precision_score}/10")

🛠️ Development

Development Setup

# Clone repository
git clone https://github.com/MJAHMADEE/Cataract_LMM.git
cd Cataract_LMM/codes

# Install development dependencies
poetry install --extras "dev"

# Setup pre-commit hooks
pre-commit install

# Run development server
make dev-server

Code Quality Tools

# Format code
make format

# Run linting
make lint

# Type checking
make type-check

# Security scanning
make security-scan

# Run all quality checks
make quality

Testing

# Run unit tests
make test

# Run with coverage
make test-coverage

# Run integration tests
make test-integration

# Run end-to-end tests
make test-e2e

# Generate coverage report
make coverage-report

Available Make Commands

make help              # Show all available commands
make install           # Install dependencies
make clean             # Clean build artifacts
make build             # Build distribution packages
make docker-build      # Build Docker image
make docker-run        # Run Docker container
make docs-build        # Build documentation
make docs-serve        # Serve documentation locally

📊 Model Zoo

Instance Segmentation Models (Task 3: 12-class)

Model	mAP@0.5:0.95
YOLOv11 ⭐	73.9%
YOLOv8	73.8%
SAM	56.0%
SAM2	55.2%
Mask R-CNN	53.7%

Phase Recognition Models (In-Domain - Farabi Test Set)

Model	Backbone	Accuracy	F1-Score	Precision	Recall
MViT-B ⭐	-	85.7%	77.1%	77.1%	78.5%
Swin-T	-	85.5%	76.2%	77.5%	77.2%
CNN + GRU	EfficientNet-B5	82.1%	71.3%	76.0%	70.4%
CNN + TeCNO	EfficientNet-B5	81.7%	71.2%	75.1%	71.2%
CNN + LSTM	EfficientNet-B5	81.5%	70.0%	76.4%	69.4%

Skill Assessment Models

Model	Accuracy	Precision	Recall	F1-Score
TimeSformer ⭐	82.5%	86.0%	82.0%	83.9%
R3D-18	81.7%	82.4%	84.9%	83.6%
Slow R50	80.0%	81.8%	81.8%	81.8%
X3D-M	80.0%	83.9%	78.8%	81.3%
R(2+1)D-18	72.9%	79.3%	76.7%	78.0%

🔧 Configuration

Configuration Files

The framework uses YAML-based configuration for all components:

Video Processing (`surgical-video-processing/configs/default.yaml`)

processing:
  target_resolution: [1920, 1080]
  fps: 30
  quality_threshold: 0.75
  
deidentification:
  enabled: true
  blur_faces: true
  remove_text: true
  
output:
  format: "mp4"
  compression: "h264"
  quality: "high"

Instance Segmentation (`surgical-instance-segmentation/configs/yolo_config.yaml`)

model:
  architecture: "yolov8"
  size: "medium"
  pretrained: true

training:
  epochs: 100
  batch_size: 16
  learning_rate: 0.001
  
data:
  classes: ["forceps", "scissors", "needle_holder", "suction"]
  augmentation:
    enabled: true
    rotation: 15
    scaling: [0.8, 1.2]

Environment Variables

# Create .env file
cp .env.example .env

# Edit configuration
CUDA_VISIBLE_DEVICES=0,1
WANDB_PROJECT=cataract-lmm
DATA_ROOT=/path/to/data
OUTPUT_DIR=/path/to/outputs
LOG_LEVEL=INFO

🧪 Testing

Test Structure

tests/
├── unit/                   # Unit tests for individual components
├── integration/            # Integration tests for module interactions  
├── e2e/                   # End-to-end workflow tests
├── performance/           # Performance and benchmarking tests
├── security/              # Security and vulnerability tests
├── fixtures/              # Test data and fixtures
└── conftest.py           # Pytest configuration

Running Tests

# Run all tests
pytest

# Run specific test category
pytest tests/unit/
pytest tests/integration/
pytest tests/e2e/

# Run with coverage
pytest --cov=. --cov-report=html

# Run performance tests
pytest tests/performance/ --benchmark-only

# Run with specific markers
pytest -m "gpu" --gpu-required
pytest -m "slow" --timeout=300

Test Configuration

# pytest.ini
[tool:pytest]
testpaths = tests
python_files = test_*.py
python_classes = Test*
python_functions = test_*
markers =
    unit: Unit tests
    integration: Integration tests
    e2e: End-to-end tests
    gpu: Tests requiring GPU
    slow: Slow running tests
    security: Security tests
addopts = 
    --strict-markers
    --verbose
    --tb=short
    --cov-report=term-missing

📖 Documentation

Documentation Structure

📚 User Guide: Getting started, tutorials, and examples
🔧 API Reference: Comprehensive API documentation
🏗️ Developer Guide: Contributing, architecture, and development setup
📊 Model Documentation: Model architectures, performance metrics, and usage
🔐 Security Guide: Security considerations and best practices

Building Documentation

# Install documentation dependencies
poetry install --extras "docs"

# Build documentation
cd docs
make html

# Serve documentation locally
make serve

# Build PDF documentation
make latexpdf

Online Documentation

Documentation Site: https://cataract-lmm.readthedocs.io
API Reference: https://cataract-lmm.readthedocs.io/api/
Tutorials: https://cataract-lmm.readthedocs.io/tutorials/
Model Zoo: https://cataract-lmm.readthedocs.io/models/

🤝 Contributing

We welcome contributions from the surgical AI community! Please see our CONTRIBUTING.md for detailed guidelines.

Quick Contribution Guide

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit changes (git commit -m 'Add amazing feature')
Push to branch (git push origin feature/amazing-feature)
Open a Pull Request

Development Workflow

# Setup development environment
make dev-setup

# Run pre-commit checks
pre-commit run --all-files

# Run tests before committing
make test-all

# Submit pull request
gh pr create --title "Feature: Add amazing feature"

Code Standards

Python Style: Black formatter
Import Sorting: isort
Linting: Flake8 with medical AI conventions
Type Checking: MyPy for type safety
Documentation: Google style docstrings

📄 License

This repository is governed by specific licensing terms to ensure the proper use of both the software framework and the surgical dataset.

Dataset License (CC BY-NC-ND 4.0)

The Cataract-LMM dataset is proudly open-access and is officially licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

⚠️ Usage Restrictions: This license permits any non-commercial use, sharing, distribution, and reproduction in any medium or format, provided you give appropriate credit to the original authors. You do not have permission to use this material for commercial purposes, nor may you distribute adapted or derivative material based on this dataset.

Dataset License (CC BY-NC-ND 4.0)

The Cataract-LMM dataset is released under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License (CC BY-NC-ND 4.0). This license permits non-commercial use, sharing, and distribution with proper attribution, but prohibits commercial use and derivative works. See DATA_LICENSE.md for comprehensive details.

Framework Code License (CC BY-NC-ND 4.0)

The software framework, scripts, and codebase associated with this project are licensed under the CC BY-NC-ND 4.0 License. See the LICENSE file for details.

📣 Citation

🎉 Update: Our manuscript is officially published!

The comprehensive methodology, algorithmic baselines, and technical validations of this dataset have been published in Nature Scientific Data. While our earlier preprint remains available for reference on arXiv (arXiv:2510.16371), we kindly request that any research or systems utilizing this dataset direct their citations to the final peer-reviewed journal version.

⚠️ License Usage Note: This dataset permits any non-commercial use, sharing, distribution, and reproduction in any medium or format, as long as you give appropriate credit to the original authors. You do not have permission to use this material for commercial purposes, nor may you distribute adapted/derivative material derived from this dataset.

Please use your preferred format from the options below to cite our work:

📌 APA (7th ed.)

Ahmadi, M. J., Gandomi, I., Abdi, P., Mohammadi, S.-F., Taslimi, A., Khodaparast, M., Hashemi, H., Tavakoli, M., & Taghirad, H. D. (2026). Cataract-LMM Large-Scale Multi-Source Multi-Task Benchmark for Deep Learning in Surgical Video Analysis. Scientific Data. https://doi.org/10.1038/s41597-026-07464-0

📌 MLA (9th ed.)

Ahmadi, Mohammad Javad, et al. "Cataract-LMM Large-Scale Multi-Source Multi-Task Benchmark for Deep Learning in Surgical Video Analysis." Scientific Data, 23 May 2026, https://doi.org/10.1038/s41597-026-07464-0.

📌 Chicago/Turabian

Ahmadi, Mohammad Javad, Iman Gandomi, Parisa Abdi, Seyed-Farzad Mohammadi, Amirhossein Taslimi, Mehdi Khodaparast, Hassan Hashemi, Mahdi Tavakoli, and Hamid D. Taghirad. "Cataract-LMM Large-Scale Multi-Source Multi-Task Benchmark for Deep Learning in Surgical Video Analysis." Scientific Data (2026). https://doi.org/10.1038/s41597-026-07464-0.

📌 Harvard

Ahmadi, M.J., Gandomi, I., Abdi, P., Mohammadi, S.F., Taslimi, A., Khodaparast, M., Hashemi, H., Tavakoli, M. and Taghirad, H.D., 2026. Cataract-LMM Large-Scale Multi-Source Multi-Task Benchmark for Deep Learning in Surgical Video Analysis. Scientific Data. Available at: https://doi.org/10.1038/s41597-026-07464-0.

📌 Vancouver

Ahmadi MJ, Gandomi I, Abdi P, Mohammadi SF, Taslimi A, Khodaparast M, Hashemi H, Tavakoli M, Taghirad HD. Cataract-LMM Large-Scale Multi-Source Multi-Task Benchmark for Deep Learning in Surgical Video Analysis. Scientific Data. 2026 May 23. doi: 10.1038/s41597-026-07464-0.

📌 IEEE

M. J. Ahmadi et al., "Cataract-LMM Large-Scale Multi-Source Multi-Task Benchmark for Deep Learning in Surgical Video Analysis," Scientific Data, May 2026, doi: 10.1038/s41597-026-07464-0.

📌 BibTeX

@article{Ahmadi2026CataractLMM,
  title={Cataract-LMM Large-Scale Multi-Source Multi-Task Benchmark for Deep Learning in Surgical Video Analysis},
  author={Ahmadi, Mohammad Javad and Gandomi, Iman and Abdi, Parisa and Mohammadi, Seyed-Farzad and Taslimi, Amirhossein and Khodaparast, Mehdi and Hashemi, Hassan and Tavakoli, Mahdi and D. Taghirad, Hamid},
  journal={Scientific Data},
  year={2026},
  month={May},
  doi={10.1038/s41597-026-07464-0},
  url={https://doi.org/10.1038/s41597-026-07464-0}
}

📬 Contact & Connect

Mohammad Javad Ahmadi

I welcome collaborations, technical inquiries regarding the dataset, and discussions on advancing AI in medical applications. Feel free to connect with me through any of the channels below:

📧 Academic Email: mjahmadi@email.kntu.ac.ir
📧 Personal Email: mjahmadee@gmail.com

📞 Support & Community

Getting Help

📖 Documentation: Refer to individual README files in each module
🐛 Issues: GitHub Issues
💬 Discussions: GitHub Discussions
📧 Email: mjahmadee@gmail.com

🚀 Roadmap

Current Version (v1.0.0)

✅ Multi-task surgical video analysis framework
✅ Instance segmentation with YOLO/Mask R-CNN/SAM
✅ Phase recognition with Video Transformers
✅ Skill assessment framework
✅ Production-ready CI/CD pipeline

Upcoming Features (v1.1.0)

🔄 Real-time inference optimization
🔄 Multi-GPU distributed training
🔄 Model quantization and pruning
🔄 REST API and web interface
🔄 Advanced analytics dashboard

Future Vision (v2.0.0+)

🔮 Multi-modal learning (video + audio + sensor data)
🔮 Federated learning across institutions
🔮 Real-time surgical guidance system
🔮 Integration with surgical robots
🔮 Multi-language support

🏥 Advancing Surgical AI Through Open Science 🤖

Built with ❤️ by the Surgical AI Research Community
Empowering the next generation of computer-assisted surgery

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
.github		.github
codes		codes
data		data
docs		docs
paper_code		paper_code
.editorconfig		.editorconfig
.gitignore		.gitignore
CI_CD_STATUS.md		CI_CD_STATUS.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
DATA_LICENSE.md		DATA_LICENSE.md
DOCKER_TROUBLESHOOTING.md		DOCKER_TROUBLESHOOTING.md
LICENSE		LICENSE
README.md		README.md
test-docker-build.sh		test-docker-build.sh

Folders and files

Latest commit

History

Repository files navigation

🏥 Cataract-LMM: Surgical Video Analysis

🎯 Overview

🔬 Research Foundation

🏆 Key Differentiators

📋 Table of Contents

🚀 Quick Start

Prerequisites

Installation

Basic Usage

✨ Features

🧠 AI/ML Capabilities

🛠️ Engineering Excellence

🔧 Developer Experience

🏗️ Architecture

System Overview

Project Structure

📦 Installation

System Requirements

Installation Methods

Method 1: Poetry (Recommended)

Method 2: Conda Environment

Method 3: Docker Deployment

Verification

🎯 Usage Examples

Video Processing Pipeline

Instance Segmentation

Phase Recognition

Skill Assessment

🛠️ Development

Development Setup

Code Quality Tools

Testing

Available Make Commands

📊 Model Zoo

Instance Segmentation Models (Task 3: 12-class)

Phase Recognition Models (In-Domain - Farabi Test Set)

Skill Assessment Models

🔧 Configuration

Configuration Files

Video Processing (surgical-video-processing/configs/default.yaml)

Instance Segmentation (surgical-instance-segmentation/configs/yolo_config.yaml)

Environment Variables

🧪 Testing

Test Structure

Running Tests

Test Configuration

📖 Documentation

Documentation Structure

Building Documentation

Online Documentation

🤝 Contributing

Quick Contribution Guide

Development Workflow

Code Standards

📄 License

Dataset License (CC BY-NC-ND 4.0)

Dataset License (CC BY-NC-ND 4.0)

Framework Code License (CC BY-NC-ND 4.0)

📣 Citation

📌 APA (7th ed.)

📌 MLA (9th ed.)

📌 Chicago/Turabian

📌 Harvard

📌 Vancouver

📌 IEEE

📌 BibTeX

📬 Contact & Connect

📞 Support & Community

Getting Help

🚀 Roadmap

Current Version (v1.0.0)

Upcoming Features (v1.1.0)

Future Vision (v2.0.0+)

🏥 Advancing Surgical AI Through Open Science 🤖

About

Topics

Video Processing (`surgical-video-processing/configs/default.yaml`)

Instance Segmentation (`surgical-instance-segmentation/configs/yolo_config.yaml`)

Packages