CVops - Real-Time Computer Vision Pipeline

A distributed, real-time computer vision pipeline designed for edge-to-cloud video analytics. The system ingests video streams from IoT cameras (ESP32-CAM), processes frames through a series of microservices, performs YOLO object detection, and stores detection data for later analysis.

Sandbox Station

This repository serves as a sandbox station baseline for Edge AI Vision exploration. Attendees can tune inference parameters, sampling strategies, and preprocessing to explore real-time performance/accuracy trade-offs on edge hardware.

Station Overview: SANDBOX_STATION.md
Decision Log: sandbox/LEDGER.md

Documentation

System Architecture - Complete architecture overview, components, and data flow
Quick Start Guide - Get up and running in 5 minutes
GPU Setup - Hardware acceleration configuration
Observability Guide - Prometheus and Grafana setup
Performance Tuning - CPU optimization strategies
V2 Proposals - Future enhancement plans

Quick Start

Prerequisites

Docker & Docker Compose
ESP32-CAM or IP camera with HTTP stream

1. Clone and Configure

git clone https://github.com/Akin-ctrl/CVops.git
cd CVops

# Configure your camera URL
nano .env

2. Build and Start

# Build all services
docker compose build

# Start infrastructure
docker compose up -d

3. Access Web Interfaces

Service	URL	Credentials
Grafana Dashboard	http://localhost:3000	admin / admin
Prometheus	http://localhost:9090	(none)
Preprocessed Frames	http://localhost:5000	(none)
Detection Results	http://localhost:7000	(none)
Kafka Management	http://localhost:19000	(none)
MinIO Console	http://localhost:9001	minioadmin / minioadmin

4. Common Operations

# View logs
docker compose logs -f

# Restart a service
docker compose restart yolo_inference

# Stop all services
docker compose down

# Full cleanup (removes volumes)
docker compose down -v

For detailed setup instructions, see the Quick Start Guide.

Technology Stack

Layer	Technology
Language	Python 3.12
ML Framework	Ultralytics YOLO11
Message Broker	Apache Kafka (Confluent 7.5.0)
Object Storage	MinIO (S3-compatible)
Web Framework	Flask
Computer Vision	OpenCV
Monitoring	Prometheus + Grafana
Containerization	Docker, Docker Compose

Configuration

Core settings in .env:

# Camera
URL=http://192.168.x.x:8080/stream

# Kafka
KAFKA_BROKER=kafka:9092

# YOLO
YOLO_INPUT_SIZE_WIDTH=640
YOLO_INPUT_SIZE_HEIGHT=640
MODEL_WEIGHTS_PATH=yolo11n.pt

# MinIO
MINIO_HOST=minio:9000
MINIO_ACCESS_KEY=minioadmin
MINIO_SECRET_KEY=minioadmin

For advanced configuration and tuning, see:

License

MIT License - See LICENSE file for details.

Additional Resources

System Architecture: doc/architecture/README.md
GPU Acceleration: doc/guides/GPU_SETUP.md
Monitoring Setup: doc/guides/OBSERVABILITY.md
CPU Optimization: doc/guides/PERFORMANCE_TUNING.md
Future Plans: doc/proposals/

Observability & Metrics

Port	Service	Purpose
3000	Grafana	Dashboards & visualization
9090	Prometheus	Metrics database
8000	Producer Metrics	Prometheus scrape endpoint
8001	Preprocessor Metrics	Prometheus scrape endpoint
8002	YOLO Metrics	Prometheus scrape endpoint
8003	MinIO Writer Metrics	Prometheus scrape endpoint
8004	Viewer Metrics	Prometheus scrape endpoint
8005	Detector Viewer Metrics	Prometheus scrape endpoint

Performance Optimizations

The system includes several optimizations for real-time processing:

Frame Skipping: Preprocessor and YOLO inference keep only the latest frame to prevent queue buildup and lag
Background Thread: FrameGrabber daemon decouples Kafka polling from inference processing
Batched Kafka Flush: Producer flushes every 100ms instead of per-frame
Reduced Input Size: YOLO uses 320×320 instead of 640×640 for ~2x speedup
Detection-Only Mode: Object tracking disabled by default (faster than tracking)
LZ4 Compression: Kafka producer uses LZ4 for fast message compression
Offline Wheel Files: Pre-downloaded pip wheels for air-gapped deployments

Observability & Monitoring

CoRVision includes comprehensive observability with Prometheus for metrics collection and Grafana for visualization, providing real-time monitoring of the entire pipeline from ingestion to storage.

What's Included

Prometheus - Metrics collection and time-series database
Grafana - Beautiful dashboards with pre-configured CoRVision Overview
Real-time Metrics - FPS, latency, detections, errors, and more
Service Health Monitoring - Instant visibility into service status
Automated Setup - Pre-provisioned datasources and dashboards

Quick Access

After starting the system with docker compose up -d, access:

Service	URL	Credentials
Grafana Dashboard	http://localhost:3000	admin / admin
Prometheus	http://localhost:9090	(none)

First-time Grafana setup:

Login with admin / admin
Change password when prompted
Navigate to Dashboards → CoRVision Overview
View real-time metrics across all services

Dashboard Panels

The Overview dashboard provides:

Processing FPS by Service - Real-time frames per second for each microservice
Processing Latency - Gauge showing current latency with color-coded thresholds
Total Detections by Class - Time series of detected objects (person, car, etc.)
Service Health - Status indicators showing which services are up/down
Kafka Message Throughput - Messages consumed/produced per second
Error Rate - Errors per service over time, grouped by error type

Observability Architecture

┌──────────────────────────────────────────────────────────────┐
│                     CoRVision Services                       │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌──────────┐      │
│  │ Producer │  │Preprocess│  │   YOLO   │  │  MinIO   │      │
│  │  :8000   │  │  :8001   │  │  :8002   │  │ Writer   │      │
│  │          │  │          │  │          │  │  :8003   │      │
│  └────┬─────┘  └────┬─────┘  └────┬─────┘  └────┬─────┘      │
│       │             │             │             │            │
│       └─────────────┴─────────────┴─────────────┘            │
│                         │ Metrics (HTTP)                     │
│                         ▼                                    │
│              ┌─────────────────────┐                         │
│              │    Prometheus       │                         │
│              │      :9090          │                         │
│              └──────────┬──────────┘                         │
│                         │ PromQL                             │
│                         ▼                                    │
│              ┌─────────────────────┐                         │
│              │      Grafana        │                         │
│              │       :3000         │                         │
│              └─────────────────────┘                         │
└──────────────────────────────────────────────────────────────┘

Metrics Collected

System-Wide Metrics

corvision_frames_processed_total - Total frames processed by each service
corvision_processing_latency_ms - Processing latency in milliseconds
corvision_kafka_messages_consumed_total - Messages consumed per topic
corvision_kafka_messages_produced_total - Messages produced per topic
corvision_service_up - Service health status (1=up, 0=down)
corvision_errors_total - Total errors by type and service

YOLO Inference Metrics

corvision_detections_total{class_name} - Total detections per class (person, car, etc.)
corvision_detection_confidence{class_name} - Confidence score histogram
corvision_inference_fps - Real-time inference frames per second

MinIO Writer Metrics

corvision_minio_batches_written_total - Total batches written to storage
corvision_minio_records_written_total - Total records written
corvision_minio_write_duration_seconds - Write operation duration

Verify Metrics Collection

# Check if metrics are being collected
curl http://localhost:8000/metrics  # Producer
curl http://localhost:8001/metrics  # Preprocessor
curl http://localhost:8002/metrics  # YOLO Inference
curl http://localhost:8003/metrics  # MinIO Writer

# Check Prometheus targets (all should show "UP")
# Open http://localhost:9090 → Status → Targets

Useful Prometheus Queries

# Current FPS by service
rate(corvision_frames_processed_total[1m])

# Average processing latency
avg(corvision_processing_latency_ms) by (service)

# Total detections in last hour
increase(corvision_detections_total[1h])

# Service health status
corvision_service_up == 0  # Shows down services

# Top detected classes
topk(5, sum by (class_name) (corvision_detections_total))

Metrics Endpoints

Each service exposes metrics at:

Service	Metrics Port	Endpoint
kafka-producer	8000	http://localhost:8000/metrics
preprocessor	8001	http://localhost:8001/metrics
yolo-inference	8002	http://localhost:8002/metrics
minio-writer	8003	http://localhost:8003/metrics
kafka-viewer	8004	http://localhost:8004/metrics
detector-viewer	8005	http://localhost:8005/metrics

Monitoring Best Practices

Key Metrics to Watch:

FPS - Ensures real-time processing capability
Latency - Detects performance degradation early
Error Rate - Catches failures before they cascade
Consumer Lag - Prevents Kafka queue buildup

Recommended Alerts:

Service downtime (immediate notification)
High latency (> 500ms for 5 minutes)
Error spikes (> 1 error/sec for 2 minutes)
Consumer lag (> 5000 messages for 5 minutes)

Troubleshooting

Services Won't Start

# Check logs
docker compose logs prometheus grafana

# Verify configuration
docker compose config

# Restart observability stack
docker compose restart prometheus grafana

No Data in Grafana

Wait 30 seconds for first Prometheus scrape
Check time range in Grafana (top-right) - set to "Last 5 minutes"
Verify Prometheus datasource: Configuration → Data Sources
Ensure all services are running: docker compose ps

Prometheus Targets Show "DOWN"

# Check if service is exposing metrics
curl http://localhost:8002/metrics

# Verify service is running
docker compose ps yolo_inference

# Check Prometheus configuration
docker compose exec prometheus cat /etc/prometheus/prometheus.yml

Common Issues

Problem: Low FPS

# Check which service is the bottleneck
rate(corvision_frames_processed_total[1m]) by (service)

Problem: High Latency

# Identify slow services
corvision_processing_latency_ms > 500

Problem: Detection Quality Issues

# Check confidence distribution
histogram_quantile(0.5, corvision_detection_confidence)

Additional Resources

Full Documentation: Observability Guide - Complete guide with advanced queries, alerting, and best practices
Quick Reference: Quick Start Guide - 5-minute setup guide

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
common		common
doc		doc
grafana/provisioning		grafana/provisioning
kafka-consumer		kafka-consumer
kafka-producers		kafka-producers
minio		minio
model		model
prometheus		prometheus
sandbox		sandbox
view-detection		view-detection
view-process		view-process
.env.example		.env.example
.gitignore		.gitignore
CVops_architecture.pdf		CVops_architecture.pdf
README.md		README.md
SANDBOX_STATION.md		SANDBOX_STATION.md
docker-compose.yml		docker-compose.yml

Folders and files

Latest commit

History

Repository files navigation

CVops - Real-Time Computer Vision Pipeline

Sandbox Station

Documentation

Table of Contents

Quick Start

Prerequisites

1. Clone and Configure

2. Build and Start

3. Access Web Interfaces

4. Common Operations

Technology Stack

Configuration

License

Additional Resources

Observability & Metrics

Performance Optimizations

Observability & Monitoring

What's Included

Quick Access

Dashboard Panels

Observability Architecture

Metrics Collected

System-Wide Metrics

YOLO Inference Metrics

MinIO Writer Metrics

Verify Metrics Collection

Useful Prometheus Queries

Metrics Endpoints

Monitoring Best Practices

Troubleshooting

Services Won't Start

No Data in Grafana

Prometheus Targets Show "DOWN"

Common Issues

Additional Resources

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages