A sophisticated deep learning solution for automatically removing backgrounds from pet images with professional-quality results. This project features a web application supporting transparent cutout generation, model comparison, and interactive background swapping.
This is a 2-member collaborative project with clearly distinct contributions:
- Focus: Real-time, lightweight matting optimized for speed and efficiency
- Model Architecture: MODNet (Mobile Optimized Deep Network)
- Code Location:
code/modnet/ - Model Weights:
models/modnet/modnet_pet_matting.keras - Benchmark Performance: IoU 0.7237, Dice 0.8290, MAE 0.0807
- Focus: High-quality, detail-oriented matting for complex fur and hair edges
- Model Architecture: FBA (Foreground-Background-Alpha) Matting with U-Net segmentation
- Code Location:
code/fba/ - Model Weights:
- FBA Refinement:
models/fba/fba_pet_final.pth - U-Net Segmentation:
models/fba/unet/pet_unet_improved_final.keras
- FBA Refinement:
- Benchmark Performance: IoU 0.7500, Dice 0.8462, MAE 0.0869
- Methodology:
- Pseudo-Label Generation: Utilized improved U-Net predictions to generate soft alpha mattes for training.
- Trimap Generation: Dynamic trimap generation with random unknown region width to handle varying uncertainty.
- FBA Refinement: Fine-tuned FBA model with ResNet-50 backbone using the generated pseudo-labels.
- Novelty:
- Laplacian Pyramid Loss: Introduced a multi-scale loss function to capture high-frequency fur details.
- Domain Adaptation: Fine-tuned a general matting model specifically for pet imagery.
- Self-Training: Leveraged self-generated pseudo-labels to bridge the gap between coarse masks and soft mattes.
-
π¨ Transparent Cutout Generator
- Upload a pet image and generate a high-quality PNG with transparent background
- Choose between MODNet or FBA model
- View alpha matte and processing pipeline visualization
- Download results instantly
-
βοΈ Model Comparison
- Side-by-side comparison of MODNet and FBA results
- Visual and quantitative performance analysis
- Automatic winner determination based on metrics
-
πΌοΈ Interactive Background Swap
- Drag-and-drop pet positioning
- Scroll to resize foreground
- Real-time preview with transparent checkered background
- Export high-resolution composites
Main Flask server with three API endpoints:
/api/cutout: Generates transparent cutouts using selected model (MODNet/FBA)/api/compare: Side-by-side comparison of both models/api/composite: Creates foreground + background composites- Handles model loading, image processing, and metric calculation
FBA Pipeline Integration:
SimpleFBAPipeline: Combines U-Net segmentation with FBA refinementload_fba_pipeline(): Loads model weights and initializes pipelinepredict_fba_pipeline(): Runs inference on input images- Handles trimap generation and alpha matte refinement
Main UI Template:
- Three-tab interface (Cutout, Compare, Composite)
- Model selector radio buttons for MODNet/FBA
- Drag-and-drop file upload zones
- Interactive composite editor with positioning controls
- Real-time metrics display
Frontend Logic:
- File upload and preview handling
- API calls to backend (
generateCutout(),compareModels(),startCompositeEditor()) - Interactive drag-and-drop for composite editor
- Dynamic metric updates and UI state management
UI Styling:
- Modern dark theme with gradient accents
- Custom radio buttons for model selection
- Animated transitions and hover effects
- Responsive design for mobile compatibility
MODNet Model Definition:
MODNet: MobileNetV2-based encoder-decoder architecturecreate_modnet(): Factory function with customizable parameters- Optimized for real-time inference
Data Processing:
prepare_modnet_training_data(): Prepares Oxford-IIIT Pet datasetcreate_training_dataset(): Applies augmentation and batching- Handles segmentation mask preprocessing
Main Training Script:
- Full training pipeline with callbacks and logging
- Custom loss functions (BCE + L1 + Gradient Loss)
- Visualization of training progress
- Saves model checkpoints and final weights
Model Evaluation:
- IoU, Dice, MAE metric calculations
- Batch evaluation on test datasets
- Performance visualization
train_modnet_quick.py: Fast prototyping training scriptmodnet_inference.py: Standalone inference utilitiespipeline.py: Complete matting pipelinetrimap_generation.py: Trimap creation from masksgenerate_cutouts.py: Batch processing scriptvisualize_matting_process.py: Pipeline visualizationdemo_cutouts.py: Interactive demo
Simplified FBA Pipeline:
- Two-stage matting: U-Net β FBA refinement
- Trimap generation utilities
- Handles PyTorch model loading and inference
FBA Model Architecture:
MattingModule: Main FBA matting networkfba_encoder(): ResNet-based feature encoderfba_decoder(): Pyramid pooling decoderfba_fusion(): Foreground-background-alpha fusion
networks/resnet_GN_WS.py: ResNet with Group Normalization and Weight Standardizationnetworks/layers_WS.py: Custom weight-standardized layersdataloader.py: Data loading utilities for training
pip install tensorflow torch pillow flask numpy scipy-
Navigate to webapp directory:
cd webapp -
Start the Flask server:
python3 app.py
-
Open in browser:
http://localhost:4000
- Select model (MODNet or FBA)
- Upload a pet image
- Click "Generate Cutout"
- Download the result
- Upload an image
- Click "Compare Models"
- View side-by-side results
- Select model (MODNet or FBA)
- Upload pet image and background
- Click "Open Editor"
- Drag to position, scroll to resize
- Download composite
| Model | IoU | Dice | MAE | Speed |
|---|---|---|---|---|
| MODNet | 0.7237 | 0.8290 | 0.0807 | β‘ Fast |
| FBA | 0.7500 | 0.8462 | 0.0869 | π― Accurate |
Benchmarks on Oxford-IIIT Pet Test Set
Trained on Oxford-IIIT Pet Dataset:
- 37 pet categories
- ~7,000 images with segmentation masks
- Diverse backgrounds and poses
This project is for educational purposes.
- Vikhas: MODNet implementation and training
- Lalitha: FBA integration and pipeline development