Computer Vision Image Segmentation Project

Overview

This project explores various image segmentation techniques on the Oxford-IIIT Pet Dataset. The goal is to implement and compare different segmentation approaches, ranging from traditional methods to deep learning models, while investigating how pre-training and feature extraction strategies impact segmentation performance.

Project Structure

📂 config: Configuration parameters
📂 data: Dataset loaders and augmentation utilities
📂 scripts: Scripts for preprocessing and running experiments
📂 utils: Utility functions for dataset processing
📂 Dataset: Original dataset

Generated Directories (not in version control)

📂 Processed_Dataset: Standardized dataset (generated via scripts)
📂 Augmented_Dataset: Dataset with augmentations (generated via scripts)

Environment Setup

This project requires several dependencies that can be installed using the provided environment file. Follow these steps to set up the environment:

conda env create -f environment.yml
conda activate cv-seg

Reproducibility

For consistent results, all data preprocessing includes a fixed random seed (RANDOM_SEED=42) to ensure:

Train/validation splits are consistent
Data augmentations are reproducible
Models initialized with the same weights
All generated datasets are created within the repository directory structure.

Task 1: Dataset Preprocessing and Augmentation

Goal

Prepare the Oxford-IIIT Pet Dataset for image segmentation by:

Standardizing dimensions.
Implementing data augmentation techniques to improve model training performance.

Dataset Overview

The original dataset contains images of cats and dogs with corresponding segmentation masks. The masks include:

Class 0: Background pixels.
Class 1: Cat pixels.
Class 2: Dog pixels
Class 3: Boundary/outline pixels.

Implementation

1. Dataset Standardization

We standardized the dataset using the following steps:

Resize Images: All images and masks are resized to 256×256 pixels.
Train/Val/Test Split: The dataset is split into:
- Train (80%)
- Validation (20%)
- Test sets.
Mask Processing: Nearest-neighbor interpolation is used to preserve label information during resizing.

The implementation is available in preprocessing.py, which provides:

standardize_dataset(): Handles image/mask resizing and splitting.
get_train_val_split(): Creates training/validation splits using scikit-learn.

2. Data Augmentation

To improve model generalization, we implemented augmentation techniques using the Albumentations library:

Geometric Transformations (applied to both images and masks):
- Random flips (horizontal and vertical).
- Random 90° rotations.
- Elastic transformations.
- Grid distortions.
- Random cropping.
Pixel-Level Transformations (applied to images only):
- Brightness/contrast adjustments.
- Gaussian noise addition.

For each training image, 3 augmented variants are generated, effectively quadrupling the training set size.

The augmentation functionality is provided in augmentation.py:

get_training_augmentation(): Returns the augmentation pipeline for training data.
get_validation_augmentation(): Provides minimal processing for validation data.

Usage

To preprocess the dataset, run the following script:

python scripts/prepare_dataset.py

This pipeline:

Creates a standardized dataset with consistent dimensions.
Generates augmented versions of all training images.
Displays dataset statistics.

Data Loading

For model training, we provide the PetSegmentationDataset class, which integrates seamlessly with PyTorch for efficient data loading.

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
Dataset		Dataset
config		config
data		data
notebooks		notebooks
robustness_results		robustness_results
scripts		scripts
utils		utils
.gitignore		.gitignore
CLIP.ipynb		CLIP.ipynb
ComputerVisionGitDraft.ipynb		ComputerVisionGitDraft.ipynb
ComputerVisionGitDraftNew.ipynb		ComputerVisionGitDraftNew.ipynb
README.md		README.md
TestForColabPushing.ipynb		TestForColabPushing.ipynb
UNet.png		UNet.png
autoencoder_and_baselineUnet.ipynb		autoencoder_and_baselineUnet.ipynb
environment.yml		environment.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Computer Vision Image Segmentation Project

Overview

Project Structure

Generated Directories (not in version control)

Environment Setup

Reproducibility

Task 1: Dataset Preprocessing and Augmentation

Goal

Dataset Overview

Implementation

1. Dataset Standardization

2. Data Augmentation

Usage

Data Loading

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Computer Vision Image Segmentation Project

Overview

Project Structure

Generated Directories (not in version control)

Environment Setup

Reproducibility

Task 1: Dataset Preprocessing and Augmentation

Goal

Dataset Overview

Implementation

1. Dataset Standardization

2. Data Augmentation

Usage

Data Loading

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages