A desktop tool that uses OpenAI’s CLIP model to automatically classify and organize images into user-defined categories.
Images are assigned to one and only one category folder based on the highest-scoring CLIP similarity.
Includes a full Tkinter GUI for easy use.
-
Organizes only images (
.jpg,.jpeg,.png,.gif,.bmp,.tiff,.webp) -
Uses CLIP (openai/clip-vit-base-patch32) for semantic image understanding
-
Images go to exactly one category
-
Fully custom categories
- Load from
.txt - Or enter comma-separated list
- Load from
-
Dry-run mode for safe previews
-
Move or Copy modes
-
Progress bar + live console output
-
Multi-threaded worker keeps UI responsive
-
Saves CSV log (when not dry-run)
-
Works on Windows / Linux / macOS
git clone https://github.com/<your-user>/<your-repo>.git
cd <your-repo>python -m venv .venv.\.venv\Scripts\Activate.ps1If PowerShell blocks scripts:
Set-ExecutionPolicy -Scope Process -ExecutionPolicy Bypass
.\.venv\Scripts\Activate.ps1pip install -r requirements.txtpython organize_images_clip_gui.pyA window will appear with all available options.
You can define categories using either method:
Example:
People
Food
Screenshots
Documents
Pets
Landscapes
Receipts
Load it via Load from File in the GUI.
People, Food, Screenshots, Pets
-
Recursively scans the chosen root folder for images.
-
For each category, CLIP evaluates prompts like:
"a photo of people" "a photo of food" "a photo of receipts" -
Picks the highest-probability category if above
threshold. -
Moves/Copies the image into:
<target>/<CategoryName>/ -
If no category scores high enough:
<target>/Uncategorized/
When not in dry-run mode, saves:
<target>/organize_images_log.csv
Columns include:
srcdstcategoryscore
torch>=2.0.0
torchvision>=0.15.0
transformers>=4.30.0
pillow>=9.0.0
tqdm>=4.60.0
pandas>=1.3.0
typing-extensions
If you want GPU acceleration, install PyTorch with CUDA via: https://pytorch.org
/
├── organize_images_clip_gui.py
├── requirements.txt
├── categories.txt
└── README.md
- Enable dry-run first to verify.
- Use cuda device for faster sorting (if available).
- Good threshold values:
0.20–0.35.
