Cross-platform digital archive integrity, preservation and deduplication toolkit.
AIPT is a production-quality, preservation-first CLI tool for safely auditing and preserving mixed digital archives on macOS, Linux, and Windows. It prioritizes safety, conservative validation, and minimization of false positives over aggressive deletion.
No files are ever automatically deleted. Instead, problematic files are quarantined with full restoration capabilities.
Digital archives (photos, videos, documents) often suffer from silent corruption, OS-level clutter, and duplication over time. Simple scripts or aggressive deduplication tools can cause data loss through false positives or hasty deletions.
AIPT solves this by providing a highly conservative, multi-tier validation system. It safely audits junk, verifies file integrity (using hardware-accelerated video decoding when available), accurately finds exact duplicates, and performs perceptual best-version resolution.
- Junk/Clutter Audit: Safely quarantines OS clutter (
.DS_Store,Thumbs.db,__MACOSX, etc.). - Conservative Integrity Scanning:
- Images: Zero-byte, truncation, and Pillow verification.
- Videos & Audio: 3-tier validation (fast metadata probe → keyframe-only decode → full decode fallback). Benign warnings are classified separately from fatal corruption.
- Documents: Deep structural validation for PDFs (pypdf) and Office/ZIP files (zipfile), plus zero-byte text checking.
- Hardware Acceleration: Autodetects and utilizes
videotoolbox,cuda,nvdec,qsv,dxva2,d3d11va, andvaapifor video processing. - Hybrid Exact Deduplication: 4-stage pipeline (Size → Sample BLAKE2b/MD5 → Full hash) to avoid disk thrashing on large archives.
- Perceptual Best Version Resolution: dHash-based grouping to keep the highest quality version of an image and quarantine the rest.
- Empty Folder Cleanup: Safely removes empty directories left behind.
- Safe Quarantine & Restore: Every action is logged to
quarantine_manifest.jsonenabling a robustaipt restoreworkflow. - Smart Runtime Detection: Automatically configures GPU backends and optimal worker counts to prevent disk thrashing.
- Conservative by default: Unknown conditions result in a warning, not a corruption flag.
- Never delete automatically: Destructive actions are replaced with a robust Quarantine system.
- Restorable: Every quarantined file can be restored to its original path via the manifest.
- Timeout safety: Timeouts (especially on large video files) do not imply corruption.
Ensure you have FFmpeg installed (required for video scanning):
| OS | Command |
|---|---|
| macOS | brew install ffmpeg |
| Linux | sudo apt install ffmpeg |
| Windows | winget install ffmpeg |
We recommend using uv to install AIPT as a standalone tool. This makes the aipt command available globally.
# Clone the repo
git clone https://github.com/AfshanKhan/aipt.git
cd aipt
# Install as a global tool
uv tool install . --python 3.13Once installed, you can run aipt from any directory.
The easiest way to process an archive is the run-all command. It initializes the system and runs every preservation step in the correct order.
aipt run-all /path/to/your/archiveIf you prefer more control, you can run individual stages:
| Command | Description |
|---|---|
aipt init |
Initialize system folders |
aipt audit |
Quarantine junk & OS clutter |
aipt integrity |
Scan for corrupt images/videos |
aipt dupes |
Remove exact byte-for-byte duplicates |
aipt best-version |
Keep highest quality perceptual images |
aipt clean |
Remove empty directories |
aipt restore |
Move files back from quarantine |
Want to see what happens without moving any files? Just add --dry-run:
aipt run-all /path/to/archive --dry-runIf a file was incorrectly quarantined or you wish to revert an action:
aipt restore /path/to/archiveAIPT automatically detects your hardware and chooses the fastest available video decoder.
| GPU Brand | Technology Supported | OS |
|---|---|---|
| Apple Silicon | videotoolbox |
macOS |
| NVIDIA | cuda, nvdec |
Windows, Linux |
| Intel (Arc/UHD) | qsv (QuickSync), vaapi |
Windows, Linux |
| AMD (Radeon) | amf, vaapi, d3d11va |
Windows, Linux |
- Drivers: Ensure your GPU drivers are up to date.
- FFmpeg: Ensure your version of FFmpeg was built with support for these decoders (the default versions from
brew,apt, andwingetusually include them). - Detection: AIPT will print
HW accel: [name]at the start of an integrity scan so you can verify it's using your GPU.
Contributions are welcome. Please ensure that PRs adhere to the conservative safety philosophy of the project.
Apache License 2.0 + Commons Clause. See LICENSE for details.