feat(indexing): surface encoder-model download progress in the album UI by lstein · Pull Request #308 · lstein/PhotoMapAI

lstein · 2026-06-07T23:27:24Z

Problem

On a fresh PhotoMapAI install, the very first indexing run must download the CLIP/SigLIP encoder weights (hundreds of MB) before any image can be encoded. Today this happens silently inside the encoder constructors — the album card just sits at "Processing 0/N" with a frozen 0% bar for a minute or more, looking hung.

What this does

Surfaces the download as a distinct "Downloading encoder model…" phase that drives the existing progress bar with real byte progress, then transitions cleanly into the normal scanning → indexing → mapping flow.

When the model is already cached, no download bar fires, so there's no spurious "Downloading…" flash — the card goes straight to scanning/indexing.

How it works

The three encoder backends each render their download through a tqdm byte-bar, but via different module-level references. A new capture_download_progress() context manager temporarily swaps those references for a subclass that forwards byte progress to a callback, restoring the originals on exit:

clip.clip.tqdm — openai-clip URL download (the CPU-fallback default ViT-B/32)
open_clip.pretrained.tqdm — open-clip URL downloads (non-HF tags)
huggingface_hub.utils.tqdm.tqdm — HF Hub fetches (open-clip HF tags + all siglip)

It only patches when a callback is supplied, so the CLI/console path keeps its normal tqdm output, and only reports byte-unit bars (ignoring stray iteration counters). Callback exceptions are swallowed so a UI hiccup can never break a download.

Changes

encoders.py — capture_download_progress() context manager + reporting tqdm subclass
progress.py — IndexStatus.DOWNLOADING, report_download() (maps bytes onto the existing percentage/ETA fields), begin_indexing() (flips back once encoding starts), DOWNLOADING in is_running()
embeddings.py — threads a download_callback through _process_images_batch (wrapping _build_encoder()); the async wrapper reports bytes and transitions DOWNLOADING→INDEXING on the first image. CLI/sync path passes None.
album-manager.js / album-manager.css — downloading status branch (purple); existing bar/ETA logic reused. No endpoint change needed.
Tests — test_progress.py (tracker), capture tests in test_encoders.py, album-manager-progress.test.js (Jest)

Known limitation

If a download server omits Content-Length, the total is unknown and the bar reads 0% while bytes stream (the shimmer animation still shows activity and the label still says it's downloading). In practice all three backends provide a total, so this is an edge case.

Verification

make lint — ruff, eslint, prettier clean
make test — 382 backend + 352 frontend, all passing
Manually verified end-to-end: cleared the encoder caches, indexed an album, and confirmed the card shows "Downloading encoder model…" with the bar filling + ETA, then transitions to processing → mapping → complete. Re-indexing skips the download phase.

🤖 Generated with Claude Code

OpenCLIP ViT-L-14 is impractically slow to index/search on CPU-only Linux/Windows hosts. New albums on those hosts now default to the lightweight OpenAI CLIP ViT-B/32 instead, while CUDA hosts and macOS (untested for the lighter path) keep the high-quality ViT-L-14 default. - encoders.py: add CPU_FALLBACK_ENCODER_SPEC + default_encoder_spec() resolver (CUDA/macOS -> ViT-L-14, CPU Linux/Windows -> ViT-B/32) - config.py: Album.encoder_spec uses default_factory=default_encoder_spec - routers/album.py: GET /default_encoder/ exposes the host-resolved default - album-manager.js: new-album dropdown pre-selects the server default (cached fetch, falls back to recommended option on failure) Existing albums keep their stored encoder_spec; only the default for newly created albums changes. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

On a fresh install the first index run silently downloads the CLIP/SigLIP encoder weights (hundreds of MB), leaving the album card stuck at "Processing 0/N" with a frozen 0% bar. Surface this as a distinct "Downloading encoder model…" phase that drives the existing progress bar with real byte progress. The three encoder backends each render their download through a tqdm byte-bar via different module-level references. A new capture_download_progress() context manager temporarily swaps clip.clip.tqdm, open_clip.pretrained.tqdm, and huggingface_hub.utils.tqdm.tqdm for a subclass that forwards byte progress to a callback, restoring them on exit. It only patches when a callback is supplied (the CLI/console path is untouched) and only reports byte-unit bars. When the model is already cached no download bar fires, so there is no spurious "Downloading…" flash. - encoders.py: capture_download_progress() + reporting tqdm subclass - progress.py: IndexStatus.DOWNLOADING, report_download(), begin_indexing(), DOWNLOADING in is_running() - embeddings.py: thread download_callback through _process_images_batch; async wrapper reports bytes and transitions DOWNLOADING->INDEXING on first image - album-manager.js/.css: downloading status branch (existing bar/ETA reused) - tests: tracker + capture context-manager (backend), downloading branch (Jest) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

lstein

Tested working.

lstein and others added 2 commits June 7, 2026 13:13

lstein commented Jun 7, 2026

View reviewed changes

Merge branch 'master' into lstein/feature/model-download-progress

73ef74e

lstein merged commit ff21661 into master Jun 8, 2026
10 checks passed

lstein deleted the lstein/feature/model-download-progress branch June 8, 2026 16:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(indexing): surface encoder-model download progress in the album UI#308

feat(indexing): surface encoder-model download progress in the album UI#308
lstein merged 3 commits into
masterfrom
lstein/feature/model-download-progress

lstein commented Jun 7, 2026

Uh oh!

lstein left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

lstein commented Jun 7, 2026

Problem

What this does

How it works

Changes

Known limitation

Verification

Uh oh!

lstein left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant