feat(indexing): surface encoder-model download progress in the album UI#308
Merged
Conversation
OpenCLIP ViT-L-14 is impractically slow to index/search on CPU-only Linux/Windows hosts. New albums on those hosts now default to the lightweight OpenAI CLIP ViT-B/32 instead, while CUDA hosts and macOS (untested for the lighter path) keep the high-quality ViT-L-14 default. - encoders.py: add CPU_FALLBACK_ENCODER_SPEC + default_encoder_spec() resolver (CUDA/macOS -> ViT-L-14, CPU Linux/Windows -> ViT-B/32) - config.py: Album.encoder_spec uses default_factory=default_encoder_spec - routers/album.py: GET /default_encoder/ exposes the host-resolved default - album-manager.js: new-album dropdown pre-selects the server default (cached fetch, falls back to recommended option on failure) Existing albums keep their stored encoder_spec; only the default for newly created albums changes. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
On a fresh install the first index run silently downloads the CLIP/SigLIP encoder weights (hundreds of MB), leaving the album card stuck at "Processing 0/N" with a frozen 0% bar. Surface this as a distinct "Downloading encoder model…" phase that drives the existing progress bar with real byte progress. The three encoder backends each render their download through a tqdm byte-bar via different module-level references. A new capture_download_progress() context manager temporarily swaps clip.clip.tqdm, open_clip.pretrained.tqdm, and huggingface_hub.utils.tqdm.tqdm for a subclass that forwards byte progress to a callback, restoring them on exit. It only patches when a callback is supplied (the CLI/console path is untouched) and only reports byte-unit bars. When the model is already cached no download bar fires, so there is no spurious "Downloading…" flash. - encoders.py: capture_download_progress() + reporting tqdm subclass - progress.py: IndexStatus.DOWNLOADING, report_download(), begin_indexing(), DOWNLOADING in is_running() - embeddings.py: thread download_callback through _process_images_batch; async wrapper reports bytes and transitions DOWNLOADING->INDEXING on first image - album-manager.js/.css: downloading status branch (existing bar/ETA reused) - tests: tracker + capture context-manager (backend), downloading branch (Jest) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
On a fresh PhotoMapAI install, the very first indexing run must download the CLIP/SigLIP encoder weights (hundreds of MB) before any image can be encoded. Today this happens silently inside the encoder constructors — the album card just sits at "Processing 0/N" with a frozen 0% bar for a minute or more, looking hung.
What this does
Surfaces the download as a distinct "Downloading encoder model…" phase that drives the existing progress bar with real byte progress, then transitions cleanly into the normal scanning → indexing → mapping flow.
When the model is already cached, no download bar fires, so there's no spurious "Downloading…" flash — the card goes straight to scanning/indexing.
How it works
The three encoder backends each render their download through a
tqdmbyte-bar, but via different module-level references. A newcapture_download_progress()context manager temporarily swaps those references for a subclass that forwards byte progress to a callback, restoring the originals on exit:clip.clip.tqdm— openai-clip URL download (the CPU-fallback defaultViT-B/32)open_clip.pretrained.tqdm— open-clip URL downloads (non-HF tags)huggingface_hub.utils.tqdm.tqdm— HF Hub fetches (open-clip HF tags + all siglip)It only patches when a callback is supplied, so the CLI/console path keeps its normal tqdm output, and only reports byte-unit bars (ignoring stray iteration counters). Callback exceptions are swallowed so a UI hiccup can never break a download.
Changes
encoders.py—capture_download_progress()context manager + reporting tqdm subclassprogress.py—IndexStatus.DOWNLOADING,report_download()(maps bytes onto the existing percentage/ETA fields),begin_indexing()(flips back once encoding starts),DOWNLOADINGinis_running()embeddings.py— threads adownload_callbackthrough_process_images_batch(wrapping_build_encoder()); the async wrapper reports bytes and transitions DOWNLOADING→INDEXING on the first image. CLI/sync path passesNone.album-manager.js/album-manager.css—downloadingstatus branch (purple); existing bar/ETA logic reused. No endpoint change needed.test_progress.py(tracker), capture tests intest_encoders.py,album-manager-progress.test.js(Jest)Known limitation
If a download server omits
Content-Length, the total is unknown and the bar reads 0% while bytes stream (the shimmer animation still shows activity and the label still says it's downloading). In practice all three backends provide a total, so this is an edge case.Verification
make lint— ruff, eslint, prettier cleanmake test— 382 backend + 352 frontend, all passing🤖 Generated with Claude Code