Reuse the HuggingFace cache for model assets#46
Open
zboyles wants to merge 1 commit into
Open
Conversation
Model assets (musiccoca + spectrostream resources, exported .mlxfn models, raw checkpoints) were always downloaded into ~/Documents/Magenta/magenta-rt-v2/ via hf_hub_download(local_dir=...), which never consults the global HF cache. A repo already pulled with `hf download google/magenta-realtime-2` was re-downloaded over the network, and assets were duplicated on disk. Resolve assets MAGENTA_HOME-first, then the global HF cache: - paths.py: add resolve_asset() plus resolve_musiccoca_dir/spectrostream_dir/ model_dir helpers. The local MAGENTA_HOME layout always wins (so local exports, GCS downloads, and user overrides take precedence); otherwise the asset is served from the global HF cache. Load paths reuse the cache only (local_files_only) — no network fetch at load time — so default behavior is unchanged except that an existing `hf download` is now picked up. A missing asset raises a clear FileNotFoundError pointing at both locations. - Route load sites through the resolver: musiccoca.py, mlx/system.py (mlxfn model dir + checkpoint), jax/system.py (checkpoint). - mlx/spectrostream/load_weights.py: when encoder.safetensors is not a sibling of the checkpoint (e.g. assets live in the HF cache, where the checkpoint and resources sit in different repo subdirs), fall back to the resolved spectrostream resource dir. Guarded lazy import keeps the module standalone. - resolve_checkpoint(): build the candidate path directly instead of via checkpoints_dir(), which mkdir's as a side effect. Resolution no longer creates directories, so a cache hit leaves MAGENTA_HOME untouched. CLI (opt-in, default behavior unchanged): - Add --use-hf-cache (env: MAGENTA_RT_USE_HF_CACHE) to `mrt models init`, `mrt models download`, and `mrt checkpoints download`. When set, downloads populate/reuse the global HF cache (omit local_dir) and the interactive picker's checkmarks consult the cache. The flag is HuggingFace-only; the GCS source and --download-path are unaffected. Docs: note cache reuse and the flag in README.md and docs/models.md.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Reuse the HuggingFace cache for model assets
Model assets (musiccoca + spectrostream resources, exported .mlxfn models, raw checkpoints) were always downloaded into ~/Documents/Magenta/magenta-rt-v2/ via hf_hub_download(local_dir=...), which never consults the global HF cache. A repo already pulled with
hf download google/magenta-realtime-2was re-downloaded over the network, and assets were duplicated on disk.Resolve assets MAGENTA_HOME-first, then the global HF cache:
paths.py: add resolve_asset() plus resolve_musiccoca_dir / resolve_spectrostream_dir / resolve_model_dir helpers. The local MAGENTA_HOME layout always wins (so local exports, GCS downloads, and user overrides take precedence); otherwise the asset is served from the global HF cache. Load paths reuse the cache only (local_files_only) — no network fetch at load time — so default behavior is unchanged except that an existing
hf downloadis now picked up. A missing asset raises a clear FileNotFoundError pointing at both locations.Route load sites through the resolver: musiccoca.py, mlx/system.py (mlxfn model dir + checkpoint), jax/system.py (checkpoint).
mlx/spectrostream/load_weights.py: when encoder.safetensors is not a sibling of the checkpoint (e.g. assets live in the HF cache, where the checkpoint and resources sit in different repo subdirs), fall back to the resolved spectrostream resource dir. Guarded lazy import keeps the module standalone.
resolve_checkpoint(): build the candidate path directly instead of via checkpoints_dir(), which mkdir's as a side effect. Resolution no longer creates directories, so a cache hit leaves MAGENTA_HOME untouched.
CLI (opt-in, default behavior unchanged):
--use-hf-cache(env:MAGENTA_RT_USE_HF_CACHE) tomrt models init,mrt models download, andmrt checkpoints download. When set, downloads populate/reuse the global HF cache (omit local_dir) and the interactive picker's checkmarks consult the cache. The flag is HuggingFace-only; the GCS source and--download-pathare unaffected.Docs: note cache reuse and the flag in
README.mdanddocs/models.md.Related Issues
--download-pathreadability probe). Both touch the--download-pathoptions inmodels_commands.py, so I'll rebase this branch after Fix: don't require --download-path to be readable at parse time #45 merges (adopt the shared_download_path_option, keep--use-hf-cache).Local Pytests
Note
This change reroutes where assets load from (MAGENTA_HOME → global HF cache); it does not alter model math or sampling. Validation below focuses on (a) cache reuse and (b) that both load paths still load correctly from the cache. The numeric-parity/bitlevel tests are gated on a generated reference in this environment and skip; they're included to show no regressions. Paths anonymized.
I ran
and observed the following output:
Benchmark Regression Test
N/A — this change only affects asset resolution (which directory weights are loaded from). It does not touch sampling, the model graph, or any performance-sensitive code path, so there is no benchmark surface to regress.