v0.8.2.post1: multi-source pull (HF + Harbor + GitHub) + bare-name#19
Merged
Conversation
After v0.8.2 shipped, the natural follow-ups were: 1. Drop the `hf://` ceremony for the common case (`owner/name` should just work) 2. Support bare names (`my-dataset` → resolve owner via whoami) 3. Pull from Harbor's registry + GitHub, not just HF Hub This finishes the CLI cleanup arc that v0.8.2 started. New URI dispatcher accepts: name → HF (whoami resolves owner) owner/name → HF owner/name@<rev> → HF (specific revision) hf://owner/name[@rev] → HF, explicit harbor://name[@tag] → Harbor registry (shells out to harbor) gh://owner/repo[@ref] → GitHub (git clone --depth 1) https://github.com/... → GitHub (full URL accepted) `cmd_pull` routes to the right backend; new `pull_from_harbor` and `pull_from_github` helpers in `hub.py` flatten the downloaded layout to the standard `<local-dir>/<task-id>/...` so the result is immediately consumable by `repo2rlenv validate` and `harbor run --path`. `cmd_push` accepts the same parser but only allows HF as target. `push ./local harbor://x` and `push ./local gh://x` get clear redirects pointing at `harbor publish` / `git push` instead of half-implementing those flows. Pull-specific flag: `--registry-url` for custom Harbor registries. Pull-specific behavior: `--task <name>` is HF-only (filters allow_patterns on snapshot_download); ignored for Harbor / GitHub. Tests: rewrote tests/test_cli_push_pull.py to cover all 4 backends + revision pinning + error UX (40 tests). All 485 unit tests pass; ruff + format clean. Why v0.8.2.post1 (not v0.8.3): framing this as completing the v0.8.2 CLI cleanup, not a new feature release. Sort order is correct: 0.8.2 < 0.8.2.post1 < 0.8.3.
Live e2e against Harbor's public registry (hub.harborframework.com) surfaced that real datasets use the `<org>/<name>` URI form — e.g. `cookbook/test`, `scale-ai/swe-atlas-qna`, `cais/swebenchpro` — not bare names. Original parser rejected slashes inside `harbor://`. Fix: parser now accepts both: harbor://name (bare / legacy / convenience) harbor://org/name (the actual registry form) Both with optional `@tag` version suffix. 3 new test cases added; full e2e confirmed working with `harbor://cookbook/test` pulling 1 real task from the public registry.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #18.
Live four-route e2e completed against the real HF Hub + GitHub + Harbor public registry — captured in `plans/v0.8.2.post1_e2e.md`.
CLI changes — one verb, four sources
```bash
HF Hub (default)
repo2rlenv pull click-r2e # bare → whoami auto-resolves owner
repo2rlenv pull AdithyaSK/click-r2e # owner/name
repo2rlenv pull AdithyaSK/click-r2e@v1.0 # @ revision pinning
repo2rlenv pull hf://AdithyaSK/click-r2e # explicit-prefix form
Harbor registry (NEW)
repo2rlenv pull harbor://cookbook/test # org/name form (verified live)
repo2rlenv pull harbor://swe-bench@lite # with version tag
repo2rlenv pull harbor://x --registry-url # custom registry
GitHub (NEW)
repo2rlenv pull gh://owner/repo # scheme form
repo2rlenv pull gh://owner/repo@main # with branch/tag/SHA pin
repo2rlenv pull https://github.com/owner/repo # full URL also accepted
```
Push stays HF-only. `push ./local harbor://x` and `push ./local gh://x` emit clear redirects to `harbor publish` / `git push`.
Live e2e evidence
Real finding surfaced + fixed during e2e
Harbor's public registry uses `/` URIs (`cookbook/test`, `scale-ai/swe-atlas-qna`, `cais/swebenchpro`) — not bare names. Original parser rejected slashes inside `harbor://`. Fixed: parser now accepts both bare and org/name forms. 3 new test cases.
Why `0.8.2.post1` (not v0.8.3)
Framing this as completing the v0.8.2 CLI cleanup arc, not a new feature release. Sort order: `0.8.2 < 0.8.2.post1 < 0.8.3` — a future genuine v0.8.3 is unaffected.
Test plan