What I'm seeing
A standard scp-style SSH clone URL, the git@github.com:org/repo.git form, gets accepted as a Git URL by _is_git_url() (it starts with git@ and ends with .git), routed to _clone_git(), and then rejected inside _validate_url_host() because urlparse(...).hostname is empty for that syntax. So the git@ acceptance in _is_git_url() is effectively dead, every scp-style URL it admits is turned away a few lines later, and the message the user gets ("URL has no valid hostname") is misleading for what is a perfectly valid Git URL.
To be clear up front, this one fails closed, so it is not a security hole, it is an internal inconsistency where the resolver advertises a URL form it cannot actually process.
Walking the code (commit 7bc9c0f, src/skillspector/input_handler.py)
- Accepted via the
.git suffix branch in _is_git_url() (input_handler.py:143-155); the git@ prefix clears the first check at :145, and path.endswith(".git") returns True at :153-154.
- Routed to clone in
resolve(...) (input_handler.py:110-111).
- Rejected in
_clone_git(), which calls _validate_url_host(url, ALLOWED_GIT_HOSTS) (input_handler.py:185), and that helper does (input_handler.py:163-181):
parsed = urlparse(url)
host = parsed.hostname or ""
if not host:
raise ValueError(f"URL has no valid hostname: {url}")
urlparse does not populate hostname for the scp-style form (there is no // authority for it to parse), so host is empty and the validation raises.
Reproduction
from urllib.parse import urlparse
print(urlparse("git@github.com:org/repo.git").hostname) # None
skillspector scan git@github.com:org/repo.git --no-llm
# ValueError: URL has no valid hostname: git@github.com:org/repo.git
Why it matters
Users who reach for the SSH clone URL (the default GitHub "clone with SSH" gives exactly this form) cannot scan the repo, and the error points them at "no valid hostname" rather than at the real story, which is that scp-style syntax is not handled. The dead git@ branch in _is_git_url() is also a quiet maintenance trap, since it reads as supported.
Two ways to fix, depending on intent
I do not want to guess whether SSH support was meant to be there, so I will lay out both:
- If scp-style SSH URLs are meant to be supported, parse the
user@host:path form explicitly (pull host out from between @ and :) and run it through the same allowlist and private-IP checks before cloning.
- If they are not meant to be supported, reject them up front in
_is_git_url() with a message that actually helps (something like "use the https:// clone URL"), rather than accepting and then failing inside _validate_url_host().
Happy to implement whichever you prefer; let me know which way you want it to go on the issue.
Related
What I'm seeing
A standard scp-style SSH clone URL, the
git@github.com:org/repo.gitform, gets accepted as a Git URL by_is_git_url()(it starts withgit@and ends with.git), routed to_clone_git(), and then rejected inside_validate_url_host()becauseurlparse(...).hostnameis empty for that syntax. So thegit@acceptance in_is_git_url()is effectively dead, every scp-style URL it admits is turned away a few lines later, and the message the user gets ("URL has no valid hostname") is misleading for what is a perfectly valid Git URL.To be clear up front, this one fails closed, so it is not a security hole, it is an internal inconsistency where the resolver advertises a URL form it cannot actually process.
Walking the code (commit 7bc9c0f,
src/skillspector/input_handler.py).gitsuffix branch in_is_git_url()(input_handler.py:143-155); thegit@prefix clears the first check at :145, andpath.endswith(".git")returnsTrueat :153-154.resolve(...)(input_handler.py:110-111)._clone_git(), which calls_validate_url_host(url, ALLOWED_GIT_HOSTS)(input_handler.py:185), and that helper does (input_handler.py:163-181):urlparsedoes not populatehostnamefor the scp-style form (there is no//authority for it to parse), sohostis empty and the validation raises.Reproduction
skillspector scan git@github.com:org/repo.git --no-llm # ValueError: URL has no valid hostname: git@github.com:org/repo.gitWhy it matters
Users who reach for the SSH clone URL (the default GitHub "clone with SSH" gives exactly this form) cannot scan the repo, and the error points them at "no valid hostname" rather than at the real story, which is that scp-style syntax is not handled. The dead
git@branch in_is_git_url()is also a quiet maintenance trap, since it reads as supported.Two ways to fix, depending on intent
I do not want to guess whether SSH support was meant to be there, so I will lay out both:
user@host:pathform explicitly (pullhostout from between@and:) and run it through the same allowlist and private-IP checks before cloning._is_git_url()with a message that actually helps (something like "use the https:// clone URL"), rather than accepting and then failing inside_validate_url_host().Happy to implement whichever you prefer; let me know which way you want it to go on the issue.
Related
.gitSSRF concern). The current allowlist and host check is the fix for [Security] Zip Slip path traversal and SSRF via permissive Git URL validation in InputHandler #147, and this scp rejection is a side effect of that validation, not something [Security] Zip Slip path traversal and SSRF via permissive Git URL validation in InputHandler #147 describes.