Skip to content

[BUG] scp-style SSH Git URLs (git@host:org/repo.git) are accepted as Git URLs, then rejected with "URL has no valid hostname" #202

Description

@wernerkasselman-au

What I'm seeing

A standard scp-style SSH clone URL, the git@github.com:org/repo.git form, gets accepted as a Git URL by _is_git_url() (it starts with git@ and ends with .git), routed to _clone_git(), and then rejected inside _validate_url_host() because urlparse(...).hostname is empty for that syntax. So the git@ acceptance in _is_git_url() is effectively dead, every scp-style URL it admits is turned away a few lines later, and the message the user gets ("URL has no valid hostname") is misleading for what is a perfectly valid Git URL.

To be clear up front, this one fails closed, so it is not a security hole, it is an internal inconsistency where the resolver advertises a URL form it cannot actually process.

Walking the code (commit 7bc9c0f, src/skillspector/input_handler.py)

  • Accepted via the .git suffix branch in _is_git_url() (input_handler.py:143-155); the git@ prefix clears the first check at :145, and path.endswith(".git") returns True at :153-154.
  • Routed to clone in resolve(...) (input_handler.py:110-111).
  • Rejected in _clone_git(), which calls _validate_url_host(url, ALLOWED_GIT_HOSTS) (input_handler.py:185), and that helper does (input_handler.py:163-181):
    parsed = urlparse(url)
    host = parsed.hostname or ""
    if not host:
        raise ValueError(f"URL has no valid hostname: {url}")

urlparse does not populate hostname for the scp-style form (there is no // authority for it to parse), so host is empty and the validation raises.

Reproduction

from urllib.parse import urlparse
print(urlparse("git@github.com:org/repo.git").hostname)   # None
skillspector scan git@github.com:org/repo.git --no-llm
# ValueError: URL has no valid hostname: git@github.com:org/repo.git

Why it matters

Users who reach for the SSH clone URL (the default GitHub "clone with SSH" gives exactly this form) cannot scan the repo, and the error points them at "no valid hostname" rather than at the real story, which is that scp-style syntax is not handled. The dead git@ branch in _is_git_url() is also a quiet maintenance trap, since it reads as supported.

Two ways to fix, depending on intent

I do not want to guess whether SSH support was meant to be there, so I will lay out both:

  1. If scp-style SSH URLs are meant to be supported, parse the user@host:path form explicitly (pull host out from between @ and :) and run it through the same allowlist and private-IP checks before cloning.
  2. If they are not meant to be supported, reject them up front in _is_git_url() with a message that actually helps (something like "use the https:// clone URL"), rather than accepting and then failing inside _validate_url_host().

Happy to implement whichever you prefer; let me know which way you want it to go on the issue.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions