Skip to content

feat(trainer): suggest valid HuggingFace model IDs on validation failure#44

Open
taeyang0505 wants to merge 2 commits into
kubeflow:mainfrom
taeyang0505:feat/hf-model-id-suggestions
Open

feat(trainer): suggest valid HuggingFace model IDs on validation failure#44
taeyang0505 wants to merge 2 commits into
kubeflow:mainfrom
taeyang0505:feat/hf-model-id-suggestions

Conversation

@taeyang0505

Copy link
Copy Markdown

What this does

When pre_flight() / estimate_resources() is called with a malformed
HuggingFace model ID (e.g. Ollama-style qwen3:8b, or a typo like
meta-lama/Llama-3), the response now includes a suggestions list of valid
model IDs instead of a bare format error, so users and agents can self-correct
without wasted round-trips.

Changes

  • Add _suggest_hf_model_ids() in planning.py: normalises the raw input
    (hf:// prefix, Ollama :tag, / separators) and queries
    huggingface_hub.list_models(search=..., limit=3). Best-effort — returns
    [] on any lookup error so it never masks the original error.
  • Attach suggestions to the validation-failure error in
    _get_model_info_from_hf().
  • Surface suggestions through the public ToolError.details in
    estimate_resources() so they reach pre_flight() callers.
  • Add unit tests (planning_test.py) covering normalisation, lookup-failure
    fallback, and the error payload shape (HF Hub mocked).

Testing

make verify and make test-python pass locally (150 passed).

Closes #40

Signed-off-by: TaeYang Hong <hongtaeyang1231@gmail.com>
Copilot AI review requested due to automatic review settings June 24, 2026 13:31
@google-oss-prow

Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign kramaranya for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@github-actions

Copy link
Copy Markdown

🎉 Welcome to the Kubeflow MCP Server! 🎉

Thanks for opening your first PR! We're happy to have you as part of our community 🚀

Here's what happens next:

  • If you haven't already, please check out our Contributing Guide for repo-specific guidelines and the Kubeflow Contributor Guide for general community standards
  • Our team will review your PR soon! cc @kubeflow/kubeflow-sdk-team

Join the community:

Feel free to ask questions in the comments if you need any help or clarification!
Thanks again for contributing to Kubeflow! 🙏

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds best-effort HuggingFace Hub-based suggestions to help callers recover from invalid model identifiers during trainer planning flows (notably estimate_resources() and pre_flight()), reducing trial-and-error retries when model ID validation fails.

Changes:

  • Added _suggest_hf_model_ids() to normalize common malformed inputs and query the Hub for close matches.
  • Attached suggestions to the validation error produced by _get_model_info_from_hf(), and surfaced those suggestions through ToolError.details in estimate_resources().
  • Added unit tests for normalization, lookup-failure fallback, and _get_model_info_from_hf() error payload shape (Hub mocked).

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
kubeflow_mcp/trainer/api/planning.py Adds Hub-backed suggestion helper and includes suggestions in error details returned by estimate_resources().
kubeflow_mcp/trainer/api/planning_test.py Adds tests for normalization and suggestion behavior on format-validation failures.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +503 to +507
details: dict[str, Any] = {
"hint": "Ensure model path is correct (e.g., 'meta-llama/Llama-3.2-1B')"
}
if hf_info and hf_info.get("suggestions"):
details["suggestions"] = hf_info["suggestions"]
Comment on lines +66 to +84
def test_invalid_model_id_includes_suggestions():
"""An invalid model ID returns the format error plus suggestions when available."""
with patch("huggingface_hub.list_models") as mock_list:
mock_list.return_value = _fake_models("Qwen/Qwen3-8B")
result = _get_model_info_from_hf("qwen3:8b")

assert result is not None
assert "Invalid HuggingFace model ID format" in result["error"]
assert result["suggestions"] == ["Qwen/Qwen3-8B"]


def test_invalid_model_id_omits_suggestions_when_none_found():
"""When the Hub returns nothing, no empty 'suggestions' key is added."""
with patch("huggingface_hub.list_models") as mock_list:
mock_list.return_value = []
result = _get_model_info_from_hf("qwen3:8b")

assert result is not None
assert "suggestions" not in result
…ub lookup

Signed-off-by: TaeYang Hong <hongtaeyang1231@gmail.com>
@taeyang0505

Copy link
Copy Markdown
Author

Thanks for the thorough review! Addressed both points:

_get_model_info_from_hf now also attaches best-effort suggestions when a well-formed but nonexistent ID fails the Hub lookup (e.g. meta-lama/Llama-3), not just on format-validation failure — this covers the second example from the issue.
Added tests asserting estimate_resources() surfaces details["suggestions"] (which pre_flight() inherits), including the well-formed-but-missing case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

pre_flight should suggest valid HuggingFace model IDs on format validation failure

2 participants