fix: suggest valid HuggingFace model IDs when the format check fails by AhmedDlshad007 · Pull Request #43 · kubeflow/mcp-server

AhmedDlshad007 · 2026-06-24T13:31:55Z

Description

pre_flight (and estimate_resources, which share _get_model_info_from_hf) returned a bare Invalid HuggingFace model ID format error for a malformed model reference, with no guidance. Users and AI agents then retry with similarly wrong IDs, wasting round-trips. This attaches best-effort suggestions of real Hub model IDs, following the approach proposed in the issue.

Type of Change

feat: New feature
fix: Bug fix
revert: Revert a change
chore: Maintenance / tooling

Checklist

Tests pass locally (make test-python)
Linting passes for the changed files (ruff check + ruff format, under both ruff 0.14.14 and 0.15.12); see the CI note below
Documentation updated (if applicable)
Commit messages follow conventional format

Related Issues

Fixes #40

Problem

With an invalid model id (e.g. an Ollama-style tag qwen3:8b, or an hf:// prefixed value), _get_model_info_from_hf fails the _HF_MODEL_ID_RE check and returns only:

{"error": "Invalid HuggingFace model ID format: 'qwen3:8b'"}

Fix

Add a small _suggest_hf_model_ids(model) helper that:

Normalizes the input into a search term (drops an hf:// prefix and any Ollama-style :tag suffix).
Queries huggingface_hub.list_models(search=..., limit=3, full=False) for close matches.
Attaches them as a suggestions list on the error response.

It returns no suggestions (and adds no suggestions key) when the Hub lookup errors or finds nothing, so the validation path stays robust offline and never depends on network access. huggingface_hub is already a dependency.

Result now:

{"error": "Invalid HuggingFace model ID format: 'qwen3:8b'", "suggestions": ["Qwen/Qwen3-8B", "Qwen/Qwen2.5-7B-Instruct"]}

Tests Added

New tests/unit/trainer/test_planning.py (mocks list_models, no network):

Ollama :tag and hf:// prefix are normalized before searching the Hub.
Suggestions attached on invalid format; the key is omitted when none are found or the Hub errors.
Blank-after-normalization input makes no Hub call.

Full suite passes locally: 151 passed (144 existing + 7 new).

Note on CI

main is currently red independently of this change: the pre-commit job's ruff-format hook reports drift on files this PR does not touch, and security-audit (pip-audit) flags a transitive CVE. Because test is gated on pre-commit, the test matrix is skipped repo-wide until that is resolved. The change here is clean under both the pre-commit ruff pin (0.14.14) and the make verify ruff (0.15.12), and all unit tests pass locally. Happy to rebase once main is green.

/kind bug

google-oss-prow · 2026-06-24T13:32:03Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign abhijeet-dhumal for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

github-actions · 2026-06-24T13:32:06Z

🎉 Welcome to the Kubeflow MCP Server! 🎉

Thanks for opening your first PR! We're happy to have you as part of our community 🚀

Here's what happens next:

If you haven't already, please check out our Contributing Guide for repo-specific guidelines and the Kubeflow Contributor Guide for general community standards
Our team will review your PR soon! cc @kubeflow/kubeflow-sdk-team

Join the community:

Slack: Join our #kubeflow-ml-experience Slack channel
Meetings: Attend the Kubeflow SDK and ML Experience bi-weekly meetings

Feel free to ask questions in the comments if you need any help or clarification!
Thanks again for contributing to Kubeflow! 🙏

Copilot

Pull request overview

This PR aims to improve HuggingFace model ID validation feedback by adding best-effort “did you mean” suggestions when the model ID fails the format regex check, reducing unproductive retries for malformed IDs.

Changes:

Added _suggest_hf_model_ids() to normalize malformed inputs and query huggingface_hub.list_models() for up to 3 close matches.
Updated _get_model_info_from_hf() to attach a suggestions list to the invalid-format error response when available.
Added unit tests covering normalization, hub error handling, and suggestion attachment/omission behavior.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File	Description
`kubeflow_mcp/trainer/api/planning.py`	Add suggestion helper and include suggestions in invalid-format HF validation errors.
`tests/unit/trainer/test_planning.py`	Add unit tests for normalization and suggestion attachment logic.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+            result: dict[str, Any] = {"error": f"Invalid HuggingFace model ID format: '{model}'"}
+            suggestions = _suggest_hf_model_ids(model)
+            if suggestions:
+                result["suggestions"] = suggestions
+            return result


+def test_invalid_format_attaches_suggestions():
+    with patch("huggingface_hub.list_models") as mock_list:
+        mock_list.return_value = _fake_models("Qwen/Qwen3-8B", "Qwen/Qwen2.5-7B-Instruct")
+        result = _get_model_info_from_hf("qwen3:8b")
+
+    assert result["error"] == "Invalid HuggingFace model ID format: 'qwen3:8b'"
+    assert result["suggestions"] == ["Qwen/Qwen3-8B", "Qwen/Qwen2.5-7B-Instruct"]
+
+
+def test_invalid_format_omits_suggestions_when_none_found():
+    with patch("huggingface_hub.list_models") as mock_list:
+        mock_list.return_value = _fake_models()
+        result = _get_model_info_from_hf("qwen3:8b")
+
+    assert result["error"] == "Invalid HuggingFace model ID format: 'qwen3:8b'"
+    assert "suggestions" not in result


When pre_flight or estimate_resources received a malformed model reference (an Ollama-style tag like qwen3:8b, or an hf:// prefixed value), the response was a bare "Invalid HuggingFace model ID format" error with no guidance, so users and agents retried with similarly wrong IDs. Add a best-effort _suggest_hf_model_ids helper that normalizes the input (drops an hf:// prefix and any Ollama-style :tag suffix) and queries huggingface_hub.list_models for close matches, attaching them as a "suggestions" list. estimate_resources re-wraps the helper error into a ToolError, so propagate the suggestions into the error details there too, which also covers pre_flight (it delegates to estimate_resources). The lookup degrades to no suggestions when it errors or finds nothing, so validation never depends on network access. huggingface_hub is already a dependency. Covered by unit tests (mocking list_models) at both the helper and the estimate_resources tool boundary. Fixes kubeflow#40 Signed-off-by: Ahmed <ahmed.dlshad.m@gmail.com>

AhmedDlshad007 · 2026-06-24T13:58:50Z

Thanks for the review, good catch on the boundary. _suggest_hf_model_ids attached suggestions to the _get_model_info_from_hf return, but estimate_resources (and pre_flight, which delegates to it) re-wrapped that into a fresh ToolError and dropped the extra key, so callers never actually saw the suggestions.

Fixed: estimate_resources now propagates the suggestions into the ToolError details when present, so both estimate_resources and pre_flight surface them. Added tool-boundary tests asserting the suggestions survive into the response (and are omitted when none are found). Full suite is green locally (153 passed). I also squashed to a single commit and normalized the new test file to LF.

Copilot AI review requested due to automatic review settings June 24, 2026 13:31

google-oss-prow Bot added the kind/bug Something isn't working label Jun 24, 2026

google-oss-prow Bot requested review from Electronic-Waste, kramaranya and szaher June 24, 2026 13:32

google-oss-prow Bot added the size/L label Jun 24, 2026

Copilot started reviewing on behalf of AhmedDlshad007 June 24, 2026 13:32 View session

Copilot AI reviewed Jun 24, 2026

View reviewed changes

AhmedDlshad007 force-pushed the fix/suggest-hf-model-ids-on-invalid-format branch from fe98afe to 0ff34a6 Compare June 24, 2026 13:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: suggest valid HuggingFace model IDs when the format check fails#43

fix: suggest valid HuggingFace model IDs when the format check fails#43
AhmedDlshad007 wants to merge 1 commit into
kubeflow:mainfrom
AhmedDlshad007:fix/suggest-hf-model-ids-on-invalid-format

AhmedDlshad007 commented Jun 24, 2026

Uh oh!

google-oss-prow Bot commented Jun 24, 2026

Uh oh!

github-actions Bot commented Jun 24, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

AhmedDlshad007 commented Jun 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

AhmedDlshad007 commented Jun 24, 2026

Description

Type of Change

Checklist

Related Issues

Problem

Fix

Tests Added

Note on CI

Uh oh!

google-oss-prow Bot commented Jun 24, 2026

Uh oh!

github-actions Bot commented Jun 24, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

AhmedDlshad007 commented Jun 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants