fix(trainer): detect OpenShift pip permission error and document work…#46
Open
priyank766 wants to merge 1 commit into
Open
fix(trainer): detect OpenShift pip permission error and document work…#46priyank766 wants to merge 1 commit into
priyank766 wants to merge 1 commit into
Conversation
…around Signed-off-by: priyank <priyank8445@gmail.com>
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
🎉 Welcome to the Kubeflow MCP Server! 🎉 Thanks for opening your first PR! We're happy to have you as part of our community 🚀 Here's what happens next:
Join the community:
Feel free to ask questions in the comments if you need any help or clarification! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
fix(trainer): detect OpenShift pip permission error and document workaround
Problem
fixes : #41
On OpenShift clusters with a restricted Security Context Constraint (SCC), the root filesystem is read-only. Running
run_custom_training(packages=[...])executespip install --user(writing to/.local), which fails with aPermissionError: [Errno 13] Permission denied: '/.local'because user-defined volumes are not mounted during the pre-script installation step.Changes
kubeflow_mcp/trainer/api/monitoring.py): Added an OpenShift-specific pattern to_FAILURE_PATTERNSto catch permission errors on/.localand return the workaround suggestion.kubeflow_mcp/trainer/__init__.py): Added prompt guidance underINSTRUCTION_SECTIONS["training"]advising AI agents to avoid thepackagesparameter on OpenShift.kubeflow_mcp/trainer/resources/platform-fixes.md): Documented the issue and the target-directory workaround.tests/unit/trainer/test_monitoring.py): Added a new test suite to verify the parser correctly matches and prioritizes the OpenShift error over generic permission errors.Workaround
Do not use the
packagesparameter on OpenShift. Instead, run the installation within the training script using:Verification