Skip to content

[monitor-config] fix: config optim#92

Open
mengchengTang wants to merge 1 commit into
verl-project:mainfrom
mengchengTang:master067
Open

[monitor-config] fix: config optim#92
mengchengTang wants to merge 1 commit into
verl-project:mainfrom
mengchengTang:master067

Conversation

@mengchengTang

@mengchengTang mengchengTang commented Jun 27, 2026

Copy link
Copy Markdown
Collaborator

What does this PR do?

This PR cleans the bundled Prometheus config by removing accidentally committed runtime scrape targets. It also stages bundled Grafana dashboards into the runtime directory and provisions Grafana from that runtime copy, while preserving user-added runtime dashboard files.

Checklist Before Starting

  • Search for similar PRs. Paste at least one query link here: ...
  • Format the PR title as [{modules}] {type}: {description} (This will be checked by the CI)
    • {modules} include:
      • monitor-api, monitor-cli, monitor-collector, monitor-server, monitor-config for online monitor changes
      • recipe for offline analysis changes, including pipeline, parser, visualizer, data checker, recipe config, examples, and sample data
      • doc, ci, perf, deployment, misc for cross-cutting changes
    • If this PR involves multiple modules, separate them with , like [recipe, ci] or [monitor-server, monitor-config]
    • {type} is in feat, fix, refactor, chore, test
    • If this PR breaks any API (CLI arguments, config, function signature, etc.), add [BREAKING] to the beginning of the title.
    • Example: [BREAKING][mstx, torch_profile] feat: support timeline parsing

Test

image

API and Usage Example

Demonstrate how the API changes if any, and provide usage example(s) if possible.

# Add code snippet or script demonstrating how to use this

Design & Code Changes

Demonstrate the high-level design if this PR is complex, and list the specific changes.

Checklist Before Submitting

Important

Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the documentation and runtime logic for Grafana dashboards in RL-Insight. Specifically, it introduces a staging step (_stage_grafana_dashboards) that copies bundled dashboard JSON files from the package directory to a runtime directory (~/.rl-insight/runtime/dashboards) at startup, and configures Grafana to provision from there. It also cleans up the default Prometheus scrape configurations. The review feedback highlights several edge cases in the new staging function that could lead to runtime errors, such as handling missing configuration values, safely removing existing files or symlinks at the target path, and ensuring the source path is a directory before iterating over it.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment on lines +463 to +467
source = Path(str(OmegaConf.select(conf, "grafana.dashboards_dir"))).resolve()
target = (runtime_dir / "dashboards").resolve()
if source == target:
target.mkdir(parents=True, exist_ok=True)
return target

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If grafana.dashboards_dir is not configured or is set to None, OmegaConf.select will return None. Converting None to a string results in "None", which then resolves to a path named "None" in the current working directory. We should safely handle the case where grafana.dashboards_dir is missing or None before resolving it.

Suggested change
source = Path(str(OmegaConf.select(conf, "grafana.dashboards_dir"))).resolve()
target = (runtime_dir / "dashboards").resolve()
if source == target:
target.mkdir(parents=True, exist_ok=True)
return target
target = (runtime_dir / "dashboards").resolve()
dashboards_dir = OmegaConf.select(conf, "grafana.dashboards_dir")
if not dashboards_dir:
target.mkdir(parents=True, exist_ok=True)
return target
source = Path(str(dashboards_dir)).resolve()
if source == target:
target.mkdir(parents=True, exist_ok=True)
return target

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix. the code removed

Comment thread rl_insight/server/runtime.py Outdated
Comment on lines +469 to +471
if target.exists():
shutil.rmtree(target)
target.mkdir(parents=True, exist_ok=True)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If target exists but is a file or a symlink, calling shutil.rmtree(target) will raise an exception (e.g., NotADirectoryError). It is safer to check if target is a directory and not a symlink before calling shutil.rmtree, and otherwise use target.unlink() to remove it.

Suggested change
if target.exists():
shutil.rmtree(target)
target.mkdir(parents=True, exist_ok=True)
if target.exists() or target.is_symlink():
if target.is_dir() and not target.is_symlink():
shutil.rmtree(target)
else:
target.unlink()
target.mkdir(parents=True, exist_ok=True)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix. the code removed

Comment on lines +473 to +474
if not source.exists():
return target

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If source exists but is a file instead of a directory, calling source.iterdir() on the next line will raise a NotADirectoryError. We should ensure that source is a directory before attempting to iterate over its contents.

Suggested change
if not source.exists():
return target
if not source.exists() or not source.is_dir():
return target

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix. the code removed

@mengchengTang mengchengTang force-pushed the master067 branch 2 times, most recently from 25741bd to 18a3c17 Compare June 27, 2026 10:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant