feat: add GitHub Action to score PRs and suggest splits by vitali87 · Pull Request #39 · vitali87/pr-split

vitali87 · 2026-03-28T19:50:44Z

Summary

Add a composite GitHub Action (action.yml) that scores every PR and posts a split plan comment when the PR is too large
Uses the graph backend by default — no API key needed
Includes scripts/score_pr.py which runs pr-split in dry-run mode, computes metrics, and generates a markdown comment
Updates/creates comments (idempotent via hidden HTML marker) so re-runs update the existing comment
Add example workflow at .github/workflows/split-score.yml
Document the action in README with inputs/outputs tables

How it works

Installs pr-split via uv tool install
Computes diff stats — if LOC <= max-loc, skips with "no split needed"
Runs pr-split split --dry-run with the configured backend
Parses the saved plan JSON and computes metrics (overflow, scatter, group count)
If groups >= threshold, posts a PR comment with the full split plan table

Action outputs

total-loc, total-groups, objective, should-split — usable in downstream workflow steps

Test plan

Script syntax verified, lint clean
Full test suite passes (410 tests)
Manual test on a real PR after merge

gemini-code-assist

Code Review

This pull request introduces a GitHub Action for pr-split, allowing users to automatically score PR complexity and receive split suggestions as CI comments. The changes include a new composite action definition, a Python script for scoring logic, and updated documentation. The review feedback highlights several improvement opportunities: implementing robust error handling for integer environment variables, documenting the missing python-version input, pinning action versions for better security, and replacing hardcoded file paths with dynamic temporary paths to ensure cross-platform compatibility.

gemini-code-assist · 2026-03-28T19:52:46Z

+
+
+def main() -> None:
+    max_loc = os.environ.get("MAX_LOC", "400")


The max_loc variable is read as a string here, but converted to an integer in multiple places later (e.g., line 53, 100). This can raise a ValueError if a non-integer string is provided by the user, causing the action to fail. It would be more robust to parse it to an integer here, handle a potential ValueError with a clear error message, and then use the integer variable throughout the script (e.g., if total_loc <= max_loc:).

Suggested change

max_loc = os.environ.get("MAX_LOC", "400")

try:

max_loc = int(os.environ.get("MAX_LOC", "400"))

except ValueError:

print("Error: MAX_LOC must be an integer.", file=sys.stderr)

sys.exit(1)

gemini-code-assist · 2026-03-28T19:52:46Z

+    min_loc = os.environ.get("MIN_LOC", "")
+    strategy = os.environ.get("PARTITION_STRATEGY", "graph")
+    priority = os.environ.get("PRIORITY", "orthogonal")
+    threshold = int(os.environ.get("THRESHOLD_GROUPS", "2"))


The conversion int(...) can raise a ValueError if the environment variable THRESHOLD_GROUPS is not a valid integer. This will crash the script. It's safer to wrap this in a try-except block to handle potential errors gracefully and provide a more informative error message.

Suggested change

threshold = int(os.environ.get("THRESHOLD_GROUPS", "2"))

try:

threshold = int(os.environ.get("THRESHOLD_GROUPS", "2"))

except ValueError:

print("Error: THRESHOLD_GROUPS must be an integer.", file=sys.stderr)

sys.exit(1)

gemini-code-assist · 2026-03-28T19:52:46Z

+        with:
+          fetch-depth: 0
+
+      - uses: vitali87/pr-split@main


It is a security and stability best practice to pin actions to a specific version (a tag or a commit SHA) instead of a branch like main. Using @main can introduce breaking changes into your workflow unexpectedly. Please consider updating this to use a specific version tag once you create a release.

For example:

- uses: vitali87/pr-split@v1.0.0

gemini-code-assist · 2026-03-28T19:52:46Z

+| `partition-strategy` | `graph` | Backend for partitioning (`graph` or `cp_sat`) |
+| `priority` | `orthogonal` | Grouping priority (`orthogonal` or `logical`) |
+| `threshold-groups` | `2` | Minimum suggested groups before posting the split plan |
+| `post-comment` | `true` | Whether to post a PR comment with the results |


gemini-code-assist · 2026-03-28T19:52:46Z

+      with:
+        script: |
+          const fs = require('fs');
+          const body = fs.readFileSync('/tmp/pr-split-comment.md', 'utf8');


The path /tmp/pr-split-comment.md is hardcoded, which can be brittle and may not work on all runner operating systems. This path should be received as an output from the score step. I've left a corresponding comment on scripts/score_pr.py with the required changes for that file. Once the python script outputs a comment_path, this line would become:

const body = fs.readFileSync('${{ steps.score.outputs.comment_path }}', 'utf8');

gemini-code-assist · 2026-03-28T19:52:46Z

+        lines.append("This PR is within acceptable size limits.")
+
+    comment = "\n".join(lines)
+    with open("/tmp/pr-split-comment.md", "w") as f:


Using a hardcoded path /tmp/pr-split-comment.md can be brittle and might not work on all runner environments (e.g., Windows). It's better to use Python's tempfile module to create a temporary file and then pass its path as an output to the next step. This would also require a change in action.yml to read from this output.

For example, you could replace lines 158-159 with:

comment_path = os.path.join(tempfile.gettempdir(), "pr-split-comment.md") with open(comment_path, "w") as f: f.write(comment) _set_output("comment_path", comment_path)

Remember to import tempfile and import os at the top of the file. I've left a related comment on action.yml.

greptile-apps · 2026-03-28T20:01:11Z

Greptile Summary

This PR introduces a composite GitHub Action (action.yml) and supporting Python script (scripts/score_pr.py) that automatically scores every PR for size and posts a markdown split-plan comment when it exceeds a configurable LOC threshold. It also adds an example workflow and documents the action in the README.

The implementation is well-structured — previous review concerns (shell injection via python-version, fork PR fetch via refs/pull/{n}/head, markdown table pipe escaping, and /tmp path collisions) have all been addressed. A few minor clean-up items remain:

depends_on group IDs in the markdown split-plan table are not passed through _md_escape, unlike every other dynamic field (title, id, file_path).
min_loc_raw is forwarded to pr-split without integer validation, inconsistent with the validated MAX_LOC and THRESHOLD_GROUPS inputs — a bad value produces a misleading "pr-split failed" message.
astral-sh/setup-uv and actions/github-script are pinned to floating version tags (@v4, @v7) rather than immutable commit SHAs, which is a supply-chain concern for a publicly-reusable action.

Confidence Score: 5/5

Safe to merge — all remaining findings are P2 style/hardening suggestions with no impact on correctness or runtime behaviour.

All P0/P1 issues raised in previous review threads have been addressed. The three remaining comments are P2: a minor markdown-escaping gap in depends_on, an inconsistent integer-validation path for min_loc_raw, and unpinned action SHAs. None of these affect the primary user path. Per the confidence guidance, a PR with only P2 findings scores 5/5.

scripts/score_pr.py — minor escaping and validation gaps; action.yml — unpinned third-party action tags.

Important Files Changed

Filename	Overview
scripts/score_pr.py	Core scoring script — fork-PR fetch handled correctly via `refs/pull/{n}/head`; minor gaps: `depends_on` values not escaped in markdown table, `min_loc_raw` not validated as integer
action.yml	Composite action wiring is solid; `python-version` injection addressed via env indirection; unpinned `@v4`/`@v7` action tags remain a supply-chain concern
.github/workflows/split-score.yml	Example workflow triggering on `pull_request` — intentionally uses local `./` reference for self-testing; `pull-requests: write` permission declared correctly
README.md	Documentation update adding GitHub Action section with inputs/outputs tables and usage example — accurate and consistent with action.yml

Sequence Diagram

sequenceDiagram
    participant GH as GitHub Event
    participant WF as split-score.yml
    participant AC as action.yml (composite)
    participant PY as score_pr.py
    participant GS as github-script

    GH->>WF: pull_request opened/synced
    WF->>AC: uses: ./
    AC->>AC: Install uv + Python
    AC->>AC: uv tool install pr-split
    AC->>PY: python score_pr.py
    PY->>PY: git fetch origin base_branch
    PY->>PY: git fetch refs/pull/{n}/head (fork-safe)
    PY->>PY: git diff --numstat → total_loc
    alt total_loc <= max_loc
        PY-->>AC: should_split=false (skip)
    else total_loc > max_loc
        PY->>PY: pr-split split --dry-run
        PY->>PY: Parse .pr-split/plan.json
        PY->>PY: Compute overflow / scatter / objective
        PY->>PY: Write comment markdown to RUNNER_TEMP
        PY-->>AC: outputs: should_split, total_loc, total_groups, objective, comment_path
    end
    AC->>GS: Post comment (if should_split==true)
    GS->>GS: listComments → find <!-- pr-split-score -->
    alt comment exists
        GS->>GH: updateComment (idempotent)
    else no comment
        GS->>GH: createComment
    end

Prompt To Fix All With AI

This is a comment left during a code review.
Path: scripts/score_pr.py
Line: 170

Comment:
**`depends_on` IDs not escaped like other table fields**

`deps` is the only column in the split-plan table whose values aren't passed through `_md_escape`. `depends_on` entries are group IDs — the same values that are escaped with `_md_escape(g["id"])` on line 175. If pr-split ever generates IDs containing `|` (or changes its ID format), this column would break the markdown table while all other columns are already protected.

```suggestion
    deps = ", ".join(_md_escape(d) for d in g.get("depends_on", [])) or "—"
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: scripts/score_pr.py
Line: 102-103

Comment:
**`min-loc` bypasses integer validation**

`MAX_LOC` and `THRESHOLD_GROUPS` are validated through `_parse_int_env` (which exits with a clear message if the value isn't an integer), but `min_loc_raw` is passed straight to `pr-split` without any validation. If someone passes a non-integer string for `min-loc`, pr-split will fail and the script will silently fall through to `_skip("pr-split failed to generate a plan.")` — a misleading error that hides the real cause.

```suggestion
    if min_loc_raw:
        try:
            int(min_loc_raw)
        except ValueError:
            print(f"Error: MIN_LOC must be an integer, got '{min_loc_raw}'.", file=sys.stderr)
            sys.exit(1)
        cmd.extend(["--min-loc", min_loc_raw])
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: action.yml
Line: 47-48

Comment:
**Unpinned action tags are a supply-chain risk**

Both `astral-sh/setup-uv@v4` (line 48) and `actions/github-script@v7` (line 76) reference floating version tags rather than immutable commit SHAs. Any repo that uses this action will silently pick up whatever those tags point to if a tag is force-pushed. For a reusable action published to the Marketplace, pinning to SHA is the recommended practice:

```yaml
- uses: astral-sh/setup-uv@f0ec1fc3b38f5e7cd731bb6ce540c5af426746bb  # v4.x
```
```yaml
- uses: actions/github-script@60a0d83039c74a4aee543508d2ffcb1c3799cdea  # v7.x
```

Pin both to the exact commit SHA that corresponds to the version you trust.

How can I resolve this? If you propose a fix, please make it concise.

_{Reviews (2): Last reviewed commit: "fix: prevent shell injection, support fo..." | Re-trigger Greptile}

… paths

feat: add GitHub Action to score PRs and suggest splits

3339e65

gemini-code-assist Bot reviewed Mar 28, 2026

View reviewed changes

vitali87 added 3 commits March 28, 2026 20:54

fix: use local action reference for in-repo workflow

e170daa

fix: install pr-split from local checkout instead of PyPI

2fa3fbc

fix: create local branch refs for GitHub Actions checkout compatibility

59970db

greptile-apps Bot reviewed Mar 28, 2026

View reviewed changes

Comment thread scripts/score_pr.py Outdated

Comment thread action.yml Outdated

Comment thread action.yml

Comment thread scripts/score_pr.py Outdated

Comment thread scripts/score_pr.py Outdated

fix: prevent shell injection, support fork PRs, and use portable temp…

ee03398

… paths

vitali87 force-pushed the feat/split-score-action branch from 12d823d to ee03398 Compare March 28, 2026 20:04

vitali87 merged commit 3f9577b into main Mar 28, 2026
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add GitHub Action to score PRs and suggest splits#39

feat: add GitHub Action to score PRs and suggest splits#39
vitali87 merged 5 commits into
mainfrom
feat/split-score-action

vitali87 commented Mar 28, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Mar 28, 2026

Uh oh!

gemini-code-assist Bot Mar 28, 2026

Uh oh!

gemini-code-assist Bot Mar 28, 2026

Uh oh!

gemini-code-assist Bot Mar 28, 2026

Uh oh!

gemini-code-assist Bot Mar 28, 2026

Uh oh!

gemini-code-assist Bot Mar 28, 2026

Uh oh!

greptile-apps Bot commented Mar 28, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant



		def main() -> None:
		max_loc = os.environ.get("MAX_LOC", "400")

	\| `post-comment` \| `true` \| Whether to post a PR comment with the results \|
	\| `post-comment` \| `true` \| Whether to post a PR comment with the results \|
	\| `python-version` \| `3.12` \| Python version to use \|

Conversation

vitali87 commented Mar 28, 2026

Summary

How it works

Action outputs

Test plan

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Mar 28, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Mar 28, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Mar 28, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Mar 28, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Mar 28, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Mar 28, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot commented Mar 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

greptile-apps Bot commented Mar 28, 2026 •

edited

Loading