You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Details: The build-validator job in the evaluation workflow failed on main at 2026-06-25T14:02Z for PR Bump GitHub.Copilot.SDK, MessagePack and Nerdbank.MessagePackΒ #818 evaluation. The Build skill-validator step failed with multiple CS0234 errors: The type or namespace name 'SDK' does not exist in the namespace 'GitHub.Copilot'. Affected files: AgentRunner.cs, Judge.cs, LlmSession.cs, LocalSessionFsHandler.cs, PairwiseJudge.cs. The failure occurred against CLI 1.0.65. Note: Today's scheduled run (at 00:55 UTC) succeeded on the latest main commit, suggesting the incompatibility was fixed by a subsequent commit (PR Require environment input to pat_pool shared workflow.Β #832 merged 2026-06-25T22:35).
Details: Of 4 evaluation runs in the last 24h, 1 failed, 2 succeeded, 1 was cancelled. Failure rate (excl. cancelled) = 33.3% β above the 30% Critical threshold. The single failure was run #4823 (Evaluate PR #818 @ fda2336, build-validator job, build error β same as P1 above). The failure rate is driven entirely by this one build-error run; the scheduled run and PR Bump the github-actions-dependencies group across 1 directory with 5 updatesΒ #833 evaluation both succeeded.
Suggested action: Correlated with P1 above β fixing the SDK build compatibility issue should resolve this metric.
π‘ [P2] Evaluation cancelled after 102 min on main (issue_comment trigger)
Fingerprint:pipeline:evaluation:evaluate:timeout
First seen: 2026-06-26
Details: An issue_comment-triggered evaluation run ("Add eval coverage for dotnet-test/filter-syntax") was cancelled on main at 2026-06-25T08:34Z after running for ~102 minutes. No job failed β the run was cancelled externally (likely by a concurrency group eviction when a newer run started). This is consistent with the chronic eval duration issue (runs taking 99β130 min).
Suggested action: The 102-min runtime is consistent with the chronic duration issue (P3/resource:eval-duration:critical). Reducing eval runtime (parallelizing dotnet-test plugin evaluation) would also reduce concurrency-group cancellations.
π Investigation Results
Deep investigations are dispatched for new critical/warning findings.
The grooming workflow links results ~3 hours after this run.
No findings resolved since the last health check (2026-06-25).
π Existing Findings (3)
These have been present since before today. Sorted by severity then age.
π΄ Critical β Evaluation average duration critical (~99β130 min avg, threshold: 55 min) Β· first seen 2026-06-03 Β· 12 occurrences
Fingerprint:resource:eval-duration:critical
First seen: 2026-06-03 Β· Occurrences: 12 (chronic β 3+ weeks)
Details: The 14-day average for substantial evaluation schedule runs remains well above the 55-min critical threshold. Today's scheduled run took 99 minutes (down from the 121β130 min recent average), but still 1.8Γ the threshold. The previous run (yesterday's) was 85 min, suggesting some variance.
7-day summary (schedule, main): Today: 99 min β success; yesterday: 85 min β ; Jun 23β24: 2 failures. Avg ~118 min (est.)
Suggested action: Parallelize eval scenarios in evaluate (dotnet-test) β split 175 sequential scenarios across parallel jobs, or introduce a fast-path for PR-triggered evaluations vs. scheduled full runs.
π‘ Warning β Orphan plugin: dotnet-experimental not listed in marketplace.json Β· first seen 2026-05-14 Β· 31 occurrences
First seen: 2026-05-14 Β· Occurrences: 31 (chronic β 6+ weeks)
Details:plugins/dotnet-experimental/ exists on disk with a valid plugin.json and skills (exp-mock-usage-analysis, exp-test-maintainability, exp-simd-vectorization), but no entry in .github/plugin/marketplace.json (which lists 14 plugins). The plugin is not discoverable by consumers.
Suggested action: Either add { "name": "dotnet-experimental", "source": "./plugins/dotnet-experimental", "description": "..." } to marketplace.json when ready to publish, or remove the directory if not intended for publication.
π΅ Info β evaluation.yml uses --verdict-warn-only mode Β· first seen 2026-05-16 Β· 29 occurrences
Fingerprint:infra:verdict-warn-only
First seen: 2026-05-16 Β· Occurrences: 29 (intentional configuration)
Details:evaluation.yml passes --verdict-warn-only to the skill-validator, treating skill validation failures as warnings rather than hard failures. This is intentional.
β οΈEval pipeline degraded today: 1 evaluation build failure + 1 cancellation on main in 24h.
β Scheduled eval succeeded (99 min, latest main commit). The build failure was on an older PR commit and appears resolved. β οΈP1 + P5 are correlated β the SDK build error in PR #818's skill-validator code drove both findings. The fix (PR #832, merged 22:35 UTC) appears to have resolved the incompatibility. β οΈEval duration remains a chronic concern (118 min est. 7d avg). Concurrency cancellations will continue until runtime is reduced. β οΈ Skipped I5 check (Pages deployment): GitHub Pages API not accessible via available tools.
iοΈ I3 (validate-skills): No validate-skills.yml workflow found β check not applicable.
iοΈ I6 (unpinned actions): All workflow action references use first-party (actions/*) or SHA-pinned third-party actions β no unpinned third-party actions detected.
π€ Generated by DevOps Health Check agentic workflow Β· Run #28217288553 Β· 2026-06-26T04:34 UTC
π₯ Daily Health Check β 2026-06-26
Status: π΄ 3 critical Β· π‘ 2 warnings Β· π΅ 1 info
Since yesterday: π 3 new Β· β 0 resolved Β· π 3 unchanged
π New Findings (3)
π΄ [P1] Evaluation
build-validatorjob failed: GitHub.Copilot.SDK build errorpipeline:evaluation:build-validator:build-skill-validator:failurebuild-validatorjob in theevaluationworkflow failed onmainat 2026-06-25T14:02Z for PR Bump GitHub.Copilot.SDK, MessagePack and Nerdbank.MessagePackΒ #818 evaluation. TheBuild skill-validatorstep failed with multipleCS0234errors:The type or namespace name 'SDK' does not exist in the namespace 'GitHub.Copilot'. Affected files:AgentRunner.cs,Judge.cs,LlmSession.cs,LocalSessionFsHandler.cs,PairwiseJudge.cs. The failure occurred against CLI 1.0.65. Note: Today's scheduled run (at 00:55 UTC) succeeded on the latest main commit, suggesting the incompatibility was fixed by a subsequent commit (PR Require environment input to pat_pool shared workflow.Β #832 merged 2026-06-25T22:35).GitHub.Copilot.SDKnamespace usage inAgentRunner.cs,Judge.cs,LlmSession.csis compatible with the pinned CLI version. Check if PR skill-validator: restore 15K aggregate cap as the real Copilot CLI skill-menu budgetΒ #803 (restore 15K cap) introduced a reference to SDK APIs not yet available in CLI 1.0.65.π΄ [P5] Evaluation failure rate critical: ~33% across all branches in last 24h
pipeline:evaluation:failure-rate:criticalEvaluate PR #818 @ fda2336,build-validatorjob, build error β same as P1 above). The failure rate is driven entirely by this one build-error run; the scheduled run and PR Bump the github-actions-dependencies group across 1 directory with 5 updatesΒ #833 evaluation both succeeded.π‘ [P2] Evaluation cancelled after 102 min on main (issue_comment trigger)
pipeline:evaluation:evaluate:timeoutissue_comment-triggered evaluation run ("Add eval coverage for dotnet-test/filter-syntax") was cancelled onmainat 2026-06-25T08:34Z after running for ~102 minutes. No job failed β the run was cancelled externally (likely by a concurrency group eviction when a newer run started). This is consistent with the chronic eval duration issue (runs taking 99β130 min).dotnet-testplugin evaluation) would also reduce concurrency-group cancellations.π Investigation Results
evaluate (dotnet-test)job is the sole critical bottleneck at 100 minutes, driven by 175 sequential eval scenarios; the regression traces directly to PR #707 (merged June 1) which added polyglot scenarios to three skills.β Resolved Since Yesterday (0)
π Existing Findings (3)
π΄ Critical β Evaluation average duration critical (~99β130 min avg, threshold: 55 min) Β· first seen 2026-06-03 Β· 12 occurrences
resource:eval-duration:criticalevaluate (dotnet-test)job is the bottleneck, driven by 175 sequential eval scenarios added by PR Make dotnet-test analysis skills and auditor agent polyglotΒ #707 (merged 2026-06-01).evaluate (dotnet-test)β split 175 sequential scenarios across parallel jobs, or introduce a fast-path for PR-triggered evaluations vs. scheduled full runs.π‘ Warning β Orphan plugin: dotnet-experimental not listed in marketplace.json Β· first seen 2026-05-14 Β· 31 occurrences
infra:orphan-plugin:dotnet-experimentalplugins/dotnet-experimental/exists on disk with a validplugin.jsonand skills (exp-mock-usage-analysis,exp-test-maintainability,exp-simd-vectorization), but no entry in.github/plugin/marketplace.json(which lists 14 plugins). The plugin is not discoverable by consumers.{ "name": "dotnet-experimental", "source": "./plugins/dotnet-experimental", "description": "..." }tomarketplace.jsonwhen ready to publish, or remove the directory if not intended for publication.π΅ Info β evaluation.yml uses --verdict-warn-only mode Β· first seen 2026-05-16 Β· 29 occurrences
infra:verdict-warn-onlyevaluation.ymlpasses--verdict-warn-onlyto the skill-validator, treating skill validation failures as warnings rather than hard failures. This is intentional.π Trends (7-day)
π€ Generated by DevOps Health Check agentic workflow Β· Run #28217288553 Β· 2026-06-26T04:34 UTC