Create a KPI dashboard for quality numbers by ahmed0mousa · Pull Request #448 · eclipse-score/communication

ahmed0mousa · 2026-05-18T14:18:09Z

Add a nightly CI pipeline that runs three quality jobs in parallel (coverage, CodeQL, and clang-tidy) and publishes all results to GitHub Pages under latest/quality/. A Jinja2-based dashboard aggregates the findings into a single page with KPI trend tracking across runs. The Sphinx documentation is extended with a dedicated quality reports page and a version switcher navbar, and on every push to main the docs automatically pull the latest nightly KPI numbers so they stay current without waiting for another nightly run.

Issue: SWP-262453

castler

Not yet fully through

castler · 2026-05-19T06:01:14Z

 env:
  ANDROID_HOME: ""
  ANDROID_SDK_ROOT: ""
+  FORCE_JAVASCRIPT_ACTIONS_TO_NODE24: true


We should not have things different in the release workflow then in others. So either, we add this everywhere or nowhere.

Can you please also state in the commit message why this change is needed?

Node 20 will be deprecated next month on GitHub Actions runners, I can add this to the commit message
https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/

castler · 2026-05-19T06:03:23Z

-            os.system(
-                f"{code_ql_path} database analyze -j=0 {database_location} --format=sarifv2.1.0 --output={output_base}/codeql.sarif")
-            os.system(
-                f"{code_ql_path} database analyze -j=0 {database_location} --format=csv --output={output_base}/codeql.csv")


We should keep the CSV output for direct human readibility

castler · 2026-05-19T06:05:18Z

-                f"{code_ql_path} database analyze -j=0 {database_location} --format=csv --output={output_base}/codeql.csv")
+
+            # Analyze: run MISRA/AUTOSAR queries and produce SARIF.
+            # --ram:     cap at 5 GB to prevent swap thrashing on GitHub runners (7 GB total)


We should not make this by default. We should add an extra option for github runners that we then only enable in the CI where these parameters are changed.

castler · 2026-05-19T06:08:52Z

+      - name: Set conclusion
+        id: set-conclusion
+        run: |
+          if [[ "${{ steps.run-coverage.outcome }}" == "success" ]]; then
+            echo "conclusion=success" >> $GITHUB_OUTPUT
+          else
+            echo "conclusion=failure" >> $GITHUB_OUTPUT
+          fi
+


Why is this needed, can we try to remove this again please?

Because the coverage step has continue-on-error: true meaning if bazel coverage fails, the job doesn't stop, it keeps running. Without the "Set conclusion" step, the caller nightly_quality.yml has no way to know whether coverage actually passed or failed; it only sees the job as success because continue-on-error suppresses the failure. So In continue-on-error hides the failure from GitHub's job status, "Set conclusion" exists to un-hide it for the dashboard. For a nightly quality pipeline, partial data is actually useful, if coverage fails at night, you want something to look at the next morning rather than an empty artifact.

castler · 2026-05-19T06:10:13Z

@@ -0,0 +1,179 @@
+# *******************************************************************************


Can we first just take care of the code coverage please to reduce the scope of the PR.

during the discussion we were always talking about three jobs not only coverage, and I already implemented that.
if you insist I can add a commit on top to remove them so I can just apply it in reverse on another PR

castler · 2026-05-19T06:11:20Z

+
+      # Restore KPI history from the previous gh-pages deployment so the
+      # dashboard can show delta badges and trend sparklines across runs.
+      - name: Restore KPI history


I thought we agreed that we do not want history at the moment?

castler · 2026-05-19T06:13:32Z

+      # Deploy to GitHub Pages
+      # ------------------------------------------------------------------
+      - name: Deploy quality reports to GitHub Pages
+        uses: peaceiris/actions-gh-pages@v4


This way it is not integrated into our Sphinx build, maybe you can talk with Jochen about that

yes that was deploys quality reports directly to gh-pages as a separate, uncoordinated publish. I made nightly_quality.yml upload quality reports as an artifact instead of deploying, then have docs.yml trigger on its completion and deploy everything in one shot.

hoe-jo · 2026-05-19T14:43:35Z

Instead of this Jinja Template maybe generate a RST File which we can include in the sphinx build?

That is technically possible but we will lose the visual dashboard (we agreed during the discussions to have a dashboard), all what we get is a summary tables like what we have here https://github.com/ahmed0mousa/communication/actions/runs/26016646593

hoe-jo · 2026-05-19T14:44:58Z

+                f"{code_ql_path} database analyze"
+                f"{_analyze_flags}"
+                f" {database_location}"
+                f" --format=sarifv2.1.0 --output={output_base}/codeql.sarif",


Should we maybe upload the sarif file if we have it already? Could be used for local development

hoe-jo · 2026-05-19T14:45:45Z

+        env:
+          DOCS_VERSION: ${{ steps.vars.outputs.version }}
+          DOCS_BASE_URL: ${{ steps.vars.outputs.base_url }}
+        run: sphinx-build docs/sphinx _sphinx_output


Is there a reason why you run without bazel?

Yes, the sphinx_doc Bazel target is marked testonly = True, meaning it's for CI testing (lint, link-checking), not for producing the deploy artifact. There are also two practical reasons for running sphinx-build directly:

Environment variables: conf.py reads DOCS_VERSION and DOCS_BASE_URL to configure the version switcher navbar. Passing env vars into Bazel's sandbox requires --action_env flags and is fragile in CI.

Output location: Bazel writes HTML under bazel-out/…/sphinx_doc/… making it awkward to reference in the subsequent deploy step. sphinx-build … _sphinx_output puts it exactly where we need it.

The two-phase approach in the workflow is intentional: Bazel handles the complex artifact generation (Doxygen XML, generated RST files), then sphinx-build is called directly for the final HTML output.

hoe-jo · 2026-05-19T14:48:15Z

+          touch _deploy/.nojekyll
+
+          # Generate versions.json for the pydata-sphinx-theme version switcher
+          python3 docs/sphinx/utils/update_versions_json.py \


mixing here again python and bazel

There's no Bazel target for update_versions_json.py, it's a plain utility script with no BUILD rule. So python3 is the correct call here. Also bazel's sandbox blocks all network access by default, so this can only run outside Bazel. It also needs GH_TOKEN injected from the runner environment, which Bazel would strip.

Use Node24 as Node20 will be deprecated

… only - Use subprocess.run instead of os.system for all CodeQL commands so errors are properly captured and logged. - Add --ram 5000 --timeout 20 -j 2 to database analyze to prevent OOM and hung queries on GitHub runners. - Remove CSV output; SARIF is sufficient for the CI quality report.

- Add FORCE_JAVASCRIPT_ACTIONS_TO_NODE24 env var. - Add 'conclusion' output (success/failure) to match the interface of clang_tidy.yml and codeql.yml. - Add id and continue-on-error to the bazel coverage step so the job can report a conclusion even on test failures. - Gate genhtml and archive steps on run-coverage outcome so they are skipped cleanly when coverage fails. - Fix cache-save condition to also fire on scheduled (nightly) runs. - Include raw LCOV .dat file in the artifact so the quality dashboard can read coverage percentages without re-running genhtml.

Both workflows are triggered only via workflow_call from nightly_quality.yml. Each exposes artifact-name and conclusion outputs so the caller can conditionally download reports and build a unified dashboard. clang_tidy.yml: - Runs 'bazel test --config=clang-tidy //...' with continue-on-error. - Collects per-target *.AspectRulesLintClangTidy.out files and generates an HTML summary with error/warning counts and a findings table. codeql.yml: - Runs 'bazel run --config=codeql //quality/static_analysis:codeql_lint'. - Collects SARIF output and generates an HTML summary from it. - Sets a 180-minute job timeout to guard against hung analyses.

Runs every night at midnight UTC (and on workflow_dispatch). Executes coverage, codeql, and clang-tidy in parallel as reusable workflow calls, then deploys all reports plus a unified KPI dashboard to GitHub Pages.

quality/dashboard/generate_dashboard.py: - Parses CodeQL SARIF files, clang-tidy *.AspectRulesLintClangTidy.out files, and LCOV .dat data into a single Jinja2-rendered HTML page. - Maintains a quality_history.json for KPI trend tracking across runs - Writes a GitHub Actions step summary with markdown KPI tables. quality/dashboard/dashboard.html.j2: - Dark-themed single-page dashboard with tabbed panels for CodeQL, Clang-Tidy and Coverage. - Sortable/filterable findings tables, coverage progress bars, and a run-history table with trend indicators.

ahmed0mousa force-pushed the ahmo_add_nightly_quality_kpi branch from a6a2efd to 7380bb8 Compare May 18, 2026 14:20

castler reviewed May 19, 2026

View reviewed changes

ahmed0mousa force-pushed the ahmo_add_nightly_quality_kpi branch 4 times, most recently from bff61c4 to bb988f3 Compare May 19, 2026 14:36

hoe-jo reviewed May 19, 2026

View reviewed changes

ahmed0mousa added 5 commits May 19, 2026 22:03

ci: add FORCE_JAVASCRIPT_ACTIONS_TO_NODE24 to automated_release workflow

fd1d2f0

Use Node24 as Node20 will be deprecated

ci: add nightly quality orchestrator workflow

476de63

Runs every night at midnight UTC (and on workflow_dispatch). Executes coverage, codeql, and clang-tidy in parallel as reusable workflow calls, then deploys all reports plus a unified KPI dashboard to GitHub Pages.

ahmed0mousa force-pushed the ahmo_add_nightly_quality_kpi branch from bb988f3 to 3b8e61b Compare May 19, 2026 21:37

ahmed0mousa added 3 commits May 20, 2026 10:55

ci: add Sphinx docs deploy workflow with version switcher

3cd60b5

docs: add quality reports page to Sphinx documentation

b7f8297

ahmed0mousa force-pushed the ahmo_add_nightly_quality_kpi branch from 3b8e61b to b7f8297 Compare May 20, 2026 08:56

		@@ -0,0 +1,179 @@
		# *******************************************************************************

Conversation

ahmed0mousa commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

castler left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ahmed0mousa commented May 18, 2026 •

edited

Loading