From 8b5e57b7aeb54a6e591c75b40ba05cd3bff1819d Mon Sep 17 00:00:00 2001 From: Wei-Chiu Chuang Date: Fri, 10 Apr 2026 16:22:01 -0700 Subject: [PATCH 1/9] HDDS-11026. Continuous Integration With GitHub Actions. --- .../03-test/04-continuous-integration.md | 159 +++++++++++++++++- 1 file changed, 152 insertions(+), 7 deletions(-) diff --git a/docs/08-developer-guide/03-test/04-continuous-integration.md b/docs/08-developer-guide/03-test/04-continuous-integration.md index 0f784dfc5c..1f24b71880 100644 --- a/docs/08-developer-guide/03-test/04-continuous-integration.md +++ b/docs/08-developer-guide/03-test/04-continuous-integration.md @@ -1,15 +1,160 @@ --- -draft: true sidebar_label: Continuous Integration --- # Continuous Integration With GitHub Actions -**TODO:** File a subtask under [HDDS-9861](https://issues.apache.org/jira/browse/HDDS-9861) and complete this page or section. +If you are new to the project, **you do not need to understand every job** below on day one. The goal of this page is to help you get a green **`build-branch`** run on your fork, know where to look when something fails, and find deeper detail when you need it. -Aggregate content from our various GitHub actions guides, including +Apache Ozone uses [GitHub Actions](https://docs.github.com/en/actions) to build and test every meaningful change. Workflow files live in [`.github/workflows`](https://github.com/apache/ozone/tree/master/.github/workflows) in [`apache/ozone`](https://github.com/apache/ozone). A longer, file-by-file reference lives in [`.github/ci.md`](https://github.com/apache/ozone/blob/master/.github/ci.md). -- [ci.md](https://github.com/apache/ozone/blob/master/.github/ci.md) -- [CONTRIBUTING.md](https://github.com/apache/ozone/blob/master/CONTRIBUTING.md#check-your-contribution) -- https://cwiki.apache.org/confluence/display/OZONE/Ozone+CI+with+Github+Actions -- https://cwiki.apache.org/confluence/display/OZONE/Github+Actions+tips+and+tricks. +:::info Use the right repository + +This page is about **[`apache/ozone`](https://github.com/apache/ozone)** (the Ozone product source code). The documentation site you are reading comes from **[`apache/ozone-site`](https://github.com/apache/ozone-site)** and has its **own** CI. For website-only edits, use the [ozone-site contributing guide](https://github.com/apache/ozone-site/blob/master/CONTRIBUTING.md). + +::: + +## Start here: your first code contribution + +Follow these steps once; after that, pushing to your branch is the usual loop. + +1. **Fork and clone** [`apache/ozone`](https://github.com/apache/ozone) to **your** GitHub account, then clone **your fork** locally. You will push branches to `origin` on the fork, then open a PR to `apache/ozone`. +2. **Turn on Actions** on the fork so workflows actually run ([how to enable them](#enable-github-actions-on-your-fork)). +3. **Jira** — Create or choose an issue in [HDDS](https://issues.apache.org/jira/projects/HDDS/) (the Ozone Jira project; the name is historical). Need an account? Use the ASF [Jira self-service](https://selfserve.apache.org/jira-account.html?project=ozone) form. +4. **Branch** — Work on a branch, often named after the issue (for example `HDDS-1234`). +5. **Push** — When you push, GitHub should show a **`build-branch`** workflow run under the **Actions** tab on your fork. Wait for it to finish and fix any failures you can reproduce. +6. **Open the PR** — Use the [pull request template](https://github.com/apache/ozone/blob/master/.github/pull_request_template.md). When the change is ready for review, set the Jira to **Patch Available** so committers know to look. + +The full narrative (reviews, merging, Jira etiquette) is in the [Ozone contributing guide](https://github.com/apache/ozone/blob/master/CONTRIBUTING.md#contribute-your-modifications). + +:::tip You can lean on CI first + +Many contributors fix quick issues by reading the failing log on GitHub, then pushing a small follow-up commit. Running every check locally is **optional but helpful** for faster feedback; see [Run checks on your machine](#run-checks-on-your-machine). + +::: + +## Enable GitHub Actions on your fork + +New forks sometimes have workflows off until you allow them. + +1. Open **your fork** on GitHub → **Settings** → **Actions** → **General**. +2. Under **Actions permissions**, pick a policy that allows workflows to run (many people use **Allow all actions and reusable workflows** on personal forks). +3. Open the **Actions** tab. If GitHub asks to enable workflows, confirm so **`build-branch`** runs when you push. + +More detail: [Enabling or disabling GitHub Actions](https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/enabling-or-disabling-github-actions-for-a-repository). + +## What you see in GitHub: `build-branch` + +Two names show up in docs; both mean “the main CI pipeline”: + +| What | Meaning | +| --- | --- | +| **`build-branch`** | The **name** of the workflow in the Actions tab. It comes from the `name:` field in [`post-commit.yml`](https://github.com/apache/ozone/blob/master/.github/workflows/post-commit.yml). | +| **`ci.yml`** | Where most **jobs** (compile, tests, and so on) are defined. `post-commit.yml` calls this file as a [reusable workflow](https://docs.github.com/en/actions/using-workflows/reusing-workflows). | + +So: **`post-commit.yml`** = front door; **`ci.yml`** = where the heavy lifting is described. + +When you push new commits to an open pull request, **newer runs can cancel older ones** still in progress ([concurrency](https://docs.github.com/en/actions/using-jobs/using-concurrency)). That is normal and saves time. + +## Run checks on your machine + +Running scripts locally catches problems before you push. You need a working dev environment first—see [Build with Maven](../../developer-guide/build/maven) and [Building from source](https://github.com/apache/ozone/blob/master/CONTRIBUTING.md#building-from-source) in `CONTRIBUTING.md`. + +From the **root of your clone** (the folder that contains `hadoop-ozone/`): + +```bash +./hadoop-ozone/dev-support/checks/build.sh +``` + +Most checks live in [`hadoop-ozone/dev-support/checks`](https://github.com/apache/ozone/tree/master/hadoop-ozone/dev-support/checks). The [Check your contribution](https://github.com/apache/ozone/blob/master/CONTRIBUTING.md#check-your-contribution) section groups them by rough duration: + +| Rough time | Scripts | What they do | +| --- | --- | --- | +| Build step | `build.sh` | Compile Ozone | +| Minutes | `author.sh`, `bats.sh`, `rat.sh`, `docs.sh`, `dependency.sh`, `checkstyle.sh`, `pmd.sh` | Style, license headers, docs, dependency list | +| ~10 minutes | `findbugs.sh`, `kubernetes.sh` | SpotBugs, small Kubernetes-related checks | +| An hour or more | `unit.sh`, `integration.sh`, `acceptance.sh` | Unit tests, mini-cluster tests, Docker Compose acceptance tests | + +More on test styles: [Unit tests](./unit-tests), [Integration tests](./integration-tests), [Acceptance tests](./acceptance-tests). + +`integration.sh` and `acceptance.sh` can take extra arguments to run a subset; open the scripts to see options. Output usually lands under `target/` (for example `target/docs`). + +## Why did CI skip some jobs? + +Not every pull request runs every job. A step called **build-info** runs [`selective_ci_checks.sh`](https://github.com/apache/ozone/blob/master/dev-support/ci/selective_ci_checks.sh) and only enables jobs that match the files you changed—unless: + +- the run is **not** from a PR, or +- the PR has the **`full tests needed`** [label](https://docs.github.com/en/issues/using-labels-and-milestones-to-track-work/managing-labels). + +So a focused change might show fewer checks than a large refactor. **That is expected.** Reviewers can add **`full tests needed`** when the full matrix is required. If you think the wrong jobs were skipped, **ask on the PR**; reviewers are used to that question. + +## What the main CI jobs do (overview) + +The list below matches [`.github/ci.md`](https://github.com/apache/ozone/blob/master/.github/ci.md). Treat it as a map when reading logs, not something to memorize. + +- **build-info** — Decides which other jobs run (selective CI). +- **compile** — Builds with Java 8 and 11 via [`build.sh`](https://github.com/apache/ozone/blob/master/hadoop-ozone/dev-support/checks/build.sh); later jobs typically use the Java 8 build. +- **basic** — Checks like author tags, BATS, Checkstyle, Hugo for docs, SpotBugs, PMD, RAT—depending on what was selected. +- **unit** — [`unit.sh`](https://github.com/apache/ozone/blob/master/hadoop-ozone/dev-support/checks/unit.sh) and [`native.sh`](https://github.com/apache/ozone/blob/master/hadoop-ozone/dev-support/checks/native.sh) for RocksDB-native tests. +- **dependency** — Compares JARs to [`jar-report.txt`](https://github.com/apache/ozone/blob/master/hadoop-ozone/dist/src/main/license/jar-report.txt). +- **acceptance** — [`acceptance.sh`](https://github.com/apache/ozone/blob/master/hadoop-ozone/dev-support/checks/acceptance.sh) (Robot Framework + Docker Compose; variants like secure / unsecure / misc). +- **kubernetes** — [`kubernetes.sh`](https://github.com/apache/ozone/blob/master/hadoop-ozone/dev-support/checks/kubernetes.sh). +- **integration** — [`integration.sh`](https://github.com/apache/ozone/blob/master/hadoop-ozone/dev-support/checks/integration.sh) (mini-cluster style tests, often sharded in CI). +- **coverage** — [`coverage.sh`](https://github.com/apache/ozone/blob/master/hadoop-ozone/dev-support/checks/coverage.sh) merges coverage when earlier jobs produced it. + +Some jobs use a **matrix** (for example multiple Java versions) with **fail-fast**: if one matrix leg fails, the others in that matrix stop. Unrelated jobs can still run until they finish or fail. + +## Other workflows + +The [workflows directory](https://github.com/apache/ozone/tree/master/.github/workflows) also contains jobs for caches, labeling, Ratis builds, repeating tests, generated docs, and more. The folder on `master` is the up-to-date list. + +### Stale pull requests + +[`close-stale-prs.yaml`](https://github.com/apache/ozone/blob/master/.github/workflows/close-stale-prs.yaml) runs on a timer and uses [actions/stale](https://github.com/actions/stale) to nudge and eventually close very inactive PRs. Exact timings are in that file. + +## If something fails + +:::note Green CI is a team norm + +A red check does not mean you did something wrong—it means the run found something to fix. It happens to everyone. + +::: + +1. Open the failed **`build-branch`** run → click the red job → read the **log** from the bottom upward for the first error. +2. If the job uploaded **Artifacts**, download them from the run summary (they expire after a short time). +3. Try the same **check script** locally if you have the environment set up ([Run checks on your machine](#run-checks-on-your-machine)). +4. To re-run without new code, use **Re-run all jobs** or **Re-run failed jobs** on the run page, or: + +```bash +git commit --allow-empty -m 'trigger new CI check' +``` + +Failures on the **main** repo’s default branch sometimes leave extra artifacts on [ozone build results](https://elek.github.io/ozone-build-results/) (community mirror). + +### Get help + +- Ask on your **pull request**—reviewers can interpret unfamiliar failures quickly. +- **Email** [dev@ozone.apache.org](mailto:dev@ozone.apache.org) for broader questions. +- **[GitHub Discussions](https://github.com/apache/ozone/discussions)** and the [#ozone](http://s.apache.org/slack-invite) Slack channel (ASF Slack) are listed in [`CONTRIBUTING.md`](https://github.com/apache/ozone/blob/master/CONTRIBUTING.md#who-to-contact). + +### Local and wiki testing notes + +- Wiki: [Running Ozone smoke tests and unit tests](https://cwiki.apache.org/confluence/display/OZONE/Running+Ozone+Smoke+Tests+and+Unit+Tests). + +## Advanced: flaky tests and debugging on a fork + +These patterns are for **repeat failures** or **environment-only** bugs. They usually live on a **personal fork**, not in `apache/ozone`. + +- Wiki: [GitHub Actions tips and tricks](https://cwiki.apache.org/confluence/display/OZONE/Github+Actions+tips+and+tricks) — running one test many times, extra logging, optional [tmate](https://github.com/tmate-io/tmate)-style access (unsafe on public repos; never with secrets exposed). +- Prefer current runner images (for example `ubuntu-latest`) when copying older examples. + +## Deprecated workflows + +Old workflows can still appear on the [Actions](https://github.com/apache/ozone/actions) tab. An outdated workflow also named **build-branch** is tied to [`chaos.yml`](https://github.com/apache/ozone/actions/workflows/chaos.yml), not the current `post-commit.yml` pipeline—compare URLs. Full list: [`.github/ci.md` — Old/Deprecated Workflows](https://github.com/apache/ozone/blob/master/.github/ci.md#olddeprecated-workflows). + +## See also + +- [`.github/ci.md`](https://github.com/apache/ozone/blob/master/.github/ci.md) in `apache/ozone` +- [Contributing guide — Check your contribution](https://github.com/apache/ozone/blob/master/CONTRIBUTING.md#check-your-contribution) +- Wiki: [Ozone CI with GitHub Actions](https://cwiki.apache.org/confluence/display/OZONE/Ozone+CI+with+Github+Actions) (older context; prefer the repo for current names) +- Wiki: [GitHub Actions tips and tricks](https://cwiki.apache.org/confluence/display/OZONE/Github+Actions+tips+and+tricks) From 7f4519975adcb7a85a408f62c41942fdc9a7acf1 Mon Sep 17 00:00:00 2001 From: Wei-Chiu Chuang Date: Fri, 10 Apr 2026 16:49:06 -0700 Subject: [PATCH 2/9] Fix website build error --- docs/08-developer-guide/03-test/04-continuous-integration.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/08-developer-guide/03-test/04-continuous-integration.md b/docs/08-developer-guide/03-test/04-continuous-integration.md index 1f24b71880..a6f183f216 100644 --- a/docs/08-developer-guide/03-test/04-continuous-integration.md +++ b/docs/08-developer-guide/03-test/04-continuous-integration.md @@ -75,7 +75,7 @@ Most checks live in [`hadoop-ozone/dev-support/checks`](https://github.com/apach | ~10 minutes | `findbugs.sh`, `kubernetes.sh` | SpotBugs, small Kubernetes-related checks | | An hour or more | `unit.sh`, `integration.sh`, `acceptance.sh` | Unit tests, mini-cluster tests, Docker Compose acceptance tests | -More on test styles: [Unit tests](./unit-tests), [Integration tests](./integration-tests), [Acceptance tests](./acceptance-tests). +More on test styles: [Acceptance tests](./acceptance-tests) on this site. For unit and integration testing (and running checks locally), see [Running Ozone smoke tests and unit tests](https://cwiki.apache.org/confluence/display/OZONE/Running+Ozone+Smoke+Tests+and+Unit+Tests) on the wiki until the dedicated **Unit tests** and **Integration tests** pages here are published. `integration.sh` and `acceptance.sh` can take extra arguments to run a subset; open the scripts to see options. Output usually lands under `target/` (for example `target/docs`). From d16fe6e4351fa5d39d77c597cf359e0235cf3bcb Mon Sep 17 00:00:00 2001 From: Wei-Chiu Chuang Date: Mon, 13 Apr 2026 16:36:59 -0700 Subject: [PATCH 3/9] HDDS-11026. Address CI doc review from adoroszlai. - Document build vs compile; drop obsolete unit job; clarify integration and native (HDDS-9242, HDDS-12734). - Replace fail-fast wording with HDDS-6464 behavior and per-job re-run. - Rephrase dependency check around LICENSE.txt; move build.sh example after table. - Update flaky/transient re-run guidance; remove deprecated workflow and outdated wiki links. Made-with: Cursor --- .../03-test/04-continuous-integration.md | 46 +++++++------------ 1 file changed, 17 insertions(+), 29 deletions(-) diff --git a/docs/08-developer-guide/03-test/04-continuous-integration.md b/docs/08-developer-guide/03-test/04-continuous-integration.md index a6f183f216..bed6020cf3 100644 --- a/docs/08-developer-guide/03-test/04-continuous-integration.md +++ b/docs/08-developer-guide/03-test/04-continuous-integration.md @@ -60,12 +60,6 @@ When you push new commits to an open pull request, **newer runs can cancel older Running scripts locally catches problems before you push. You need a working dev environment first—see [Build with Maven](../../developer-guide/build/maven) and [Building from source](https://github.com/apache/ozone/blob/master/CONTRIBUTING.md#building-from-source) in `CONTRIBUTING.md`. -From the **root of your clone** (the folder that contains `hadoop-ozone/`): - -```bash -./hadoop-ozone/dev-support/checks/build.sh -``` - Most checks live in [`hadoop-ozone/dev-support/checks`](https://github.com/apache/ozone/tree/master/hadoop-ozone/dev-support/checks). The [Check your contribution](https://github.com/apache/ozone/blob/master/CONTRIBUTING.md#check-your-contribution) section groups them by rough duration: | Rough time | Scripts | What they do | @@ -73,9 +67,15 @@ Most checks live in [`hadoop-ozone/dev-support/checks`](https://github.com/apach | Build step | `build.sh` | Compile Ozone | | Minutes | `author.sh`, `bats.sh`, `rat.sh`, `docs.sh`, `dependency.sh`, `checkstyle.sh`, `pmd.sh` | Style, license headers, docs, dependency list | | ~10 minutes | `findbugs.sh`, `kubernetes.sh` | SpotBugs, small Kubernetes-related checks | -| An hour or more | `unit.sh`, `integration.sh`, `acceptance.sh` | Unit tests, mini-cluster tests, Docker Compose acceptance tests | +| An hour or more | `integration.sh`, `acceptance.sh` | JUnit tests (via integration), mini-cluster style tests, Docker Compose acceptance tests | + +The command below is **only an example** of running one script from the **root of your clone** (the folder that contains `hadoop-ozone/`): + +```bash +./hadoop-ozone/dev-support/checks/build.sh +``` -More on test styles: [Acceptance tests](./acceptance-tests) on this site. For unit and integration testing (and running checks locally), see [Running Ozone smoke tests and unit tests](https://cwiki.apache.org/confluence/display/OZONE/Running+Ozone+Smoke+Tests+and+Unit+Tests) on the wiki until the dedicated **Unit tests** and **Integration tests** pages here are published. +More on test styles: [Acceptance tests](./acceptance-tests) on this site. `integration.sh` and `acceptance.sh` can take extra arguments to run a subset; open the scripts to see options. Output usually lands under `target/` (for example `target/docs`). @@ -93,16 +93,18 @@ So a focused change might show fewer checks than a large refactor. **That is exp The list below matches [`.github/ci.md`](https://github.com/apache/ozone/blob/master/.github/ci.md). Treat it as a map when reading logs, not something to memorize. - **build-info** — Decides which other jobs run (selective CI). -- **compile** — Builds with Java 8 and 11 via [`build.sh`](https://github.com/apache/ozone/blob/master/hadoop-ozone/dev-support/checks/build.sh); later jobs typically use the Java 8 build. +- **build** — Performs a full build; its output is reused by later jobs. +- **compile** — Re-builds with various Java versions. It consumes the **source tarball** (release artifact) produced by **build**, not a fresh checkout of the git repository. - **basic** — Checks like author tags, BATS, Checkstyle, Hugo for docs, SpotBugs, PMD, RAT—depending on what was selected. -- **unit** — [`unit.sh`](https://github.com/apache/ozone/blob/master/hadoop-ozone/dev-support/checks/unit.sh) and [`native.sh`](https://github.com/apache/ozone/blob/master/hadoop-ozone/dev-support/checks/native.sh) for RocksDB-native tests. -- **dependency** — Compares JARs to [`jar-report.txt`](https://github.com/apache/ozone/blob/master/hadoop-ozone/dist/src/main/license/jar-report.txt). +- **dependency** — Detects whether dependencies were added or removed, as a reminder to update `LICENSE.txt` (how this is implemented—for example comparisons against `jar-report.txt`—is an internal detail). - **acceptance** — [`acceptance.sh`](https://github.com/apache/ozone/blob/master/hadoop-ozone/dev-support/checks/acceptance.sh) (Robot Framework + Docker Compose; variants like secure / unsecure / misc). - **kubernetes** — [`kubernetes.sh`](https://github.com/apache/ozone/blob/master/hadoop-ozone/dev-support/checks/kubernetes.sh). -- **integration** — [`integration.sh`](https://github.com/apache/ozone/blob/master/hadoop-ozone/dev-support/checks/integration.sh) (mini-cluster style tests, often sharded in CI). +- **integration** — [`integration.sh`](https://github.com/apache/ozone/blob/master/hadoop-ozone/dev-support/checks/integration.sh) runs **all** JUnit tests, regardless of which submodule they live in ([HDDS-9242](https://issues.apache.org/jira/browse/HDDS-9242); often sharded in CI). Older CI had a separate **unit** job; it was removed in favor of this. - **coverage** — [`coverage.sh`](https://github.com/apache/ozone/blob/master/hadoop-ozone/dev-support/checks/coverage.sh) merges coverage when earlier jobs produced it. -Some jobs use a **matrix** (for example multiple Java versions) with **fail-fast**: if one matrix leg fails, the others in that matrix stop. Unrelated jobs can still run until they finish or fail. +The RocksDB native library is built and used as part of the normal workflow across jobs ([HDDS-12734](https://issues.apache.org/jira/browse/HDDS-12734)); you do not need a separate “native-only” check to exercise it. + +Matrix jobs (for example multiple Java versions) are configured **without fail-fast** ([HDDS-6464](https://issues.apache.org/jira/browse/HDDS-6464)) so that other matrix legs keep running and failed legs can be **re-run individually** in the GitHub UI. ## Other workflows @@ -123,11 +125,7 @@ A red check does not mean you did something wrong—it means the run found somet 1. Open the failed **`build-branch`** run → click the red job → read the **log** from the bottom upward for the first error. 2. If the job uploaded **Artifacts**, download them from the run summary (they expire after a short time). 3. Try the same **check script** locally if you have the environment set up ([Run checks on your machine](#run-checks-on-your-machine)). -4. To re-run without new code, use **Re-run all jobs** or **Re-run failed jobs** on the run page, or: - -```bash -git commit --allow-empty -m 'trigger new CI check' -``` +4. For **transient** failures or **flaky** tests only, re-trigger what failed: **committers** can use GitHub’s **Re-run failed jobs** on the workflow run. **Other contributors** should wait for a committer to do that, or ask on the PR if it does not happen within a reasonable time (which varies with time of day, weekends, holidays, and so on). Avoid empty commits or re-running the entire workflow when only a subset failed. Some artifacts expire quickly (for example within a day); re-run failed jobs before those artifacts disappear. Failures on the **main** repo’s default branch sometimes leave extra artifacts on [ozone build results](https://elek.github.io/ozone-build-results/) (community mirror). @@ -137,24 +135,14 @@ Failures on the **main** repo’s default branch sometimes leave extra artifacts - **Email** [dev@ozone.apache.org](mailto:dev@ozone.apache.org) for broader questions. - **[GitHub Discussions](https://github.com/apache/ozone/discussions)** and the [#ozone](http://s.apache.org/slack-invite) Slack channel (ASF Slack) are listed in [`CONTRIBUTING.md`](https://github.com/apache/ozone/blob/master/CONTRIBUTING.md#who-to-contact). -### Local and wiki testing notes - -- Wiki: [Running Ozone smoke tests and unit tests](https://cwiki.apache.org/confluence/display/OZONE/Running+Ozone+Smoke+Tests+and+Unit+Tests). - ## Advanced: flaky tests and debugging on a fork These patterns are for **repeat failures** or **environment-only** bugs. They usually live on a **personal fork**, not in `apache/ozone`. -- Wiki: [GitHub Actions tips and tricks](https://cwiki.apache.org/confluence/display/OZONE/Github+Actions+tips+and+tricks) — running one test many times, extra logging, optional [tmate](https://github.com/tmate-io/tmate)-style access (unsafe on public repos; never with secrets exposed). +- Running a single test or suite in a loop, enabling extra logging, or attaching an interactive debugger session (for example via [tmate](https://github.com/tmate-io/tmate)) can help isolate flakiness—use care on **public** forks and **never** expose secrets. - Prefer current runner images (for example `ubuntu-latest`) when copying older examples. -## Deprecated workflows - -Old workflows can still appear on the [Actions](https://github.com/apache/ozone/actions) tab. An outdated workflow also named **build-branch** is tied to [`chaos.yml`](https://github.com/apache/ozone/actions/workflows/chaos.yml), not the current `post-commit.yml` pipeline—compare URLs. Full list: [`.github/ci.md` — Old/Deprecated Workflows](https://github.com/apache/ozone/blob/master/.github/ci.md#olddeprecated-workflows). - ## See also - [`.github/ci.md`](https://github.com/apache/ozone/blob/master/.github/ci.md) in `apache/ozone` - [Contributing guide — Check your contribution](https://github.com/apache/ozone/blob/master/CONTRIBUTING.md#check-your-contribution) -- Wiki: [Ozone CI with GitHub Actions](https://cwiki.apache.org/confluence/display/OZONE/Ozone+CI+with+Github+Actions) (older context; prefer the repo for current names) -- Wiki: [GitHub Actions tips and tricks](https://cwiki.apache.org/confluence/display/OZONE/Github+Actions+tips+and+tricks) From f30c4b32046e5010ce166b1143ed777313f200e2 Mon Sep 17 00:00:00 2001 From: Wei-Chiu Chuang Date: Mon, 13 Apr 2026 16:51:06 -0700 Subject: [PATCH 4/9] HDDS-11026. More CI doc updates per adoroszlai review. - Add local reproduction guidance by job type (basic, dependency, integration, acceptance, kubernetes). - Point flaky debugging to flaky-test-check and repeat-acceptance-test workflows; keep IDE/tmate as optional. - Replace stale build-results mirror link; mention elek.github.io is unmaintained. - ci.yml row: include build; simplify artifact/re-run wording (no retention lecture). Made-with: Cursor --- .../03-test/04-continuous-integration.md | 27 ++++++++++++++----- 1 file changed, 20 insertions(+), 7 deletions(-) diff --git a/docs/08-developer-guide/03-test/04-continuous-integration.md b/docs/08-developer-guide/03-test/04-continuous-integration.md index bed6020cf3..1b8f8f3226 100644 --- a/docs/08-developer-guide/03-test/04-continuous-integration.md +++ b/docs/08-developer-guide/03-test/04-continuous-integration.md @@ -50,7 +50,7 @@ Two names show up in docs; both mean “the main CI pipeline”: | What | Meaning | | --- | --- | | **`build-branch`** | The **name** of the workflow in the Actions tab. It comes from the `name:` field in [`post-commit.yml`](https://github.com/apache/ozone/blob/master/.github/workflows/post-commit.yml). | -| **`ci.yml`** | Where most **jobs** (compile, tests, and so on) are defined. `post-commit.yml` calls this file as a [reusable workflow](https://docs.github.com/en/actions/using-workflows/reusing-workflows). | +| **`ci.yml`** | Where most **jobs** (build, compile, tests, and so on) are defined. `post-commit.yml` calls this file as a [reusable workflow](https://docs.github.com/en/actions/using-workflows/reusing-workflows). | So: **`post-commit.yml`** = front door; **`ci.yml`** = where the heavy lifting is described. @@ -77,6 +77,17 @@ The command below is **only an example** of running one script from the **root o More on test styles: [Acceptance tests](./acceptance-tests) on this site. +### Reproducing failures locally (by check type) + +What you need depends on which job failed: + +1. **basic** — Safe to run locally without a full prior build; scripts are quick (author tags, BATS, RAT, docs, Checkstyle, PMD, SpotBugs, and similar—whatever `basic` selected for that run). +2. **dependency / license** — Quick, but expects a **build** already (the dependency check compares against built outputs / the dependency list). +3. **Checks that reproduce compiler or packaging issues** — Run the same **`build.sh`** (or narrower Maven command) after a normal dev build; align with the log. +4. **integration** (JUnit) — After a build, narrow work to one test with Maven, for example `-Dtest='YourTestClass'`, or run the same class or method from your IDE. [`integration.sh`](https://github.com/apache/ozone/blob/master/hadoop-ozone/dev-support/checks/integration.sh) wraps the full suite; open it for flags and defaults. +5. **acceptance** — Needs a **build** (often a dist build). Prefer re-running only the failing shell driver the log names (for example a line like `ERROR: Test execution of ozone/test-legacy-bucket.sh is FAILED` points at `test-legacy-bucket.sh`) instead of the whole suite. +6. **kubernetes** — The `kubernetes.sh` check is aimed at **Linux** environments; macOS or Windows may not match CI. + `integration.sh` and `acceptance.sh` can take extra arguments to run a subset; open the scripts to see options. Output usually lands under `target/` (for example `target/docs`). ## Why did CI skip some jobs? @@ -123,11 +134,11 @@ A red check does not mean you did something wrong—it means the run found somet ::: 1. Open the failed **`build-branch`** run → click the red job → read the **log** from the bottom upward for the first error. -2. If the job uploaded **Artifacts**, download them from the run summary (they expire after a short time). +2. If the job uploaded **Artifacts**, download them from the run summary while they are still available. 3. Try the same **check script** locally if you have the environment set up ([Run checks on your machine](#run-checks-on-your-machine)). -4. For **transient** failures or **flaky** tests only, re-trigger what failed: **committers** can use GitHub’s **Re-run failed jobs** on the workflow run. **Other contributors** should wait for a committer to do that, or ask on the PR if it does not happen within a reasonable time (which varies with time of day, weekends, holidays, and so on). Avoid empty commits or re-running the entire workflow when only a subset failed. Some artifacts expire quickly (for example within a day); re-run failed jobs before those artifacts disappear. +4. For **transient** failures or **flaky** tests only: **committers** can use GitHub’s **Re-run failed jobs** on the workflow run. **Other contributors** should wait for a committer to do that, or ask on the PR if it does not happen within a reasonable time (which varies with time of day, weekends, holidays, and so on). Avoid empty commits or re-running the entire workflow when only a subset failed. -Failures on the **main** repo’s default branch sometimes leave extra artifacts on [ozone build results](https://elek.github.io/ozone-build-results/) (community mirror). +A maintained mirror of build results from `apache/ozone` default-branch runs is [adoroszlai/ozone-build-results](https://github.com/adoroszlai/ozone-build-results/) (the older `elek.github.io` archive is no longer updated). ### Get help @@ -137,10 +148,12 @@ Failures on the **main** repo’s default branch sometimes leave extra artifacts ## Advanced: flaky tests and debugging on a fork -These patterns are for **repeat failures** or **environment-only** bugs. They usually live on a **personal fork**, not in `apache/ozone`. +For **repeat failures** or **environment-only** bugs, use the dedicated workflows on **your fork** (enable Actions, then run them manually from the **Actions** tab via **workflow_dispatch**): + +- **`flaky-test-check`** — defined in [`intermittent-test-check.yml`](https://github.com/apache/ozone/blob/master/.github/workflows/intermittent-test-check.yml); runs a chosen JUnit class or method many times across parallel splits. +- **`repeat-acceptance-test`** — defined in [`repeat-acceptance.yml`](https://github.com/apache/ozone/blob/master/.github/workflows/repeat-acceptance.yml); repeats acceptance tests concurrently (suite or filter). -- Running a single test or suite in a loop, enabling extra logging, or attaching an interactive debugger session (for example via [tmate](https://github.com/tmate-io/tmate)) can help isolate flakiness—use care on **public** forks and **never** expose secrets. -- Prefer current runner images (for example `ubuntu-latest`) when copying older examples. +Those replace ad-hoc loops from older wiki-style tips. You can still use an IDE, extra logging, or interactive debugging (for example [tmate](https://github.com/tmate-io/tmate)) on a fork if you accept the risk on **public** repos and **never** expose secrets. ## See also From 2da8b687850992df2eab1d8ba96db29961f52f40 Mon Sep 17 00:00:00 2001 From: Wei-Chiu Chuang Date: Mon, 13 Apr 2026 16:54:36 -0700 Subject: [PATCH 5/9] remove additional details. --- docs/08-developer-guide/03-test/04-continuous-integration.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/08-developer-guide/03-test/04-continuous-integration.md b/docs/08-developer-guide/03-test/04-continuous-integration.md index 1b8f8f3226..4fe7745698 100644 --- a/docs/08-developer-guide/03-test/04-continuous-integration.md +++ b/docs/08-developer-guide/03-test/04-continuous-integration.md @@ -138,7 +138,7 @@ A red check does not mean you did something wrong—it means the run found somet 3. Try the same **check script** locally if you have the environment set up ([Run checks on your machine](#run-checks-on-your-machine)). 4. For **transient** failures or **flaky** tests only: **committers** can use GitHub’s **Re-run failed jobs** on the workflow run. **Other contributors** should wait for a committer to do that, or ask on the PR if it does not happen within a reasonable time (which varies with time of day, weekends, holidays, and so on). Avoid empty commits or re-running the entire workflow when only a subset failed. -A maintained mirror of build results from `apache/ozone` default-branch runs is [adoroszlai/ozone-build-results](https://github.com/adoroszlai/ozone-build-results/) (the older `elek.github.io` archive is no longer updated). +A maintained mirror of build results from `apache/ozone` default-branch runs is [adoroszlai/ozone-build-results](https://github.com/adoroszlai/ozone-build-results/). ### Get help @@ -153,7 +153,7 @@ For **repeat failures** or **environment-only** bugs, use the dedicated workflow - **`flaky-test-check`** — defined in [`intermittent-test-check.yml`](https://github.com/apache/ozone/blob/master/.github/workflows/intermittent-test-check.yml); runs a chosen JUnit class or method many times across parallel splits. - **`repeat-acceptance-test`** — defined in [`repeat-acceptance.yml`](https://github.com/apache/ozone/blob/master/.github/workflows/repeat-acceptance.yml); repeats acceptance tests concurrently (suite or filter). -Those replace ad-hoc loops from older wiki-style tips. You can still use an IDE, extra logging, or interactive debugging (for example [tmate](https://github.com/tmate-io/tmate)) on a fork if you accept the risk on **public** repos and **never** expose secrets. +You can still use an IDE, extra logging, or interactive debugging (for example [tmate](https://github.com/tmate-io/tmate)) on a fork if you accept the risk on **public** repos and **never** expose secrets. ## See also From 040f8917a52e3b3c4695f44dee8b153427694b5b Mon Sep 17 00:00:00 2001 From: Wei-Chiu Chuang Date: Mon, 13 Apr 2026 16:57:38 -0700 Subject: [PATCH 6/9] Fix spelling --- cspell.yaml | 1 + 1 file changed, 1 insertion(+) diff --git a/cspell.yaml b/cspell.yaml index 62d88fdb4a..7a47361779 100644 --- a/cspell.yaml +++ b/cspell.yaml @@ -213,6 +213,7 @@ words: - Namenodes # Apache Ozone community member names - Sumit +- adoroszlai # Company names for "Who Uses Ozone" page - Didi - Shopee From ea1cde70f16608f44ca5d1821be5543212b076e3 Mon Sep 17 00:00:00 2001 From: Wei-Chiu Chuang Date: Wed, 15 Apr 2026 16:26:59 -0700 Subject: [PATCH 7/9] Update docs/08-developer-guide/03-test/04-continuous-integration.md Co-authored-by: Sarveksha Yeshavantha Raju <79865743+sarvekshayr@users.noreply.github.com> --- docs/08-developer-guide/03-test/04-continuous-integration.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/08-developer-guide/03-test/04-continuous-integration.md b/docs/08-developer-guide/03-test/04-continuous-integration.md index 4fe7745698..13185dbbde 100644 --- a/docs/08-developer-guide/03-test/04-continuous-integration.md +++ b/docs/08-developer-guide/03-test/04-continuous-integration.md @@ -23,7 +23,7 @@ Follow these steps once; after that, pushing to your branch is the usual loop. 3. **Jira** — Create or choose an issue in [HDDS](https://issues.apache.org/jira/projects/HDDS/) (the Ozone Jira project; the name is historical). Need an account? Use the ASF [Jira self-service](https://selfserve.apache.org/jira-account.html?project=ozone) form. 4. **Branch** — Work on a branch, often named after the issue (for example `HDDS-1234`). 5. **Push** — When you push, GitHub should show a **`build-branch`** workflow run under the **Actions** tab on your fork. Wait for it to finish and fix any failures you can reproduce. -6. **Open the PR** — Use the [pull request template](https://github.com/apache/ozone/blob/master/.github/pull_request_template.md). When the change is ready for review, set the Jira to **Patch Available** so committers know to look. +6. **Open the PR** — Use the [pull request template](https://github.com/apache/ozone/blob/master/.github/pull_request_template.md) to describe your work and raise the PR. Once submitted, change the Jira status to **Patch Available**. The full narrative (reviews, merging, Jira etiquette) is in the [Ozone contributing guide](https://github.com/apache/ozone/blob/master/CONTRIBUTING.md#contribute-your-modifications). From 0563d9c43930cdd655ce610a516a2be83128df56 Mon Sep 17 00:00:00 2001 From: Wei-Chiu Chuang Date: Wed, 15 Apr 2026 19:44:14 -0700 Subject: [PATCH 8/9] HDDS-11026. CI doc: address PR review (structure, links, wording) - Move selective-CI section before local checks; fix GitHub Actions settings URL - Link Maven and full-tests label to site/issues; trim duplicate CONTRIBUTING pointers - Clarify local repro (FQCN for -Dtest, acceptance log example, build table times) - Soften emphasis; merge Patch Available + contributor review guidance for step 6 Made-with: Cursor --- .../03-test/04-continuous-integration.md | 118 +++++++++--------- 1 file changed, 59 insertions(+), 59 deletions(-) diff --git a/docs/08-developer-guide/03-test/04-continuous-integration.md b/docs/08-developer-guide/03-test/04-continuous-integration.md index 13185dbbde..a59d7e5ae7 100644 --- a/docs/08-developer-guide/03-test/04-continuous-integration.md +++ b/docs/08-developer-guide/03-test/04-continuous-integration.md @@ -4,13 +4,13 @@ sidebar_label: Continuous Integration # Continuous Integration With GitHub Actions -If you are new to the project, **you do not need to understand every job** below on day one. The goal of this page is to help you get a green **`build-branch`** run on your fork, know where to look when something fails, and find deeper detail when you need it. +If you are new to the project, you do not need to understand every job below on day one. The goal of this page is to help you get a green `build-branch` run on your fork, know where to look when something fails, and find deeper detail when you need it. Apache Ozone uses [GitHub Actions](https://docs.github.com/en/actions) to build and test every meaningful change. Workflow files live in [`.github/workflows`](https://github.com/apache/ozone/tree/master/.github/workflows) in [`apache/ozone`](https://github.com/apache/ozone). A longer, file-by-file reference lives in [`.github/ci.md`](https://github.com/apache/ozone/blob/master/.github/ci.md). :::info Use the right repository -This page is about **[`apache/ozone`](https://github.com/apache/ozone)** (the Ozone product source code). The documentation site you are reading comes from **[`apache/ozone-site`](https://github.com/apache/ozone-site)** and has its **own** CI. For website-only edits, use the [ozone-site contributing guide](https://github.com/apache/ozone-site/blob/master/CONTRIBUTING.md). +This page is about [`apache/ozone`](https://github.com/apache/ozone) (the Ozone product source code). The documentation site you are reading comes from [`apache/ozone-site`](https://github.com/apache/ozone-site) and has its own CI. For website-only edits, use the [ozone-site contributing guide](https://github.com/apache/ozone-site/blob/master/CONTRIBUTING.md). ::: @@ -18,18 +18,18 @@ This page is about **[`apache/ozone`](https://github.com/apache/ozone)** (the Oz Follow these steps once; after that, pushing to your branch is the usual loop. -1. **Fork and clone** [`apache/ozone`](https://github.com/apache/ozone) to **your** GitHub account, then clone **your fork** locally. You will push branches to `origin` on the fork, then open a PR to `apache/ozone`. -2. **Turn on Actions** on the fork so workflows actually run ([how to enable them](#enable-github-actions-on-your-fork)). -3. **Jira** — Create or choose an issue in [HDDS](https://issues.apache.org/jira/projects/HDDS/) (the Ozone Jira project; the name is historical). Need an account? Use the ASF [Jira self-service](https://selfserve.apache.org/jira-account.html?project=ozone) form. -4. **Branch** — Work on a branch, often named after the issue (for example `HDDS-1234`). -5. **Push** — When you push, GitHub should show a **`build-branch`** workflow run under the **Actions** tab on your fork. Wait for it to finish and fix any failures you can reproduce. -6. **Open the PR** — Use the [pull request template](https://github.com/apache/ozone/blob/master/.github/pull_request_template.md) to describe your work and raise the PR. Once submitted, change the Jira status to **Patch Available**. +1. Fork and clone [`apache/ozone`](https://github.com/apache/ozone) to your GitHub account, then clone your fork locally. You will push branches to `origin` on the fork, then open a PR to `apache/ozone`. +2. Turn on Actions on the fork so workflows actually run ([how to enable them](#enable-github-actions-on-your-fork)). +3. Jira — Create or choose an issue in [HDDS](https://issues.apache.org/jira/projects/HDDS/) (the Ozone Jira project; the name is historical). Need an account? Use the ASF [Jira self-service](https://selfserve.apache.org/jira-account.html?project=ozone) form. +4. Branch — Work on a branch, often named after the issue (for example `HDDS-1234`). +5. Push — When you push, GitHub should show a `build-branch` workflow run under the Actions tab on your fork. Wait for it to finish and fix any failures you can reproduce. +6. Open the PR — Use the [pull request template](https://github.com/apache/ozone/blob/master/.github/pull_request_template.md) to describe your work and raise the PR. When the change is ready for review, set the Jira to Patch Available so committers know to look. Anyone can review pull requests, not just committers; reviews from other contributors are welcome. The full narrative (reviews, merging, Jira etiquette) is in the [Ozone contributing guide](https://github.com/apache/ozone/blob/master/CONTRIBUTING.md#contribute-your-modifications). :::tip You can lean on CI first -Many contributors fix quick issues by reading the failing log on GitHub, then pushing a small follow-up commit. Running every check locally is **optional but helpful** for faster feedback; see [Run checks on your machine](#run-checks-on-your-machine). +Many contributors fix quick issues by reading the failing log on GitHub, then pushing a small follow-up commit. Running every check locally is optional but helpful for faster feedback; see [Run checks on your machine](#run-checks-on-your-machine). ::: @@ -37,11 +37,11 @@ Many contributors fix quick issues by reading the failing log on GitHub, then pu New forks sometimes have workflows off until you allow them. -1. Open **your fork** on GitHub → **Settings** → **Actions** → **General**. -2. Under **Actions permissions**, pick a policy that allows workflows to run (many people use **Allow all actions and reusable workflows** on personal forks). -3. Open the **Actions** tab. If GitHub asks to enable workflows, confirm so **`build-branch`** runs when you push. +1. Open your fork on GitHub → Settings → Actions → General. +2. Under Actions permissions, pick a policy that allows workflows to run (many people use Allow all actions and reusable workflows on personal forks). +3. Open the Actions tab. If GitHub asks to enable workflows, confirm so `build-branch` runs when you push. -More detail: [Enabling or disabling GitHub Actions](https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/enabling-or-disabling-github-actions-for-a-repository). +More detail: [Managing GitHub Actions settings for a repository](https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/enabling-features-for-your-repository/managing-github-actions-settings-for-a-repository). ## What you see in GitHub: `build-branch` @@ -49,73 +49,73 @@ Two names show up in docs; both mean “the main CI pipeline”: | What | Meaning | | --- | --- | -| **`build-branch`** | The **name** of the workflow in the Actions tab. It comes from the `name:` field in [`post-commit.yml`](https://github.com/apache/ozone/blob/master/.github/workflows/post-commit.yml). | -| **`ci.yml`** | Where most **jobs** (build, compile, tests, and so on) are defined. `post-commit.yml` calls this file as a [reusable workflow](https://docs.github.com/en/actions/using-workflows/reusing-workflows). | +| `build-branch` | The name of the workflow in the Actions tab. It comes from the `name:` field in [`post-commit.yml`](https://github.com/apache/ozone/blob/master/.github/workflows/post-commit.yml). | +| `ci.yml` | Where most jobs (build, compile, tests, and so on) are defined. `post-commit.yml` calls this file as a [reusable workflow](https://docs.github.com/en/actions/using-workflows/reusing-workflows). | -So: **`post-commit.yml`** = front door; **`ci.yml`** = where the heavy lifting is described. +So: `post-commit.yml` is the front door; `ci.yml` is where the heavy lifting is described. -When you push new commits to an open pull request, **newer runs can cancel older ones** still in progress ([concurrency](https://docs.github.com/en/actions/using-jobs/using-concurrency)). That is normal and saves time. +When you push new commits to an open pull request, newer runs can cancel older ones still in progress ([concurrency](https://docs.github.com/en/actions/using-jobs/using-concurrency)). That is normal and saves time. + +## Why did CI skip some jobs? + +Not every pull request runs every job. A step called `build-info` runs [`selective_ci_checks.sh`](https://github.com/apache/ozone/blob/master/dev-support/ci/selective_ci_checks.sh) and only enables jobs that match the files you changed—unless: + +- the run is not from a PR, or +- the PR has the `full tests needed` label ([examples and discussion using that label](https://github.com/apache/ozone/issues?q=label%3A%22full%20tests%20needed%22)). + +A focused change might show fewer checks than a large refactor. That is expected. Reviewers can add `full tests needed` when the full matrix is required. If you think the wrong jobs were skipped, ask on the PR; reviewers are used to that question. ## Run checks on your machine -Running scripts locally catches problems before you push. You need a working dev environment first—see [Build with Maven](../../developer-guide/build/maven) and [Building from source](https://github.com/apache/ozone/blob/master/CONTRIBUTING.md#building-from-source) in `CONTRIBUTING.md`. +Running scripts locally catches problems before you push. You need a working dev environment first; see [Building Ozone With Maven](../build/maven). -Most checks live in [`hadoop-ozone/dev-support/checks`](https://github.com/apache/ozone/tree/master/hadoop-ozone/dev-support/checks). The [Check your contribution](https://github.com/apache/ozone/blob/master/CONTRIBUTING.md#check-your-contribution) section groups them by rough duration: +Most checks live in [`hadoop-ozone/dev-support/checks`](https://github.com/apache/ozone/tree/master/hadoop-ozone/dev-support/checks). The table below groups them by rough duration (similar to the contributing guide, without duplicating its long narrative): | Rough time | Scripts | What they do | | --- | --- | --- | -| Build step | `build.sh` | Compile Ozone | -| Minutes | `author.sh`, `bats.sh`, `rat.sh`, `docs.sh`, `dependency.sh`, `checkstyle.sh`, `pmd.sh` | Style, license headers, docs, dependency list | +| Varies (full compile) | `build.sh` | Compile Ozone | +| A few minutes | `author.sh`, `bats.sh`, `rat.sh`, `docs.sh`, `dependency.sh`, `checkstyle.sh`, `pmd.sh` | Style, license headers, docs, dependency list | | ~10 minutes | `findbugs.sh`, `kubernetes.sh` | SpotBugs, small Kubernetes-related checks | | An hour or more | `integration.sh`, `acceptance.sh` | JUnit tests (via integration), mini-cluster style tests, Docker Compose acceptance tests | -The command below is **only an example** of running one script from the **root of your clone** (the folder that contains `hadoop-ozone/`): +This is an example command to run the `build` check locally from the root of your clone (the directory that contains `hadoop-ozone/`): ```bash ./hadoop-ozone/dev-support/checks/build.sh ``` -More on test styles: [Acceptance tests](./acceptance-tests) on this site. +Other test scripts in `hadoop-ozone/dev-support/checks/` can be run similarly. To run individual acceptance tests, see [Acceptance tests](./acceptance-tests#running-tests-locally-using-docker-compose). ### Reproducing failures locally (by check type) What you need depends on which job failed: -1. **basic** — Safe to run locally without a full prior build; scripts are quick (author tags, BATS, RAT, docs, Checkstyle, PMD, SpotBugs, and similar—whatever `basic` selected for that run). -2. **dependency / license** — Quick, but expects a **build** already (the dependency check compares against built outputs / the dependency list). -3. **Checks that reproduce compiler or packaging issues** — Run the same **`build.sh`** (or narrower Maven command) after a normal dev build; align with the log. -4. **integration** (JUnit) — After a build, narrow work to one test with Maven, for example `-Dtest='YourTestClass'`, or run the same class or method from your IDE. [`integration.sh`](https://github.com/apache/ozone/blob/master/hadoop-ozone/dev-support/checks/integration.sh) wraps the full suite; open it for flags and defaults. -5. **acceptance** — Needs a **build** (often a dist build). Prefer re-running only the failing shell driver the log names (for example a line like `ERROR: Test execution of ozone/test-legacy-bucket.sh is FAILED` points at `test-legacy-bucket.sh`) instead of the whole suite. -6. **kubernetes** — The `kubernetes.sh` check is aimed at **Linux** environments; macOS or Windows may not match CI. +1. `basic` — Safe to run locally without a full prior build; scripts are quick (author tags, BATS, RAT, docs, Checkstyle, PMD, SpotBugs, and similar—whatever `basic` selected for that run). +2. `dependency` / license — Quick, but expects a build already (the dependency check compares against built outputs / the dependency list). +3. Compiler or packaging issues — Re-run the same `build.sh` (or the narrower Maven command) shown in the failing job log until it matches CI. +4. `integration` (JUnit) — After a build, narrow work with Maven’s `-Dtest=...`. Prefer the fully qualified class name (package + class) if a short class name is ambiguous. You can also run the same class or method from your IDE. [`integration.sh`](https://github.com/apache/ozone/blob/master/hadoop-ozone/dev-support/checks/integration.sh) wraps the full suite; open it for flags and defaults. +5. `acceptance` — Needs a build (often one that produces a distribution layout the scripts expect—see the log and `acceptance.sh` for what it invokes). Prefer re-running only the failing shell driver the log names instead of the whole suite. For example, a line like `ERROR: Test execution of ozone/test-legacy-bucket.sh is FAILED` points at `test-legacy-bucket.sh`. +6. `kubernetes` — The `kubernetes.sh` check is aimed at Linux environments; macOS or Windows may not match CI. `integration.sh` and `acceptance.sh` can take extra arguments to run a subset; open the scripts to see options. Output usually lands under `target/` (for example `target/docs`). -## Why did CI skip some jobs? - -Not every pull request runs every job. A step called **build-info** runs [`selective_ci_checks.sh`](https://github.com/apache/ozone/blob/master/dev-support/ci/selective_ci_checks.sh) and only enables jobs that match the files you changed—unless: - -- the run is **not** from a PR, or -- the PR has the **`full tests needed`** [label](https://docs.github.com/en/issues/using-labels-and-milestones-to-track-work/managing-labels). - -So a focused change might show fewer checks than a large refactor. **That is expected.** Reviewers can add **`full tests needed`** when the full matrix is required. If you think the wrong jobs were skipped, **ask on the PR**; reviewers are used to that question. - ## What the main CI jobs do (overview) The list below matches [`.github/ci.md`](https://github.com/apache/ozone/blob/master/.github/ci.md). Treat it as a map when reading logs, not something to memorize. -- **build-info** — Decides which other jobs run (selective CI). -- **build** — Performs a full build; its output is reused by later jobs. -- **compile** — Re-builds with various Java versions. It consumes the **source tarball** (release artifact) produced by **build**, not a fresh checkout of the git repository. -- **basic** — Checks like author tags, BATS, Checkstyle, Hugo for docs, SpotBugs, PMD, RAT—depending on what was selected. -- **dependency** — Detects whether dependencies were added or removed, as a reminder to update `LICENSE.txt` (how this is implemented—for example comparisons against `jar-report.txt`—is an internal detail). -- **acceptance** — [`acceptance.sh`](https://github.com/apache/ozone/blob/master/hadoop-ozone/dev-support/checks/acceptance.sh) (Robot Framework + Docker Compose; variants like secure / unsecure / misc). -- **kubernetes** — [`kubernetes.sh`](https://github.com/apache/ozone/blob/master/hadoop-ozone/dev-support/checks/kubernetes.sh). -- **integration** — [`integration.sh`](https://github.com/apache/ozone/blob/master/hadoop-ozone/dev-support/checks/integration.sh) runs **all** JUnit tests, regardless of which submodule they live in ([HDDS-9242](https://issues.apache.org/jira/browse/HDDS-9242); often sharded in CI). Older CI had a separate **unit** job; it was removed in favor of this. -- **coverage** — [`coverage.sh`](https://github.com/apache/ozone/blob/master/hadoop-ozone/dev-support/checks/coverage.sh) merges coverage when earlier jobs produced it. +- `build-info` — Decides which other jobs run (selective CI). +- `build` — Performs a full build; its output is reused by later jobs. +- `compile` — Re-builds with various Java versions. It consumes the source tarball (release artifact) produced by `build`, not a fresh checkout of the git repository. +- `basic` — Checks like author tags, BATS, Checkstyle, docs, SpotBugs, PMD, RAT—depending on what was selected. +- `dependency` — Detects whether dependencies were added or removed, as a reminder to update `LICENSE.txt`. +- `acceptance` — [`acceptance.sh`](https://github.com/apache/ozone/blob/master/hadoop-ozone/dev-support/checks/acceptance.sh) (Robot Framework + Docker Compose; variants like secure / unsecure / misc). +- `kubernetes` — [`kubernetes.sh`](https://github.com/apache/ozone/blob/master/hadoop-ozone/dev-support/checks/kubernetes.sh). +- `integration` — [`integration.sh`](https://github.com/apache/ozone/blob/master/hadoop-ozone/dev-support/checks/integration.sh) runs all JUnit tests, regardless of which submodule they live in ([HDDS-9242](https://issues.apache.org/jira/browse/HDDS-9242); often sharded in CI). Older CI had a separate `unit` job; it was removed in favor of this. +- `coverage` — [`coverage.sh`](https://github.com/apache/ozone/blob/master/hadoop-ozone/dev-support/checks/coverage.sh) merges code coverage results from earlier jobs. The RocksDB native library is built and used as part of the normal workflow across jobs ([HDDS-12734](https://issues.apache.org/jira/browse/HDDS-12734)); you do not need a separate “native-only” check to exercise it. -Matrix jobs (for example multiple Java versions) are configured **without fail-fast** ([HDDS-6464](https://issues.apache.org/jira/browse/HDDS-6464)) so that other matrix legs keep running and failed legs can be **re-run individually** in the GitHub UI. +Matrix jobs (for example multiple Java versions) are configured without fail-fast ([HDDS-6464](https://issues.apache.org/jira/browse/HDDS-6464)) so that other matrix legs keep running and failed legs can be re-run individually in the GitHub UI. ## Other workflows @@ -133,27 +133,27 @@ A red check does not mean you did something wrong—it means the run found somet ::: -1. Open the failed **`build-branch`** run → click the red job → read the **log** from the bottom upward for the first error. -2. If the job uploaded **Artifacts**, download them from the run summary while they are still available. -3. Try the same **check script** locally if you have the environment set up ([Run checks on your machine](#run-checks-on-your-machine)). -4. For **transient** failures or **flaky** tests only: **committers** can use GitHub’s **Re-run failed jobs** on the workflow run. **Other contributors** should wait for a committer to do that, or ask on the PR if it does not happen within a reasonable time (which varies with time of day, weekends, holidays, and so on). Avoid empty commits or re-running the entire workflow when only a subset failed. +1. Open the failed `build-branch` run → click the red job → read the log from the bottom upward for the first error. +2. If the job uploaded artifacts, download them from the run summary while they are still available. +3. Try the same check script locally if you have the environment set up ([Run checks on your machine](#run-checks-on-your-machine)). +4. For transient failures or flaky tests only: committers can use GitHub’s Re-run failed jobs on the workflow run. Other contributors should wait for a committer to do that, or ask on the PR if it does not happen within a reasonable time (which varies with time of day, weekends, holidays, and so on). Avoid empty commits or re-running the entire workflow when only a subset failed. A maintained mirror of build results from `apache/ozone` default-branch runs is [adoroszlai/ozone-build-results](https://github.com/adoroszlai/ozone-build-results/). ### Get help -- Ask on your **pull request**—reviewers can interpret unfamiliar failures quickly. -- **Email** [dev@ozone.apache.org](mailto:dev@ozone.apache.org) for broader questions. -- **[GitHub Discussions](https://github.com/apache/ozone/discussions)** and the [#ozone](http://s.apache.org/slack-invite) Slack channel (ASF Slack) are listed in [`CONTRIBUTING.md`](https://github.com/apache/ozone/blob/master/CONTRIBUTING.md#who-to-contact). +- Ask on your pull request—reviewers can interpret unfamiliar failures quickly. +- Email [dev@ozone.apache.org](mailto:dev@ozone.apache.org) for broader questions. +- [GitHub Discussions](https://github.com/apache/ozone/discussions) and the [#ozone](http://s.apache.org/slack-invite) Slack channel (ASF Slack) are listed in [`CONTRIBUTING.md`](https://github.com/apache/ozone/blob/master/CONTRIBUTING.md#who-to-contact). ## Advanced: flaky tests and debugging on a fork -For **repeat failures** or **environment-only** bugs, use the dedicated workflows on **your fork** (enable Actions, then run them manually from the **Actions** tab via **workflow_dispatch**): +For repeat failures or environment-only bugs, use the dedicated workflows on your fork (enable Actions, then run them manually from the Actions tab via workflow_dispatch): -- **`flaky-test-check`** — defined in [`intermittent-test-check.yml`](https://github.com/apache/ozone/blob/master/.github/workflows/intermittent-test-check.yml); runs a chosen JUnit class or method many times across parallel splits. -- **`repeat-acceptance-test`** — defined in [`repeat-acceptance.yml`](https://github.com/apache/ozone/blob/master/.github/workflows/repeat-acceptance.yml); repeats acceptance tests concurrently (suite or filter). +- `flaky-test-check` — defined in [`intermittent-test-check.yml`](https://github.com/apache/ozone/blob/master/.github/workflows/intermittent-test-check.yml); runs a chosen JUnit class or method many times across parallel splits. +- `repeat-acceptance-test` — defined in [`repeat-acceptance.yml`](https://github.com/apache/ozone/blob/master/.github/workflows/repeat-acceptance.yml); repeats acceptance tests concurrently (suite or filter). -You can still use an IDE, extra logging, or interactive debugging (for example [tmate](https://github.com/tmate-io/tmate)) on a fork if you accept the risk on **public** repos and **never** expose secrets. +You can still use an IDE, extra logging, or interactive debugging (for example [tmate](https://github.com/tmate-io/tmate)) on a fork if you accept the risk on public repos and never expose secrets. ## See also From 9f655dad0ae30a5c2981fc9a80180d0222a454c2 Mon Sep 17 00:00:00 2001 From: Wei-Chiu Chuang Date: Thu, 7 May 2026 12:22:15 -0700 Subject: [PATCH 9/9] HDDS-11026. Document ci-with-ratis workflow in CI guide Describe the manual workflow_dispatch job that runs full CI against a chosen Apache Ratis branch for validating Ozone after Ratis-side fixes. Co-authored-by: Cursor --- docs/08-developer-guide/03-test/04-continuous-integration.md | 1 + 1 file changed, 1 insertion(+) diff --git a/docs/08-developer-guide/03-test/04-continuous-integration.md b/docs/08-developer-guide/03-test/04-continuous-integration.md index a59d7e5ae7..9a021a3ab4 100644 --- a/docs/08-developer-guide/03-test/04-continuous-integration.md +++ b/docs/08-developer-guide/03-test/04-continuous-integration.md @@ -152,6 +152,7 @@ For repeat failures or environment-only bugs, use the dedicated workflows on you - `flaky-test-check` — defined in [`intermittent-test-check.yml`](https://github.com/apache/ozone/blob/master/.github/workflows/intermittent-test-check.yml); runs a chosen JUnit class or method many times across parallel splits. - `repeat-acceptance-test` — defined in [`repeat-acceptance.yml`](https://github.com/apache/ozone/blob/master/.github/workflows/repeat-acceptance.yml); repeats acceptance tests concurrently (suite or filter). +- `ci-with-ratis` — defined in [`ci-with-ratis.yml`](https://github.com/apache/ozone/blob/master/.github/workflows/ci-with-ratis.yml); runs full CI while building against a selected Apache Ratis branch. Because Ozone is tightly integrated with Ratis, some failures trace to the Ratis layer; this workflow is meant to validate Ozone quickly after Ratis-side fixes. You can still use an IDE, extra logging, or interactive debugging (for example [tmate](https://github.com/tmate-io/tmate)) on a fork if you accept the risk on public repos and never expose secrets.