From c751dfab0c8bf702c1f0245d62c6b02e1e1fbb3b Mon Sep 17 00:00:00 2001 From: Ethan Rose Date: Mon, 15 Jun 2026 19:38:24 -0400 Subject: [PATCH 1/3] Rework git section, import branch merge checklist --- .../04-project/01-branches-and-tags.md | 74 ------ .../04-project/01-git/01-overview.md | 28 +++ .../04-project/01-git/02-release-branches.md | 30 +++ .../01-git/03-feature-branches/01-overview.md | 42 ++++ .../03-feature-branches/02-merge-checklist.md | 97 ++++++++ .../01-hdds-2939-filesystem-optimizations.md | 109 +++++++++ .../02-hdds-3630-rocksdb-datanode.md | 78 ++++++ .../03-hdds-3698-non-rolling-upgrade.md | 227 ++++++++++++++++++ .../04-hdds-3816-erasure-coding.md | 91 +++++++ .../05-hdds-4440-s3g-grpc-connections.md | 109 +++++++++ .../06-hdds-4454-streaming-write-pipeline.md | 79 ++++++ .../07-hdds-4944-s3-multi-tenancy.md | 121 ++++++++++ .../03-merged-branches/08-hdds-4948-scm-ha.md | 117 +++++++++ .../03-merged-branches/09-hdds-5447-httpfs.md | 74 ++++++ .../10-hdds-5713-disk-balancer.md | 80 ++++++ .../11-hdds-6517-snapshots.md | 59 +++++ .../12-hdds-7593-hsync-lease-recovery.md | 68 ++++++ .../13-hdds-7733-symmetric-key-tokens.md | 58 +++++ .../14-hdds-10239-container-reconciliation.md | 84 +++++++ .../15-hdds-10656-atomic-key-overwrite.md | 54 +++++ .../03-merged-branches/README.mdx | 15 ++ .../01-git/03-feature-branches/README.mdx | 7 + .../04-project/01-git/README.mdx | 12 + 23 files changed, 1639 insertions(+), 74 deletions(-) delete mode 100644 docs/08-developer-guide/04-project/01-branches-and-tags.md create mode 100644 docs/08-developer-guide/04-project/01-git/01-overview.md create mode 100644 docs/08-developer-guide/04-project/01-git/02-release-branches.md create mode 100644 docs/08-developer-guide/04-project/01-git/03-feature-branches/01-overview.md create mode 100644 docs/08-developer-guide/04-project/01-git/03-feature-branches/02-merge-checklist.md create mode 100644 docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/01-hdds-2939-filesystem-optimizations.md create mode 100644 docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/02-hdds-3630-rocksdb-datanode.md create mode 100644 docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/03-hdds-3698-non-rolling-upgrade.md create mode 100644 docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/04-hdds-3816-erasure-coding.md create mode 100644 docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/05-hdds-4440-s3g-grpc-connections.md create mode 100644 docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/06-hdds-4454-streaming-write-pipeline.md create mode 100644 docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/07-hdds-4944-s3-multi-tenancy.md create mode 100644 docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/08-hdds-4948-scm-ha.md create mode 100644 docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/09-hdds-5447-httpfs.md create mode 100644 docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/10-hdds-5713-disk-balancer.md create mode 100644 docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/11-hdds-6517-snapshots.md create mode 100644 docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/12-hdds-7593-hsync-lease-recovery.md create mode 100644 docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/13-hdds-7733-symmetric-key-tokens.md create mode 100644 docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/14-hdds-10239-container-reconciliation.md create mode 100644 docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/15-hdds-10656-atomic-key-overwrite.md create mode 100644 docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/README.mdx create mode 100644 docs/08-developer-guide/04-project/01-git/03-feature-branches/README.mdx create mode 100644 docs/08-developer-guide/04-project/01-git/README.mdx diff --git a/docs/08-developer-guide/04-project/01-branches-and-tags.md b/docs/08-developer-guide/04-project/01-branches-and-tags.md deleted file mode 100644 index 24421b8b2b..0000000000 --- a/docs/08-developer-guide/04-project/01-branches-and-tags.md +++ /dev/null @@ -1,74 +0,0 @@ ---- -sidebar_label: Branches and Tags ---- - -# Git Branches and Tags - -The [Apache Ozone](https://github.com/apache/ozone) codebase on GitHub uses a small number of **long-lived branch patterns**, many **short-lived Jira-named branches**, and **signed tags** for releases. This page summarizes how they fit together for contributors and release managers. - -:::info Product repo versus this website - -Branch and tag names below refer to **[`apache/ozone`](https://github.com/apache/ozone)**. The documentation site you are reading is **[`apache/ozone-site`](https://github.com/apache/ozone-site)** and follows its own branching (for example `master` and `asf-site` for publishing). - -::: - -## The `master` branch - -Day-to-day development targets **`master`** in [`apache/ozone`](https://github.com/apache/ozone). Pull requests from forks are usually opened against `master`, and GitHub Actions CI ([`build-branch`](https://github.com/apache/ozone/blob/master/.github/workflows/post-commit.yml)) validates proposed changes. - -The community tries to keep **`master` in a releasable state**: it should build, tests should be in good shape, and risky work should not land without review and appropriate checks. That expectation is why large or disruptive efforts are often done on **feature branches** (see below) and merged only after broader validation. - -For routine contribution steps (fork, branch, PR, Jira), see [Ozone `CONTRIBUTING.md`](https://github.com/apache/ozone/blob/master/CONTRIBUTING.md). - -## Release branches - -When the project prepares a **minor** (or major) release, maintainers cut a **release branch** from an agreed point on `master`. The naming pattern is: - -`ozone-.` - -Examples: `ozone-2.0`, `ozone-2.1`. **Patch** releases for that line (for example 2.0.1) are typically produced from the same branch. Details—`proto.lock` updates, bumping the SNAPSHOT on `master`, tagging—are in the [Apache release manager guide](./release-guide). - -Release branches are **not** where general feature development happens; fixes that belong in the release are cherry-picked or landed on the release branch as described in that guide. - -## Tags - -Ozone uses **annotated Git tags** on [`apache/ozone`](https://github.com/apache/ozone) to mark release candidates and final releases. You will see names such as: - -- **`ozone-2.1.0`**, **`ozone-2.0.0`** — final release tags -- **`ozone-2.1.0-RC0`**, **`ozone-2.1.0-RC1`**, … — release candidate tags - -Tags appear on the [Tags](https://github.com/apache/ozone/tags) page and drive [GitHub Releases](https://github.com/apache/ozone/releases). Creating and pushing the final tag is part of the release process in the [release manager guide](./release-guide). - -## Feature branches - -### When to use a feature branch - -Feature branches are used for **larger or longer-running work** that would be hard to land incrementally on `master` without destabilizing it—subsystems that need many coordinated changes, long QA cycles, or broad community testing before merge. - -Smaller changes should continue to use **topic branches on a fork** and normal pull requests to `master` (see [`CONTRIBUTING.md`](https://github.com/apache/ozone/blob/master/CONTRIBUTING.md)). - -In the upstream repo, feature lines often show up as branches named after the Jira, for example `HDDS-3816-ec` or `HDDS-10611-mpu` (see [Branches](https://github.com/apache/ozone/branches)). - -### Merge process and community vote - -Merging a shared feature branch into `master` is a **formal step**: the community typically discusses and **votes on the dev mailing list** so people can test the line and agree it is ready. Expectations are described on the wiki page **[Merging branches](https://cwiki.apache.org/confluence/display/OZONE/Merging+branches)**. - -Important points from that process (see the wiki for the full text): - -- **Do not use GitHub “Squash and merge” or rebase** to land the feature branch onto `master`. Use a **regular `git merge`** so history, Jira links, and PR discussions stay aligned with the original commit IDs. -- Maintainers run **full CI multiple times** on the revision proposed for merge and investigate failures. -- **Documentation** and **design** updates should live in the versioned tree under `hadoop-hdds/docs` (not only on the wiki), where applicable. -- The wiki also calls out areas such as S3 behavior, Docker Compose / acceptance tests, Kubernetes examples, SonarCloud, upgrade compatibility, licensing, and performance—so reviewers know what to validate. - -### Checklist (wiki and PR, not duplicated here) - -The **[Merging branches](https://cwiki.apache.org/confluence/display/OZONE/Merging+branches)** page includes a detailed **checklist** and a **copy-paste template**. That list is maintained on Confluence so it can evolve with the project. - -**You do not need to mirror the full checklist on this website.** For a given merge, attach or link the completed checklist in the **GitHub merge pull request** and in the **mailing-list thread** (for example `dev@ozone.apache.org`) so reviewers and voters have one place to read it. - -## See also - -- [Apache release manager guide](./release-guide) — release branches, RC tags, and publishing -- [Merging branches](https://cwiki.apache.org/confluence/display/OZONE/Merging+branches) — feature-branch merge expectations and checklist -- [Ozone `CONTRIBUTING.md`](https://github.com/apache/ozone/blob/master/CONTRIBUTING.md) — everyday PR workflow to `master` -- [`.github/ci.md`](https://github.com/apache/ozone/blob/master/.github/ci.md) — what CI runs on branches and PRs diff --git a/docs/08-developer-guide/04-project/01-git/01-overview.md b/docs/08-developer-guide/04-project/01-git/01-overview.md new file mode 100644 index 0000000000..a0b7570db5 --- /dev/null +++ b/docs/08-developer-guide/04-project/01-git/01-overview.md @@ -0,0 +1,28 @@ +--- +sidebar_label: Overview +--- + +# Git Branches and Tags + +The [Apache Ozone](https://github.com/apache/ozone) codebase on GitHub uses a small number of **long-lived branch patterns**, many **short-lived Jira-named branches**, and **signed tags** for releases. This section covers how they fit together for contributors and release managers. + +:::info Product repo versus this website + +Branch and tag names below refer to **[`apache/ozone`](https://github.com/apache/ozone)**. The documentation site you are reading is **[`apache/ozone-site`](https://github.com/apache/ozone-site)** and follows its own branching (for example `master` and `asf-site` for publishing). + +::: + +## The `master` branch + +Day-to-day development targets **`master`** in [`apache/ozone`](https://github.com/apache/ozone). Pull requests from forks are usually opened against `master`, and GitHub Actions CI ([`build-branch`](https://github.com/apache/ozone/blob/master/.github/workflows/post-commit.yml)) validates proposed changes. + +The community tries to keep **`master` in a releasable state**: it should build, tests should be in good shape, and risky work should not land without review and appropriate checks. That expectation is why large or disruptive efforts are often done on **feature branches** and merged only after broader validation. + +For routine contribution steps (fork, branch, PR, Jira), see [Ozone `CONTRIBUTING.md`](https://github.com/apache/ozone/blob/master/CONTRIBUTING.md). + +## See also + +- [Ozone `CONTRIBUTING.md`](https://github.com/apache/ozone/blob/master/CONTRIBUTING.md) — everyday PR workflow to `master` +- [`.github/ci.md`](https://github.com/apache/ozone/blob/master/.github/ci.md) — what CI runs on branches and PRs +- [Release Manager Guide](../release-guide) — step-by-step release process + diff --git a/docs/08-developer-guide/04-project/01-git/02-release-branches.md b/docs/08-developer-guide/04-project/01-git/02-release-branches.md new file mode 100644 index 0000000000..36d20e7ade --- /dev/null +++ b/docs/08-developer-guide/04-project/01-git/02-release-branches.md @@ -0,0 +1,30 @@ +--- +sidebar_label: Release Branches +--- + +# Release Branches and Tags + +## Release branches + +When the project prepares a **minor** (or major) release, maintainers cut a **release branch** from an agreed point on `master`. The naming pattern is: + +`ozone-.` + +Examples: `ozone-2.0`, `ozone-2.1`. **Patch** releases for that line (for example 2.0.1) are typically produced from the same branch. Details — `proto.lock` updates, bumping the SNAPSHOT on `master`, tagging — are in the [release manager guide](../release-guide). + +Release branches are **not** where general feature development happens; fixes that belong in the release are cherry-picked or landed on the release branch as described in that guide. + +## Tags + +Ozone uses **annotated Git tags** on [`apache/ozone`](https://github.com/apache/ozone) to mark release candidates and final releases. You will see names such as: + +- **`ozone-2.1.0`**, **`ozone-2.0.0`** — final release tags +- **`ozone-2.1.0-RC0`**, **`ozone-2.1.0-RC1`**, … — release candidate tags + +Tags appear on the [Tags](https://github.com/apache/ozone/tags) page and drive [GitHub Releases](https://github.com/apache/ozone/releases). Creating and pushing the final tag is part of the release process in the [release manager guide](../release-guide). + +## See also + +- [Release Manager Guide](../release-guide) — step-by-step release process, RC tags, and publishing +- [Feature branches](./feature-branches) — feature branch lifecycle and merge process +- [Ozone `CONTRIBUTING.md`](https://github.com/apache/ozone/blob/master/CONTRIBUTING.md) — everyday PR workflow to `master` diff --git a/docs/08-developer-guide/04-project/01-git/03-feature-branches/01-overview.md b/docs/08-developer-guide/04-project/01-git/03-feature-branches/01-overview.md new file mode 100644 index 0000000000..a2c33b5ca4 --- /dev/null +++ b/docs/08-developer-guide/04-project/01-git/03-feature-branches/01-overview.md @@ -0,0 +1,42 @@ +--- +sidebar_label: Overview +--- + +# Feature Branches + +Feature branches are used for larger or longer-running work that would be hard to land incrementally on `master` without destabilizing it. Feature branches are often named after the Jira epic tracking the work and an abbreviated feature name, for example `HDDS-3816-ec` or `HDDS-10611-mpu`. You can see all the Ozone repo's branches [here](https://github.com/apache/ozone/branches). Most incremental changes do not require feature branches and can go directly to `master` as a pull request as documented in [`CONTRIBUTING.md`](https://github.com/apache/ozone/blob/master/CONTRIBUTING.md). + +## When to use a feature branch + +Use a feature branch for changes that: +- Make iterative changes to core code paths. +- Require broader community testing. +- Cannot be easily gated with a feature flag. + - The covers changaes that migrate existing code paths instead of adding completely new ones. +- Would have issues if a release was cut in the middle of their development. + - A release branch can be cut from `master` at any time and feature development should not block this. + - If a feature has upgrade compatibility concerns that will not be addressed right away, it should be developed on a feature branch. + - Note that protobuf messages and wire protocols become locked into compatability guarantees once they are released. + - If a feature is making changes in this area and it wants to keep the structure flexible while it is being finalized, it should be done on a feature branch. + + +## Merge Process + +Complete the following steps when a feature branch is ready to be merged into `master`: + +1. Complete the [branch merge checklist](./merge-checklist) for your feature branch and raise it as a pull request to the [ozone-site](https://github.com/apache/ozone-site) repo's `master` branch. + + - Feature branch merge checklists will be committed [to the website](./merged-branches) once the branch merge is approved. + +2. Start a mail thread on the `dev@ozone.apache.org` mailing list for committers and PMC members to vote on the branch merge. Include a link to the branch merge checklist PR. + +3. If the vote passes, changes from the feature branch will be incorporated into `master`, and development can continue on the `master` branch. + +**Do not use GitHub "Squash and merge" or rebase** to land the feature branch onto `master`. Use a **regular `git merge`** so history, Jira links, and PR discussions stay aligned with the original commit IDs. + +## See also + +- [Release branches](../release-branches) — release branch lifecycle and tags +- [Release Manager Guide](../../release-guide) — RC tags and publishing +- [Ozone `CONTRIBUTING.md`](https://github.com/apache/ozone/blob/master/CONTRIBUTING.md) — everyday PR workflow to `master` + diff --git a/docs/08-developer-guide/04-project/01-git/03-feature-branches/02-merge-checklist.md b/docs/08-developer-guide/04-project/01-git/03-feature-branches/02-merge-checklist.md new file mode 100644 index 0000000000..389410544f --- /dev/null +++ b/docs/08-developer-guide/04-project/01-git/03-feature-branches/02-merge-checklist.md @@ -0,0 +1,97 @@ +# Feature Branch Merge Checklist + +This section collects generic questions which can be checked for each feature branch. Some of them are obvious for some branches (for example: the decommissioning feature didn't change the s3 interface) but it's good to go through them for each of the merges. Answering these questions will also help the community to test the branch. If you have any new idea about what should be checked, submit a pull request to the [ozone-site](https://github.com/apache/ozone-site) repo to update this page. + +## Summary + +Include a brief summary of the feature, and fill in the following information: + +- Branch name: `` +- Jira: `` +- Commit: `` + +## Stable Builds + +To keep the master branch stable, it's recommended to run the full CI build 3-5 times on the latest commit proposed to be merged. Any failure should be analyzed and checked against previous failures on master to see if it is related to the branch. + +All CI runs for the feature branch can be checked at `github.com/apache/ozone/actions?query=`. + +## User Documentation + +Adding user documentation before the merge will help the community test and evaluate the feature. User documentation should be added to this website with a pull request to the [ozone-site](https://github.com/apache/ozone-site) repo. Note that the documentation will be listed under the `next` version section until it is included in a release. + +## Developer Documentation + +If the feature required a design document, ensure its PR is approved and it has been committed under the main Ozone repo's `hadoop-hdds/docs/content/design/` directory so it is preserved in the source history. + +## S3 Compatibility + +Explain how any S3 API behavior changes with the addition of this feature. + +## Docker Compose and Acceptance Tests + +Verify that Docker-compose clusters still start correctly. Add or update Robot Framework acceptance tests in `hadoop-ozone/dist/src/main/compose` to cover the new feature's behavior and allow other developers to test it in a local environment. + +## Containers and Kubernetes + +If the feature affects cluster topology, configuration, or runtime behavior, update the Kubernetes example manifests under `hadoop-ozone/dist/src/main/k8s`. + +## Code Coverage and Quality + +Review the [SonarCloud](https://sonarcloud.io/project/overview?id=apache_ozone) metrics for the branch and compare them against `master`. Regressions in coverage, duplications, or code smells should be addressed before merge. + +## Build Time + +The ASF uses a shared pool of Github Actions CI minutes for all projects, so we should avoid significant increase of the build time unless it's fully necessary. Compare the end-to-end CI build duration of the feature branch against `master`. Flag significant regressions so the community can decide whether they are acceptable. + +All Github actions runs for a branch can be found at `github.com/apache/ozone/actions?query=`. + +## Incompatible Changes + +Ozone currently supports non-rolling upgrades and downgrades even when backwards incompatible features are present. Backwards incompatible features should be added to the versioning framework so that they are not used until the Ozone upgrade is finalized, after which downgrading is not possible. + +Client cross compatibility should also be maintained as much as possible, with sensible error messages provided when this is not possible. An old client should be able to talk to the new Ozone instance, and a new client should be able to talk to the old Ozone instance. + +## Third Party Dependencies or License Changes + +Diff the distribution tar files to identify any new or updated third-party libraries. For each new library, update `LICENSE` and `NOTICE` in the appropriate module so the release artifacts remain license-compliant. + +The easiest way to check this is by building Ozone from the feature branch and from master, then comparing the files between them: + +```shell +git checkout origin/master +mvn clean install -DskipTests +cd hadoop-ozone/dist/target/ozone-*/ +find -type f | sort > /tmp/master + +git checkout origin/$BRANCH_NAME +mvn clean install -DskipTests +cd hadoop-ozone/dist/target/ozone-*/ +find -type f | sort > /tmp/$BRANCH_NAME + +git diff /tmp/$BRANCH_NAME /tmp/master +``` + +## Performance + +Share `ozone freon` benchmark results that demonstrate the performance impact of the feature. Include baseline numbers from `master` so the community can evaluate the delta. Feature branches should not introduce performance regressions, but in some cases they may improve performance. + +## Security Considerations + +Document any new attack surfaces, privilege changes, or cryptography decisions. Pay particular attention to input validation for features that add network-accessible endpoints or handle user-supplied data. + +## Configuration Changes + +Document any new configuration keys added or existing configurations whose defaults were changed by this feature and what their impact will be. + +## Dangling TODO Comments + +`TODO` comments are often left in the code when working on a feature branch for ideas that need to be worked out later or were split into smaller PRs. If there are new `TODO` comments introduced by the branch, ensure that the comment also contains the ID of an open Jira which will resolve it and document this existing gap. For example: + +```java +// TODO HDDS-1234: Finalize the return type of this method. +``` + +There should be no new `TODO`s that do not contain references to open Jiras. + +To see all TODOs unique to your branch from `master`, check out your feature branch and run `git diff master... | grep -i todo`. diff --git a/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/01-hdds-2939-filesystem-optimizations.md b/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/01-hdds-2939-filesystem-optimizations.md new file mode 100644 index 0000000000..d0b43b350d --- /dev/null +++ b/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/01-hdds-2939-filesystem-optimizations.md @@ -0,0 +1,109 @@ +# HDDS-2939: FileSystem Optimizations + +Presently, a rename/delete operation can become prohibitively expensive for such directories which have large sub-trees/sub-paths. Ozone does rename/delete each and every sub-file & sub-dir under the given directory via multiple RPC calls to OM thus makes it very expensive. Also, rename and delete doesn't guarantee the atomicity. + +The prefix based FileSystem optimization idea allows performing rename, delete of any directory in a deterministic/constant time atomically. Now, Ozone performs rename/delete operations in a single RPC call by sending only the given directory to OM. It will finish rename, delete operations with O(1) complexity. Also, makes it possible to support atomic rename/delete of any directory at any level in the namespace. + +## Git Branch + +Implementing this in a separate feature [HDDS-2939](https://github.com/apache/ozone/tree/HDDS-2939) branch. Thanks to all contributors/reviewers. + +## How to enable prefix based optimization feature + +Following are the set of configurations to be configured in 'ozone-default.xml' to enable this feature. By default, the feature will be turned OFF. + +An example of Ozone-site.xml + +```xml + + + ozone.om.enable.filesystem.paths + true + + + + ozone.om.metadata.layout + prefix + + +``` + +## Related documents + +- [Performance Test Report](https://issues.apache.org/jira/secure/attachment/13023395/Performance%20Comparison%20Between%20%20Master%20and%20HDDS-2939%20branch-Report-001.pdf) + +## 1. builds/intermittent test failures + +There are no intermittent failures specific to the HDDS-2939 branch as of now. During the development , it was ensured all the CI checks are clean prior to every commit merge .The plan is to run repeated CI checks on the merge commit to master. + +## 2. documentation + +Described feature in Apache Ozone page via HDDS-5067. + +- *[Hadoop-HDDs/docs/content/feature/PrefixFSO.md](https://github.com/apache/ozone/blob/HDDS-2939/hadoop-hdds/docs/content/feature/PrefixFSO.md)* has the feature details and related configurations. + +## 3. design, attached the docs + +Following design docs are linked from the documentation present in HDDS-2939 Jira + +- [Design Document](https://issues.apache.org/jira/secure/attachment/12991926/Ozone%20FS%20Namespace%20Proposal%20v1.0.docx) +- [Internal design - metadata format](https://issues.apache.org/jira/secure/attachment/13023399/OzoneFS%20Optimizations_DesignOverview_%20HDDS-2939.pdf) +- *[Hadoop-HDDs/docs/content/feature/PrefixFSO.md](https://github.com/apache/ozone/blob/HDDS-2939/hadoop-hdds/docs/content/feature/PrefixFSO.md)* has the feature details and related configurations. + +## 4. S3 compatibility + +There are no incompatibilities with respect to S3. This feature can be enabled only together with *ozone.OM.enable.filesystem.paths.* When file system-style path handling is enabled, 100 % S3 compatibility could not be guaranteed. FS compatible S3 key names supposed to be working well, but non-fs compatible, extra key names (like 'a/../b1 or real file with the name \`key1/\` might be normalized or rejected by the implementation of *ozone.OM.enable.filesystem.paths*) + +Note: Added S3 acceptance test with feature is turned on - PREFIX layout. + +## 5. Docker-compose / acceptance tests + +The \`compose/ozone\` cluster is modified with testing \`ozonefs/ozonefs.robot\` with or without turning on the new feature. (both ofs and o3fs and linked and unlinked bucket are tested...) + +## 6. support of containers / Kubernetes + +NA. Deployment model for OzoneManager remains as earlier. + +Example files are committed with HDDS-5018 + +## 7. coverage/code quality + +**[Sonar master branch](https://sonarcloud.io/dashboard?branch=master&id=hadoop-ozone) +[Sonar HDDS-2939 branch](https://sonarcloud.io/dashboard?branch=HDDS-2939&id=hadoop-ozone).** + +The branch has better coverage than master (73.5% vs 742.2%) but two new Sonar bugs are introduced (169 vs 171) + +## 8. build time + +There is no significant difference between local build time. + +[**Recent master build**](https://github.com/apache/ozone/pull/2132/checks) + +--- + +[**Recent HDDS-2939 branch build**](https://github.com/apache/ozone/pull/2151/checks) + +--- + +- **test time of acceptance unsecure is increased with ~3 minutes** +- **integration test is increased with ~4 mins** + +## 9. possible incompatible changes/used feature flag + +For using this feature, "ozone.OM.metadata.layout" config needs to be set to be true in *ozone-site.xml* + +The new metadata layout is supported only in a fresh cluster and the layout detail is stored in per-bucket. Presently, both old and new metadata layout buckets can't co-exists in the same cluster. User can't start OM in new layout(prefix) if there are existing old layout buckets(simple) and vice-versa. Work is in progress to support the existing old buckets to be available in new layout, this will be supported in the next development phase. + +## 10. third party dependencies/licence changes + +No new dependencies are added. + +## 11. performance + +Done testing to evaluate the performance of delete, rename operations in feature branch vs master code base. Following charts capturing the directory delete and rename operations execution time shows that, feature branch has a very significant performance gain compared to the master. + +Ran freon '*dtsg' dfs tree generator benchmark test* in a single node cluster. V0 represents master code(simple) and V1 represents feature branch(prefix). Please refer to the [Jira document](https://issues.apache.org/jira/secure/attachment/13023395/Performance+Comparison+Between++Master+and+HDDS-2939+branch-Report-001.pdf) for more details. + +## 12. security considerations + +Everything works as earlier and there is no security implications because of the feature. diff --git a/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/02-hdds-3630-rocksdb-datanode.md b/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/02-hdds-3630-rocksdb-datanode.md new file mode 100644 index 0000000000..f459c17d15 --- /dev/null +++ b/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/02-hdds-3630-rocksdb-datanode.md @@ -0,0 +1,78 @@ +# HDDS-3630: RocksDB in Datanode + +Git branch: https://github.com/apache/ozone/tree/HDDS-3630 + +Currently there will be one RocksDB for each Container on Datanode, which leads to hundreds of thousands of RocksDB instances on one Datanode. It's very challenging to manage this amount of RocksDB instances in one JVM. Please refer to the "problem statement" section of the design document\[1\] for challenge details. Unlike the current approach, Datanode RocksDB merge feature will use only one RocksDB for each data volume. With far fewer RocksDB instances to manage, the write path performance and DN stability are improved, Refer to the Micro Benchmark Data section of the design document. + +To enable the feature, the following configs need to be added to Ozone Manager's `ozone-site.xml`. + +```xml + + hdds.datanode.container.schema.v3.enabled + false + + + Hdds.datanode.container.db.dir + Determines where the per-disk rocksdb instances will be + stored. This setting is optional. If unspecified, then rocksdb instances are stored on the same disk as HDDS data. + The directories should be tagged with corresponding storage types ([SSD]/[DISK]/[ARCHIVE]/[RAM_DISK]) for storage policies. + The default storage type will be DISK if the directory does not have a storage typetagged explicitly. + Ideally, this should be mapped to a fast disk like an SSD. + + + Hdds.datanode.failed.db.volumes.tolerated + -1 + The number of db volumes that are allowed to fail before a datanode stops offering service. + Default -1 means unlimited, but we should have at least one good volume left. + +``` + +## 1. Builds/intermittent test failures + +No additional flaky tests have been introduced by the feature branch. + +## 2. Documentation + +Documentation is being added by [HDDS-6790](https://issues.apache.org/jira/browse/HDDS-6790) + +## 3. Design, attached the docs + +The design docs can be found under the Attachments section in the umbrella Jira: [HDDS-3630](https://issues.apache.org/jira/browse/HDDS-3630) + +## 4. Compatibility + +Merge RocksDB in Datanode feature does not change any existing Datanode API. All container data with the existing Ozone cluster will remain their current format and can always be accessible after the Datanode upgrade. + +## 5. Docker-compose / acceptance tests + +New acceptance test is being added by Jira: [HDDS-6791](https://issues.apache.org/jira/browse/HDDS-6791) + +## 6. Support of containers / Kubernetes + +No addition. + +## 7. Coverage/code quality + +[Current feature branch coverage](https://sonarcloud.io/summary/new_code?id=hadoop-ozone&branch=HDDS-3630) is **85.0%** (vs [82.3 % of master branch](https://sonarcloud.io/summary/new_code?id=hadoop-ozone&branch=master)) + +## 8. Build time + +No significant build time difference has been observed. + +master branch succeeded 3 days ago in 9m 9s: + +Feature branch succeeded 7 days ago in 8m 42s: + +## 9. Possible incompatible changes/used feature flag + +There should not be any incompatible changes introduced with this feature. + +A global enable/disable switch for the this feature is added in [HDDS-6541](https://issues.apache.org/jira/browse/HDDS-6541) . + +## 10. Third party dependencies/license changes + +There is no third party dependencies introduced by this feature. + +## 11. Performance + +We have tested major Datanode activities which require RocksDB operation, include container create & close & delete, and block put & get. Except that container delete performance drops because container metadata KV need to be deleted from RocksDB, other four major activities all have performance improved. diff --git a/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/03-hdds-3698-non-rolling-upgrade.md b/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/03-hdds-3698-non-rolling-upgrade.md new file mode 100644 index 0000000000..a4203f3406 --- /dev/null +++ b/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/03-hdds-3698-non-rolling-upgrade.md @@ -0,0 +1,227 @@ +# HDDS-3698: Non-Rolling Upgrade + +## 1. stable builds/intermittent test failures + +- The HDDS-3698-nonrolling-upgrade branch has no intermittent test failures. +- Most intermittent failures specific to the HDDS-3698-nonrolling-upgrade branch were tracked and resolved in [HDDS-4833](https://issues.apache.org/jira/browse/HDDS-4833). +- Other intermittent failures have been resolved in [HDDS-5109](https://issues.apache.org/jira/browse/HDDS-5109) and [HDDS-5336](https://issues.apache.org/jira/browse/HDDS-5336). + +## 2. documentation + +- *Hadoop-HDDs/docs/feature/design/how-to-do-a-nonrolling-upgrade.md* contains instructions for users to upgrade a cluster using the framework. + +- Documentation will be refined in coming weeks, before it is needed in the 1.2.0 release. + +## 3. design, attached the docs + +- *Hadoop-HDDs/docs/content/design/upgrade-dev-primer.md* contains instructions for developers who need to add a feature using the upgrade framework. +- *Hadoop-HDDs/docs/content/design/nonrolling-upgrade.md* contains links to the main design document and presentation. +- *Hadoop-HDDs/docs/content/design/omprepare.md* contains links and summary for OM preparation design document. + +## 4. S3 compatibility + +- There are no S3 incompatibilities. This code was not changed. + +## 5. Docker-compose / acceptance tests + +- Upgrades can be done in Docker-compose to/from any Ozone version which has a Docker image, or to the current code. +- See *Hadoop-ozone/dist/src/main/compose/upgrade/README.md* for more information. + +## 6. support of containers / Kubernetes + +- The upgrade framework will not effect Kubernetes deployments, since it does not require additional configuration for startup. + +## 7. coverage/code quality + +- [Sonar for master](https://sonarcloud.io/dashboard?id=hadoop-ozone) + + - Code coverage: 74.8% + +- [Sonar for upgrades](https://sonarcloud.io/dashboard?branch=HDDS-3698-nonrolling-upgrade&id=hadoop-ozone) + + - Code coverage: 73.3% + +- OM Prepare and general upgrade flow are tested through new acceptance tests which are not reflected in the code coverage. + +## 8. build time + +- Recent passing runs from master: + - https://github.com/apache/ozone/actions/runs/927339353 (2:09) + - https://github.com/apache/ozone/actions/runs/923891108 (1:49) + - https://github.com/apache/ozone/actions/runs/923951010 (2:02) + +- Recent passing runs from upgrade branch: + - https://github.com/apache/ozone/actions/runs/923010777 (2:02) + - https://github.com/apache/ozone/actions/runs/941546330 (1:58) + +- Although lots of upgrade specific tests were added, the following are the most time consuming: + +1. New upgrade acceptance tests, which walk through a cluster upgrade from the last release's Docker image to the current build, in order to catch backwards incompatible changes. + +```text + +- Although we previously had upgrade tests on master, the added tests use the upgrade framework which uses more commands and tests, and therefore takes longer to run. + +``` + +- New upgrade integration tests in TestHDDSUpgrade, which test various failure combinations that could occur during the upgrade. + +```text + +- Only a subset of these tests are currently being run to save time. + +``` + +- New integration tests in TestOzoneManagerPrepare, which tests OM preparations under various failure and request combinations. + +## 9. possible incompatible changes/used feature flag + +There are no incompatible changes and no feature flags. The upgrade framework will only be used during an upgrade. + +## 10. third party dependencies/license changes + +- The following dependencies have been added: + + - aspectjrt-1.8.9.jar + - aspectjweaver-1.8.9.jar + - reflections-0.9.12.jar + +- All new libraries have compatible licenses. (License file update: HDDS-5137) + +## 11. performance + +- The majority of the upgrade framework is not used when an upgrade is not being performed, so it is not expected to have a performance impact. + +- `LayoutVersionInstanceFactory` was added to potentially handle client requests within `OzoneManagerRatisUtils`, although this is not being used yet to accommodate HDDS-2939 (filesystem optimizations). + +- Changes do not seem to affect performance based on simple freon testing (3 runs per branch): + +`ozone freon rk --num-of-keys=100 --num-of-buckets=10 --num-of-volumes=1 --replication-type=RATIS --factor=THREE` + +HDDS-3698-nonrolling-upgrade: + +```text + +*************************************************** +Status: Success +Git Base Revision: 7a3bc90b05f257c8ace2f76d74264906f0f7a932 +Number of Volumes created: 1 +Number of Buckets created: 10 +Number of Keys added: 1000 +Ratis replication factor: THREE +Ratis replication type: RATIS +Average Time spent in volume creation: 00:00:00,013 +Average Time spent in bucket creation: 00:00:00,050 +Average Time spent in key creation: 00:00:02,493 +Average Time spent in key write: 00:00:01,163 +Total bytes written: 10240000 +Total Execution time: 00:00:20,679 +*************************************************** + +``` + +```text + +*************************************************** +Status: Success +Git Base Revision: 7a3bc90b05f257c8ace2f76d74264906f0f7a932 +Number of Volumes created: 1 +Number of Buckets created: 10 +Number of Keys added: 1000 +Ratis replication factor: THREE +Ratis replication type: RATIS +Average Time spent in volume creation: 00:00:00,007 +Average Time spent in bucket creation: 00:00:00,065 +Average Time spent in key creation: 00:00:02,014 +Average Time spent in key write: 00:00:01,144 +Total bytes written: 10240000 +Total Execution time: 00:00:23,661 +*************************************************** + +``` + +```text + +*************************************************** +Status: Success +Git Base Revision: 7a3bc90b05f257c8ace2f76d74264906f0f7a932 +Number of Volumes created: 1 +Number of Buckets created: 10 +Number of Keys added: 1000 +Ratis replication factor: THREE +Ratis replication type: RATIS +Average Time spent in volume creation: 00:00:00,006 +Average Time spent in bucket creation: 00:00:00,089 +Average Time spent in key creation: 00:00:02,442 +Average Time spent in key write: 00:00:01,076 +Total bytes written: 10240000 +Total Execution time: 00:00:23,655 +*************************************************** + +``` + +master: + +```text + +*************************************************** +Status: Success +Git Base Revision: 7a3bc90b05f257c8ace2f76d74264906f0f7a932 +Number of Volumes created: 1 +Number of Buckets created: 10 +Number of Keys added: 1000 +Ratis replication factor: THREE +Ratis replication type: RATIS +Average Time spent in volume creation: 00:00:00,007 +Average Time spent in bucket creation: 00:00:00,046 +Average Time spent in key creation: 00:00:02,422 +Average Time spent in key write: 00:00:01,411 +Total bytes written: 10240000 +Total Execution time: 00:00:25,699 +*************************************************** + +``` + +```text + +*************************************************** +Status: Success +Git Base Revision: 7a3bc90b05f257c8ace2f76d74264906f0f7a932 +Number of Volumes created: 1 +Number of Buckets created: 10 +Number of Keys added: 1000 +Ratis replication factor: THREE +Ratis replication type: RATIS +Average Time spent in volume creation: 00:00:00,006 +Average Time spent in bucket creation: 00:00:00,039 +Average Time spent in key creation: 00:00:02,762 +Average Time spent in key write: 00:00:01,466 +Total bytes written: 10240000 +Total Execution time: 00:00:23,617 +*************************************************** + +``` + +```text + +*************************************************** +Status: Success +Git Base Revision: 7a3bc90b05f257c8ace2f76d74264906f0f7a932 +Number of Volumes created: 1 +Number of Buckets created: 10 +Number of Keys added: 1000 +Ratis replication factor: THREE +Ratis replication type: RATIS +Average Time spent in volume creation: 00:00:00,006 +Average Time spent in bucket creation: 00:00:00,029 +Average Time spent in key creation: 00:00:01,975 +Average Time spent in key write: 00:00:01,040 +Total bytes written: 10240000 +Total Execution time: 00:00:23,677 +*************************************************** + +``` + +## 12. security considerations + +The branch introduced the RPC methods in the ScmAdminProtocol.proto to initialize (and finalize) the upgrade process. They are available only to the admins (HDDS-5138). diff --git a/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/04-hdds-3816-erasure-coding.md b/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/04-hdds-3816-erasure-coding.md new file mode 100644 index 0000000000..eeb3b3071a --- /dev/null +++ b/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/04-hdds-3816-erasure-coding.md @@ -0,0 +1,91 @@ +# HDDS-3816: Erasure Coding Phase 1 + +Distributed systems basic expectation is to provide the data durability. +To provide the higher data durability, many popular storage systems use replication +approach which is expensive. The Apache Ozone supports \`RATIS/THREE\` replication scheme. +The Ozone default replication scheme \`RATIS/THREE\` has 200% overhead in storage +space and other resources (e.g., network bandwidth). +However, for warm and cold datasets with relatively low I/O activities, additional +block replicas rarely accessed during normal operations, but still consume the same +amount of resources as the first replica. + +Therefore, a natural improvement is to use Erasure Coding (EC) in place of replication, +which provides the same level of fault-tolerance with much less storage space. +In typical EC setups, the storage overhead is no more than 50%. The replication factor of an EC file is meaningless. +Instead of replication factor, we introduced ReplicationConfig interface to specify the required type of replication, +either \`RATIS/THREE\` or \`EC\`. + +Integrating EC with Ozone can improve storage efficiency while still providing similar +data durability as traditional replication-based Ozone deployments. +As an example, a 3x replicated file with 6 blocks will consume 6\*3 = \`18\` blocks of disk space. +But with EC (6 data, 3 parity) deployment, it will only consume \`9\` blocks of disk space. + +## Git Branch + +## 1. builds/intermittent test failures + +There are no intermittent failures specific to the HDDS-3861-EC branch as of now. During the development , it was ensured + +all the CI checks are clean prior to every commit merge .The plan is to run repeated CI checks on the merge commit to master. + +## 2. documentation + +Described feature in Apache Ozone page via [HDDS-6172](https://issues.apache.org/jira/browse/HDDS-6172) . + +- *[Hadoop-HDDs/docs/content/feature/ErasureCoding.md](https://github.com/apache/ozone/blob/HDDS-3816-ec/hadoop-hdds/docs/content/feature/ErasureCoding.md)* has the feature details and related configurations. + +## 3. design, attached the docs + +Following design docs are linked from the documentation present in [HDDS-6172](https://issues.apache.org/jira/browse/HDDS-6172) Jira + +- [Design Document](https://issues.apache.org/jira/secure/attachment/13021096/Ozone%20EC%20v3.pdf) +- *[Hadoop-HDDs/docs/content/feature/Hadoop-HDDs/docs/content/feature/ErasureCoding.md](https://github.com/apache/ozone/blob/HDDS-3816-ec/hadoop-hdds/docs/content/feature/ErasureCoding.md)* has the feature details and related configurations. + +## 4. S3 compatibility + +EC feature does not beak any existing S3 compatibility. Please note S3 support is not ready yet for EC though. But this should not be a blocker for merge. + +## 5. Docker-compose / acceptance tests + +Acceptance tests added [HDDS-6231](https://issues.apache.org/jira/browse/HDDS-6231) + +## 6. support of containers / Kubernetes + +NA. Deployment model for OzoneManager remains as earlier. + +## 7. coverage/code quality + +**[Sonar master branch](https://sonarcloud.io/dashboard?branch=master&id=hadoop-ozone) +[Sonar HDDS-3816-EC branch](https://sonarcloud.io/dashboard?branch=HDDS-3816-ec&id=hadoop-ozone).** + +The branch has better coverage than master (68% vs 72%) + +## 8. build time + +There is no significant difference between local build time. + +[**Recent master build**](https://github.com/apache/ozone/actions/runs/1807694482) + +[**Recent HDDS-3816-EC branch build**](https://github.com/apache/ozone/actions/runs/1826490688) + +## 9. possible incompatible changes/used feature flag + +For using this feature, users create bucket with EC replication config, so that the keys created in that bucket will be written in EC mode. + +Upgrade: Before finalization, we would like to reject EC related requests. Even though EC feature not introduced any new APIs, but parameters carry different values to indicate EC options. So, We needed some special handling to check the parameters level and validate the requests. Related upgrade JIRAs being worked on are: [HDDS-6213](https://issues.apache.org/jira/browse/HDDS-6213) and [HDDS-5909](https://issues.apache.org/jira/browse/HDDS-5909) + +Compatibility Changes: Currently forward compatibility broken due to the introduction if server side defaults and removal of client side default configurations. This is also being work on [HDDS-6209](https://issues.apache.org/jira/browse/HDDS-6209) + +We are tracking down the above issues before the merge. + +## 10. third party dependencies/licence changes + +No new dependencies are added. + +## 11. performance + +There should not be any performance impact for Non-EC flows. For EC files there is basic benchmark performed [HDDS-6194](https://issues.apache.org/jira/browse/HDDS-6194) . + +## 12. security considerations + +Everything works as earlier and there is no security implications because of the feature. diff --git a/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/05-hdds-4440-s3g-grpc-connections.md b/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/05-hdds-4440-s3g-grpc-connections.md new file mode 100644 index 0000000000..3b10e8e4b3 --- /dev/null +++ b/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/05-hdds-4440-s3g-grpc-connections.md @@ -0,0 +1,109 @@ +# HDDS-4440: S3G gRPC Connections + +## Checklist + +### 1. stable builds/intermittent test failures + +There are no intermittent test failures specific to the **HDDS-4440-S3-performance** feature branch. On each commit during development, it was ensured that all CI workflow tests passed prior to the commit merge. To keep the master branch stable, full CI workflow runs are run multiple times on the latest commit prior to the final merge. + +### 2. documentation + +Documented in Apache Ozone Design docs as HDDS-4440 Proposed persistent OM connection for S3 Gateway. + +- [Hadoop-HDDs/docs/content/design/S3-performance.md](https://ci-hadoop.apache.org/view/Hadoop%20Ozone/job/ozone-doc-master/lastSuccessfulBuild/artifact/hadoop-hdds/docs/public/design/s3-performance.html) + +### 3. design, attached the docs + +Design found in Jira HDDS-4440 and supporting related Jiras HDDS-5881, HDDS-5630. ASF feature branch slack channel is, \#ozone-s3g-gRPC. + +### 4. S3 compatibility + +This feature tries to provide 100% S3 compatibility when Ozone.OM.enable.filesystem.paths=false. This feature branch provides an enhancement to S3 Gateway behavior for handling and relaying S3 errors to the client. + +Whereas the existing gateway implementation returns an ambiguous 500 HTTP response code with INTERNAL_ERROR in the event of client connection failures, this feature returns standard S3 errors for most errors. + +This holds true except in the event the initial persistent connection cannot be established between the S3 Gateway and the Ozone Manager. In this case the 500 HTTP response is returned to the caller. + +### 5. Docker-compose / acceptance tests + +Added enabling Ozone Manager gRPC server service to each Docker-config for the development clusters: Ozone, ozonesecure, Ozone-ha and ozonesecure-ha. + +To test the S3 Gateway performance persistent connection gRPC feature with Docker-compose / acceptance tests. Add the following configuration key settings to the Docker-compose.yaml : OM container - [OZONE-SITE.XML_ozone.OM](http://OZONE-SITE.XML_ozone.om).S3.gRPC.server_enabled: "true" & s3g container - [OZONE-SITE.XML_ozone.](http://OZONE-SITE.XML_ozone.om)OM.transport.class : "[org.apache.Hadoop.ozone.OM](http://org.apache.hadoop.ozone.om).protocolPB.GrpcOmTransportFactory". + +Then run acceptance tests, test.sh, for development cluster configured including Ozone, ozonesecure, Ozone-ha and ozonesecure-ha. + +Also, with development cluster configured for s3g gRPC can load test the S3 Gateway using the endpoint, localhost:9878. Load testers used include freon and warp. + +### 6. support of containers / Kubernetes + +NA. Deployment model for OzoneManager remains as earlier. + +### 7. coverage/code quality + +[Sonar master branch code coverage.](https://sonarcloud.io/component_measures?metric=coverage&view=list&id=hadoop-ozone) + +[Sonar HDDS-4440-S3-performance feature branch code coverage.](https://sonarcloud.io/component_measures?metric=coverage&view=list&branch=HDDS-4440-s3-performance&id=hadoop-ozone) + +Code coverage is nearly unchanged from master to feature branch, from 76.5% to 75.7%. In addition the feature branch has no Duplications nor Security vulnerabilities. Further, the feature branch has a better Maintainability number than the master, 26 vs 207 (lower better). + +### 8. build time + +**Recent Master branch build time:** + +**Current HDDS-4440-S3-performance feature branch build time:** . + +Building on a local machine with ubuntu linux six-core i5 Coffee Lake and 64Gb Ram (\$ mvn clean install -DskipTests): + +| | | +|----------------------------------------------------------|---------------| +| **Master branch build time:** | 06:22 min | +| **Feature branch HDDS-4440-S3-performance build time:** | **06:16 min** | + +**\** + +### 9. possible incompatible changes + +The S3g gRPC Persistent Connections feature is enabled through two s3g gRPC specific configuration keys. One configuration key is to enable the gRPC server service on the Ozone Manager, OM, and the other is to enable the gRPC client on the S3 Gateway, s3g. By default the S3 Gateway gRPC client is off and communication between the s3g and OM is though the existing Hadoop RPC. + +To enable this feature set, + +1. [ozone.OM](http://OZONE-SITE.XML_ozone.om).S3.gRPC.server_enabled set to true in *ozone-site.xml*. (enable service on OM) +2. [ozone.](http://OZONE-SITE.XML_ozone.om)OM.transport.class set to [org.apache.Hadoop.ozone.OM](http://org.apache.hadoop.ozone.om).protocolPB.GrpcOmTransportFactory in *ozone-site.xml*. (enable gRPC on s3g client) + +With these two configuration keys disabled, the S3 Gateway \ Ozone Manager channel operates in legacy mode with the existing Hadoop RPC. This can used in the upgrade period to turn off the feature when the feature is unstable and operate in legacy mode (Hadoop RPC communication). + +### 10. third party dependencies/license changes + +For the S3-performance gRPC feature, network transport related jars are added to support native encryption on the wire, TLS: + +| | +|-----------------------------------------------| +| Added to License.txt | +| \+ io.Netty:netty-tcnative-boringssl-static | +| \+ io.Netty:netty-tcnative | + +### 11. performance + +We compare the performance of the S3 Gateway using the gRPC persistent connection with TLS to the existing Hadoop RPC, hRPC connections with encryption on the wire for metadata requests. We find that in load testing the S3 performance feature branch with gRPC and encryption on the wire outperforms the existing hRPC connection ***both*** encrypted and in plaintext. This is particularly evident in the comparison of gRPC with TLS to encrypted wire Hadoop RPC where the increase is greater than 2X. + +| # | s3g Transport Type | Description | Load Test Performance for Metadata throughput, Objects / sec (objs/sec) | +|---|---|---|---| +| 1 | gRPC TLS (feature branch) | s3g ↔︎ Ozone Manager connection over gRPC with encryption on the wire, TLS. Persistent connection. | 9026.12 | +| 2 | hRPC plaintext (current) | s3g ↔︎ Ozone Manager connection over Hadoop RPC plaintext. Persistent connection (HDDS-5881). | 6508.85 | +| 3 | hRPC encrypted wire (current) | s3g ↔︎ Ozone Manager connection over Hadoop RPC with encryption on the wire (privacy configuration). Persistent connection (HDDS-5881). | 3989.35 | + +Load test used: minio Warp S3 benchmarking tool. + +```bash +./warp stat --host=\ --duration=1m –bucket bucket1 --concurrent=64 --noclear --obj.size=1KiB --access-key=$AWS_ACCESS_KEY --secret-key=$AWS_SECRET_ACCESS_KEY +``` + +Test cluster consists of native Ozone deployment, bare-metal. OM-SCM on one node, S3 Gateway on separate node. + +### 12. security considerations + +This feature branch supports gRPC encryption channel communication between the S3 Gateway and Ozone Manager through TLS. Encryption on the wire for the gRPC channel is configured by the Ozone-site.xml key, + +1. `hdds.grpc.tls.enabled` set to `true` + + A new security model is introduced for S3 Gateway persistent connections and was implemented in supporting Jira master branch patch, HDDS-5881. This branch uses the same security model for S3 user authentication on a per request basis. diff --git a/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/06-hdds-4454-streaming-write-pipeline.md b/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/06-hdds-4454-streaming-write-pipeline.md new file mode 100644 index 0000000000..d8b2434bc3 --- /dev/null +++ b/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/06-hdds-4454-streaming-write-pipeline.md @@ -0,0 +1,79 @@ +# HDDS-4454: Streaming Write Pipeline + +Git branch: + +Currently, Ozone write pattern is bursty and involves multiple buffers copies as well multiple Ratis log syncs in a block write. The idea of the Jira is to use a zero buffer copy based Ratis streaming ([RATIS-979](https://issues.apache.org/jira/browse/RATIS-979)) in Ozone write path for better performance and resource utilization. + +To enable the feature, the following configuration properties need to be added to Ozone Manager's `ozone-site.xml`. + +```xml + + dfs.container.ratis.datastream.enable + true + OZONE, CONTAINER, RATIS, DATASTREAM + If enable datastream ipc of container. + + + + ozone.fs.datastream.enabled + true + OZONE, DATANODE + + To enable/disable filesystem write via ratis streaming. + + +``` + +## 1. Builds/intermittent test failures + +No additional flaky tests have been introduced by the feature branch. + +## 2. Documentation + +Documentation was added by [HDDS-7425](https://issues.apache.org/jira/browse/HDDS-7425) . + +## 3. Design, attached the docs + +The design docs can be found under the Attachments section in the umbrella Jira: [HDDS-4454](https://issues.apache.org/jira/browse/HDDS-4454) + +## 4. Compatibility + +Ozone Streaming Write Pipeline feature does not change any existing APIs. + +## 5. Docker-compose / acceptance tests + +New acceptance test was added by Jira: [HDDS-7426](https://issues.apache.org/jira/browse/HDDS-7426) + +## 6. Support of containers / Kubernetes + +No addition. + +## 7. Coverage/code quality + +[HDDS-4454 feature branch coverage](https://sonarcloud.io/summary/overall?id=hadoop-ozone&branch=HDDS-4454) (65.4%) vs [master branch coverage](https://sonarcloud.io/summary/overall?id=hadoop-ozone&branch=master) (65.9%) + +## 8. Build time + +No significant build time difference has been observed. + +master branch succeeded (1h 28m 9s): + +HDDS-4454 branch succeeded (1h 22m 11s): + +## 9. Possible incompatible changes/used feature flag + +There should not be any incompatible changes introduced with this feature. + +A global enable/disable switch for the this feature is added in [HDDS-5480](https://issues.apache.org/jira/browse/HDDS-5480) . + +## 10. Third party dependencies/license changes + +There is no third party dependencies introduced by this feature. + +## 11. Performance + +The throughput of the new Streaming Pipeline is roughly 2x and 3x of the throughput of the existing Write Pipeline for the single client case and the 3-client case, respectively. + +Single client case: [20220702_Single Client - Benchmarks for Ozone Streaming.pdf](https://issues.apache.org/jira/secure/attachment/13046179/20220702_Single%20Client%20-%20Benchmarks%20for%20Ozone%20Streaming.pdf) + +3-client case: [20220702_3 Clients - Benchmarks for Ozone Streaming.pdf](https://issues.apache.org/jira/secure/attachment/13046180/20220702_3%20Clients%20-%20Benchmarks%20for%20Ozone%20Streaming.pdf) diff --git a/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/07-hdds-4944-s3-multi-tenancy.md b/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/07-hdds-4944-s3-multi-tenancy.md new file mode 100644 index 0000000000..5faaa79b31 --- /dev/null +++ b/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/07-hdds-4944-s3-multi-tenancy.md @@ -0,0 +1,121 @@ +# HDDS-4944: S3 Multi-Tenancy + +Feature branch HDDS-4944 has been merged to master on May 29. + +Git branch: https://github.com/apache/ozone/tree/HDDS-4944 + +Compare: https://github.com/apache/ozone/compare/master...HDDS-4944 + +For a quick intro to the S3 multi-tenancy feature, here is an excerpt from the documentation: + +> Before Ozone multi-tenancy, all S3 access to Ozone (via S3 Gateway) are +> confined to a single designated S3 volume (that is volume \`s3v\`, by default). +> +> Ozone multi-tenancy allows multiple S3-accessible volumes to be created. +> Each volume can be managed separately by their own tenant admins via CLI for user operations, and via Apache Ranger for access control. + +For more, please check out the [full documentation](https://github.com/apache/ozone/blob/HDDS-4944/hadoop-hdds/docs/content/feature/S3-Multi-Tenancy.md?plain=1#L26). The doc has [feature overview](https://github.com/apache/ozone/blob/HDDS-4944/hadoop-hdds/docs/content/feature/S3-Multi-Tenancy.md), [setup guide](https://github.com/apache/ozone/blob/HDDS-4944/hadoop-hdds/docs/content/feature/S3-Multi-Tenancy-Setup.md), [CLI guide](https://github.com/apache/ozone/blob/HDDS-4944/hadoop-hdds/docs/content/feature/S3-Tenant-Commands.md) and [access control guide](https://github.com/apache/ozone/blob/HDDS-4944/hadoop-hdds/docs/content/feature/S3-Multi-Tenancy-Access-Control.md) (best viewed locally rendered using `hugo serve` command under `./hadoop-hdds/docs/` , as it is not published to the website yet). + +Requirements to enable S3 multi-tenancy: + +1. [Use Apache Ranger](https://ozone.apache.org/docs/1.2.1/security/securitywithranger.html) +2. [Enable Ozone security and use Kerberos](https://ozone.apache.org/docs/1.2.1/security/secureozone.html) authentication + +To enable multi-tenancy (with Ranger Basic HTTP authentication), in addition to the requirements above, the following configs need to be added to Ozone Manager's `ozone-site.xml`, as documented [here](https://github.com/apache/ozone/blob/HDDS-4944/hadoop-hdds/docs/content/feature/S3-Multi-Tenancy-Setup.md?plain=1#L40) in the doc as well: + +```xml + + + ozone.om.multitenancy.enabled + true + + + ozone.om.ranger.https-address + https://RANGER_HOST:6182 + + + ozone.om.ranger.https.admin.api.user + RANGER_ADMIN_USERNAME + + + ozone.om.ranger.https.admin.api.passwd + RANGER_ADMIN_PASSWORD + +``` + +To enable multi-tenancy with Ranger Java client ( [HDDS-5836](https://issues.apache.org/jira/browse/HDDS-5836) ), clear text Ranger admin user name and password will no longer be necessary. Rather the Ranger Java client (re)uses the existing OM Kerberos principal and keytab config when enabling Ozone security with Kerberos auth. Therefore, only two extra config keys are necessary to enable the feature: + +```xml + + ozone.om.multitenancy.enabled + true + + + ozone.om.ranger.https-address + https://RANGER_HOST:6182 + +``` + +\`ozone.OM.kerberos.principal\` and \`ozone.OM.kerberos.keytab.file\` should have been [configured](https://ozone.apache.org/docs/1.2.1/security/secureozone.html#:~:text=ozone.om.kerberos.principal) already. + +NOTE: Ranger Java client patch is merged. BUT the authorizer implementation switch hasn't happened. Partially due to Ranger 2.3.0 hasn't been released yet. Therefore, as of now it can only use the Ranger Basic HTTP authentication approach. Further patch will be done to complete the switch. + +## 1. builds/intermittent test failures + +No additional flaky tests have been introduced by the feature branch. + +But there is one flaky upgrade/SCM acceptance test worth mentioning here that comes from the master branch: [HDDS-6546](https://issues.apache.org/jira/browse/HDDS-6546) This is frequently observed while running CIs on the S3 multi-tenancy branch. Spent quite a few hours on it but haven't found the root cause yet (see Jira comments). Note: the S3 multi-tenancy feature branch does not change SCM code at all. + +## 2. documentation + +Documentation has been added since [HDDS-6275](https://issues.apache.org/jira/browse/HDDS-6275) and is under constant revision. + +The doc (S3-Multi-Tenancy.md, S3-Tenant-Commands.md and so on) can be found under + +## 3. design, attached the docs + +The design docs can be found under the Attachments section in the umbrella Jira: [HDDS-4944](https://issues.apache.org/jira/browse/HDDS-4944) + +## 4. S3 compatibility + +S3 multi-tenancy feature does not break any existing S3 API compatiblity. And all S3 secret key pairs generated with the existing `ozone s3 getsecret` command can still be used the same way (still confined to default s3v volume) after the OM upgrade. + +## 5. Docker-compose / acceptance tests + +Acceptance test cases can be found in: + +## 6. support of containers / Kubernetes + +No addition. There is plan to bring Apache Ranger to a new set of cluster config in order to better test the multi-tenancy functionality, but is not done yet. Though this would not be a merge blocker. + +## 7. coverage/code quality + +[Current feature branch coverage](https://sonarcloud.io/summary/new_code?id=hadoop-ozone&branch=HDDS-4944) is **82.0%** (vs [62.8% of master branch](https://sonarcloud.io/summary/new_code?id=hadoop-ozone&branch=master)) + +## 8. build time + +No significant build time difference has been observed. + +master branch succeeded 20 days ago in 9m 45s: + +Feature branch succeeded 19 days ago in 9m 28s: + +## 9. possible incompatible changes/used feature flag + +There should not be any incompatible changes introduced with this feature. + +A global enable/disable switch for the S3 multi-tenancy feature is to be added in [HDDS-6612](https://issues.apache.org/jira/browse/HDDS-6612) . + +## 10. third party dependencies/licence changes + +[HDDS-5836](https://issues.apache.org/jira/browse/HDDS-5836) Ranger Java client would include new dependency `org.apache.ranger.ranger-intg` + +## 11. performance + +S3 Gateway performance to be tested. Performance has been considered during development. For example, in order to for the client (S3 Gateway) to select the correct decryption key based on the actual user principal (S3 Gateway) and without introducing extra round trip, the user principal is piggy-backed in `RpcClient#getS3Volume` . + +Ozone Java RPC client performance should not be affected. + +## 12. security considerations + +For the tenant user assign CLI, there were a discussion on whether to allow admins to specify the access ID (rather than in the auto-generated form of `tenantName$userName` ) to be assigned to the user. But out of security concern (possible key pair leak), it has been disabled on the server side. Additional input validation will need to be added in the future to allow it to be safely enabled again. diff --git a/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/08-hdds-4948-scm-ha.md b/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/08-hdds-4948-scm-ha.md new file mode 100644 index 0000000000..adadc9b369 --- /dev/null +++ b/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/08-hdds-4948-scm-ha.md @@ -0,0 +1,117 @@ +# HDDS-4948: SCM HA + +## 1. stable builds/intermittent test failures + +There are no intermittent failures specific to the HDDS-2823 branch as of now. During the development , it was ensured all the CI checks are clean prior to every commit merge .The plan is to run repeated CI checks on the merge commit to master. + +## 2. documentation + +Initial doc has been added by HDDS-4948. + +## 3. design, attached the docs + +All the design docs are linked from the documentation as part of HDDS-4948. + +## 4. S3 compatibility + +There are no incompatibilities with respect to S3. S3 is not changed at all (except that the OmUtils rename affected it) + +## 5. support of containers / kubernetes + +Tested in Kubernetes and everything worked well. + +Example files are committed with HDDS-4950. + +## 6. coverage/code quality + +[Sonar master branch](https://sonarcloud.io/dashboard?branch=master&id=hadoop-ozone). +[Sonar SCM-HA branch](https://sonarcloud.io/dashboard?branch=HDDS-2823&id=hadoop-ozone) +SCM-HA has 20 more issues but 2.3 better code coverage. + +## 7. build time + +**Recent master builder: + +Recent SCM-HA build: \** + +SCM-HA branch didn't introduce any significant slowness (2-3 minutes plus to the existing integrations test and acceptance tests which are already close to 1h runtime). +**\** +There is no significant difference between local build time. In a linux with the below configuration, + +```text +hadoop@9 ~/glengeng$ lscpu +Architecture: x86_64 +CPU op-mode(s): 32-bit, 64-bit +Byte Order: Little Endian +CPU(s): 16 +On-line CPU(s) list: 0-15 +Thread(s) per core: 1 +Core(s) per socket: 16 +Socket(s): 1 +NUMA node(s): 1 +Vendor ID: GenuineIntel +CPU family: 6 +Model: 94 +Model name: Intel(R) Xeon(R) Gold 61xx CPU +Stepping: 3 +CPU MHz: 2494.138 +BogoMIPS: 4988.27 +Hypervisor vendor: KVM +Virtualization type: full +L1d cache: 32K +L1i cache: 32K +L2 cache: 4096K +NUMA node0 CPU(s): 0-15 +Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl eagerfpu pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch fsgsbase bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx rdseed adx smap xsaveopt xsavec xgetbv1 arat +hadoop@9 ~/glengeng$ free -g +total used free shared buff/cache available +Mem: 125 4 98 0 22 120 +Swap: 0 0 0 + +#data for master +INFO BUILD SUCCESS +INFO --------------------------------------------- +INFO Total time: 08:10 min +INFO Finished at: 2021-03-10T16:54:42+08:00 +INFO --------------------------------------------- + +#data for HDDS-2823 +INFO BUILD SUCCESS +INFO -------------------------------------------- +INFO Total time: 07:09 min +INFO Finished at: 2021-03-10T17:06:42+08:00 +``` + +## 8. possible incompatible changes/used feature flag + +For using the SCM HA feature, "ozone.SCM.ratis.enable" config needs to be set to be true in Ozone-site.xml. + +## 9. third party dependencies/licence changes + +Checking the content of the two branches (find -type f \| sort \> ... + diff) the only jar differences are due to a latest version bump: + +```text + ./share/ozone/lib/jackson-annotations-2.12.1.jar +> ./share/ozone/lib/jackson-core-2.12.1.jar +> ./share/ozone/lib/jackson-databind-2.12.1.jar +> ./share/ozone/lib/jackson-dataformat-cbor-2.12.1.jar +> ./share/ozone/lib/jackson-dataformat-xml-2.12.1.jar +> ./share/ozone/lib/jackson-datatype-jsr310-2.12.1.jar +> ./share/ozone/lib/jackson-module-jaxb-annotations-2.12.1.jar +``` + +No new dependencies are added. + +## a10. performance + +Performance between master and SCM-HA branch (without turning on Ratis) is shared [here](https://docs.google.com/document/d/1XYgwM3zOKeZUWsrkxaWO_12zpokTFOSoA4X7Eu-Ed50/edit) + +> We use the default configuration for master and 2823. +> +> The write throughput seems to be constrained by hardware, e.g. DC network, which we haven’t dug further. +> +> According to the slight differences between 2823 and master, the performance of the SCM HA bypass Ratis is close to that of pure in-mem SCM + +## 11. security considerations + +Security is not ready on the branch yet, therefor this feature is not production-ready. SCM-HA is disabled for secure clusters to avoid any security issues. (See HDDS-4978.) diff --git a/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/09-hdds-5447-httpfs.md b/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/09-hdds-5447-httpfs.md new file mode 100644 index 0000000000..e54307a518 --- /dev/null +++ b/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/09-hdds-5447-httpfs.md @@ -0,0 +1,74 @@ +# HDDS-5447: HttpFS + +Ozone HttpFS is a WebHDFS compatible interface implementation, as a separate role it provides an easy integration with Ozone. + +HttpFS support in Ozone epic Jira: [HDDS-5447](https://issues.apache.org/jira/browse/HDDS-5447) + +## 1. builds/intermittent test failures + +no additional flaky tests have been introduced by the feature branch. + +## 2. documentation + +the documentation for HttpFS was added by [HDDS-5966](https://issues.apache.org/jira/browse/HDDS-5966) , it can be found at *[Hadoop-HDDs/docs/content/interface/HttpFS.md](https://github.com/apache/ozone/blob/HDDS-5447-httpfs/hadoop-hdds/docs/content/interface/HttpFS.md)* + +## 3. design, attached the docs + +- [design document](https://issues.apache.org/jira/secure/attachment/13031822/HTTPFS%20interface%20for%20Ozone.pdf) (can be found under the epic Jira) +- attached docs can be found at *[Hadoop-HDDs/docs/content/interface/HttpFS.md](https://github.com/apache/ozone/blob/HDDS-5447-httpfs/hadoop-hdds/docs/content/interface/HttpFS.md)* + +## 4. S3 compatibility + +the HttpFS gateway doesn't break anything related S3. + +## 5. Docker-compose / acceptance tests + +acceptance tests were added in the following tasks: + +- [HDDS-5615](https://issues.apache.org/jira/browse/HDDS-5615) +- [HDDS-5698](https://issues.apache.org/jira/browse/HDDS-5698) +- [HDDS-7719](https://issues.apache.org/jira/browse/HDDS-7719) + +it can be found at [/Hadoop-ozone/dist/src/main/smoketest/HttpFS](https://github.com/apache/ozone/tree/HDDS-5447-httpfs/hadoop-ozone/dist/src/main/smoketest/httpfs). + +## 6. support of containers / Kubernetes + +no addition yet. + +## 7. coverage/code quality + +[current](https://sonarcloud.io/summary/overall?id=hadoop-ozone&branch=master) and [HDDS-5447-HttpFS](https://sonarcloud.io/summary/new_code?id=hadoop-ozone&branch=HDDS-5447-httpfs) + +## 8. build time + +no significant change + +- current master: **[1h 41m 17s](https://github.com/apache/ozone/actions/runs/4103741189)** +- last merged commit on branch: [**1h 33m 58s**](https://github.com/apache/ozone/actions/runs/4005079797) (will update with a green build-branch from the HDDS-5447-HttpFS branch) + +## 9. possible incompatible changes/used feature flag + +there should not be any incompatible changes introduced with this feature. + +## 10. third party dependencies/licence changes + +these dependencies were added: + +- curator-client + +- curator-framework + +- JSON-simple + +- zookeeper + +the Zookeeper and Curator dependencies are added because the delegation tokens and token details are stored in Zookeeper, this can be relevant if there is a load balancer and more than one HttpFS gateways are serving the requests. +JSON-simple is used in the module, with objects like JSONObject, JSONArray, JSONParser, etc. + +## 11. performance + +this feature won't affect the performance of Ozone. + +## 12. security considerations + +there is no security implications because of this feature. diff --git a/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/10-hdds-5713-disk-balancer.md b/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/10-hdds-5713-disk-balancer.md new file mode 100644 index 0000000000..443dcd16ad --- /dev/null +++ b/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/10-hdds-5713-disk-balancer.md @@ -0,0 +1,80 @@ +# HDDS-5713: Disk Balancer + +DiskBalancer for Datanode Epic : HDDS-5713. DiskBalancer for Datanode +Git Branch : https://github.com/apache/ozone/tree/HDDS-5713 + +## 1. Builds/intermittent test failures + +There are no intermittent failures specific to the HDDS-5713 branch as of now. During the development , it was ensured all the CI checks were clean prior to every commit merge . + +The plan is to run repeated CI checks on the merge commit to master. + +## 2. Documentation + +[User Documentation](https://github.com/apache/ozone/blob/HDDS-5713/hadoop-hdds/docs/content/feature/DiskBalancer.md) for DiskBalancer has been added. + +## 3. Design, attached the docs + +Markdown design document for Ozone-site can be found here : [DiskBalancer Design Doc](https://github.com/apache/ozone/blob/HDDS-5713/hadoop-hdds/docs/content/design/diskbalancer.md). + +Google design doc : [Design Doc](https://docs.google.com/document/d/1G3fcXqmyiB7MNs7eq0nc4zGmGuLsY_IaFOMpTT5eiq4/edit?pli=1&tab=t.0) . + +## 4. S3 compatibility + +N/A, S3 compatibility remains the same. DiskBalancer only affects the Data Volumes within a Datanode. + +## 5. Docker-compose / Acceptance tests + +New robot test [testdiskbalancer.robot](https://github.com/apache/ozone/blob/HDDS-5713/hadoop-ozone/dist/src/main/smoketest/diskbalancer/testdiskbalancer.robot) is being added. + +New acceptance test are added, mainly tests the CLI for DiskBalancer. It does not test fault injection. + +More comprehensive tests with fault injection were added as part of unit tests. + +## 6. Support of containers / Kubernetes + +No addition. No change in existing support. + +## 7. Coverage / Code quality + +[New Code Coverage](https://sonarcloud.io/summary/new_code?id=hadoop-ozone&branch=HDDS-5713) for DiskBalancer (HDDS-5713) is 83.83 and [Overall Code Coverage](https://sonarcloud.io/summary/overall?id=hadoop-ozone&branch=HDDS-5713) is 78.4 . +[Overall Code Coverage](https://sonarcloud.io/summary/overall?id=hadoop-ozone&branch=master) for master is 75.6. + +## 8. Build time + +[Build time for the latest commit](https://github.com/apache/ozone/actions/runs/16464370250) from DiskBalancer Branch is 1hr 11m 54s. +[Build time for the latest commit](https://github.com/apache/ozone/actions/runs/16580591540) from the master branch is 1hr 13m 36s. + +## 9. Possible incompatible changes/used feature flag + +There should not be any incompatible changes introduced with this feature. + +A global enable/disable switch for the DiskBalancer feature is to be added in [HDDS-13497. \[DiskBalancer\] Add new property "HDDs.Datanode.disk.balancer.enabled"](https://issues.apache.org/jira/browse/HDDS-13497) + +To enable the feature, the following configs need to be added to DN Ozone-site.xml, and set the value to "true" + +| Property | Default | Tags | Description | +|---|---|---|---| +| `hdds.datanode.disk.balancer.enabled` | `false` | OZONE, Datanode, DISKBALANCER | If this property is set to true, then the Disk Balancer service is enabled on Datanodes, and users can use this service. By default, this is disabled. | + +## 10. Third-party dependencies/License changes + +There are no third party dependencies introduced by this feature. + +## 11. Performance + +The major work flow of DiskBalancer on Datanode, is first select a pair of data volumes, one is over-utilized as source volume, the other is lower-utilized as destination volume, then select one container from the source volume, then move the container from source volume to destination volume. The overall end-to-end performance is mainly dominated by the time spent on moving container from source volume to destination volume, which in turn is mainly decided by volume physical medium. + +Except the move container part, we did the microbenchmark performance testing for volume pair choosing(VolumeChoosingPolicy), and container choosing(ContainerChoosingPolicy) . VolumeChoosingPolicy, it chooses a pair of volumes which will act as source volume(most used) and destination volume(least used). ContainerChoosingPolicy, it decides which container to move from an over-utilized disk to least-utilized to help balance storage across volumes. + +Performance test for ContainerChoosingPolicy is done by [HDDS-13055. Optimise ContainerChoosingPolicy Performance](https://issues.apache.org/jira/browse/HDDS-13055) . The test shows it takes approx 0.02ms to pick one container. + +Performance test for VolumeChoosingPolicy is done by [HDDS-13291. Add Performance test for VolumeChoosingPolicy](https://issues.apache.org/jira/browse/HDDS-13291) . It shows it takes approx 0.12ms to pick one pair of volume. + +The container move process will consume Datanode resource, especially disk IO resource, so Disk balancer is by default stopped in Datanode. User can start the disk balancer only when needed, with IO bandwidth throttling supported. + +## 12. Security considerations + +The CLI to start/stop/update DiskBalancer, is only accessible to the admins. [There is a specific robot test to verify this](https://github.com/apache/ozone/blob/HDDS-5713/hadoop-ozone/dist/src/main/smoketest/diskbalancer/testdiskbalancer.robot). + +Besides this, there are no other security implications of this feature. diff --git a/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/11-hdds-6517-snapshots.md b/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/11-hdds-6517-snapshots.md new file mode 100644 index 0000000000..7eaf06fe8e --- /dev/null +++ b/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/11-hdds-6517-snapshots.md @@ -0,0 +1,59 @@ +# HDDS-6517: Snapshots + +01 Feb 2023 Snapshot feature branch `HDDS-6517-Snapshot` has been merged to master. All future snapshot development work should continue on the master branch. + +Snapshot feature umbrella Jira: [HDDS-6517](https://issues.apache.org/jira/browse/HDDS-6517) + +## 1. builds/intermittent test failures + +No additional flaky tests have been introduced by the feature branch. + +## 2. documentation + +Documentation for this feature is tracked through [HDDS-7745](https://issues.apache.org/jira/browse/HDDS-7745) . + +## 3. design, attached the docs + +The design docs can be found under the Attachments section in the umbrella Jira: [HDDS-6517](https://issues.apache.org/jira/browse/HDDS-6517) + +## 4. Compatibility + +The feature doesn't break compatibility for any of the existing features. + +## 5. Docker-compose / acceptance tests + +We have some basic Snapshot acceptance tests. This will remain an ongoing activity throughout Snapshot development and is tracked through [HDDS-7768](https://issues.apache.org/jira/browse/HDDS-7768) . + +## 6. support of containers / Kubernetes + +No addition so far. This should not be a merge blocker. + +## 7. coverage/code quality + +[Current feature branch coverage](https://sonarcloud.io/summary/new_code?id=hadoop-ozone&branch=HDDS-6517-Snapshot) + +## 8. build time + +No significant build time difference has been observed. + +master branch succeeded yesterday in 13m 30s: + +Feature branch succeeded in 15m 31s: + +## 9. possible incompatible changes/used feature flag + +There should not be any incompatible changes introduced with this feature. + +Snapshot feature will be a layout upgrade in new releases and can be used after upgrade finalization. This will be tracked through Jira [HDDS-7772](https://issues.apache.org/jira/browse/HDDS-7772) + +## 10. third party dependencies/licence changes + +NA + +## 11. performance + +The feature won't affect performance if the Snapshot feature is not in use. When snapshots are used, the performance can get impacted proportionate to the number of snapshots that are actively read from and the number of concurrent snapdiff operations, + +## 12. security considerations + +Security consideration associated with Snapshot feature are tracked through [HDDS-6851](https://issues.apache.org/jira/browse/HDDS-6851) . Authorization model for Snapshots is attached to this Jira ticket. diff --git a/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/12-hdds-7593-hsync-lease-recovery.md b/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/12-hdds-7593-hsync-lease-recovery.md new file mode 100644 index 0000000000..6e939486a8 --- /dev/null +++ b/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/12-hdds-7593-hsync-lease-recovery.md @@ -0,0 +1,68 @@ +# HDDS-7593: HSync and Lease Recovery + +Epic: HDDS-7593 +Feature branch: https://github.com/apache/ozone/tree/HDDS-7593 + +## 1. builds/intermittent test failures + +Ran full CI for the [latest commit](https://github.com/apache/ozone/actions/runs/10081026756) for the feature branch two times. In both the runs all the tests has passed. + +## 2. documentation + +The new API is for developers to build upon. Not intended for end-users or administrators. + +## 3. design, attached the docs + +The design and architecture spans across multiple design docs + +[Design doc: Supporting Hflush and lease recovery](https://docs.google.com/document/d/1KcB9qjIe6vEg7iRu4rFsHE5kTj6A1JCaJ-Q2PKWLpGw/edit?usp=sharing) + +[Ozone File Lease Recovery Protocol Detail Design](https://docs.google.com/document/d/1wS0dVL3huManP8OrKl-sBjxE5vFVyeW4XEFdHHIctO4/edit?usp=sharing) + +[Support Incremental ChunkList in PutBlock requests](https://docs.google.com/document/d/1Q7skR1xndQ1W3qzkz5rdwjN_2ZFA1zD63bZtKHciY28/edit?usp=sharing) + +## 4. S3 compatibility + +S3 behaviour was not changed. + +The new APIs (hsync, recoverLease, ...) are Hadoop file system APIs and are not supported by S3. + +## 5. Docker-compose / acceptance tests + +Use cases are covered by integration tests. + +## 6. support of containers / Kubernetes + +N/A, the new request does not affect support of containers. + +## 7. coverage/code quality + +Current coverage is ~46.8% for both [master](https://sonarcloud.io/project/activity?custom_metrics=coverage&graph=custom&id=hadoop-ozone) and ~44.4% for the [feature branch](https://sonarcloud.io/summary/new_code?id=hadoop-ozone&branch=HDDS-7593). + +## 8. build time + +Build time in CI of the latest commit on the [feature branch (https://github.com/apache/ozone/commit/f7610c0012cd5e83769ff22eebeba092d3893d14)](https://github.com/apache/ozone/actions/runs/9847179268) is similar to that of [master (https://github.com/apache/ozone/actions/runs/9847590592)](https://github.com/apache/ozone/actions/runs/9847590592): 1hr15m vs. 1hr15m. + +## 9. possible incompatible changes/used feature flag + +A number of feature flags were introduced: + +```text +ozone.client.incremental.chunk.list +ozone.client.stream.putblock.piggybacking +ozone.incremental.chunk.list +``` + +Additionally, new Datanode layout version "HBASE_SUPPORT" was added. A Datanode wire protocol version COMBINED_PUTBLOCK_WRITECHUNK_RPC was added too. + +## 10. third party dependencies/license changes + +N/A, no new dependencies were introduced. + +## 11. performance + +The changes do not impact performance. If the new API is used, an additional parameter is passed and check on the server side. If the feature is not used, the code path is unchanged. No new locks or expensive checks have been added to facilitate the new feature. + +## 12. security considerations + +N/A. New method was added to the existing OM client API. diff --git a/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/13-hdds-7733-symmetric-key-tokens.md b/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/13-hdds-7733-symmetric-key-tokens.md new file mode 100644 index 0000000000..4fa6cab277 --- /dev/null +++ b/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/13-hdds-7733-symmetric-key-tokens.md @@ -0,0 +1,58 @@ +# HDDS-7733: Symmetric Key Tokens + +In secure mode, Ozone issues tokens to authorize each block (and container) access. Each token is signed by Ozone (OM or SCM) using its RSA private keys and verified by Datanodes using a public key and certificate. With the RSA private key size of 2048, the sign operation is very costly and contributes more than 80% to the latency of read/write operations in Ozone Manager. + +This feature branch contains the implementation to replace RSA token signing with symmetric keys and thus greatly boosts OM performance. + +Epic Jira: [HDDS-7733](https://issues.apache.org/jira/browse/HDDS-7733) + +## 1. builds/intermittent test failures + +The feature branch has introduced no additional flaky tests. + +## 2. documentation + +The documentation for this change is added by [HDDS-8631](https://issues.apache.org/jira/browse/HDDS-8631) . + +## 3. design, attached the docs + +Design document can be found [here](https://issues.apache.org/jira/secure/attachment/13055974/Symmetric%20Key%20For%20Ozone%20Token%20Signatures-1.pdf) (can be found in the epic Jira). + +## 4. S3 compatibility + +This feature branch doesn't break any S3 feature. + +## 5. Docker-compose / acceptance tests + +Block/container tokens are already tested well by the existing acceptance tests. + +## 6. support of containers / Kubernetes + +No addition yet. + +## 7. coverage/code quality + +**[Master](https://sonarcloud.io/summary/overall?id=hadoop-ozone&branch=master) and [HDDS-7733-Symmetric-Tokens](https://sonarcloud.io/summary/new_code?id=hadoop-ozone&branch=HDDS-7733-Symmetric-Tokens)** + +## 8. build time + +No significant change + +- current master: **[1h 43m 9s](https://github.com/apache/ozone/actions/runs/4996624349)** +- last merged commit on branch: **[1h 40m 17s](https://github.com/apache/ozone/actions/runs/4963516058)** + +## 9. possible incompatible changes/used feature flag + +There is no incompatible change introduced with this feature. + +## 10. third party dependencies/licence changes + +No new dependency is added. + +## 11. performance + +Performance improvements are documented by [HDDS-8574](https://issues.apache.org/jira/browse/HDDS-8574) + +## 12. security considerations + +- This feature branch adds a couple of new APIs to allow OM and Datanode to access secret keys in OM. Those APIs are protected over Hadoop RPC secure line, with Privacy enabled. Also, the APIs are authorized to only allow Datanode and OM to access, ref: [HDDS-8164](https://issues.apache.org/jira/browse/HDDS-8164) . diff --git a/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/14-hdds-10239-container-reconciliation.md b/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/14-hdds-10239-container-reconciliation.md new file mode 100644 index 0000000000..1c9088d605 --- /dev/null +++ b/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/14-hdds-10239-container-reconciliation.md @@ -0,0 +1,84 @@ +# HDDS-10239: Container Reconciliation + +Storage Container Reconciliation Epic: [HDDS-10239](https://issues.apache.org/jira/browse/HDDS-10239) +Git Branch: https://github.com/apache/ozone/tree/HDDS-10239-container-reconciliation + +## 1. Builds/intermittent test failures + +All the CI checks need to pass before merging any commit. No additional flaky tests have been introduced by the feature branch. + +## 2. Documentation + +Documentation is a WIP, which is tracked here: [HDDS-13039](https://issues.apache.org/jira/browse/HDDS-13039) + +## 3. Design, attached the docs + +The design document is added to the [master via pull request](https://github.com/apache/ozone/pull/6121) + +Design Document: [Hadoop-HDDs/docs/content/design/container-reconciliation.md](https://github.com/apache/ozone/blob/HDDS-10239-container-reconciliation/hadoop-hdds/docs/content/design/container-reconciliation.md) + +## 4. S3 compatibility + +N/A, S3 compatibility remains the same. Container Reconciliation only affects the block storage layer. + +## 5. Docker-compose / Acceptance tests + +New acceptance tests were added to test the container reconciliation process [HDDS-10372](https://issues.apache.org/jira/browse/HDDS-10372) + +The acceptance test mainly tests the SCM CLI for container reconciliation. It does not test fault injection. + +More comprehensive tests with fault injection were added as part of unit tests and integration tests. + +## 6. Support of containers / Kubernetes + +No addition. No change in existing support. + +## 7. Coverage / Code quality + +[New code coverage](https://sonarcloud.io/summary/new_code?id=hadoop-ozone&branch=HDDS-10239-container-reconciliation) for Storage Container Reconciliation (HDDS-10239) is 89.2, and [Overall code coverage](https://sonarcloud.io/summary/overall?id=hadoop-ozone&branch=HDDS-10239-container-reconciliation) is 78.5 + +[Overall code coverage](https://sonarcloud.io/summary/overall?id=hadoop-ozone&branch=master) for master is 77.9 + +## 8. Build time + +[Build time of the latest commit](https://github.com/apache/ozone/actions/runs/15448452385/job/43484284706) from the reconciliation branch is 11m 51s + +[Build time of the latest commit](https://github.com/apache/ozone/actions/runs/15461566472/job/43523882533) from the master branch is 11m 37s + +The build time is similar to that of master. + +## 9. Possible incompatible changes/used feature flag + +There is no feature flag for reconciliation. There are no modifications to existing container data or metadata unless an admin manually runs the `ozone admin container reconcile` command. Container checksums and merkle trees will be generated and persisted by the container scanner and reported to SCM, but without an admin command the system will not act on this information. + +There is a possibility that some orphan blocks may appear when containers created before this feature are reconciled. Reconciliation tracks the blocks that were deleted from each container so that the data checksum of the container does not change after it is closed. Containers that deleted blocks before the reconciliation changes were present did not register these deletions in the persisted merkle tree. The following order of events is possible, where software v1 does not have the reconciliation feature and version v2 does: + +1. Block 1 is deleted from container replica 1 in v1. +2. Cluster is upgraded to v2. +3. Container replicas 1 and 2 will report different data checksums until replica 2 also deletes block 1 +4. Container replicas 1 and 2 are reconciled, and replica 2 has still not yet deleted block 1. + - During reconciliation replica 1 will see that it does not have block 1 that replica 2 has, and this block is not tracked in either replica's deleted block list. + - Replica 1 will re-ingest block 1 from replica 2, since it has no record of the block being deleted. +5. Replica 1 now contains block 1 again and deletion will not be retried from SCM. +6. Replica 2 then deletes block 1 as part of the normal block deletion flow and persists the result to the merkle tree. Only replica 1 has the orphan block. + +In the above example, if replicas 1 and 2 are reconciled again replica 2 will ignore block 1 from replica 1 since it knows the block was deleted. Currently replica 1 will not take action to remove block 1 once it learns replica 2 deleted it. This will be handled by HDDS-11765, such that a subsequent reconciliation cleans up the orphan block. + +Integrity of existing data is a bigger problem than presence of orphan data. Therefore we feel that the reconciliation branch should still be merged with this case outstanding so admins have the option to repair containers even with this caveat. Orphan data arising from this issue at any time can be fixed by HDDS-11765. + +If the `ozone admin container reconcile` command is never used then this issue will not happen. + +## 10. Third-party dependencies/License changes + +No new dependencies are added. + +## 11. Performance + +This feature might impact the startup time of the Datanode, as we are reading a Merkle tree file for each container during the startup. This will be resolved by [HDDS-12824](https://issues.apache.org/jira/browse/HDDS-12824) . All other reconciliation operations are asynchronous and designed to run at the same time as other container operations. If parallel operations cause containers to diverge (replication of a replica while it is being reconciled, for example), another round of reconciliation will bring the replica up to date with its peers. + +## 12. Security considerations + +- The SCM CLI, which is used to trigger reconciliation, is only accessible to the admins. [There is a specific robot test to verify this](https://github.com/apache/ozone/blob/47c1eaa1eb903175c1729186da48b7bffda85bd8/hadoop-ozone/dist/src/main/smoketest/admincli/container.robot#L172). +- An API on the Datanode was added to read the merkle tree from peer Datanodes. It uses a container token for authorization. [There are specific integration tests for container token verification](https://github.com/apache/ozone/blob/47c1eaa1eb903175c1729186da48b7bffda85bd8/hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/container/ozoneimpl/TestOzoneContainerWithTLS.java#L233). +- All the other APIs used in this feature are existing secure APIs. +- All the RPC calls within the cluster in this feature are secure and use secure container/block tokens for each call. diff --git a/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/15-hdds-10656-atomic-key-overwrite.md b/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/15-hdds-10656-atomic-key-overwrite.md new file mode 100644 index 0000000000..2bcd3f517e --- /dev/null +++ b/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/15-hdds-10656-atomic-key-overwrite.md @@ -0,0 +1,54 @@ +# HDDS-10656: Atomic Key Overwrite + +Epic: [HDDS-10656](https://issues.apache.org/jira/browse/HDDS-10656) +Feature branch: https://github.com/apache/ozone/tree/HDDS-10656-atomic-key-overwrite + +## 1. builds/intermittent test failures + +Intermittent test failures observed on the branch are the same as on master. Currently most of these are due to [HDDS-10750](https://issues.apache.org/jira/browse/HDDS-10750), which will be fixed by Ratis 3.1.0. + +Ran full [CI for the latest commit](https://github.com/apache/ozone/actions/runs/9624466946) on the feature branch 4 times. The first run had an unrelated failure, all tests passed in the following 3 runs. + +## 2. documentation + +The new API is for developers to build upon. Not intended for end-users or administrators. + +## 3. design, attached the docs + +Design doc added on the master branch ([HDDS-10657](https://issues.apache.org/jira/browse/HDDS-10657), [pull request](https://github.com/apache/ozone/pull/6482)). + +## 4. S3 compatibility + +N/A, S3 behavior was not changed. + +## 5. Docker-compose / acceptance tests + +Robot test was added in [HDDS-10947](https://issues.apache.org/jira/browse/HDDS-10947) to cover rewrite of key created via multipart upload. Other use cases are covered by integration tests. + +## 6. support of containers / Kubernetes + +N/A, the new request does not affect support of containers. + +## 7. coverage/code quality + +Current coverage is ~58% for both [master](https://sonarcloud.io/project/activity?custom_metrics=coverage&graph=custom&id=hadoop-ozone) and the [feature branch](https://sonarcloud.io/project/activity?id=hadoop-ozone&graph=custom&custom_metrics=coverage&branch=HDDS-10656-atomic-key-overwrite). + +## 8. build time + +Build time in CI of the latest commit on the [feature branch (54f151946cc349087bf73de04aa85a5d128f4584)](https://github.com/apache/ozone/actions/runs/9624466946/job/26551824083) is similar to that of [master (9f1f7ed23801f219a41d9dd9283cc6fdf57381c8)](https://github.com/apache/ozone/actions/runs/9608208083/job/26500618875): 19:02 vs. 18:36. + +## 9. possible incompatible changes/used feature flag + +A new OM version number was introduced to prevent new client sending atomic key overwrite request to old OM which does not support this feature. + +## 10. third party dependencies/licence changes + +N/A, no new dependencies were introduced. + +## 11. performance + +The changes do not impact performance. If the new API is used, an additional parameter is passed and check on the server side. If the feature is not used, the code path is unchanged. No new locks or expensive checks have been added to facilitate the new feature. + +## 12. security considerations + +N/A. New method was added to the existing OM client API. diff --git a/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/README.mdx b/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/README.mdx new file mode 100644 index 0000000000..6003afba4d --- /dev/null +++ b/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/README.mdx @@ -0,0 +1,15 @@ +--- +sidebar_label: Merged Feature Branches +--- + +# Merged Feature Branches + +import DocCardList from '@theme/DocCardList'; + +This section contains completed merge checklists for past Apache Ozone feature branches that are now merged into `master`. + +:::info +These checklists are accurate only up to when each feature branch was merged. They are not updated retroactively as feature development may continue on the `master` branch. +::: + + diff --git a/docs/08-developer-guide/04-project/01-git/03-feature-branches/README.mdx b/docs/08-developer-guide/04-project/01-git/03-feature-branches/README.mdx new file mode 100644 index 0000000000..2adfaa38e2 --- /dev/null +++ b/docs/08-developer-guide/04-project/01-git/03-feature-branches/README.mdx @@ -0,0 +1,7 @@ +# Feature Branches + +import DocCardList from '@theme/DocCardList'; + +This section documents Ozone's usage of git feature branches for development off of the `master` branch. + + diff --git a/docs/08-developer-guide/04-project/01-git/README.mdx b/docs/08-developer-guide/04-project/01-git/README.mdx new file mode 100644 index 0000000000..f99952108e --- /dev/null +++ b/docs/08-developer-guide/04-project/01-git/README.mdx @@ -0,0 +1,12 @@ +--- +sidebar_label: Git +--- + +# Git Usage + +import DocCardList from '@theme/DocCardList'; + +This section documents usage of git branches and tags within the Apache Ozone repo. + + + From b491ba6749200fb9de07a3d88d175828cfaba3a8 Mon Sep 17 00:00:00 2001 From: Ethan Rose Date: Mon, 15 Jun 2026 19:57:36 -0400 Subject: [PATCH 2/3] Fix whitespace from linter --- docs/08-developer-guide/04-project/01-git/README.mdx | 1 - 1 file changed, 1 deletion(-) diff --git a/docs/08-developer-guide/04-project/01-git/README.mdx b/docs/08-developer-guide/04-project/01-git/README.mdx index f99952108e..ac7feb2963 100644 --- a/docs/08-developer-guide/04-project/01-git/README.mdx +++ b/docs/08-developer-guide/04-project/01-git/README.mdx @@ -9,4 +9,3 @@ import DocCardList from '@theme/DocCardList'; This section documents usage of git branches and tags within the Apache Ozone repo. - From 1731b512efade46e25a8e652dd8818769bf9713b Mon Sep 17 00:00:00 2001 From: Ethan Rose Date: Mon, 15 Jun 2026 20:25:47 -0400 Subject: [PATCH 3/3] Fix spelling issues from imported content --- cspell.yaml | 3 +++ .../01-git/03-feature-branches/01-overview.md | 4 ++-- .../01-hdds-2939-filesystem-optimizations.md | 4 ++-- .../03-hdds-3698-non-rolling-upgrade.md | 18 +++++++++--------- .../04-hdds-3816-erasure-coding.md | 2 +- .../05-hdds-4440-s3g-grpc-connections.md | 12 ++++++------ .../06-hdds-4454-streaming-write-pipeline.md | 1 + .../07-hdds-4944-s3-multi-tenancy.md | 4 ++-- .../03-merged-branches/09-hdds-5447-httpfs.md | 2 +- .../10-hdds-5713-disk-balancer.md | 6 +++--- .../11-hdds-6517-snapshots.md | 4 ++-- .../12-hdds-7593-hsync-lease-recovery.md | 6 +++--- .../13-hdds-7733-symmetric-key-tokens.md | 2 +- .../15-hdds-10656-atomic-key-overwrite.md | 2 +- 14 files changed, 37 insertions(+), 33 deletions(-) diff --git a/cspell.yaml b/cspell.yaml index 6e0fbeb65e..f5fc66af16 100644 --- a/cspell.yaml +++ b/cspell.yaml @@ -217,6 +217,9 @@ words: - minimising - tarballed - Namenodes +- WIP +- smoketest +- microbenchmark # Apache Ozone community member names - Sumit # Company names for "Who Uses Ozone" page diff --git a/docs/08-developer-guide/04-project/01-git/03-feature-branches/01-overview.md b/docs/08-developer-guide/04-project/01-git/03-feature-branches/01-overview.md index a2c33b5ca4..6b5b051998 100644 --- a/docs/08-developer-guide/04-project/01-git/03-feature-branches/01-overview.md +++ b/docs/08-developer-guide/04-project/01-git/03-feature-branches/01-overview.md @@ -12,11 +12,11 @@ Use a feature branch for changes that: - Make iterative changes to core code paths. - Require broader community testing. - Cannot be easily gated with a feature flag. - - The covers changaes that migrate existing code paths instead of adding completely new ones. + - The covers changes that migrate existing code paths instead of adding completely new ones. - Would have issues if a release was cut in the middle of their development. - A release branch can be cut from `master` at any time and feature development should not block this. - If a feature has upgrade compatibility concerns that will not be addressed right away, it should be developed on a feature branch. - - Note that protobuf messages and wire protocols become locked into compatability guarantees once they are released. + - Note that protobuf messages and wire protocols become locked into compatibility guarantees once they are released. - If a feature is making changes in this area and it wants to keep the structure flexible while it is being finalized, it should be done on a feature branch. diff --git a/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/01-hdds-2939-filesystem-optimizations.md b/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/01-hdds-2939-filesystem-optimizations.md index d0b43b350d..3d4a8da9a6 100644 --- a/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/01-hdds-2939-filesystem-optimizations.md +++ b/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/01-hdds-2939-filesystem-optimizations.md @@ -94,7 +94,7 @@ For using this feature, "ozone.OM.metadata.layout" config needs to be set to be The new metadata layout is supported only in a fresh cluster and the layout detail is stored in per-bucket. Presently, both old and new metadata layout buckets can't co-exists in the same cluster. User can't start OM in new layout(prefix) if there are existing old layout buckets(simple) and vice-versa. Work is in progress to support the existing old buckets to be available in new layout, this will be supported in the next development phase. -## 10. third party dependencies/licence changes +## 10. third party dependencies/license changes No new dependencies are added. @@ -102,7 +102,7 @@ No new dependencies are added. Done testing to evaluate the performance of delete, rename operations in feature branch vs master code base. Following charts capturing the directory delete and rename operations execution time shows that, feature branch has a very significant performance gain compared to the master. -Ran freon '*dtsg' dfs tree generator benchmark test* in a single node cluster. V0 represents master code(simple) and V1 represents feature branch(prefix). Please refer to the [Jira document](https://issues.apache.org/jira/secure/attachment/13023395/Performance+Comparison+Between++Master+and+HDDS-2939+branch-Report-001.pdf) for more details. +Ran `freon dtsg` dfs tree generator benchmark test in a single node cluster. V0 represents master code(simple) and V1 represents feature branch(prefix). Please refer to the [Jira document](https://issues.apache.org/jira/secure/attachment/13023395/Performance+Comparison+Between++Master+and+HDDS-2939+branch-Report-001.pdf) for more details. ## 12. security considerations diff --git a/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/03-hdds-3698-non-rolling-upgrade.md b/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/03-hdds-3698-non-rolling-upgrade.md index a4203f3406..3e4359b76a 100644 --- a/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/03-hdds-3698-non-rolling-upgrade.md +++ b/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/03-hdds-3698-non-rolling-upgrade.md @@ -2,21 +2,21 @@ ## 1. stable builds/intermittent test failures -- The HDDS-3698-nonrolling-upgrade branch has no intermittent test failures. -- Most intermittent failures specific to the HDDS-3698-nonrolling-upgrade branch were tracked and resolved in [HDDS-4833](https://issues.apache.org/jira/browse/HDDS-4833). +- The `HDDS-3698-nonrolling-upgrade` branch has no intermittent test failures. +- Most intermittent failures specific to the `HDDS-3698-nonrolling-upgrade` branch were tracked and resolved in [HDDS-4833](https://issues.apache.org/jira/browse/HDDS-4833). - Other intermittent failures have been resolved in [HDDS-5109](https://issues.apache.org/jira/browse/HDDS-5109) and [HDDS-5336](https://issues.apache.org/jira/browse/HDDS-5336). ## 2. documentation -- *Hadoop-HDDs/docs/feature/design/how-to-do-a-nonrolling-upgrade.md* contains instructions for users to upgrade a cluster using the framework. +- `hadoop-hdds/docs/feature/design/how-to-do-a-nonrolling-upgrade.md` contains instructions for users to upgrade a cluster using the framework. - Documentation will be refined in coming weeks, before it is needed in the 1.2.0 release. ## 3. design, attached the docs -- *Hadoop-HDDs/docs/content/design/upgrade-dev-primer.md* contains instructions for developers who need to add a feature using the upgrade framework. -- *Hadoop-HDDs/docs/content/design/nonrolling-upgrade.md* contains links to the main design document and presentation. -- *Hadoop-HDDs/docs/content/design/omprepare.md* contains links and summary for OM preparation design document. +- `Hadoop-hdds/docs/content/design/upgrade-dev-primer.md` contains instructions for developers who need to add a feature using the upgrade framework. +- `hadoop-hdds/docs/content/design/nonrolling-upgrade.md` contains links to the main design document and presentation. +- `hadoop-hdds/docs/content/design/omprepare.md` contains links and summary for OM preparation design document. ## 4. S3 compatibility @@ -82,8 +82,8 @@ There are no incompatible changes and no feature flags. The upgrade framework wi - The following dependencies have been added: - - aspectjrt-1.8.9.jar - - aspectjweaver-1.8.9.jar + - `aspectjrt-1.8.9.jar` + - `aspectjweaver-1.8.9.jar` - reflections-0.9.12.jar - All new libraries have compatible licenses. (License file update: HDDS-5137) @@ -98,7 +98,7 @@ There are no incompatible changes and no feature flags. The upgrade framework wi `ozone freon rk --num-of-keys=100 --num-of-buckets=10 --num-of-volumes=1 --replication-type=RATIS --factor=THREE` -HDDS-3698-nonrolling-upgrade: +`HDDS-3698-nonrolling-upgrade`: ```text diff --git a/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/04-hdds-3816-erasure-coding.md b/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/04-hdds-3816-erasure-coding.md index eeb3b3071a..41e752ffd4 100644 --- a/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/04-hdds-3816-erasure-coding.md +++ b/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/04-hdds-3816-erasure-coding.md @@ -78,7 +78,7 @@ Compatibility Changes: Currently forward compatibility broken due to the introdu We are tracking down the above issues before the merge. -## 10. third party dependencies/licence changes +## 10. third party dependencies/license changes No new dependencies are added. diff --git a/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/05-hdds-4440-s3g-grpc-connections.md b/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/05-hdds-4440-s3g-grpc-connections.md index 3b10e8e4b3..01d9998160 100644 --- a/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/05-hdds-4440-s3g-grpc-connections.md +++ b/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/05-hdds-4440-s3g-grpc-connections.md @@ -26,11 +26,11 @@ This holds true except in the event the initial persistent connection cannot be ### 5. Docker-compose / acceptance tests -Added enabling Ozone Manager gRPC server service to each Docker-config for the development clusters: Ozone, ozonesecure, Ozone-ha and ozonesecure-ha. +Added enabling Ozone Manager gRPC server service to each Docker-config for the development clusters: `ozone`, `ozonesecure`, `ozone-ha` and `ozonesecure-ha`. To test the S3 Gateway performance persistent connection gRPC feature with Docker-compose / acceptance tests. Add the following configuration key settings to the Docker-compose.yaml : OM container - [OZONE-SITE.XML_ozone.OM](http://OZONE-SITE.XML_ozone.om).S3.gRPC.server_enabled: "true" & s3g container - [OZONE-SITE.XML_ozone.](http://OZONE-SITE.XML_ozone.om)OM.transport.class : "[org.apache.Hadoop.ozone.OM](http://org.apache.hadoop.ozone.om).protocolPB.GrpcOmTransportFactory". -Then run acceptance tests, test.sh, for development cluster configured including Ozone, ozonesecure, Ozone-ha and ozonesecure-ha. +Then run acceptance tests, test.sh, for development cluster configured including `ozone`, `ozonesecure`, `ozone-ha` and `ozonesecure-ha`. Also, with development cluster configured for s3g gRPC can load test the S3 Gateway using the endpoint, localhost:9878. Load testers used include freon and warp. @@ -52,7 +52,7 @@ Code coverage is nearly unchanged from master to feature branch, from 76.5% to 7 **Current HDDS-4440-S3-performance feature branch build time:** . -Building on a local machine with ubuntu linux six-core i5 Coffee Lake and 64Gb Ram (\$ mvn clean install -DskipTests): +Building on a local machine with ubuntu linux six-core i5 Coffee Lake and 64Gb Ram (`mvn clean install -DskipTests`): | | | |----------------------------------------------------------|---------------| @@ -79,14 +79,14 @@ For the S3-performance gRPC feature, network transport related jars are added to | | |-----------------------------------------------| | Added to License.txt | -| \+ io.Netty:netty-tcnative-boringssl-static | -| \+ io.Netty:netty-tcnative | +| `io.netty:netty-tcnative-boringssl-static` | +| `io.netty:netty-tcnative` | ### 11. performance We compare the performance of the S3 Gateway using the gRPC persistent connection with TLS to the existing Hadoop RPC, hRPC connections with encryption on the wire for metadata requests. We find that in load testing the S3 performance feature branch with gRPC and encryption on the wire outperforms the existing hRPC connection ***both*** encrypted and in plaintext. This is particularly evident in the comparison of gRPC with TLS to encrypted wire Hadoop RPC where the increase is greater than 2X. -| # | s3g Transport Type | Description | Load Test Performance for Metadata throughput, Objects / sec (objs/sec) | +| # | s3g Transport Type | Description | Load Test Performance for Metadata throughput, Objects / sec (objects/sec) | |---|---|---|---| | 1 | gRPC TLS (feature branch) | s3g ↔︎ Ozone Manager connection over gRPC with encryption on the wire, TLS. Persistent connection. | 9026.12 | | 2 | hRPC plaintext (current) | s3g ↔︎ Ozone Manager connection over Hadoop RPC plaintext. Persistent connection (HDDS-5881). | 6508.85 | diff --git a/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/06-hdds-4454-streaming-write-pipeline.md b/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/06-hdds-4454-streaming-write-pipeline.md index d8b2434bc3..ffb2a56b96 100644 --- a/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/06-hdds-4454-streaming-write-pipeline.md +++ b/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/06-hdds-4454-streaming-write-pipeline.md @@ -1,3 +1,4 @@ + # HDDS-4454: Streaming Write Pipeline Git branch: diff --git a/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/07-hdds-4944-s3-multi-tenancy.md b/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/07-hdds-4944-s3-multi-tenancy.md index 5faaa79b31..a3cf267026 100644 --- a/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/07-hdds-4944-s3-multi-tenancy.md +++ b/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/07-hdds-4944-s3-multi-tenancy.md @@ -78,7 +78,7 @@ The design docs can be found under the Attachments section in the umbrella Jira: ## 4. S3 compatibility -S3 multi-tenancy feature does not break any existing S3 API compatiblity. And all S3 secret key pairs generated with the existing `ozone s3 getsecret` command can still be used the same way (still confined to default s3v volume) after the OM upgrade. +S3 multi-tenancy feature does not break any existing S3 API compatibility. And all S3 secret key pairs generated with the existing `ozone s3 getsecret` command can still be used the same way (still confined to default s3v volume) after the OM upgrade. ## 5. Docker-compose / acceptance tests @@ -106,7 +106,7 @@ There should not be any incompatible changes introduced with this feature. A global enable/disable switch for the S3 multi-tenancy feature is to be added in [HDDS-6612](https://issues.apache.org/jira/browse/HDDS-6612) . -## 10. third party dependencies/licence changes +## 10. third party dependencies/license changes [HDDS-5836](https://issues.apache.org/jira/browse/HDDS-5836) Ranger Java client would include new dependency `org.apache.ranger.ranger-intg` diff --git a/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/09-hdds-5447-httpfs.md b/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/09-hdds-5447-httpfs.md index e54307a518..bc79125167 100644 --- a/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/09-hdds-5447-httpfs.md +++ b/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/09-hdds-5447-httpfs.md @@ -50,7 +50,7 @@ no significant change there should not be any incompatible changes introduced with this feature. -## 10. third party dependencies/licence changes +## 10. third party dependencies/license changes these dependencies were added: diff --git a/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/10-hdds-5713-disk-balancer.md b/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/10-hdds-5713-disk-balancer.md index 443dcd16ad..7ebb592ab9 100644 --- a/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/10-hdds-5713-disk-balancer.md +++ b/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/10-hdds-5713-disk-balancer.md @@ -25,7 +25,7 @@ N/A, S3 compatibility remains the same. DiskBalancer only affects the Data Volum ## 5. Docker-compose / Acceptance tests -New robot test [testdiskbalancer.robot](https://github.com/apache/ozone/blob/HDDS-5713/hadoop-ozone/dist/src/main/smoketest/diskbalancer/testdiskbalancer.robot) is being added. +New robot test [`testdiskbalancer.robot`](https://github.com/apache/ozone/blob/HDDS-5713/hadoop-ozone/dist/src/main/smoketest/diskbalancer/testdiskbalancer.robot) is being added. New acceptance test are added, mainly tests the CLI for DiskBalancer. It does not test fault injection. @@ -55,7 +55,7 @@ To enable the feature, the following configs need to be added to DN Ozone-site.x | Property | Default | Tags | Description | |---|---|---|---| -| `hdds.datanode.disk.balancer.enabled` | `false` | OZONE, Datanode, DISKBALANCER | If this property is set to true, then the Disk Balancer service is enabled on Datanodes, and users can use this service. By default, this is disabled. | +| `hdds.datanode.disk.balancer.enabled` | `false` | `OZONE, Datanode, DISKBALANCER` | If this property is set to true, then the Disk Balancer service is enabled on Datanodes, and users can use this service. By default, this is disabled. | ## 10. Third-party dependencies/License changes @@ -67,7 +67,7 @@ The major work flow of DiskBalancer on Datanode, is first select a pair of data Except the move container part, we did the microbenchmark performance testing for volume pair choosing(VolumeChoosingPolicy), and container choosing(ContainerChoosingPolicy) . VolumeChoosingPolicy, it chooses a pair of volumes which will act as source volume(most used) and destination volume(least used). ContainerChoosingPolicy, it decides which container to move from an over-utilized disk to least-utilized to help balance storage across volumes. -Performance test for ContainerChoosingPolicy is done by [HDDS-13055. Optimise ContainerChoosingPolicy Performance](https://issues.apache.org/jira/browse/HDDS-13055) . The test shows it takes approx 0.02ms to pick one container. +Performance test for ContainerChoosingPolicy is done by [HDDS-13055. Optimize ContainerChoosingPolicy Performance](https://issues.apache.org/jira/browse/HDDS-13055) . The test shows it takes approx 0.02ms to pick one container. Performance test for VolumeChoosingPolicy is done by [HDDS-13291. Add Performance test for VolumeChoosingPolicy](https://issues.apache.org/jira/browse/HDDS-13291) . It shows it takes approx 0.12ms to pick one pair of volume. diff --git a/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/11-hdds-6517-snapshots.md b/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/11-hdds-6517-snapshots.md index 7eaf06fe8e..ff2710c5ec 100644 --- a/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/11-hdds-6517-snapshots.md +++ b/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/11-hdds-6517-snapshots.md @@ -46,13 +46,13 @@ There should not be any incompatible changes introduced with this feature. Snapshot feature will be a layout upgrade in new releases and can be used after upgrade finalization. This will be tracked through Jira [HDDS-7772](https://issues.apache.org/jira/browse/HDDS-7772) -## 10. third party dependencies/licence changes +## 10. third party dependencies/license changes NA ## 11. performance -The feature won't affect performance if the Snapshot feature is not in use. When snapshots are used, the performance can get impacted proportionate to the number of snapshots that are actively read from and the number of concurrent snapdiff operations, +The feature won't affect performance if the Snapshot feature is not in use. When snapshots are used, the performance can get impacted proportionate to the number of snapshots that are actively read from and the number of concurrent `snapdiff` operations, ## 12. security considerations diff --git a/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/12-hdds-7593-hsync-lease-recovery.md b/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/12-hdds-7593-hsync-lease-recovery.md index 6e939486a8..bcee2145f4 100644 --- a/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/12-hdds-7593-hsync-lease-recovery.md +++ b/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/12-hdds-7593-hsync-lease-recovery.md @@ -15,7 +15,7 @@ The new API is for developers to build upon. Not intended for end-users or admin The design and architecture spans across multiple design docs -[Design doc: Supporting Hflush and lease recovery](https://docs.google.com/document/d/1KcB9qjIe6vEg7iRu4rFsHE5kTj6A1JCaJ-Q2PKWLpGw/edit?usp=sharing) +[Design doc: Supporting `Hflush` and lease recovery](https://docs.google.com/document/d/1KcB9qjIe6vEg7iRu4rFsHE5kTj6A1JCaJ-Q2PKWLpGw/edit?usp=sharing) [Ozone File Lease Recovery Protocol Detail Design](https://docs.google.com/document/d/1wS0dVL3huManP8OrKl-sBjxE5vFVyeW4XEFdHHIctO4/edit?usp=sharing) @@ -23,7 +23,7 @@ The design and architecture spans across multiple design docs ## 4. S3 compatibility -S3 behaviour was not changed. +S3 behavior was not changed. The new APIs (hsync, recoverLease, ...) are Hadoop file system APIs and are not supported by S3. @@ -53,7 +53,7 @@ ozone.client.stream.putblock.piggybacking ozone.incremental.chunk.list ``` -Additionally, new Datanode layout version "HBASE_SUPPORT" was added. A Datanode wire protocol version COMBINED_PUTBLOCK_WRITECHUNK_RPC was added too. +Additionally, new Datanode layout version "HBASE_SUPPORT" was added. A Datanode wire protocol version `COMBINED_PUTBLOCK_WRITECHUNK_RPC` was added too. ## 10. third party dependencies/license changes diff --git a/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/13-hdds-7733-symmetric-key-tokens.md b/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/13-hdds-7733-symmetric-key-tokens.md index 4fa6cab277..a365fc1c14 100644 --- a/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/13-hdds-7733-symmetric-key-tokens.md +++ b/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/13-hdds-7733-symmetric-key-tokens.md @@ -45,7 +45,7 @@ No significant change There is no incompatible change introduced with this feature. -## 10. third party dependencies/licence changes +## 10. third party dependencies/license changes No new dependency is added. diff --git a/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/15-hdds-10656-atomic-key-overwrite.md b/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/15-hdds-10656-atomic-key-overwrite.md index 2bcd3f517e..2656a4df6a 100644 --- a/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/15-hdds-10656-atomic-key-overwrite.md +++ b/docs/08-developer-guide/04-project/01-git/03-feature-branches/03-merged-branches/15-hdds-10656-atomic-key-overwrite.md @@ -41,7 +41,7 @@ Build time in CI of the latest commit on the [feature branch (54f151946cc349087b A new OM version number was introduced to prevent new client sending atomic key overwrite request to old OM which does not support this feature. -## 10. third party dependencies/licence changes +## 10. third party dependencies/license changes N/A, no new dependencies were introduced.