Skip to content

fix: port wave CI release flow with non-enterprise jibRepo fix SEC-1395#1046

Merged
ShahzebMahmood merged 13 commits into
masterfrom
sm/fix-wave-internal-ecr-push-SEC-1395
May 27, 2026
Merged

fix: port wave CI release flow with non-enterprise jibRepo fix SEC-1395#1046
ShahzebMahmood merged 13 commits into
masterfrom
sm/fix-wave-internal-ecr-push-SEC-1395

Conversation

@ShahzebMahmood

@ShahzebMahmood ShahzebMahmood commented May 19, 2026

Copy link
Copy Markdown
Contributor

re-applies #1039 (reverted in 7ad68e9) with the bug fix that caused the revert.

what broke: the non-enterprise gradle call was missing -PjibRepo, so JIB tagged the image as wave/app:$version (no v prefix) but the script then tried docker tag wave/app:$TAG server:$TAG where $TAG is v$version. mismatch, release run 25995019680 died.

fix: per @bebosudo's suggestions, build the image once with a generic name (wave/server:$TAG) and let the workflow handle ECR retagging/pushing. dropped the internal ECR push since wave doesn't need internal images.

gating uses paolo's detect-release output from ff8c7af.

SEC-1395

@ShahzebMahmood ShahzebMahmood changed the title fix: port wave CI to internal ECR with non-enterprise jibRepo fix + dev snapshot push SEC-1395 fix: port wave CI to internal ECR with non-enterprise SEC-1395 May 19, 2026
@ShahzebMahmood ShahzebMahmood self-assigned this May 19, 2026
@ShahzebMahmood

Copy link
Copy Markdown
Contributor Author

@claude

@claude

claude Bot commented May 19, 2026

Copy link
Copy Markdown

Claude finished @ShahzebMahmood's task —— View job


PR Review Analysis

Todo List:

  • Read and analyze workflow file changes
  • Review shell script modifications
  • Validate JIB tagging fix
  • Check development push implementation
  • Provide comprehensive review feedback

Code Analysis

Files Modified:

  1. .github/workflows/build.yml - Added development push workflow (+128 lines)
  2. tag-and-push.sh - Fixed JIB repo parameter (+9/-4 lines)

Review Findings

✅ Critical Bug Fix - JIB Tagging (tag-and-push.sh:48)

Problem: Missing -PjibRepo parameter caused version tag mismatch

# Before (broken):
./gradlew jibDockerBuild  # Created: wave/app:1.33.5
# After (fixed):  
./gradlew -PjibRepo=195996028523.dkr.ecr.eu-west-1.amazonaws.com/wave/app:$TAG jibDockerBuild  # Creates: wave/app:v1.33.5

Analysis: This correctly fixes the root cause where JIB was tagging without the v prefix, breaking the subsequent docker tag commands that expected v$version format.

✅ Development Push Implementation (lines 195-245)

New Feature: Automatic development snapshots for every master commit

  • Target: internal/seqeralabs/wave/server/development:<short-sha>
  • Trigger Logic: github.event_name == 'push' && steps.detect-release.outputs.is_release != 'true'
  • Tagging: Uses ${GITHUB_SHA::7} for short SHA tags
  • Security: Follows same AWS credential chain pattern as releases

Security & Best Practices Assessment

✅ Strengths

  1. Proper Conditionals: Development push only runs on push events, not PRs
  2. Release Exclusion: Uses existing detect-release logic to avoid conflicts
  3. Credential Management: Reuses established AWS role chaining pattern
  4. Action Pinning: All actions pinned to specific SHAs with ratchet comments
  5. Separation of Concerns: Development and release workflows don't interfere

✅ Technical Implementation

  1. JIB Configuration: Correctly passes -PjibRepo parameter to match enterprise pattern
  2. Docker Tagging: Proper intermediate tagging before ECR push
  3. Session Names: Unique session names for AWS roles (-dev suffix)
  4. Error Handling: Uses shell: bash consistently

Recommendation: APPROVE

This PR correctly addresses the critical CI failure and adds valuable development snapshot functionality. The implementation follows established patterns and security practices. The fix for the JIB tagging issue is minimal and targeted, reducing risk of introducing new bugs.

Key Benefits:

  • Fixes broken release pipeline
  • Enables platform team to access development snapshots
  • Maintains security boundaries with proper role separation
  • Follows existing architectural patterns

@ShahzebMahmood ShahzebMahmood marked this pull request as ready for review May 19, 2026 10:50

@bebosudo bebosudo left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some suggestions

Comment thread tag-and-push.sh Outdated
Comment thread .github/workflows/build.yml Outdated
Comment thread .github/workflows/build.yml Outdated
@ShahzebMahmood ShahzebMahmood changed the title fix: port wave CI to internal ECR with non-enterprise SEC-1395 fix: port wave CI release flow with non-enterprise jibRepo fix SEC-1395 May 19, 2026

@bebosudo bebosudo left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
Could you check with the team if we could stop pushing to the wave/app legacy repo, since it should contain the same exact images as the enterprise repo?
Internally we're using it for our installations, but we could point our apps to the enterprise repo. We could then remove that in a follow up PR

@pditommaso pditommaso left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rationale of the tag-and-push.sh script is making possible to test the logic without outside Github action.

In this form makes no sense any more: either 1) remove it completely, 2) implement the full logic into the bash script

@ShahzebMahmood

Copy link
Copy Markdown
Contributor Author

The rationale of the tag-and-push.sh script is making possible to test the logic without outside Github action.

In this form makes no sense any more: either 1) remove it completely, 2) implement the full logic into the bash script

@pditommaso sounds good, in c2279c6. moved the docker tag/push for both ECRs out of build.yml and back into tag-and-push.sh so the full publish flow lives in the script and can be run locally with docker login to both registries.

@pditommaso

Copy link
Copy Markdown
Collaborator

Nice, have tried to validate locally, so we are sure it's not breaking next release?

@ShahzebMahmood

ShahzebMahmood commented May 22, 2026

Copy link
Copy Markdown
Contributor Author

ran it locally, jibDockerBuild with -PjibRepo=wave/server:vTEST builds and loads the image into local docker as wave/server:vTEST docker tag for all 3 ECR targets legacy wave/app, legacy nf-tower-enterprise/wave, central wave/server) all resolve to the same digest. skipped the actual docker push to keep ECR clean.

the GHA-only parts OIDC + AssumeRole + unset AWS_SESSION_TOKEN before publish.sh can't be exercised outside CI so those will only get verified on the next release run.

image

@pditommaso

Copy link
Copy Markdown
Collaborator

Let's make a fake release changing the VERSION to 1.33.6-TEST and pushing the commit with [release] to validate the flow end to end

https://github.com/seqeralabs/wave/blob/master/VERSION

@bebosudo bebosudo left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to move away from the tag-and-push script, and only keep the build and push logic inside GHA

jobs:
build:
name: Build Wave
if: "github.event == 'push' || github.repository != github.event.pull_request.head.repo.full_name"

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if: "github.event_name == 'push' || github.repository != github.event.pull_request.head.repo.full_name"

github.event is an object so it'll never evaluate to the string push, so the first condition of the check will never pass.
The second condition of the check is also misleading, but I'll open a separate PR for that

Comment thread .github/workflows/build.yml Outdated

- name: Configure AWS credentials for platform pusher
if: "steps.detect-release.outputs.is_release == 'true'"
uses: aws-actions/configure-aws-credentials@ec61189d14ec14c8efccab744f656cffd0e33f37 # ratchet:aws-actions/configure-aws-credentials@v6.1.0

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you give it another pass with ratchet? there have been a few newer releases

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see release v6.1.2 is out with a few fixes

@ShahzebMahmood

Copy link
Copy Markdown
Contributor Author

Hey @bebosudo, did what you said. Pulled tag-and-push into the workflow and kept publish.sh for maven. Ran ratchet over the workflow too, all action pins are clean.

Two bugs on the original PR:

  1. Role ARN was platform-ecr-pusher which only trusts platform+groundswell, not wave. Swapped to wave-ecr-pusher.
  2. wave-ecr-pusher only trusts master so I temp added the branch to its trust to test.

v1.33.6-TEST3 just went green. Same digest pushed to wave/app + nf-tower-enterprise/wave in legacy ECR and wave/server in central.

VERSION needs bumping back before merge.

@bebosudo bebosudo left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but I think ratchet upgrade should find some newer releases

Comment thread .github/workflows/build.yml Outdated

- name: Configure AWS credentials for platform pusher
if: "steps.detect-release.outputs.is_release == 'true'"
uses: aws-actions/configure-aws-credentials@ec61189d14ec14c8efccab744f656cffd0e33f37 # ratchet:aws-actions/configure-aws-credentials@v6.1.0

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see release v6.1.2 is out with a few fixes

Comment thread VERSION Outdated
ShahzebMahmood and others added 2 commits May 27, 2026 10:11
Co-authored-by: Alberto Chiusole <1922124+bebosudo@users.noreply.github.com>
@ShahzebMahmood ShahzebMahmood requested a review from pditommaso May 27, 2026 14:58
@ShahzebMahmood

Copy link
Copy Markdown
Contributor Author

Hey @pditommaso, ready for another look after your feedback.

Worked through this with Alberto. The OIDC role for the new central ECR can only be assumed in CI, not locally, so we moved build + push into the workflow instead of keeping them in tag-and-push.sh. Tested end-to-end with v1.33.6-TEST3, same digest landed in wave/app, nf-tower-enterprise/wave (legacy) and wave/server (central). VERSION back to 1.33.6, ratchet upgrade on the action pins too.

@ShahzebMahmood ShahzebMahmood merged commit 1a570db into master May 27, 2026
4 checks passed
@ShahzebMahmood ShahzebMahmood deleted the sm/fix-wave-internal-ecr-push-SEC-1395 branch May 27, 2026 15:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants