Skip to content

Enable Karpenter interruption handling#65

Merged
esentuna merged 2 commits into
mainfrom
codex/karpenter-interruption-hardening
Jun 16, 2026
Merged

Enable Karpenter interruption handling#65
esentuna merged 2 commits into
mainfrom
codex/karpenter-interruption-hardening

Conversation

@alan-walsh

Copy link
Copy Markdown
Contributor

Summary

  • Configure the cluster-level chart to load Karpenter values from this GitOps repo.
  • Enable Karpenter interruption handling with the existing ardac1prd-ardac1prd SQS queue.
  • Exclude medium instances from the default Karpenter NodePool in addition to nano, micro, and small.
  • Scale ambassador, aws-es-proxy, fence, portal, and revproxy to 3 replicas.

AWS change already applied

  • Updated IAM policy arn:aws:iam::742378880825:policy/ardac1prd-karpenter-sa-policy to default version v2.
  • The Karpenter2 SQS statement now grants receive/delete/get permissions on arn:aws:sqs:us-east-1:742378880825:ardac1prd-ardac1prd.
  • Previous policy document backup was saved locally at %TEMP%\ardac1prd-karpenter-sa-policy-v1-backup.json.

Validation

  • helm template cluster-level-resources gen3/cluster-level-resources -f ardac1prd\cluster-level-resources\cluster-values.yaml
  • helm template gen3 gen3/gen3 -f ardac1prd\portal.ardac.org\values.yaml
  • helm template karpenter oci://public.ecr.aws/karpenter/karpenter --version 1.0.8 -f ardac1prd\cluster-values\karpenter.yaml
  • git diff --check

Copilot AI review requested due to automatic review settings June 12, 2026 13:36

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Enables Karpenter interruption handling for the ardac1prd cluster by introducing Karpenter Helm values in this GitOps repo and wiring cluster-level resources to load that configuration, while also adjusting default Karpenter instance-size constraints and scaling key Gen3 services to 3 replicas.

Changes:

  • Scale ambassador, aws-es-proxy, fence, portal, and revproxy to replicaCount: 3 in portal.ardac.org values.
  • Add ardac1prd/cluster-values/karpenter.yaml with Karpenter settings including interruptionQueue.
  • Update cluster-level resources values to enable Karpenter configuration loading and exclude medium instance sizes in the default NodePool requirements.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
ardac1prd/portal.ardac.org/values.yaml Increases replica counts for several services to improve availability/capacity.
ardac1prd/cluster-values/karpenter.yaml Adds Karpenter Helm values including IRSA annotation, cluster settings, and interruption queue.
ardac1prd/cluster-level-resources/cluster-values.yaml Enables Karpenter configuration loading from this repo and updates default instance-size exclusions.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread ardac1prd/cluster-values/karpenter.yaml
Comment thread ardac1prd/cluster-level-resources/cluster-values.yaml
@alan-walsh alan-walsh marked this pull request as draft June 12, 2026 13:45
@alan-walsh alan-walsh marked this pull request as ready for review June 12, 2026 19:37
@esentuna esentuna merged commit e2bf3cf into main Jun 16, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants