Skip to content

FDE-39: Add Argo Rollouts OTel Collector monitoring example#124

Open
prathamesh-sonpatki wants to merge 5 commits into
mainfrom
worktree-prancy-exploring-otter
Open

FDE-39: Add Argo Rollouts OTel Collector monitoring example#124
prathamesh-sonpatki wants to merge 5 commits into
mainfrom
worktree-prancy-exploring-otter

Conversation

@prathamesh-sonpatki
Copy link
Copy Markdown
Member

@prathamesh-sonpatki prathamesh-sonpatki commented Feb 27, 2026

Summary

  • Logs pipeline: adds filelog receiver + k8sattributes processor to collect pod logs enriched with rollouts-pod-template-hash, enabling canary vs stable pod log filtering in Last9
  • APM correlation: adds transform/apm_labels + groupbyattrs/service processors to map rollout name label → service.name, so rollout metrics correlate with APM traces and metrics for the same service
  • Namespace: changed from monitoring to last9
  • DaemonSet: changed from Deployment to DaemonSet (required for /var/log/pods host volume access)
  • Bug fixes: filelog regex now handles both UUID and hex-hash pod UIDs (static pods like etcd); if guards on move operators prevent failures when regex doesn't match

Test plan

  • Deployed collector to minikube — metrics pipeline sending to Last9 (sent_metric_points=564, send_failed=0)
  • Logs pipeline sending to Last9 (sent_log_records=115, receiver_accepted=155)
  • service.name=test-rollout confirmed on rollout metrics in debug output
  • rollouts-pod-template-hash label extraction verified via k8sattributes

🤖 Generated with Claude Code

prathamesh-sonpatki and others added 5 commits February 27, 2026 18:02
Adds production-ready example for monitoring Argo Rollouts canary
deployments via OTel Collector prometheus receiver, shipping metrics
to Last9. Includes K8s manifests with RBAC, canary vs stable pod
comparison via kube-state-metrics, and dashboard guidance.

Closes FDE-39

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ast9

Adds AnalysisTemplates that use Last9's Prometheus-compatible query
endpoint to drive automated canary promotion and rollback in Argo
Rollouts. Includes error rate and p99 latency templates, a reference
Rollout spec wiring both templates across 10/25/50/100% steps, and
README docs explaining the automated gating flow.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The read endpoint host is dynamic per org — replaced split placeholders
with a single <your-last9-prometheus-read-endpoint> placeholder. Also
removed stale sigv4 authentication block.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Replace basicAuth (unsupported by Argo Rollouts Prometheus provider)
  with args.valueFrom.secretKeyRef + Authorization header pattern
- Add note: use printf + tr -d '\n' when base64-encoding credentials
  to avoid invalid header field value errors at runtime
- Fix metrics reference: remove canary_weight (not a label in current
  Argo Rollouts), add rollout_phase and rollout_events_total, correct
  label names from rollout/namespace keys to name/namespace
- Update dashboard panel queries to use verified metric names

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…mespace

- Add filelog receiver to collect pod logs enriched with rollouts-pod-template-hash,
  enabling canary vs stable pod log filtering in Last9
- Add transform/apm_labels + groupbyattrs/service processors to map rollout name
  to service.name for APM correlation with traces and metrics
- Add k8sattributes processor to enrich logs with pod labels from K8s API
- Change namespace from monitoring to last9
- Change Deployment to DaemonSet (required for /var/log/pods host access)
- Fix filelog regex to handle both UUID and hex-hash pod UIDs (static pods)
- Add if guards on move operators to handle parse failures gracefully

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant