Skip to content

Add build-monitor-service-hook-deliveries alert to source control#6610

Open
missymessa wants to merge 1 commit into
mainfrom
fix/build-monitor-service-hook-alert
Open

Add build-monitor-service-hook-deliveries alert to source control#6610
missymessa wants to merge 1 commit into
mainfrom
fix/build-monitor-service-hook-alert

Conversation

@missymessa

Copy link
Copy Markdown
Member

Summary

Adds the �uild-monitor-service-hook-deliveries Grafana alert rule to source control (Production + Staging).

What changed

The existing Grafana alert was querying for \ServiceHookNotificationStatus\ custom events that no longer exist in the codebase — the telemetry was removed at some point but the alert was never updated. It only fired via
oDataState: Alerting\ (false positive when no data flows), not from actual failure detection.

Updated the alert to:

  • Query actual HTTP request telemetry (
    equests\ table) from DotNetEng-Status App Insights for the \POST AzurePipelines/BuildComplete\ endpoint
  • Change
    oDataState\ from \Alerting\ to \OK\
  • Use explicit \15m\ time bins instead of \\
  • Use parameterized resource paths for staging/prod

Production

The alert has already been updated in production Grafana via the API. This PR tracks the change in source control so it's deployed consistently.

Fixes: https://dev.azure.com/dnceng/internal/_workitems/edit/10629

The existing Grafana alert (UID: build-monitor-service-hook-deliveries) was
querying for 'ServiceHookNotificationStatus' custom events that no longer
exist in the codebase. This caused false-positive alerts via noDataState.

Updated the alert to query actual HTTP request telemetry from the
DotNetEng-Status App Insights instance for the POST AzurePipelines/BuildComplete
endpoint, which is the real webhook handler for AzDO service hook deliveries.

Changes:
- Query: customEvents/ServiceHookNotificationStatus -> requests/BuildComplete
- noDataState: Alerting -> OK
- Time bins: \ -> explicit 15m
- Data source: dotnet-eng -> DotNetEng-Status-Prod (parameterized)

Fixes: https://dev.azure.com/dnceng/internal/_workitems/edit/10629

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant