From 9bdc843f061257b1e660b8473639e1d297172ff5 Mon Sep 17 00:00:00 2001 From: Nastasha Solomon Date: Fri, 15 May 2026 16:46:12 -0400 Subject: [PATCH 01/14] Add experimental alerting features workflows and notifications pages Adds six pages under kibana-alerting-experimental/: - workflows-alerting.md: workflow overview and runtime execution order - notifications.md: action policy overview with dispatcher behavior - notifications/create-configure-action-policy.md: creation guide - notifications/action-policy-reference.md: full field reference - notifications/manage-action-policies.md: management operations - notifications/notification-gating.md: acknowledge/snooze/deactivate gating Incorporates additions from PR #6395 including the new notification-gating page. Co-Authored-By: Claude Sonnet 4.6 --- .../notifications.md | 77 +++++++++++++++ .../notifications/action-policy-reference.md | 99 +++++++++++++++++++ .../create-configure-action-policy.md | 67 +++++++++++++ .../notifications/manage-action-policies.md | 34 +++++++ .../notifications/notification-gating.md | 56 +++++++++++ .../workflows-alerting.md | 37 +++++++ explore-analyze/toc.yml | 7 ++ 7 files changed, 377 insertions(+) create mode 100644 explore-analyze/alerting/kibana-alerting-experimental/notifications.md create mode 100644 explore-analyze/alerting/kibana-alerting-experimental/notifications/action-policy-reference.md create mode 100644 explore-analyze/alerting/kibana-alerting-experimental/notifications/create-configure-action-policy.md create mode 100644 explore-analyze/alerting/kibana-alerting-experimental/notifications/manage-action-policies.md create mode 100644 explore-analyze/alerting/kibana-alerting-experimental/notifications/notification-gating.md create mode 100644 explore-analyze/alerting/kibana-alerting-experimental/workflows-alerting.md diff --git a/explore-analyze/alerting/kibana-alerting-experimental/notifications.md b/explore-analyze/alerting/kibana-alerting-experimental/notifications.md new file mode 100644 index 0000000000..039166eb0c --- /dev/null +++ b/explore-analyze/alerting/kibana-alerting-experimental/notifications.md @@ -0,0 +1,77 @@ +--- +navigation_title: Notifications +applies_to: + stack: unavailable + serverless: preview +products: + - id: kibana +description: "How experimental alerting features action policies route alert episodes to notifications: matchers, grouping, frequency, and workflow destinations." +--- + +# experimental alerting features notifications + +After a rule produces alert episodes, action policies decide what to do about them: who gets notified, how often, and through which channel. + +This page explains how action policies work. For creating and configuring them step by step, refer to [Create and configure an action policy](notifications/create-configure-action-policy.md). + +## What is an action policy [action-policies] + + +An action policy is a saved object in your space that controls notification routing. It's not attached to a rule. It's global within the space, so when an episode is produced, the system evaluates all enabled policies in the space that are not snoozed and decides which ones apply. + +Each policy has four controls: + +| Control | What it does | +| --- | --- | +| Matcher (optional KQL) | Filters which episodes this policy applies to. An empty matcher matches all episodes in the space. | +| Dispatch per (grouping) | Controls how episodes batch into notifications: one per episode, one per notification group (Dispatch per **Group**), or one digest for all. | +| Frequency | Controls how often the policy can notify for the same notification group. | +| Destinations | One or more workflows to invoke when all conditions are met. | + +## How policies apply to rules + +Action policies don't reference rules directly. You scope a policy using KQL over episode and rule fields, for example `rule.labels: "checkout"` or `data.severity: "critical"`. A policy applies to every matching episode in the space, from any rule. + +Multiple policies can match the same episode, and each runs independently. There's no precedence or merging between them. If no policy matches an episode, no notification is sent. This is intentional. + +## How action policies are evaluated [how-action-policies-evaluated] + + +{{kib}} runs a background process called the dispatcher that checks for eligible episodes on a short interval (around 10 seconds) and evaluates action policies against them. The dispatcher is separate from the rule schedule. Rules write events on their own cadence, and the dispatcher picks them up asynchronously. + +For each enabled policy that is not snoozed, the dispatcher works through the following steps: + +1. **Gating:** Is the episode acknowledged, snoozed, or deactivated? If so, skip dispatch. Refer to [Notification gating](notifications/notification-gating.md) to learn more. +2. **Matcher:** Does the episode match the policy's KQL? If not, skip this policy. +3. **Grouping:** How should matching episodes batch into notification groups? +4. **Frequency:** Has a notification already gone out for this notification group recently? If so, wait. +5. **Destinations:** Send to the policy's workflow destinations. + +### Notification dispatch outcomes [possible-outcomes] +The dispatcher runs on a short interval (around 5 seconds). Notifications don't arrive on the exact rule schedule. They follow the dispatcher's own cycle. + +### Possible outcomes [possible-outcomes] + +Each notification attempt results in one of the following outcomes. + +| Outcome | What it means | +| --- | --- | +| `dispatched` | A notification was sent. | +| `throttled` | Dispatch was suppressed because the **frequency** interval had not elapsed. | +| `suppressed` | The episode was suppressed before dispatch (acknowledged, snoozed, or deactivated). | +| `unmatched` | No policy matched this episode; no workflow ran. | +| `error` | Processing failed. Check {{kib}} logs. | + +You can query these outcomes in Discover through the `.alert-actions` data stream. + +## Why policies are separate from rules + +Rules don't own policies. A rule can't say "notify team X when it fires." Instead, team X creates an action policy that uses a KQL matcher to pick up matching episodes. + +This design means: +- One policy can cover episodes from many rules. +- You can update routing without touching any rule. +- Rules can be created without any notification policy, which is useful for testing. + +When you're ready to route notifications, go to [Create and configure an action policy](notifications/create-configure-action-policy.md). + diff --git a/explore-analyze/alerting/kibana-alerting-experimental/notifications/action-policy-reference.md b/explore-analyze/alerting/kibana-alerting-experimental/notifications/action-policy-reference.md new file mode 100644 index 0000000000..fe14625da9 --- /dev/null +++ b/explore-analyze/alerting/kibana-alerting-experimental/notifications/action-policy-reference.md @@ -0,0 +1,99 @@ +--- +navigation_title: Action policy reference +applies_to: + stack: unavailable + serverless: preview +products: + - id: kibana +description: "Grouping modes, frequency options, dispatch outcomes, and matcher field reference for action policies in the experimental alerting features." +--- + +# Action policy reference [action-policy-reference] + + +Action policies are part of the experimental alerting features in Kibana. Use this page when building action policies. Below, you will find details about valid matcher fields, grouping modes and frequency options, dispatch outcome, and more. For step-by-step guidance, refer to [Create and configure an action policy](create-configure-action-policy.md). + +## Matcher fields [matcher-fields] + +Use these fields in the **Matcher** expression to filter which episodes a policy applies to. Combine them with standard KQL operators, for example `data.severity: "critical" AND episode_status: "active"`. + +| Field | Description | Example | +|---|---|---| +| `episode_status` | Current lifecycle status of the episode. Accepted values: `active`, `inactive`, `pending`, `recovering`. | `episode_status: "active"` | +| `data.*` | Dynamic payload fields sent by the rule. Available fields depend on the rule type and configuration. | `data.severity: "critical"` or `data.host.name: "web-01"` | +| `rule.id` | Unique identifier for the rule that generated the episode. | `rule.id: "rule-001"` | +| `rule.name` | Display name of the rule. | `rule.name: "High CPU"` | +| `rule.labels` | Key-value labels attached to the rule. Use dot notation to target a specific label key. | `rule.labels.env: "production"` | + + + +## Dispatch per options [notification-grouping] + +Controls how the policy batches matching episodes before sending a notification. + +| Option | Description | When to use | +|---|---|---| +| Episode | Each episode triggers its own notification independently. Default selection. | You need per-issue visibility and want to handle each problem separately. | +| Group | The policy bundles episodes that share the same value for a specified `data.*` field into one notification per unique value (a **notification group**). | A rule produces many related episodes (for example, one per service or host) and you want to reduce noise by batching them into shared notifications. | +| Digest | The policy combines all matching episodes into a single notification, regardless of what they have in common. | You want a single periodic summary of everything that matched, rather than individual alerts. | + +## Frequency [throttle-strategies] + +**Frequency** controls how often the policy fires for a given episode or notification group. The available options depend on the **Dispatch per** setting. Not all options are valid for all modes. + +| Option | Description | When to use | +|---|---|---| +| On status change | Notifies when the episode status changes (for example, active → recovering). One notification per transition. | You only need to know when something breaks and when it's resolved. No reminders needed. | +| On status change + repeat at interval | Notifies on status change, then resends notifications at a regular interval while the episode remains in the same status. | You want status change alerts plus periodic notifications that a problem is still unresolved, in case it has been missed or pushed aside. | +| At most once every… | Caps notifications at one per episode or notification group within the chosen interval, regardless of rule frequency. | You want to limit alert volume for noisy rules without missing new or ongoing issues. | +| Every evaluation | Notifies on every rule evaluation. Can be noisy. Use sparingly and only with infrequent rule schedules. | You need a full audit trail of every evaluation, or the rule runs infrequently enough that noise isn't a concern. | + + + +### Frequency options for Episode [frequency-when-episode-per_episode] + +Available frequency options when you set **Dispatch per** to **Episode**. + +| Option | Description | Example | +|---|---|---| +| On status change | Notifies once when the episode opens and once when it recovers. No repeat notifications while it remains active. Best for when you trust your ticketing or incident workflow to track ongoing issues | A host goes down at 9:00am → one notification. Recovers at 11:00am → one notification. No notifications between them. | +| On status change + repeat at interval | Same as On status change, but also sends a reminder at a set interval while the episode is still active. | A host goes down at 9:00am → notification. With a 1h repeat: reminder at 10:00am, 11:00am. Recovers at 11:30am → notification. | +| Every evaluation | Fires on every rule evaluation, regardless of status. Can be noisy on frequent rule schedules. Avoid in production. | A rule running every 5 minutes with one active episode produces up to 288 notifications per day. | + +### Frequency options for Group + +Available frequency options when you set **Dispatch per** to **Group**. + +| Option | Description | Example | +|---|---|---| +| At most once every… | Limits how often each notification group can notify, regardless of how many episodes match or how often the rule runs. | 10 episodes share `data.host.name: "web-01"`. With a 1h limit, you get at most one notification per hour for that notification group. | +| Every evaluation | Fires on every rule evaluation for each unique value in the group-by field. Still noisy on frequent rule schedules. | A rule running every 10 minutes with 5 unique host values produces up to 6 notifications per host per hour. | + +### Frequency options for Digest + +Available frequency options when you set **Dispatch per** to **Digest**. + +| Option | Description | Example | +|---|---|---| +| Every evaluation | The only option for Digest. Fires on every rule run, bundling all matching episodes into one message. Pair with a longer rule schedule to avoid frequent summary messages. | A rule running every 30 minutes with 20 matching episodes produces one summary notification every 30 minutes containing all 20. | + +## Dispatch outcomes + +The system records each notification attempt with one of the following outcomes. To investigate delivery issues, query the `.alert-actions` data stream in Discover and filter by the `outcome` field. + +| Outcome | What happened | +|---|---| +| `dispatched` | The system sent the notification successfully. | +| `throttled` | The system skipped delivery because the **frequency** interval had not elapsed. This is expected behavior, not an error. | +| `suppressed` | Dispatch was blocked before the notification went out—the rule was acknowledged, snoozed, or deactivated. | +| `unmatched` | No action policy matched this episode, so no workflow ran. | +| `error` | An error occurred during processing. Check {{kib}} logs to identify the cause. | \ No newline at end of file diff --git a/explore-analyze/alerting/kibana-alerting-experimental/notifications/create-configure-action-policy.md b/explore-analyze/alerting/kibana-alerting-experimental/notifications/create-configure-action-policy.md new file mode 100644 index 0000000000..aa30993ecb --- /dev/null +++ b/explore-analyze/alerting/kibana-alerting-experimental/notifications/create-configure-action-policy.md @@ -0,0 +1,67 @@ +--- +navigation_title: Create an action policy +applies_to: + stack: unavailable + serverless: preview +products: + - id: kibana +description: "Create action policies in the experimental alerting features, configure matchers, Dispatch per, Frequency, and workflow destinations." +--- + +# Create and configure an action policy [create-manage-action-policies] + + +Action policies are part of the experimental alerting features in Kibana. Rules define what counts as a problem. Action policies define what happens when a problem is detected. They determine which episodes generate notifications, how episodes batch for dispatch, **frequency** limits on notifications, and where they're routed. + +Because policies are separate from rules and global within a space, you can update notification behavior across many rules at once without touching detection logic, and you can route the same alerts differently depending on severity or source. You create and manage policies from the **Action policies** page, not from the rule form. + +For matcher fields, grouping modes, frequency options, and dispatch outcomes, refer to [Action policy reference](action-policy-reference.md). + + + +## Policy fields [policy-fields] + +### Matcher [matcher] + + +An optional KQL expression that filters which episodes this policy applies to. An empty matcher matches every episode in the space. + +Use matchers to route different episodes to different policies, for example, one policy for `data.severity: "critical"` episodes routed to PagerDuty and another for warnings routed to Slack. For available fields and examples, refer to [Matcher fields](action-policy-reference.md#matcher-fields). + + + +### Grouping and frequency [reduce-noise-grouping] + + +**Dispatch per** controls how episodes batch into notifications. **Frequency** controls how often the policy can notify for each batch. + +:::{table} +:widths: 4-4-4 + +| Dispatch per | What it does | Available Frequency options | +|---|---|---| +| Episode | One notification per episode. | - On status change
- On status change + repeat at interval
- Every evaluation | +| Group | Bundle episodes that share a field value. Specify **Group by** (for example `data.service.name` or `data.host.name`). | - At most once every…
- Every evaluation | +| Digest | One notification for all matching episodes combined. | Every evaluation | + +::: + +For detailed descriptions, frequency options, and examples for each mode, refer to [Dispatch per options](action-policy-reference.md#notification-grouping). + +### Frequency [throttle] + + +**Frequency** limits how often the policy can fire for a given notification group. The interval resets from the last time the policy fired, so successive notifications stay at least `interval` apart. Set a duration such as `1h` or `30m`. For available options by **Dispatch per** mode, refer to [Frequency](action-policy-reference.md#throttle-strategies). + +### Destinations + +One or more workflows to invoke when the policy matches. Use the search field to find and attach workflows. + +### Snooze + +An optional time window during which the policy doesn't dispatch. Useful for planned maintenance or quiet periods without disabling the policy entirely. diff --git a/explore-analyze/alerting/kibana-alerting-experimental/notifications/manage-action-policies.md b/explore-analyze/alerting/kibana-alerting-experimental/notifications/manage-action-policies.md new file mode 100644 index 0000000000..0c09fd1231 --- /dev/null +++ b/explore-analyze/alerting/kibana-alerting-experimental/notifications/manage-action-policies.md @@ -0,0 +1,34 @@ +--- +navigation_title: Manage action policies +applies_to: + stack: unavailable + serverless: preview +products: + - id: kibana +description: "Enable, disable, snooze, maintenance windows, bulk actions, and API key rotation for action policies in the experimental alerting features." +--- + +# Manage action policies + +Managing action policies is part of the experimental alerting features in Kibana. After you create an action policy, you can control when it runs, pause it temporarily, and keep its credentials current. This page covers those management tasks. + +## Enable, snooze, and maintenance + +You can disable a policy so it is not evaluated for new episodes. You can snooze a policy for a defined window so that it does not dispatch notifications during that period. Policies that are not enabled or are snoozed are skipped when the dispatcher evaluates policies. + +### Maintenance windows [maintenance-windows] + + +Maintenance windows are scheduled periods during which a policy does not dispatch notifications. They are configured on the action policy alongside snooze and other policy controls, not on the rule. Rule evaluation continues and alert episodes can still be recorded in `.rule-events`. Only dispatch through that policy pauses. Use maintenance windows for planned deployments, infrastructure changes, or recurring quiet periods. + +## Update API keys + +You can rotate the API key used to run a policy's workflows without changing matchers or destinations. Use the **Update API key** action on one policy or for multiple selected policies. + +::::{important} Production considerations +When you update or delete an action policy, previous API keys used for execution are marked for invalidation and removed on a schedule managed by {{kib}}. Allow for a short delay before new keys are used for dispatch. +:::: + +## Bulk actions + +On the action policies list, select one or more policies to enable, disable, snooze, and do more in bulk. **Select all** selects every policy on the current page of results. Clear the selection before changing filters if you need a different set. diff --git a/explore-analyze/alerting/kibana-alerting-experimental/notifications/notification-gating.md b/explore-analyze/alerting/kibana-alerting-experimental/notifications/notification-gating.md new file mode 100644 index 0000000000..eb0f4c3c0b --- /dev/null +++ b/explore-analyze/alerting/kibana-alerting-experimental/notifications/notification-gating.md @@ -0,0 +1,56 @@ +--- +navigation_title: Notification gating +applies_to: + stack: unavailable + serverless: preview +products: + - id: kibana +description: "How experimental alerting features gates notifications: which mechanisms suppress dispatch before action policies run, their scope, and when to use each." +--- + +# Notification gating [notification-gating] + + +Notification gating controls whether a matched alert episode triggers a notification. When an episode is gated, the dispatcher stops processing it before any action policy matcher, grouping, throttle, or destination runs. No notification is sent. + +## How gating fits in the dispatcher [gating-in-dispatcher] + + +When an episode is eligible for dispatch, the dispatcher evaluates each enabled action policy in order: + +1. **Gating:** Is the episode acknowledged, snoozed, or deactivated? If so, stop — no notification is sent. +2. **Matcher:** Does the episode match the policy's KQL? If not, skip this policy. +3. **Grouping:** How should matching episodes batch into notification groups? +4. **Throttle:** Has a notification already gone out for this group recently? If so, wait. +5. **Destinations:** Send to the policy's workflow destinations. + +Gating is the first step. An episode that is acknowledged, snoozed, or deactivated never reaches routing. + +## Gating mechanisms [gating-mechanisms] + + +Three mechanisms let you gate notifications, each at a different scope: + +| Mechanism | Scope | When to use | +|---|---|---| +| Acknowledge | Per episode | You're actively investigating a breach and want to silence notifications for it without closing the episode. Clear the acknowledgement when you're done to restore notifications. | +| Snooze | Per series (group) | You want to quiet an entire alert series for a defined period — for example, during a known noisy window for a specific host. Snooze expires automatically at the end of the duration. | +| Deactivate | Per episode | You want to manually close an episode that hasn't recovered automatically. Deactivating marks the episode as inactive and stops notifications for it. Unlike acknowledge, this closes the episode rather than silencing it while leaving it active. | + +Each mechanism is stored as a separate document in `.alert-actions`, so the full gating history for an episode is queryable in Discover. + +### Snooze scope + +Snooze applies at the group level (by `group_hash`), not per individual episode. When you snooze one episode, every episode sharing the same group — all rows with the same `rule_id` and `group_hash` — is silenced for the duration. Snoozing one row in the alerts table silences the entire series for that rule. + + + +## Related pages + + +- **[Notifications](../notifications.md):** Set up action policies that control routing, grouping, and throttle after gating. diff --git a/explore-analyze/alerting/kibana-alerting-experimental/workflows-alerting.md b/explore-analyze/alerting/kibana-alerting-experimental/workflows-alerting.md new file mode 100644 index 0000000000..f28ead6b58 --- /dev/null +++ b/explore-analyze/alerting/kibana-alerting-experimental/workflows-alerting.md @@ -0,0 +1,37 @@ +--- +navigation_title: Workflows +applies_to: + stack: ga 9.4, preview 9.3 + serverless: ga +products: + - id: kibana +description: "How workflows connect to the experimental alerting features action policies and rule automation, and where to configure them." +--- + +# Workflows for the experimental alerting features [workflows] + + +Workflows are part of the experimental alerting features in Kibana. Without a workflow, an action policy has nowhere to send notifications. [Workflows](../../workflows.md) are the delivery layer. They define the actual steps that run when a policy matches an episode: sending a message, calling a webhook, triggering automation, or any combination. Setting up a workflow is what connects the experimental alerting features to the tools your team already uses for incident response. + +Before creating an action policy, make sure the workflows you want to use already exist in your space. Policies store references to workflow IDs, so a destination workflow must exist before you can select it. + +::::{note} +Only manual triggers are supported for workflows used with action policies. +:::: + +## Runtime execution order [runtime-execution-order] + +After a rule produces or updates alert episodes, processing follows this sequence: + +``` +Rule → Alert → Action Policy → Workflow → Notification +``` + +1. The rule runs its {{esql}} evaluation and writes to `.rule-events`. +2. In Alert mode, alert documents and episodes represent the ongoing issue. +3. Action policies in the same space are evaluated against episodes (matcher, suppression, grouping, frequency). +4. For each dispatch, the policy invokes its configured workflows. +5. Notifications are the outcome: Email, chat, webhook, and so on. + +The policy evaluates matchers and **frequency** limits before any workflow step runs, even though you created the workflow before the policy. That's why configuration order (workflow first, then policy, then rule) is the reverse of runtime order. + diff --git a/explore-analyze/toc.yml b/explore-analyze/toc.yml index 2a57f93a92..d50be4be30 100644 --- a/explore-analyze/toc.yml +++ b/explore-analyze/toc.yml @@ -380,6 +380,13 @@ toc: - file: report-and-share/reporting-troubleshooting-pdf.md - file: alerting.md children: + - file: alerting/kibana-alerting-experimental/workflows-alerting.md + - file: alerting/kibana-alerting-experimental/notifications.md + children: + - file: alerting/kibana-alerting-experimental/notifications/create-configure-action-policy.md + - file: alerting/kibana-alerting-experimental/notifications/action-policy-reference.md + - file: alerting/kibana-alerting-experimental/notifications/manage-action-policies.md + - file: alerting/kibana-alerting-experimental/notifications/notification-gating.md - file: alerting/alerts.md children: - file: alerting/alerts/alerting-getting-started.md From 7cb32e944372ea0e94db60ed44f124b793369475 Mon Sep 17 00:00:00 2001 From: Nastasha Solomon Date: Fri, 15 May 2026 17:12:17 -0400 Subject: [PATCH 02/14] Add alerting-v2 variables and replace phrase throughout Co-Authored-By: Claude Sonnet 4.6 --- docset.yml | 2 ++ .../alerting/kibana-alerting-experimental/notifications.md | 4 ++-- .../notifications/action-policy-reference.md | 4 ++-- .../notifications/create-configure-action-policy.md | 4 ++-- .../notifications/manage-action-policies.md | 4 ++-- .../notifications/notification-gating.md | 4 ++-- .../kibana-alerting-experimental/workflows-alerting.md | 6 +++--- 7 files changed, 15 insertions(+), 13 deletions(-) diff --git a/docset.yml b/docset.yml index 5ecf676ed5..a43cb177d5 100644 --- a/docset.yml +++ b/docset.yml @@ -123,6 +123,8 @@ subs: ls-pipelines-app: "Logstash Pipelines" maint-windows-app: "Maintenance Windows" maint-windows-cap: "Maintenance windows" + alerting-v2: "experimental alerting features" + alerting-v2-cap: "Experimental alerting features" custom-roles-app: "Custom Roles" data-source: "data view" data-sources: "data views" diff --git a/explore-analyze/alerting/kibana-alerting-experimental/notifications.md b/explore-analyze/alerting/kibana-alerting-experimental/notifications.md index 039166eb0c..788c71f568 100644 --- a/explore-analyze/alerting/kibana-alerting-experimental/notifications.md +++ b/explore-analyze/alerting/kibana-alerting-experimental/notifications.md @@ -5,10 +5,10 @@ applies_to: serverless: preview products: - id: kibana -description: "How experimental alerting features action policies route alert episodes to notifications: matchers, grouping, frequency, and workflow destinations." +description: "How {{alerting-v2}} action policies route alert episodes to notifications: matchers, grouping, frequency, and workflow destinations." --- -# experimental alerting features notifications +# {{alerting-v2-cap}} notifications After a rule produces alert episodes, action policies decide what to do about them: who gets notified, how often, and through which channel. diff --git a/explore-analyze/alerting/kibana-alerting-experimental/notifications/action-policy-reference.md b/explore-analyze/alerting/kibana-alerting-experimental/notifications/action-policy-reference.md index fe14625da9..b663032c9d 100644 --- a/explore-analyze/alerting/kibana-alerting-experimental/notifications/action-policy-reference.md +++ b/explore-analyze/alerting/kibana-alerting-experimental/notifications/action-policy-reference.md @@ -5,13 +5,13 @@ applies_to: serverless: preview products: - id: kibana -description: "Grouping modes, frequency options, dispatch outcomes, and matcher field reference for action policies in the experimental alerting features." +description: "Grouping modes, frequency options, dispatch outcomes, and matcher field reference for action policies in the {{alerting-v2}}." --- # Action policy reference [action-policy-reference] -Action policies are part of the experimental alerting features in Kibana. Use this page when building action policies. Below, you will find details about valid matcher fields, grouping modes and frequency options, dispatch outcome, and more. For step-by-step guidance, refer to [Create and configure an action policy](create-configure-action-policy.md). +Action policies are part of the {{alerting-v2}} in Kibana. Use this page when building action policies. Below, you will find details about valid matcher fields, grouping modes and frequency options, dispatch outcome, and more. For step-by-step guidance, refer to [Create and configure an action policy](create-configure-action-policy.md). ## Matcher fields [matcher-fields] diff --git a/explore-analyze/alerting/kibana-alerting-experimental/notifications/create-configure-action-policy.md b/explore-analyze/alerting/kibana-alerting-experimental/notifications/create-configure-action-policy.md index aa30993ecb..6fbb394469 100644 --- a/explore-analyze/alerting/kibana-alerting-experimental/notifications/create-configure-action-policy.md +++ b/explore-analyze/alerting/kibana-alerting-experimental/notifications/create-configure-action-policy.md @@ -5,13 +5,13 @@ applies_to: serverless: preview products: - id: kibana -description: "Create action policies in the experimental alerting features, configure matchers, Dispatch per, Frequency, and workflow destinations." +description: "Create action policies in the {{alerting-v2}}, configure matchers, Dispatch per, Frequency, and workflow destinations." --- # Create and configure an action policy [create-manage-action-policies] -Action policies are part of the experimental alerting features in Kibana. Rules define what counts as a problem. Action policies define what happens when a problem is detected. They determine which episodes generate notifications, how episodes batch for dispatch, **frequency** limits on notifications, and where they're routed. +Action policies are part of the {{alerting-v2}} in Kibana. Rules define what counts as a problem. Action policies define what happens when a problem is detected. They determine which episodes generate notifications, how episodes batch for dispatch, **frequency** limits on notifications, and where they're routed. Because policies are separate from rules and global within a space, you can update notification behavior across many rules at once without touching detection logic, and you can route the same alerts differently depending on severity or source. You create and manage policies from the **Action policies** page, not from the rule form. diff --git a/explore-analyze/alerting/kibana-alerting-experimental/notifications/manage-action-policies.md b/explore-analyze/alerting/kibana-alerting-experimental/notifications/manage-action-policies.md index 0c09fd1231..13840dd62e 100644 --- a/explore-analyze/alerting/kibana-alerting-experimental/notifications/manage-action-policies.md +++ b/explore-analyze/alerting/kibana-alerting-experimental/notifications/manage-action-policies.md @@ -5,12 +5,12 @@ applies_to: serverless: preview products: - id: kibana -description: "Enable, disable, snooze, maintenance windows, bulk actions, and API key rotation for action policies in the experimental alerting features." +description: "Enable, disable, snooze, maintenance windows, bulk actions, and API key rotation for action policies in the {{alerting-v2}}." --- # Manage action policies -Managing action policies is part of the experimental alerting features in Kibana. After you create an action policy, you can control when it runs, pause it temporarily, and keep its credentials current. This page covers those management tasks. +Managing action policies is part of the {{alerting-v2}} in Kibana. After you create an action policy, you can control when it runs, pause it temporarily, and keep its credentials current. This page covers those management tasks. ## Enable, snooze, and maintenance diff --git a/explore-analyze/alerting/kibana-alerting-experimental/notifications/notification-gating.md b/explore-analyze/alerting/kibana-alerting-experimental/notifications/notification-gating.md index eb0f4c3c0b..dd7ee31e1c 100644 --- a/explore-analyze/alerting/kibana-alerting-experimental/notifications/notification-gating.md +++ b/explore-analyze/alerting/kibana-alerting-experimental/notifications/notification-gating.md @@ -5,7 +5,7 @@ applies_to: serverless: preview products: - id: kibana -description: "How experimental alerting features gates notifications: which mechanisms suppress dispatch before action policies run, their scope, and when to use each." +description: "How {{alerting-v2}} gates notifications: which mechanisms suppress dispatch before action policies run, their scope, and when to use each." --- # Notification gating [notification-gating] @@ -51,6 +51,6 @@ For instructions on snoozing and unsnoozing single or multiple episodes, refer t - **[Notifications](../notifications.md):** Set up action policies that control routing, grouping, and throttle after gating. diff --git a/explore-analyze/alerting/kibana-alerting-experimental/workflows-alerting.md b/explore-analyze/alerting/kibana-alerting-experimental/workflows-alerting.md index f28ead6b58..3640186e56 100644 --- a/explore-analyze/alerting/kibana-alerting-experimental/workflows-alerting.md +++ b/explore-analyze/alerting/kibana-alerting-experimental/workflows-alerting.md @@ -5,13 +5,13 @@ applies_to: serverless: ga products: - id: kibana -description: "How workflows connect to the experimental alerting features action policies and rule automation, and where to configure them." +description: "How workflows connect to the {{alerting-v2}} action policies and rule automation, and where to configure them." --- -# Workflows for the experimental alerting features [workflows] +# Workflows for the {{alerting-v2}} [workflows] -Workflows are part of the experimental alerting features in Kibana. Without a workflow, an action policy has nowhere to send notifications. [Workflows](../../workflows.md) are the delivery layer. They define the actual steps that run when a policy matches an episode: sending a message, calling a webhook, triggering automation, or any combination. Setting up a workflow is what connects the experimental alerting features to the tools your team already uses for incident response. +Workflows are part of the {{alerting-v2}} in Kibana. Without a workflow, an action policy has nowhere to send notifications. [Workflows](../../workflows.md) are the delivery layer. They define the actual steps that run when a policy matches an episode: sending a message, calling a webhook, triggering automation, or any combination. Setting up a workflow is what connects the {{alerting-v2}} to the tools your team already uses for incident response. Before creating an action policy, make sure the workflows you want to use already exist in your space. Policies store references to workflow IDs, so a destination workflow must exist before you can select it. From 528009164048e3e128037b622283607fecb86e88 Mon Sep 17 00:00:00 2001 From: Nastasha Solomon Date: Wed, 20 May 2026 23:27:38 -0400 Subject: [PATCH 03/14] fixes to naming --- .../notifications.md | 6 +++--- .../notifications/action-policy-reference.md | 18 ++++++++--------- .../create-configure-action-policy.md | 20 +++++++++---------- .../notifications/manage-action-policies.md | 2 +- .../notifications/notification-gating.md | 8 ++++---- 5 files changed, 27 insertions(+), 27 deletions(-) diff --git a/explore-analyze/alerting/kibana-alerting-experimental/notifications.md b/explore-analyze/alerting/kibana-alerting-experimental/notifications.md index 788c71f568..8b8d82b793 100644 --- a/explore-analyze/alerting/kibana-alerting-experimental/notifications.md +++ b/explore-analyze/alerting/kibana-alerting-experimental/notifications.md @@ -8,7 +8,7 @@ products: description: "How {{alerting-v2}} action policies route alert episodes to notifications: matchers, grouping, frequency, and workflow destinations." --- -# {{alerting-v2-cap}} notifications +# Notification routing in {{alerting-v2}} After a rule produces alert episodes, action policies decide what to do about them: who gets notified, how often, and through which channel. @@ -23,8 +23,8 @@ Each policy has four controls: | Control | What it does | | --- | --- | -| Matcher (optional KQL) | Filters which episodes this policy applies to. An empty matcher matches all episodes in the space. | -| Dispatch per (grouping) | Controls how episodes batch into notifications: one per episode, one per notification group (Dispatch per **Group**), or one digest for all. | +| Match conditions (optional KQL) | Filters which episodes this policy applies to. An empty match condition matches all episodes in the space. | +| Notify per (grouping) | Controls how episodes batch into notifications: one per episode, one per notification group (Notify per **Group**), or one digest for all. | | Frequency | Controls how often the policy can notify for the same notification group. | | Destinations | One or more workflows to invoke when all conditions are met. | diff --git a/explore-analyze/alerting/kibana-alerting-experimental/notifications/action-policy-reference.md b/explore-analyze/alerting/kibana-alerting-experimental/notifications/action-policy-reference.md index b663032c9d..82baa4b8ab 100644 --- a/explore-analyze/alerting/kibana-alerting-experimental/notifications/action-policy-reference.md +++ b/explore-analyze/alerting/kibana-alerting-experimental/notifications/action-policy-reference.md @@ -5,17 +5,17 @@ applies_to: serverless: preview products: - id: kibana -description: "Grouping modes, frequency options, dispatch outcomes, and matcher field reference for action policies in the {{alerting-v2}}." +description: "Grouping modes, frequency options, dispatch outcomes, and match conditions field reference for action policies in the {{alerting-v2}}." --- # Action policy reference [action-policy-reference] -Action policies are part of the {{alerting-v2}} in Kibana. Use this page when building action policies. Below, you will find details about valid matcher fields, grouping modes and frequency options, dispatch outcome, and more. For step-by-step guidance, refer to [Create and configure an action policy](create-configure-action-policy.md). +Action policies are part of the {{alerting-v2}} in {{kib}}. This page is a reference for match conditions fields, grouping modes, frequency options, and dispatch outcomes. For step-by-step guidance, refer to [Create and configure an action policy](create-configure-action-policy.md). -## Matcher fields [matcher-fields] +## Match conditions fields [matcher-fields] -Use these fields in the **Matcher** expression to filter which episodes a policy applies to. Combine them with standard KQL operators, for example `data.severity: "critical" AND episode_status: "active"`. +Use these fields in the **Match conditions** expression to filter which episodes a policy applies to. Combine them with standard KQL operators, for example `data.severity: "critical" AND episode_status: "active"`. | Field | Description | Example | |---|---|---| @@ -35,7 +35,7 @@ Add both fields to this table with examples. Update the introductory sentence to There is also an open M2 question about whether a severity change mid-episode (de-escalation or escalation) triggers policy re-evaluation. If it does, document the re-evaluation behavior in the frequency options section below, since it interacts with frequency limits.] --> -## Dispatch per options [notification-grouping] +## Notify per options [notification-grouping] Controls how the policy batches matching episodes before sending a notification. @@ -47,7 +47,7 @@ Controls how the policy batches matching episodes before sending a notification. ## Frequency [throttle-strategies] -**Frequency** controls how often the policy fires for a given episode or notification group. The available options depend on the **Dispatch per** setting. Not all options are valid for all modes. +**Frequency** controls how often the policy fires for a given episode or notification group. The available options depend on the **Notify per** setting. Not all options are valid for all modes. | Option | Description | When to use | |---|---|---| @@ -61,7 +61,7 @@ Controls how the policy batches matching episodes before sending a notification. ### Frequency options for Episode [frequency-when-episode-per_episode] -Available frequency options when you set **Dispatch per** to **Episode**. +Available frequency options when you set **Notify per** to **Episode**. | Option | Description | Example | |---|---|---| @@ -71,7 +71,7 @@ Available frequency options when you set **Dispatch per** to **Episode**. ### Frequency options for Group -Available frequency options when you set **Dispatch per** to **Group**. +Available frequency options when you set **Notify per** to **Group**. | Option | Description | Example | |---|---|---| @@ -80,7 +80,7 @@ Available frequency options when you set **Dispatch per** to **Group**. ### Frequency options for Digest -Available frequency options when you set **Dispatch per** to **Digest**. +Available frequency options when you set **Notify per** to **Digest**. | Option | Description | Example | |---|---|---| diff --git a/explore-analyze/alerting/kibana-alerting-experimental/notifications/create-configure-action-policy.md b/explore-analyze/alerting/kibana-alerting-experimental/notifications/create-configure-action-policy.md index 6fbb394469..483c067123 100644 --- a/explore-analyze/alerting/kibana-alerting-experimental/notifications/create-configure-action-policy.md +++ b/explore-analyze/alerting/kibana-alerting-experimental/notifications/create-configure-action-policy.md @@ -5,17 +5,17 @@ applies_to: serverless: preview products: - id: kibana -description: "Create action policies in the {{alerting-v2}}, configure matchers, Dispatch per, Frequency, and workflow destinations." +description: "Create action policies in the {{alerting-v2}}, configure match conditions, Notify per, Frequency, and workflow destinations." --- # Create and configure an action policy [create-manage-action-policies] -Action policies are part of the {{alerting-v2}} in Kibana. Rules define what counts as a problem. Action policies define what happens when a problem is detected. They determine which episodes generate notifications, how episodes batch for dispatch, **frequency** limits on notifications, and where they're routed. +Action policies are part of the {{alerting-v2}} in {{kib}}. This page covers how to configure match conditions, set grouping and frequency, and attach workflow destinations. Where rules define what counts as a problem, action policies define what happens when one is detected — which episodes generate notifications, how they batch for dispatch, and where they're routed. Because policies are separate from rules and global within a space, you can update notification behavior across many rules at once without touching detection logic, and you can route the same alerts differently depending on severity or source. You create and manage policies from the **Action policies** page, not from the rule form. -For matcher fields, grouping modes, frequency options, and dispatch outcomes, refer to [Action policy reference](action-policy-reference.md). +For match conditions fields, grouping modes, frequency options, and dispatch outcomes, refer to [Action policy reference](action-policy-reference.md). @@ -38,12 +38,12 @@ Use matchers to route different episodes to different policies, for example, one ### Grouping and frequency [reduce-noise-grouping] -**Dispatch per** controls how episodes batch into notifications. **Frequency** controls how often the policy can notify for each batch. +**Notify per** controls how episodes batch into notifications. **Frequency** controls how often the policy can notify for each batch. :::{table} :widths: 4-4-4 -| Dispatch per | What it does | Available Frequency options | +| Notify per | What it does | Available Frequency options | |---|---|---| | Episode | One notification per episode. | - On status change
- On status change + repeat at interval
- Every evaluation | | Group | Bundle episodes that share a field value. Specify **Group by** (for example `data.service.name` or `data.host.name`). | - At most once every…
- Every evaluation | @@ -51,12 +51,12 @@ Use matchers to route different episodes to different policies, for example, one ::: -For detailed descriptions, frequency options, and examples for each mode, refer to [Dispatch per options](action-policy-reference.md#notification-grouping). +For detailed descriptions, frequency options, and examples for each mode, refer to [Notify per options](action-policy-reference.md#notification-grouping). ### Frequency [throttle] -**Frequency** limits how often the policy can fire for a given notification group. The interval resets from the last time the policy fired, so successive notifications stay at least `interval` apart. Set a duration such as `1h` or `30m`. For available options by **Dispatch per** mode, refer to [Frequency](action-policy-reference.md#throttle-strategies). +**Frequency** limits how often the policy can fire for a given notification group. The interval resets from the last time the policy fired, so successive notifications stay at least `interval` apart. Set a duration such as `1h` or `30m`. For available options by **Notify per** mode, refer to [Frequency](action-policy-reference.md#throttle-strategies). ### Destinations diff --git a/explore-analyze/alerting/kibana-alerting-experimental/notifications/manage-action-policies.md b/explore-analyze/alerting/kibana-alerting-experimental/notifications/manage-action-policies.md index 13840dd62e..2380f0db3e 100644 --- a/explore-analyze/alerting/kibana-alerting-experimental/notifications/manage-action-policies.md +++ b/explore-analyze/alerting/kibana-alerting-experimental/notifications/manage-action-policies.md @@ -10,7 +10,7 @@ description: "Enable, disable, snooze, maintenance windows, bulk actions, and AP # Manage action policies -Managing action policies is part of the {{alerting-v2}} in Kibana. After you create an action policy, you can control when it runs, pause it temporarily, and keep its credentials current. This page covers those management tasks. +Action policies are part of the {{alerting-v2}} in {{kib}}. This page covers how to enable and disable policies, snooze them during planned outages, and rotate their API keys. ## Enable, snooze, and maintenance diff --git a/explore-analyze/alerting/kibana-alerting-experimental/notifications/notification-gating.md b/explore-analyze/alerting/kibana-alerting-experimental/notifications/notification-gating.md index dd7ee31e1c..c37e671af5 100644 --- a/explore-analyze/alerting/kibana-alerting-experimental/notifications/notification-gating.md +++ b/explore-analyze/alerting/kibana-alerting-experimental/notifications/notification-gating.md @@ -11,14 +11,14 @@ description: "How {{alerting-v2}} gates notifications: which mechanisms suppress # Notification gating [notification-gating] -Notification gating controls whether a matched alert episode triggers a notification. When an episode is gated, the dispatcher stops processing it before any action policy matcher, grouping, throttle, or destination runs. No notification is sent. +Notification gating is part of the {{alerting-v2}} in {{kib}}. Gating controls whether a matched episode triggers a notification. When an episode is gated, the dispatcher stops processing it before any action policy matcher, grouping, or destination runs. This page covers how gating fits in the dispatch cycle and when to use each mechanism: acknowledge, snooze, and deactivate. ## How gating fits in the dispatcher [gating-in-dispatcher] When an episode is eligible for dispatch, the dispatcher evaluates each enabled action policy in order: -1. **Gating:** Is the episode acknowledged, snoozed, or deactivated? If so, stop — no notification is sent. +1. **Gating:** Is the episode acknowledged, snoozed, or deactivated? If so, stop. No notification is sent. 2. **Matcher:** Does the episode match the policy's KQL? If not, skip this policy. 3. **Grouping:** How should matching episodes batch into notification groups? 4. **Throttle:** Has a notification already gone out for this group recently? If so, wait. @@ -34,14 +34,14 @@ Three mechanisms let you gate notifications, each at a different scope: | Mechanism | Scope | When to use | |---|---|---| | Acknowledge | Per episode | You're actively investigating a breach and want to silence notifications for it without closing the episode. Clear the acknowledgement when you're done to restore notifications. | -| Snooze | Per series (group) | You want to quiet an entire alert series for a defined period — for example, during a known noisy window for a specific host. Snooze expires automatically at the end of the duration. | +| Snooze | Per series (group) | You want to quiet an entire alert series for a defined period, for example, during a known noisy window for a specific host. Snooze expires automatically at the end of the duration. | | Deactivate | Per episode | You want to manually close an episode that hasn't recovered automatically. Deactivating marks the episode as inactive and stops notifications for it. Unlike acknowledge, this closes the episode rather than silencing it while leaving it active. | Each mechanism is stored as a separate document in `.alert-actions`, so the full gating history for an episode is queryable in Discover. ### Snooze scope -Snooze applies at the group level (by `group_hash`), not per individual episode. When you snooze one episode, every episode sharing the same group — all rows with the same `rule_id` and `group_hash` — is silenced for the duration. Snoozing one row in the alerts table silences the entire series for that rule. +Snooze applies at the group level (by `group_hash`), not per individual episode. When you snooze one episode, every episode sharing the same group (all rows with the same `rule_id` and `group_hash`) is silenced for the duration. Snoozing one row in the alerts table silences the entire series for that rule. ## Summary Updates the action policy docs for the experimental alerting features with content from eight doc issues. Following the tech preview principle of accuracy over comprehensiveness, changes focus on stable concepts and system behavior rather than UI details that are subject to change. ### Per-rule action policies ([#6613](https://github.com/elastic/docs-content/issues/6613)) The most significant conceptual change in this batch. The previous docs described all action policies as global within a space. PR #268006 introduced a `global` / `single_rule` type discriminator, making the existing description factually incorrect. **`notifications.md`** — Updated "What is an action policy" to define both policy types, their scoping rules, and when to use each. Updated "How policies apply to rules" to distinguish global and per-rule scoping. Updated "Why policies are separate from rules" to reflect that per-rule policies exist for rule-specific routing. **`create-configure-action-policy.md`** — Added a **Policy type** field section explaining the global/per-rule distinction, the immutability constraint, and when to choose each type. Updated the opening paragraph and the match conditions scope description to reflect both types. ### Policy tags ([#6616](https://github.com/elastic/docs-content/issues/6616)) PR #261008 added a `tags` field to notification policies across the full stack. **`create-configure-action-policy.md`** — Added a **Tags** field section explaining that policy tags are organizational labels distinct from rule tags and do not affect routing behavior. ### Quick filter KQL fields ([#6608](https://github.com/elastic/docs-content/issues/6608)) PR #261601 added Rule, Status, and Tags quick filter pickers to the match conditions form. The pickers generate `rule.id`, `episode_status`, and `rule.tags` KQL respectively. **`action-policy-reference.md`** — Added `rule.tags` to the match conditions field reference table. This field was missing but is now surfaced by the quick filters and used in examples across the docs. **`create-configure-action-policy.md`** — Updated the match conditions example from `data.severity` to `rule.tags: "payment-service"` to reflect the fields that quick filters generate. ### Maintenance window suppression ([#6416](https://github.com/elastic/docs-content/issues/6619)) PR #267771 added backend suppression logic for maintenance windows in the dispatcher. **`action-policy-reference.md`** and **`notifications.md`** — Updated the `suppressed` dispatch outcome description to include maintenance windows as a cause alongside acknowledge, snooze, and deactivate. ### View policy details ([#6419](https://github.com/elastic/docs-content/issues/6619)) PR #264497 added a details flyout to the Action policies list page. **`manage-action-policies.md`** — Added a "View policy details" section describing that a policy's full configuration and all per-policy actions are accessible from the list page. UI mechanics (flyout vs. dedicated page) are omitted as the IA is not yet stable. ### Policy execution history ([#6501](https://github.com/elastic/docs-content/issues/6501)) PR #266775 added a Policy Execution History UI and API. **`manage-action-policies.md`** — Added an "Execution history" section describing how to query dispatch outcomes per policy using the `.alert-actions` data stream. The section points to the existing dispatch outcomes reference rather than documenting the UI, which is subject to change. ### Out of scope for this PR - **[#6498](https://github.com/elastic/docs-content/issues/6498)** (optional `matcher` query parameter on the data fields suggestions endpoint): API reference detail with no conceptual home yet. - **[#6617](https://github.com/elastic/docs-content/issues/6617)** (grouping modes and frequency strategies redesign): The existing reference and create-configure pages already cover this content accurately. No new stable concepts to add at the tech preview stage. ## Generative AI disclosure 1. Did you use a generative AI (GenAI) tool to assist in creating this contribution? - [x] Yes - Cursor + Claude - [ ] No --- .../notifications.md | 27 +++++++++++-------- .../notifications/action-policy-reference.md | 7 ++--- .../create-configure-action-policy.md | 27 ++++++++++++------- .../notifications/manage-action-policies.md | 17 ++++++++---- .../workflows-alerting.md | 6 +---- 5 files changed, 51 insertions(+), 33 deletions(-) diff --git a/explore-analyze/alerting/kibana-alerting-experimental/notifications.md b/explore-analyze/alerting/kibana-alerting-experimental/notifications.md index 8b8d82b793..ee345ad611 100644 --- a/explore-analyze/alerting/kibana-alerting-experimental/notifications.md +++ b/explore-analyze/alerting/kibana-alerting-experimental/notifications.md @@ -17,20 +17,25 @@ This page explains how action policies work. For creating and configuring them s ## What is an action policy [action-policies] -An action policy is a saved object in your space that controls notification routing. It's not attached to a rule. It's global within the space, so when an episode is produced, the system evaluates all enabled policies in the space that are not snoozed and decides which ones apply. +An action policy is a saved object in your space that controls notification routing. Policies can be **global** or **per-rule**: + +- **Global** policies apply to all episodes in the space, from any rule. When an episode is produced, the dispatcher evaluates all enabled global policies that are not snoozed. Global is the default type and suits most use cases. +- **Per-rule** policies are scoped to a single rule. They apply only to episodes produced by that specific rule. Use a per-rule policy when routing is specific to one rule and you don't want it to affect other rules in the space. The rule association is set at creation and cannot be changed. Each policy has four controls: | Control | What it does | | --- | --- | -| Match conditions (optional KQL) | Filters which episodes this policy applies to. An empty match condition matches all episodes in the space. | -| Notify per (grouping) | Controls how episodes batch into notifications: one per episode, one per notification group (Notify per **Group**), or one digest for all. | +| Match conditions (optional KQL) | Filters which episodes this policy applies to. An empty match condition matches all episodes covered by the policy's scope. | +| Notify per | Controls how episodes batch into notifications: one per episode, one per notification group using **Group** mode, or one digest for all. | | Frequency | Controls how often the policy can notify for the same notification group. | | Destinations | One or more workflows to invoke when all conditions are met. | ## How policies apply to rules -Action policies don't reference rules directly. You scope a policy using KQL over episode and rule fields, for example `rule.labels: "checkout"` or `data.severity: "critical"`. A policy applies to every matching episode in the space, from any rule. +**Global policies** don't reference rules directly. You scope them using KQL over episode and rule fields, for example `rule.tags: "checkout"` or `data.severity: "critical"`. A global policy applies to every matching episode in the space, from any rule. + +**Per-rule policies** are bound to a specific rule at creation. They apply only to episodes from that rule, and you can still use match conditions to filter further within that rule's episodes. Multiple policies can match the same episode, and each runs independently. There's no precedence or merging between them. If no policy matches an episode, no notification is sent. This is intentional. @@ -58,20 +63,20 @@ Each notification attempt results in one of the following outcomes. | --- | --- | | `dispatched` | A notification was sent. | | `throttled` | Dispatch was suppressed because the **frequency** interval had not elapsed. | -| `suppressed` | The episode was suppressed before dispatch (acknowledged, snoozed, or deactivated). | -| `unmatched` | No policy matched this episode; no workflow ran. | +| `suppressed` | Dispatch was blocked before the notification went out. The episode was acknowledged, snoozed, or deactivated, or the space is currently in a maintenance window. | +| `unmatched` | No policy matched this episode and no workflow ran. | | `error` | Processing failed. Check {{kib}} logs. | You can query these outcomes in Discover through the `.alert-actions` data stream. ## Why policies are separate from rules -Rules don't own policies. A rule can't say "notify team X when it fires." Instead, team X creates an action policy that uses a KQL matcher to pick up matching episodes. - -This design means: -- One policy can cover episodes from many rules. +Policies are independent of rules, which means: +- One global policy can cover episodes from many rules. For example, a policy matching `data.severity: "critical"` applies regardless of which rule produced the episode. - You can update routing without touching any rule. -- Rules can be created without any notification policy, which is useful for testing. +- Rules can be created without any action policy, which is useful for testing. + +When you do need routing that's specific to one rule, create a per-rule policy and bind it to that rule at creation. When you're ready to route notifications, go to [Create and configure an action policy](notifications/create-configure-action-policy.md). diff --git a/explore-analyze/alerting/kibana-alerting-experimental/notifications/action-policy-reference.md b/explore-analyze/alerting/kibana-alerting-experimental/notifications/action-policy-reference.md index 82baa4b8ab..7272d36c94 100644 --- a/explore-analyze/alerting/kibana-alerting-experimental/notifications/action-policy-reference.md +++ b/explore-analyze/alerting/kibana-alerting-experimental/notifications/action-policy-reference.md @@ -23,6 +23,7 @@ Use these fields in the **Match conditions** expression to filter which episodes | `data.*` | Dynamic payload fields sent by the rule. Available fields depend on the rule type and configuration. | `data.severity: "critical"` or `data.host.name: "web-01"` | | `rule.id` | Unique identifier for the rule that generated the episode. | `rule.id: "rule-001"` | | `rule.name` | Display name of the rule. | `rule.name: "High CPU"` | +| `rule.tags` | Tags attached to the rule. Use to match episodes from rules with a specific tag. | `rule.tags: "payment-service"` | | `rule.labels` | Key-value labels attached to the rule. Use dot notation to target a specific label key. | `rule.labels.env: "production"` | @@ -46,7 +59,7 @@ Use match conditions to route different episodes to different policies, for exam | Notify per | What it does | Available Frequency options | |---|---|---| | Episode | One notification per episode. | - On status change
- On status change + repeat at interval
- Every evaluation | -| Group | Bundle episodes that share a field value. Specify **Group by** (for example `data.service.name` or `data.host.name`). | - At most once every…
- Every evaluation | +| Group | Bundle episodes that share a field value. Specify a **Group by** field such as `data.service.name` or `data.host.name`. | - At most once every…
- Every evaluation | | Digest | One notification for all matching episodes combined. | Every evaluation | ::: @@ -61,7 +74,3 @@ For detailed descriptions, frequency options, and examples for each mode, refer ### Destinations One or more workflows to invoke when the policy matches. Use the search field to find and attach workflows. - -### Snooze - -An optional time window during which the policy doesn't dispatch. Useful for planned maintenance or quiet periods without disabling the policy entirely. diff --git a/explore-analyze/alerting/kibana-alerting-experimental/notifications/manage-action-policies.md b/explore-analyze/alerting/kibana-alerting-experimental/notifications/manage-action-policies.md index 2380f0db3e..4039405c58 100644 --- a/explore-analyze/alerting/kibana-alerting-experimental/notifications/manage-action-policies.md +++ b/explore-analyze/alerting/kibana-alerting-experimental/notifications/manage-action-policies.md @@ -5,21 +5,28 @@ applies_to: serverless: preview products: - id: kibana -description: "Enable, disable, snooze, maintenance windows, bulk actions, and API key rotation for action policies in the {{alerting-v2}}." +description: "View policy details, enable, disable, snooze, review execution history, and rotate API keys for action policies in the {{alerting-v2}}." --- # Manage action policies -Action policies are part of the {{alerting-v2}} in {{kib}}. This page covers how to enable and disable policies, snooze them during planned outages, and rotate their API keys. +Action policies are part of the {{alerting-v2}} in {{kib}}. This page covers how to view policy details, enable and disable policies, snooze them during planned outages, rotate their API keys, and review execution history. -## Enable, snooze, and maintenance +## View policy details + +From the **Action policies** list, you can open a policy to see its full configuration, including match conditions, grouping mode, frequency, and destinations. You can also edit, clone, delete, enable, disable, snooze, or update its API key without leaving the list page. + +## Execution history + +The dispatcher records the outcome of every notification attempt for each policy. To investigate delivery issues or audit which policies ran for an episode, query the `.alert-actions` data stream in Discover and filter by `outcome` or `policy_id`. For a description of each outcome, refer to [Dispatch outcomes](action-policy-reference.md#dispatch-outcomes). + +## Enable and snooze You can disable a policy so it is not evaluated for new episodes. You can snooze a policy for a defined window so that it does not dispatch notifications during that period. Policies that are not enabled or are snoozed are skipped when the dispatcher evaluates policies. ### Maintenance windows [maintenance-windows] - -Maintenance windows are scheduled periods during which a policy does not dispatch notifications. They are configured on the action policy alongside snooze and other policy controls, not on the rule. Rule evaluation continues and alert episodes can still be recorded in `.rule-events`. Only dispatch through that policy pauses. Use maintenance windows for planned deployments, infrastructure changes, or recurring quiet periods. +During a [maintenance window](../../alerts/maintenance-windows.md), action policies stop dispatching notifications automatically. No policy configuration is required. Rule evaluation continues and alert episodes are still recorded in `.rule-events`. Maintenance windows are configured separately, not on the action policy. ## Update API keys diff --git a/explore-analyze/alerting/kibana-alerting-experimental/workflows-alerting.md b/explore-analyze/alerting/kibana-alerting-experimental/workflows-alerting.md index 3640186e56..f1e1141f2e 100644 --- a/explore-analyze/alerting/kibana-alerting-experimental/workflows-alerting.md +++ b/explore-analyze/alerting/kibana-alerting-experimental/workflows-alerting.md @@ -13,11 +13,7 @@ description: "How workflows connect to the {{alerting-v2}} action policies and r Workflows are part of the {{alerting-v2}} in Kibana. Without a workflow, an action policy has nowhere to send notifications. [Workflows](../../workflows.md) are the delivery layer. They define the actual steps that run when a policy matches an episode: sending a message, calling a webhook, triggering automation, or any combination. Setting up a workflow is what connects the {{alerting-v2}} to the tools your team already uses for incident response. -Before creating an action policy, make sure the workflows you want to use already exist in your space. Policies store references to workflow IDs, so a destination workflow must exist before you can select it. - -::::{note} -Only manual triggers are supported for workflows used with action policies. -:::: +Before creating an action policy, make sure the workflows you want to use already exist in your space. Policies store references to workflow IDs, so a destination workflow must exist before you can select it. ## Runtime execution order [runtime-execution-order] From c4ce673f86668d866a76b43403c36b7a367b71fe Mon Sep 17 00:00:00 2001 From: Nastasha Solomon <79124755+nastasha-solomon@users.noreply.github.com> Date: Wed, 27 May 2026 11:58:41 -0400 Subject: [PATCH 05/14] Update explore-analyze/alerting/kibana-alerting-experimental/notifications.md Co-authored-by: Kevin Delemme --- .../alerting/kibana-alerting-experimental/notifications.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/explore-analyze/alerting/kibana-alerting-experimental/notifications.md b/explore-analyze/alerting/kibana-alerting-experimental/notifications.md index ee345ad611..cfc786dc5d 100644 --- a/explore-analyze/alerting/kibana-alerting-experimental/notifications.md +++ b/explore-analyze/alerting/kibana-alerting-experimental/notifications.md @@ -42,7 +42,7 @@ Multiple policies can match the same episode, and each runs independently. There ## How action policies are evaluated [how-action-policies-evaluated] -{{kib}} runs a background process called the dispatcher that checks for eligible episodes on a short interval (around 10 seconds) and evaluates action policies against them. The dispatcher is separate from the rule schedule. Rules write events on their own cadence, and the dispatcher picks them up asynchronously. +{{kib}} runs a background process called the dispatcher that checks for eligible episodes on a short interval (around 5 seconds) and evaluates action policies against them. The dispatcher is separate from the rule schedule. Rules write events on their own cadence, and the dispatcher picks them up asynchronously. For each enabled policy that is not snoozed, the dispatcher works through the following steps: From 6392ee1c39cba8c00853018f1fa835f5d8a12321 Mon Sep 17 00:00:00 2001 From: Nastasha Solomon Date: Thu, 28 May 2026 16:23:43 -0400 Subject: [PATCH 06/14] Significant revisions --- docset.yml | 4 +- .../action-policy-reference.md | 109 ++++++++++++++++++ .../create-configure-action-policy.md | 80 +++++++++++++ .../action-policies/manage-action-policies.md | 56 +++++++++ .../reduce-notification-noise.md | 43 +++++++ .../notifications-actions.md | 81 +++++++++++++ .../notifications.md | 91 ++++++++------- .../notifications/action-policy-reference.md | 100 ---------------- .../create-configure-action-policy.md | 76 ------------ .../notifications/manage-action-policies.md | 41 ------- .../notifications/notification-gating.md | 56 --------- .../workflows-alerting.md | 37 +++--- explore-analyze/alerting/watcher/actions.md | 2 +- explore-analyze/toc.yml | 10 +- 14 files changed, 443 insertions(+), 343 deletions(-) create mode 100644 explore-analyze/alerting/kibana-alerting-experimental/action-policies/action-policy-reference.md create mode 100644 explore-analyze/alerting/kibana-alerting-experimental/action-policies/create-configure-action-policy.md create mode 100644 explore-analyze/alerting/kibana-alerting-experimental/action-policies/manage-action-policies.md create mode 100644 explore-analyze/alerting/kibana-alerting-experimental/action-policies/reduce-notification-noise.md create mode 100644 explore-analyze/alerting/kibana-alerting-experimental/notifications-actions.md delete mode 100644 explore-analyze/alerting/kibana-alerting-experimental/notifications/action-policy-reference.md delete mode 100644 explore-analyze/alerting/kibana-alerting-experimental/notifications/create-configure-action-policy.md delete mode 100644 explore-analyze/alerting/kibana-alerting-experimental/notifications/manage-action-policies.md delete mode 100644 explore-analyze/alerting/kibana-alerting-experimental/notifications/notification-gating.md diff --git a/docset.yml b/docset.yml index 338395a798..c4b5aa4f90 100644 --- a/docset.yml +++ b/docset.yml @@ -124,8 +124,8 @@ subs: ls-pipelines-app: "Logstash Pipelines" maint-windows-app: "Maintenance Windows" maint-windows-cap: "Maintenance windows" - alerting-v2: "experimental alerting features" - alerting-v2-cap: "Experimental alerting features" + alerting-v2-system: "experimental alerting system" + alerting-v2-system-cap: "Experimental alerting system" custom-roles-app: "Custom Roles" data-source: "data view" data-sources: "data views" diff --git a/explore-analyze/alerting/kibana-alerting-experimental/action-policies/action-policy-reference.md b/explore-analyze/alerting/kibana-alerting-experimental/action-policies/action-policy-reference.md new file mode 100644 index 0000000000..1801f4f47e --- /dev/null +++ b/explore-analyze/alerting/kibana-alerting-experimental/action-policies/action-policy-reference.md @@ -0,0 +1,109 @@ +--- +navigation_title: Action policy reference +applies_to: + stack: preview + serverless: preview +products: + - id: kibana +description: "Grouping modes, frequency options, dispatch outcomes, and match conditions field reference for action policies in the {{alerting-v2-system}}." +--- + +# Action policy reference for the {{alerting-v2-system}} [action-policy-reference] + +Action policies are part of the {{alerting-v2-system}} in {{kib}}. This page is a reference for match conditions fields, grouping modes, frequency options, and dispatch outcomes. For step-by-step guidance, refer to [Create and configure an action policy](create-configure-action-policy.md). + +## Match conditions fields [matcher-fields] + +Use these fields in the **Match conditions** expression to filter which alert episodes a policy applies to. Combine them with standard [KQL](../../../query-filter/languages/kql.md) operators, for example `data.severity: "critical" AND episode_status: "active"`. + +| Field | Type | Description | Accepted values | Example | +|---|---|---|---|---| +| `episode_id` | string | Unique identifier of the alert episode. | Any string | `episode_id: "ep-001"` | +| `episode_status` | string | Current lifecycle status of the alert episode. | `inactive`, `pending`, `active`, `recovering` | `episode_status: "active"` | +| `group_hash` | string | Stable hash identifying the alert series the alert episode belongs to. | Any string | `group_hash: "abc123"` | +| `last_event_timestamp` | string | ISO 8601 timestamp of the most recent event recorded for the alert episode. | ISO 8601 timestamp | `last_event_timestamp > "2026-01-01"` | +| `rule.id` | string | Unique identifier of the rule that generated the alert episode. | Any string | `rule.id: "rule-001"` | +| `rule.name` | string | Display name of the rule. | Any string | `rule.name: "High CPU"` | +| `rule.description` | string | Description of the rule. | Any string | `rule.description: *checkout*` | +| `rule.tags` | string[] | Tags attached to the rule. Use to match alert episodes from rules with a specific tag. | Any string | `rule.tags: "payment-service"` | +| `rule.enabled` | boolean | Whether the rule is currently enabled. | `true`, `false` | `rule.enabled: true` | +| `data.*` | object | Dynamic payload fields sent by the rule. Available fields depend on the rule type and configuration. | Depends on rule type | `data.severity: "critical"` | + + + +## Notify per options [notification-grouping] + +Controls how the policy batches matching episodes before sending a notification. + +| Option | Description | When to use | +|---|---|---| +| Episode | The policy sends one notification per alert episode, independently of other episodes. Default selection. | You need per-issue visibility and want to handle each problem separately. | +| Group | The policy bundles alert episodes that share the same value for a specified `data.*` field into one notification per unique value. Each unique value forms a **notification group**. | A rule produces many related alert episodes, such as one per service or host, and you want to reduce noise by batching them into shared notifications. | +| Digest | The policy combines all matching alert episodes into a single notification, regardless of what they have in common. | You want a single periodic summary of everything that matched, rather than individual alert episodes. | + +## Frequency [throttle-strategies] + +Frequency controls how often the policy fires for a given alert episode or notification group. The available options depend on the **Notify per** setting. Not all options are valid for all modes. + +| Option | Description | When to use | +|---|---|---| +| On status change | Notifies when the alert episode status changes, for example from active to recovering. One notification per transition. | You only need to know when something breaks and when it's resolved. Use this when you trust your ticketing or incident workflow to track ongoing issues. | +| On status change + repeat at interval | Notifies on status change, then resends notifications at a regular interval while the alert episode remains in the same status. | You want status change notifications plus periodic reminders that a problem is still unresolved, in case it has been missed or pushed aside. | +| At most once every… | Caps notifications at one per alert episode or notification group within the chosen interval, regardless of rule frequency. | You want to limit notification volume for noisy rules without missing new or ongoing issues. | +| Every evaluation | Notifies on every rule evaluation. Can be noisy. Use sparingly and only with infrequent rule schedules. | You need a full audit trail of every evaluation, or the rule runs infrequently enough that noise isn't a concern. | + + + +### Frequency options for Episode [frequency-when-episode-per_episode] + +Available frequency options when you set **Notify per** to **Episode**. + +| Option | Description | Example | +|---|---|---| +| On status change | Notifies once when the alert episode opens and once when it recovers. No repeat notifications while it remains active. | A host goes down at 9:00am → one notification. Recovers at 11:00am → one notification. No notifications between them. | +| On status change + repeat at interval | Same as On status change, but also sends a reminder at a set interval while the alert episode is still active. | A host goes down at 9:00am → notification. With a 1h repeat: reminder at 10:00am, 11:00am. Recovers at 11:30am → notification. | +| Every evaluation | Fires on every rule evaluation, regardless of status. Can be noisy on frequent rule schedules. Avoid in production. | A rule running every 5 minutes with one active alert episode produces up to 288 notifications per day. | + +### Frequency options for Group + +Available frequency options when you set **Notify per** to **Group**. + +| Option | Description | Example | +|---|---|---| +| At most once every… | Limits how often each notification group can notify, regardless of how many alert episodes match or how often the rule runs. | 10 alert episodes share `data.host.name: "web-01"`. With a 1h limit, you get at most one notification per hour for that notification group. | +| Every evaluation | Fires on every rule evaluation for each unique value in the group-by field. Still noisy on frequent rule schedules. | A rule running every 10 minutes with 5 unique host values produces up to 6 notifications per host per hour. | + +### Frequency options for Digest + +Available frequency options when you set **Notify per** to **Digest**. + +| Option | Description | Example | +|---|---|---| +| At most once every… (default) | Caps digest delivery to at most one bundled summary within the chosen interval, regardless of how often the rule runs. | A rule running every 5 minutes with a 1h digest interval sends one bundled summary per hour containing all matching alert episodes from that period. | +| Every evaluation | Fires on every rule run, bundling all matching alert episodes into one message. Can be noisy on frequent rule schedules. | A rule running every 30 minutes with 20 matching alert episodes produces one summary every 30 minutes containing all 20. | + +## Dispatch outcomes + +The dispatcher records each run with one of the following outcomes. To investigate delivery issues, open Discover, query the `.alert-actions` index, and filter by the `action_type` field. + +| Outcome | What happened | +|---|---| +| `dispatched` | The dispatcher invoked a workflow for the alert episode. | +| `throttled` | The alert episode matched a policy but was rate-limited by the frequency setting. No workflow ran. This is expected behavior, not an error. | +| `suppressed` | Dispatch was blocked. The alert episode was acknowledged, snoozed, or deactivated, or the space is currently in a [maintenance window](../../alerts/maintenance-windows.md). | +| `unmatched` | No action policy matched the alert episode. No workflow ran. | + +## Related pages + +- [Create and configure an action policy](create-configure-action-policy.md) to apply these fields and options when setting up a policy. +- [Manage action policies in {{alerting-v2-system}}](manage-action-policies.md) to enable, disable, snooze, or audit your policies. +- [Notifications and actions in {{alerting-v2-system}}](../notifications-actions.md) to understand how action policies evaluate and gate alert episodes. \ No newline at end of file diff --git a/explore-analyze/alerting/kibana-alerting-experimental/action-policies/create-configure-action-policy.md b/explore-analyze/alerting/kibana-alerting-experimental/action-policies/create-configure-action-policy.md new file mode 100644 index 0000000000..082558e2a6 --- /dev/null +++ b/explore-analyze/alerting/kibana-alerting-experimental/action-policies/create-configure-action-policy.md @@ -0,0 +1,80 @@ +--- +navigation_title: Create an action policy +applies_to: + stack: preview + serverless: preview +products: + - id: kibana +description: "Create action policies in the {{alerting-v2-system}}, configure match conditions, Notify per, Frequency, and workflow destinations." +--- + +# Create an action policy for the {{alerting-v2-system}} [create-manage-action-policies] + + +Action policies are part of the {{alerting-v2-system}} in {{kib}}. This page covers how to configure policy type, match conditions, grouping, frequency, and workflow destinations. Where rules define what counts as a problem, action policies define what happens when one is detected: which alert episodes generate notifications, how they batch for dispatch, and where they're routed. + +Because policies are separate from rules, you can update notification behavior across many rules at once without touching detection logic, and you can route the same alert episodes differently depending on severity or source. You create and manage policies from the **Action policies** page, not from the rule form. + +For match conditions fields, grouping modes, frequency options, and dispatch outcomes, refer to [Action policy reference](action-policy-reference.md). + + + +## Policy type [policy-type] + +An action policy can be global or per-rule: + +- **Global**: Global policies apply to any alert episode in the space. Use a global policy when you want to route alert episodes from multiple rules. For example, a policy matching `rule.tags: "checkout"` applies to every rule with that tag. +- **Per-rule**: Per-rule policies are scoped to a single rule. Use a per-rule policy when notification routing is specific to one rule and you don't want it to affect other rules in the space. + +The policy type is set at creation and cannot be changed. If you need a different type, create a new policy. + +## Tags [policy-tags] + +Optional string labels you assign to a policy to categorize it or filter it on the Action policies list. Unlike rule tags, policy tags describe the policy itself rather than the alerts it matches. You can add, edit, or remove tags at any time without affecting routing behavior. + +## Match conditions [matcher] + + +An optional [KQL](../../../query-filter/languages/kql.md) expression that filters which alert episodes this policy applies to. An empty match condition matches every alert episode covered by the policy's scope. For a global policy, that means all alert episodes in the space. For a per-rule policy, it means all alert episodes from the associated rule. + +Use match conditions to route different alert episodes to different policies, for example, one policy for `rule.tags: "payment-service"` alert episodes routed to PagerDuty and another for warnings routed to Slack. For available fields and examples, refer to [Match conditions fields](action-policy-reference.md#matcher-fields). + + + +## Grouping and frequency [reduce-noise-grouping] + + +**Notify per** controls how alert episodes batch into notifications. **Frequency** controls how often the policy can notify for each batch. + +:::{table} +:widths: 4-4-4 + +| Notify per | What it does | Available Frequency options | +|---|---|---| +| Episode | One notification per alert episode. | - On status change
- On status change + repeat at interval
- Every evaluation | +| Group | Bundle alert episodes that share a field value. Specify a **Group by** field such as `data.service.name` or `data.host.name`. | - At most once every…
- Every evaluation | +| Digest | One notification for all matching alert episodes combined. | - At most once every… (default)
- Every evaluation | + +::: + +For detailed descriptions, frequency options, and examples for each mode, refer to [Notify per options](action-policy-reference.md#notification-grouping). + +## Frequency [throttle] + + +**Frequency** limits how often the policy can fire for a given notification group. The interval resets from the last time the policy fired, so successive notifications stay at least `interval` apart. Set a duration such as `1h` or `30m`. For available options by **Notify per** mode, refer to [Frequency](action-policy-reference.md#throttle-strategies). + +## Destinations + +One or more workflows to invoke when the policy matches. Use the search field to find and attach workflows. + +## Related pages + +- [Manage action policies in {{alerting-v2-system}}](manage-action-policies.md) to view, enable, disable, or snooze the policies you create. +- [Action policy reference in {{alerting-v2-system}}](action-policy-reference.md) to look up match condition fields, grouping modes, and frequency options. +- [Notifications and actions in {{alerting-v2-system}}](../notifications-actions.md) to understand how action policies evaluate and gate alert episodes. diff --git a/explore-analyze/alerting/kibana-alerting-experimental/action-policies/manage-action-policies.md b/explore-analyze/alerting/kibana-alerting-experimental/action-policies/manage-action-policies.md new file mode 100644 index 0000000000..fc47ce612b --- /dev/null +++ b/explore-analyze/alerting/kibana-alerting-experimental/action-policies/manage-action-policies.md @@ -0,0 +1,56 @@ +--- +navigation_title: Manage action policies +applies_to: + stack: preview + serverless: preview +products: + - id: kibana +description: "View policy details, enable, disable, snooze, review execution history, and rotate API keys for action policies in the {{alerting-v2-system}}." +--- + +# Manage action policies for the {{alerting-v2-system}} + +Action policies are part of the {{alerting-v2-system}} in {{kib}}. This page covers how to view policy details, enable and disable policies, snooze them during planned outages, rotate their API keys, and review execution history. + +## View policy details + +From the **Action policies** list, you can open a policy to see its full configuration, including match conditions, grouping mode, frequency, and destinations. You can also edit, clone, delete, enable, disable, snooze, or update its API key without leaving the list page. + +## Execution history + +After each dispatcher run, {{kib}} records the outcome in the `.alert-actions` index. These records let you audit whether workflows were invoked, skipped, or had no matching policy for each alert episode. + +| Outcome | What it means | +|---|---| +| `dispatched` | The dispatcher invoked a workflow for the alert episode. | +| `throttled` | The alert episode matched a policy but was rate-limited by the frequency setting. No workflow ran. | +| `suppressed` | Dispatch was blocked. The alert episode was acknowledged, snoozed, or deactivated, or the space is currently in a [maintenance window](../../alerts/maintenance-windows.md). | +| `unmatched` | No action policy matched the alert episode. No workflow ran. | + +To investigate delivery issues or audit which policies ran for an alert episode, open Discover and query the `.alert-actions` index. Filter by `action_type` to narrow by outcome, or by `policy_id` to filter by policy. + +## Enable and snooze + +You can disable a policy so it is not evaluated for new alert episodes. You can snooze a policy for a defined window so that it does not dispatch notifications during that period. Policies that are not enabled or are snoozed are skipped when the dispatcher evaluates policies. + +### Maintenance windows [maintenance-windows] + +During a [maintenance window](../../alerts/maintenance-windows.md), action policies stop dispatching notifications automatically. No policy configuration is required. Rule evaluation continues and alert episodes are still recorded in `.rule-events`. Maintenance windows are configured separately, not on the action policy. + +## Update API keys + +You can rotate the API key used to run a policy's workflows without changing matchers or destinations. Use the **Update API key** action on one policy or for multiple selected policies. + +::::{important} Production considerations +When you update or delete an action policy, previous API keys used for execution are marked for invalidation and removed on a schedule managed by {{kib}}. Allow for a short delay before new keys are used for dispatch. +:::: + +## Bulk actions + +On the action policies list, select one or more policies to enable, disable, snooze, and do more in bulk. **Select all** selects every policy on the current page of results. Clear the selection before changing filters if you need a different set. + +## Related pages + +- [Reduce notification noise in {{alerting-v2-system}}](reduce-notification-noise.md) to silence individual alert episodes using acknowledge, snooze, or deactivate. +- [Action policy reference in {{alerting-v2-system}}](action-policy-reference.md) to look up match condition fields, grouping modes, and frequency options. +- [Create and configure an action policy](create-configure-action-policy.md) to set up or update the policies you manage here. diff --git a/explore-analyze/alerting/kibana-alerting-experimental/action-policies/reduce-notification-noise.md b/explore-analyze/alerting/kibana-alerting-experimental/action-policies/reduce-notification-noise.md new file mode 100644 index 0000000000..a79b333dab --- /dev/null +++ b/explore-analyze/alerting/kibana-alerting-experimental/action-policies/reduce-notification-noise.md @@ -0,0 +1,43 @@ +--- +navigation_title: Reduce notification noise +applies_to: + stack: preview + serverless: preview +products: + - id: kibana +description: "How to reduce notification noise in the {{alerting-v2-system}} using acknowledge, snooze, and deactivate to silence alert episodes." +--- + +# Reduce notification noise for the {{alerting-v2-system}} [reduce-notification-noise] + +Acknowledge, snooze, and deactivate are part of the {{alerting-v2-system}} in {{kib}}. Each one silences notifications for an alert episode at a different scope. When an alert episode is silenced, the dispatcher stops processing it before any action policy matching, grouping, or frequency evaluation runs. For an overview of where this fits in the full dispatch cycle, refer to [Notifications and actions in {{alerting-v2-system}}](../notifications-actions.md). + +## Silencing mechanisms [silencing-mechanisms] + +Three mechanisms let you silence notifications, each at a different scope: + +| Mechanism | Scope | When to use | +|---|---|---| +| Acknowledge | Per alert episode | You're actively investigating a breach and want to silence notifications for it without closing the alert episode. Clear the acknowledgment when you're done to restore notifications. | +| Snooze | Per series (group) | You want to quiet an entire alert series for a defined period, for example, during a known noisy window for a specific host. Snooze expires automatically at the end of the duration. | +| Deactivate | Per alert episode | You want to manually close an alert episode that hasn't recovered automatically. Deactivating marks the alert episode as inactive and stops notifications for it. Unlike acknowledge, this closes the alert episode rather than silencing it while leaving it active. | + +Each mechanism is stored as a separate document in `.alert-actions`, so the full gating history for an episode is queryable in Discover. + +### Snooze scope + +Snooze applies at the group level (by `group_hash`), not per individual alert episode. When you snooze one alert episode, every alert episode sharing the same group (all rows with the same `rule_id` and `group_hash`) is silenced for the duration. Snoozing one row in the alerts table silences the entire series for that rule. + + + +## Related pages + + +- [Notifications and actions in {{alerting-v2-system}}](../notifications-actions.md) to learn how action policies route and throttle alert episodes after silencing. +- [Create and configure an action policy](create-configure-action-policy.md) to set up the policies that run after gating checks pass. +- [Action policy reference](action-policy-reference.md) to look up match condition fields, grouping modes, and frequency options. diff --git a/explore-analyze/alerting/kibana-alerting-experimental/notifications-actions.md b/explore-analyze/alerting/kibana-alerting-experimental/notifications-actions.md new file mode 100644 index 0000000000..1c3cd98838 --- /dev/null +++ b/explore-analyze/alerting/kibana-alerting-experimental/notifications-actions.md @@ -0,0 +1,81 @@ +--- +navigation_title: Notifications and actions +applies_to: + stack: preview + serverless: preview +products: + - id: kibana +description: "How {{alerting-v2-system}} action policies route alert episodes to notifications and actions." +--- + +# Notifications and actions for the {{alerting-v2-system}} + +Action policies are part of the {{alerting-v2-system}} in {{kib}}. After a rule produces alert episodes, action policies decide whether and when to invoke workflows. Workflows are what actually send the notification or run the automation. + +This page explains how action policies work. For creating and configuring them step by step, refer to [Create and configure an action policy](action-policies/create-configure-action-policy.md). + +## What is an action policy [action-policies] + +An action policy is the gating layer between an alert episode and a workflow. It decides whether and when to invoke a workflow by running the alert episode through a sequence of gates. A workflow runs only if the alert episode clears each gate in sequence. + +The three gates are suppression, match conditions, and frequency: + +* **Suppression**: Suppression checks whether the alert episode should be silenced. Episodes that are acknowledged, snoozed, or inside a maintenance window are stopped here and no workflow is invoked. For details on each mechanism and its scope, refer to [Reduce notification noise](action-policies/reduce-notification-noise.md). +* **Match conditions**: Match conditions filter which alert episodes the policy applies to. You define them using [KQL](../../query-filter/languages/kql.md). An empty match condition applies to all alert episodes within the policy's scope. +* **Frequency**: Frequency controls how often the policy can invoke its workflows for the same group of episodes, and how episodes batch before a workflow is invoked. Options are one notification per alert episode, one per notification group, or one digest for all matching episodes. If a workflow was already invoked within the cooldown period, the episode waits. + +If any gate stops the episode, the workflow is not invoked for that policy. + +:::{note} +Because each action policy evaluates alert episodes independently, an episode that is blocked by one policy can still trigger a workflow through a second policy with different conditions. +::: + +## Policy types [policy-types] + +Policies can be global or per-rule. Global policies apply across all rules in a space and suit most use cases. Per-rule policies apply to a single rule and give you precise control over routing for that rule without affecting anything else in the space. + +### Global policies + +A global policy applies to all alert episodes in the space, from any rule. When an alert episode is produced, the dispatcher evaluates all enabled global policies that are not snoozed. Global is the default type and suits most use cases. + +### Per-rule policies + +A per-rule policy is scoped to a single rule. It applies only to alert episodes produced by that specific rule. Use a per-rule policy when routing is specific to one rule and you do not want it to affect other rules in the space. The rule association is set at creation and cannot be changed. + +## How policies apply to rules + +How a policy applies depends on whether it is global or per-rule. Multiple policies can match the same alert episode, and each runs independently. There is no precedence or merging between them. If no policy matches an alert episode, no workflow is invoked and no notification is sent. + +### Global policy application + +Global policies don't reference rules directly. You scope them using KQL over alert episode and rule fields, for example `rule.tags: "checkout"` or `data.severity: "critical"`. A global policy applies to every matching alert episode in the space, from any rule. + +### Per-rule policy application + +Per-rule policies are bound to a specific rule at creation. They apply only to alert episodes from that rule, and you can still use match conditions to filter further within that rule's alert episodes. + +## How action policies are evaluated [how-action-policies-evaluated] + +{{kib}} runs a background process called the dispatcher that checks for eligible alert episodes on a short interval (around 5 seconds) and evaluates action policies against them. The dispatcher runs on its own cycle, separate from the rule schedule. + +For each enabled policy that is not snoozed, the dispatcher works through the following steps. + +1. **Gating:** Is the alert episode acknowledged, snoozed, or deactivated? If so, skip. Refer to [Reduce notification noise](action-policies/reduce-notification-noise.md) to learn more. +2. **Matcher:** Does the alert episode match the policy's KQL? If not, skip this policy. +3. **Grouping:** How should matching alert episodes batch into notification groups? +4. **Frequency:** Has a workflow already been invoked for this notification group recently? If so, wait. +5. **Destinations:** Invoke the configured workflows. + +Workflow invocations may not happen immediately after a rule evaluates. + +## Why policies are separate from rules + +Policies are independent of rules. A single global policy can cover alert episodes from many rules, so a policy matching `data.severity: "critical"` applies regardless of which rule produced the alert episode. You can also update notification routing without touching any rule, and you can create rules without any action policy, which is useful for testing detection logic before wiring up notifications. + +When you do need routing that is specific to one rule, create a per-rule policy and bind it to that rule at creation. + +## Next steps + +- [Create and configure an action policy](action-policies/create-configure-action-policy.md) to set up policy type, match conditions, grouping, frequency, and workflow destinations. +- [Manage action policies](action-policies/manage-action-policies.md) to enable, disable, snooze, edit, or delete the policies in your space. +- [Action policy reference](action-policies/action-policy-reference.md) to look up available match condition fields, grouping modes, and frequency options. diff --git a/explore-analyze/alerting/kibana-alerting-experimental/notifications.md b/explore-analyze/alerting/kibana-alerting-experimental/notifications.md index cfc786dc5d..c8fe435703 100644 --- a/explore-analyze/alerting/kibana-alerting-experimental/notifications.md +++ b/explore-analyze/alerting/kibana-alerting-experimental/notifications.md @@ -1,82 +1,81 @@ --- -navigation_title: Notifications +navigation_title: Notifications and actions applies_to: - stack: unavailable + stack: preview serverless: preview products: - id: kibana -description: "How {{alerting-v2}} action policies route alert episodes to notifications: matchers, grouping, frequency, and workflow destinations." +description: "How {{alerting-v2-system}} action policies route alert episodes to notifications and actions." --- -# Notification routing in {{alerting-v2}} +# {{alerting-v2-system-cap}} notifications and actions -After a rule produces alert episodes, action policies decide what to do about them: who gets notified, how often, and through which channel. +Action policies are part of the {{alerting-v2-system}} in {{kib}}. After a rule produces alert episodes, action policies decide whether and when to invoke workflows. Workflows are what actually send the notification or run the automation. -This page explains how action policies work. For creating and configuring them step by step, refer to [Create and configure an action policy](notifications/create-configure-action-policy.md). +This page explains how action policies work. For creating and configuring them step by step, refer to [Create and configure an action policy](action-policies/create-configure-action-policy.md). ## What is an action policy [action-policies] +An action policy is the gating layer between an alert episode and a workflow. It decides whether and when to invoke a workflow by running the alert episode through a sequence of gates. A workflow runs only if the alert episode clears each gate in sequence. -An action policy is a saved object in your space that controls notification routing. Policies can be **global** or **per-rule**: +The three gates are match conditions, suppression, and frequency: -- **Global** policies apply to all episodes in the space, from any rule. When an episode is produced, the dispatcher evaluates all enabled global policies that are not snoozed. Global is the default type and suits most use cases. -- **Per-rule** policies are scoped to a single rule. They apply only to episodes produced by that specific rule. Use a per-rule policy when routing is specific to one rule and you don't want it to affect other rules in the space. The rule association is set at creation and cannot be changed. +* **Match conditions**: Match conditions filter which alert episodes the policy applies to. You define them using [KQL](../../query-filter/languages/kql.md). An empty match condition applies to all alert episodes within the policy's scope. +* **Suppression**: Suppression checks whether the alert episode should be silenced. Episodes that are acknowledged, snoozed, or inside a maintenance window are stopped here and no workflow is invoked. For details on each mechanism and its scope, refer to [Reduce notification noise](action-policies/reduce-notification-noise.md). +* **Frequency**: Frequency controls how often the policy can invoke its workflows for the same group of episodes, and how episodes batch before a workflow is invoked. Options are one notification per alert episode, one per notification group, or one digest for all matching episodes. If a workflow was already invoked within the cooldown period, the episode waits. -Each policy has four controls: +If any gate stops the episode, the workflow is not invoked for that policy. -| Control | What it does | -| --- | --- | -| Match conditions (optional KQL) | Filters which episodes this policy applies to. An empty match condition matches all episodes covered by the policy's scope. | -| Notify per | Controls how episodes batch into notifications: one per episode, one per notification group using **Group** mode, or one digest for all. | -| Frequency | Controls how often the policy can notify for the same notification group. | -| Destinations | One or more workflows to invoke when all conditions are met. | +:::{note} +Because each action policy evaluates alert episodes independently, an episode that is blocked by one policy can still trigger a workflow through a second policy with different conditions. +::: -## How policies apply to rules +## Policy types [policy-types] -**Global policies** don't reference rules directly. You scope them using KQL over episode and rule fields, for example `rule.tags: "checkout"` or `data.severity: "critical"`. A global policy applies to every matching episode in the space, from any rule. +Policies can be global or per-rule. Global policies apply across all rules in a space and suit most use cases. Per-rule policies apply to a single rule and give you precise control over routing for that rule without affecting anything else in the space. -**Per-rule policies** are bound to a specific rule at creation. They apply only to episodes from that rule, and you can still use match conditions to filter further within that rule's episodes. +### Global policies -Multiple policies can match the same episode, and each runs independently. There's no precedence or merging between them. If no policy matches an episode, no notification is sent. This is intentional. +A global policy applies to all alert episodes in the space, from any rule. When an alert episode is produced, the dispatcher evaluates all enabled global policies that are not snoozed. Global is the default type and suits most use cases. -## How action policies are evaluated [how-action-policies-evaluated] +### Per-rule policies +A per-rule policy is scoped to a single rule. It applies only to alert episodes produced by that specific rule. Use a per-rule policy when routing is specific to one rule and you do not want it to affect other rules in the space. The rule association is set at creation and cannot be changed. + +## How policies apply to rules -{{kib}} runs a background process called the dispatcher that checks for eligible episodes on a short interval (around 5 seconds) and evaluates action policies against them. The dispatcher is separate from the rule schedule. Rules write events on their own cadence, and the dispatcher picks them up asynchronously. +How a policy applies depends on whether it is global or per-rule. Multiple policies can match the same alert episode, and each runs independently. There is no precedence or merging between them. If no policy matches an alert episode, no workflow is invoked and no notification is sent. -For each enabled policy that is not snoozed, the dispatcher works through the following steps: +### Global policy application -1. **Gating:** Is the episode acknowledged, snoozed, or deactivated? If so, skip dispatch. Refer to [Notification gating](notifications/notification-gating.md) to learn more. -2. **Matcher:** Does the episode match the policy's KQL? If not, skip this policy. -3. **Grouping:** How should matching episodes batch into notification groups? -4. **Frequency:** Has a notification already gone out for this notification group recently? If so, wait. -5. **Destinations:** Send to the policy's workflow destinations. +Global policies don't reference rules directly. You scope them using KQL over alert episode and rule fields, for example `rule.tags: "checkout"` or `data.severity: "critical"`. A global policy applies to every matching alert episode in the space, from any rule. -### Notification dispatch outcomes [possible-outcomes] -The dispatcher runs on a short interval (around 5 seconds). Notifications don't arrive on the exact rule schedule. They follow the dispatcher's own cycle. +### Per-rule policy application + +Per-rule policies are bound to a specific rule at creation. They apply only to alert episodes from that rule, and you can still use match conditions to filter further within that rule's alert episodes. + +## How action policies are evaluated [how-action-policies-evaluated] -### Possible outcomes [possible-outcomes] +{{kib}} runs a background process called the dispatcher that checks for eligible alert episodes on a short interval (around 5 seconds) and evaluates action policies against them. The dispatcher runs on its own cycle, separate from the rule schedule. -Each notification attempt results in one of the following outcomes. +For each enabled policy that is not snoozed, the dispatcher works through the following steps. -| Outcome | What it means | -| --- | --- | -| `dispatched` | A notification was sent. | -| `throttled` | Dispatch was suppressed because the **frequency** interval had not elapsed. | -| `suppressed` | Dispatch was blocked before the notification went out. The episode was acknowledged, snoozed, or deactivated, or the space is currently in a maintenance window. | -| `unmatched` | No policy matched this episode and no workflow ran. | -| `error` | Processing failed. Check {{kib}} logs. | +1. **Gating:** Is the alert episode acknowledged, snoozed, or deactivated? If so, skip. Refer to [Reduce notification noise](action-policies/reduce-notification-noise.md) to learn more. +2. **Matcher:** Does the alert episode match the policy's KQL? If not, skip this policy. +3. **Grouping:** How should matching alert episodes batch into notification groups? +4. **Frequency:** Has a workflow already been invoked for this notification group recently? If so, wait. +5. **Destinations:** Invoke the configured workflows. -You can query these outcomes in Discover through the `.alert-actions` data stream. +Workflow invocations may not happen immediately after a rule evaluates. ## Why policies are separate from rules -Policies are independent of rules, which means: -- One global policy can cover episodes from many rules. For example, a policy matching `data.severity: "critical"` applies regardless of which rule produced the episode. -- You can update routing without touching any rule. -- Rules can be created without any action policy, which is useful for testing. +Policies are independent of rules. A single global policy can cover alert episodes from many rules, so a policy matching `data.severity: "critical"` applies regardless of which rule produced the alert episode. You can also update notification routing without touching any rule, and you can create rules without any action policy, which is useful for testing detection logic before wiring up notifications. -When you do need routing that's specific to one rule, create a per-rule policy and bind it to that rule at creation. +When you do need routing that is specific to one rule, create a per-rule policy and bind it to that rule at creation. -When you're ready to route notifications, go to [Create and configure an action policy](notifications/create-configure-action-policy.md). +## Next steps +- [Create and configure an action policy](action-policies/create-configure-action-policy.md) to set up policy type, match conditions, grouping, frequency, and workflow destinations. +- [Manage action policies](action-policies/manage-action-policies.md) to enable, disable, snooze, edit, or delete the policies in your space. +- [Action policy reference](action-policies/action-policy-reference.md) to look up available match condition fields, grouping modes, and frequency options. diff --git a/explore-analyze/alerting/kibana-alerting-experimental/notifications/action-policy-reference.md b/explore-analyze/alerting/kibana-alerting-experimental/notifications/action-policy-reference.md deleted file mode 100644 index 7272d36c94..0000000000 --- a/explore-analyze/alerting/kibana-alerting-experimental/notifications/action-policy-reference.md +++ /dev/null @@ -1,100 +0,0 @@ ---- -navigation_title: Action policy reference -applies_to: - stack: unavailable - serverless: preview -products: - - id: kibana -description: "Grouping modes, frequency options, dispatch outcomes, and match conditions field reference for action policies in the {{alerting-v2}}." ---- - -# Action policy reference [action-policy-reference] - - -Action policies are part of the {{alerting-v2}} in {{kib}}. This page is a reference for match conditions fields, grouping modes, frequency options, and dispatch outcomes. For step-by-step guidance, refer to [Create and configure an action policy](create-configure-action-policy.md). - -## Match conditions fields [matcher-fields] - -Use these fields in the **Match conditions** expression to filter which episodes a policy applies to. Combine them with standard KQL operators, for example `data.severity: "critical" AND episode_status: "active"`. - -| Field | Description | Example | -|---|---|---| -| `episode_status` | Current lifecycle status of the episode. Accepted values: `active`, `inactive`, `pending`, `recovering`. | `episode_status: "active"` | -| `data.*` | Dynamic payload fields sent by the rule. Available fields depend on the rule type and configuration. | `data.severity: "critical"` or `data.host.name: "web-01"` | -| `rule.id` | Unique identifier for the rule that generated the episode. | `rule.id: "rule-001"` | -| `rule.name` | Display name of the rule. | `rule.name: "High CPU"` | -| `rule.tags` | Tags attached to the rule. Use to match episodes from rules with a specific tag. | `rule.tags: "payment-service"` | -| `rule.labels` | Key-value labels attached to the rule. Use dot notation to target a specific label key. | `rule.labels.env: "production"` | - - - -## Notify per options [notification-grouping] - -Controls how the policy batches matching episodes before sending a notification. - -| Option | Description | When to use | -|---|---|---| -| Episode | Each episode triggers its own notification independently. Default selection. | You need per-issue visibility and want to handle each problem separately. | -| Group | The policy bundles episodes that share the same value for a specified `data.*` field into one notification per unique value. Each unique value forms a **notification group**. | A rule produces many related episodes, such as one per service or host, and you want to reduce noise by batching them into shared notifications. | -| Digest | The policy combines all matching episodes into a single notification, regardless of what they have in common. | You want a single periodic summary of everything that matched, rather than individual alerts. | - -## Frequency [throttle-strategies] - -**Frequency** controls how often the policy fires for a given episode or notification group. The available options depend on the **Notify per** setting. Not all options are valid for all modes. - -| Option | Description | When to use | -|---|---|---| -| On status change | Notifies when the episode status changes, for example from active to recovering. One notification per transition. | You only need to know when something breaks and when it's resolved. No reminders needed. | -| On status change + repeat at interval | Notifies on status change, then resends notifications at a regular interval while the episode remains in the same status. | You want status change alerts plus periodic notifications that a problem is still unresolved, in case it has been missed or pushed aside. | -| At most once every… | Caps notifications at one per episode or notification group within the chosen interval, regardless of rule frequency. | You want to limit alert volume for noisy rules without missing new or ongoing issues. | -| Every evaluation | Notifies on every rule evaluation. Can be noisy. Use sparingly and only with infrequent rule schedules. | You need a full audit trail of every evaluation, or the rule runs infrequently enough that noise isn't a concern. | - - - -### Frequency options for Episode [frequency-when-episode-per_episode] - -Available frequency options when you set **Notify per** to **Episode**. - -| Option | Description | Example | -|---|---|---| -| On status change | Notifies once when the episode opens and once when it recovers. No repeat notifications while it remains active. Best for when you trust your ticketing or incident workflow to track ongoing issues | A host goes down at 9:00am → one notification. Recovers at 11:00am → one notification. No notifications between them. | -| On status change + repeat at interval | Same as On status change, but also sends a reminder at a set interval while the episode is still active. | A host goes down at 9:00am → notification. With a 1h repeat: reminder at 10:00am, 11:00am. Recovers at 11:30am → notification. | -| Every evaluation | Fires on every rule evaluation, regardless of status. Can be noisy on frequent rule schedules. Avoid in production. | A rule running every 5 minutes with one active episode produces up to 288 notifications per day. | - -### Frequency options for Group - -Available frequency options when you set **Notify per** to **Group**. - -| Option | Description | Example | -|---|---|---| -| At most once every… | Limits how often each notification group can notify, regardless of how many episodes match or how often the rule runs. | 10 episodes share `data.host.name: "web-01"`. With a 1h limit, you get at most one notification per hour for that notification group. | -| Every evaluation | Fires on every rule evaluation for each unique value in the group-by field. Still noisy on frequent rule schedules. | A rule running every 10 minutes with 5 unique host values produces up to 6 notifications per host per hour. | - -### Frequency options for Digest - -Available frequency options when you set **Notify per** to **Digest**. - -| Option | Description | Example | -|---|---|---| -| Every evaluation | The only option for Digest. Fires on every rule run, bundling all matching episodes into one message. Pair with a longer rule schedule to avoid frequent summary messages. | A rule running every 30 minutes with 20 matching episodes produces one summary notification every 30 minutes containing all 20. | - -## Dispatch outcomes - -The system records each notification attempt with one of the following outcomes. To investigate delivery issues, query the `.alert-actions` data stream in Discover and filter by the `outcome` field. - -| Outcome | What happened | -|---|---| -| `dispatched` | The system sent the notification successfully. | -| `throttled` | The system skipped delivery because the **frequency** interval had not elapsed. This is expected behavior, not an error. | -| `suppressed` | Dispatch was blocked before the notification went out. The episode was acknowledged, snoozed, or deactivated, or the space is currently in a [maintenance window](../../alerts/maintenance-windows.md). | -| `unmatched` | No action policy matched this episode, so no workflow ran. | -| `error` | An error occurred during processing. Check {{kib}} logs to identify the cause. | \ No newline at end of file diff --git a/explore-analyze/alerting/kibana-alerting-experimental/notifications/create-configure-action-policy.md b/explore-analyze/alerting/kibana-alerting-experimental/notifications/create-configure-action-policy.md deleted file mode 100644 index 728b2b5f69..0000000000 --- a/explore-analyze/alerting/kibana-alerting-experimental/notifications/create-configure-action-policy.md +++ /dev/null @@ -1,76 +0,0 @@ ---- -navigation_title: Create an action policy -applies_to: - stack: unavailable - serverless: preview -products: - - id: kibana -description: "Create action policies in the {{alerting-v2}}, configure match conditions, Notify per, Frequency, and workflow destinations." ---- - -# Create and configure an action policy [create-manage-action-policies] - - -Action policies are part of the {{alerting-v2}} in {{kib}}. This page covers how to configure policy type, match conditions, grouping, frequency, and workflow destinations. Where rules define what counts as a problem, action policies define what happens when one is detected: which episodes generate notifications, how they batch for dispatch, and where they're routed. - -Because policies are separate from rules, you can update notification behavior across many rules at once without touching detection logic, and you can route the same alerts differently depending on severity or source. You create and manage policies from the **Action policies** page, not from the rule form. - -For match conditions fields, grouping modes, frequency options, and dispatch outcomes, refer to [Action policy reference](action-policy-reference.md). - - - -## Policy fields [policy-fields] - -### Policy type [policy-type] - -An action policy can be **global** or **per-rule**: - -- **Global** policies apply to any episode in the space. Use a global policy when you want to route episodes from multiple rules. For example, a policy matching `rule.tags: "checkout"` applies to every rule with that tag. -- **Per-rule** policies are scoped to a single rule. Use a per-rule policy when notification routing is specific to one rule and you don't want it to affect other rules in the space. - -The policy type is set at creation and cannot be changed. If you need a different type, create a new policy. - -### Tags [policy-tags] - -Optional string labels you assign to a policy to categorize it or filter it on the Action policies list. Unlike rule tags, policy tags describe the policy itself rather than the alerts it matches. You can add, edit, or remove tags at any time without affecting routing behavior. - -### Match conditions [matcher] - - -An optional KQL expression that filters which episodes this policy applies to. An empty match condition matches every episode covered by the policy's scope. For a global policy, that means all episodes in the space. For a per-rule policy, it means all episodes from the associated rule. - -Use match conditions to route different episodes to different policies, for example, one policy for `rule.tags: "payment-service"` episodes routed to PagerDuty and another for warnings routed to Slack. For available fields and examples, refer to [Match conditions fields](action-policy-reference.md#matcher-fields). - - - -### Grouping and frequency [reduce-noise-grouping] - - -**Notify per** controls how episodes batch into notifications. **Frequency** controls how often the policy can notify for each batch. - -:::{table} -:widths: 4-4-4 - -| Notify per | What it does | Available Frequency options | -|---|---|---| -| Episode | One notification per episode. | - On status change
- On status change + repeat at interval
- Every evaluation | -| Group | Bundle episodes that share a field value. Specify a **Group by** field such as `data.service.name` or `data.host.name`. | - At most once every…
- Every evaluation | -| Digest | One notification for all matching episodes combined. | Every evaluation | - -::: - -For detailed descriptions, frequency options, and examples for each mode, refer to [Notify per options](action-policy-reference.md#notification-grouping). - -### Frequency [throttle] - - -**Frequency** limits how often the policy can fire for a given notification group. The interval resets from the last time the policy fired, so successive notifications stay at least `interval` apart. Set a duration such as `1h` or `30m`. For available options by **Notify per** mode, refer to [Frequency](action-policy-reference.md#throttle-strategies). - -### Destinations - -One or more workflows to invoke when the policy matches. Use the search field to find and attach workflows. diff --git a/explore-analyze/alerting/kibana-alerting-experimental/notifications/manage-action-policies.md b/explore-analyze/alerting/kibana-alerting-experimental/notifications/manage-action-policies.md deleted file mode 100644 index 4039405c58..0000000000 --- a/explore-analyze/alerting/kibana-alerting-experimental/notifications/manage-action-policies.md +++ /dev/null @@ -1,41 +0,0 @@ ---- -navigation_title: Manage action policies -applies_to: - stack: unavailable - serverless: preview -products: - - id: kibana -description: "View policy details, enable, disable, snooze, review execution history, and rotate API keys for action policies in the {{alerting-v2}}." ---- - -# Manage action policies - -Action policies are part of the {{alerting-v2}} in {{kib}}. This page covers how to view policy details, enable and disable policies, snooze them during planned outages, rotate their API keys, and review execution history. - -## View policy details - -From the **Action policies** list, you can open a policy to see its full configuration, including match conditions, grouping mode, frequency, and destinations. You can also edit, clone, delete, enable, disable, snooze, or update its API key without leaving the list page. - -## Execution history - -The dispatcher records the outcome of every notification attempt for each policy. To investigate delivery issues or audit which policies ran for an episode, query the `.alert-actions` data stream in Discover and filter by `outcome` or `policy_id`. For a description of each outcome, refer to [Dispatch outcomes](action-policy-reference.md#dispatch-outcomes). - -## Enable and snooze - -You can disable a policy so it is not evaluated for new episodes. You can snooze a policy for a defined window so that it does not dispatch notifications during that period. Policies that are not enabled or are snoozed are skipped when the dispatcher evaluates policies. - -### Maintenance windows [maintenance-windows] - -During a [maintenance window](../../alerts/maintenance-windows.md), action policies stop dispatching notifications automatically. No policy configuration is required. Rule evaluation continues and alert episodes are still recorded in `.rule-events`. Maintenance windows are configured separately, not on the action policy. - -## Update API keys - -You can rotate the API key used to run a policy's workflows without changing matchers or destinations. Use the **Update API key** action on one policy or for multiple selected policies. - -::::{important} Production considerations -When you update or delete an action policy, previous API keys used for execution are marked for invalidation and removed on a schedule managed by {{kib}}. Allow for a short delay before new keys are used for dispatch. -:::: - -## Bulk actions - -On the action policies list, select one or more policies to enable, disable, snooze, and do more in bulk. **Select all** selects every policy on the current page of results. Clear the selection before changing filters if you need a different set. diff --git a/explore-analyze/alerting/kibana-alerting-experimental/notifications/notification-gating.md b/explore-analyze/alerting/kibana-alerting-experimental/notifications/notification-gating.md deleted file mode 100644 index c37e671af5..0000000000 --- a/explore-analyze/alerting/kibana-alerting-experimental/notifications/notification-gating.md +++ /dev/null @@ -1,56 +0,0 @@ ---- -navigation_title: Notification gating -applies_to: - stack: unavailable - serverless: preview -products: - - id: kibana -description: "How {{alerting-v2}} gates notifications: which mechanisms suppress dispatch before action policies run, their scope, and when to use each." ---- - -# Notification gating [notification-gating] - - -Notification gating is part of the {{alerting-v2}} in {{kib}}. Gating controls whether a matched episode triggers a notification. When an episode is gated, the dispatcher stops processing it before any action policy matcher, grouping, or destination runs. This page covers how gating fits in the dispatch cycle and when to use each mechanism: acknowledge, snooze, and deactivate. - -## How gating fits in the dispatcher [gating-in-dispatcher] - - -When an episode is eligible for dispatch, the dispatcher evaluates each enabled action policy in order: - -1. **Gating:** Is the episode acknowledged, snoozed, or deactivated? If so, stop. No notification is sent. -2. **Matcher:** Does the episode match the policy's KQL? If not, skip this policy. -3. **Grouping:** How should matching episodes batch into notification groups? -4. **Throttle:** Has a notification already gone out for this group recently? If so, wait. -5. **Destinations:** Send to the policy's workflow destinations. - -Gating is the first step. An episode that is acknowledged, snoozed, or deactivated never reaches routing. - -## Gating mechanisms [gating-mechanisms] - - -Three mechanisms let you gate notifications, each at a different scope: - -| Mechanism | Scope | When to use | -|---|---|---| -| Acknowledge | Per episode | You're actively investigating a breach and want to silence notifications for it without closing the episode. Clear the acknowledgement when you're done to restore notifications. | -| Snooze | Per series (group) | You want to quiet an entire alert series for a defined period, for example, during a known noisy window for a specific host. Snooze expires automatically at the end of the duration. | -| Deactivate | Per episode | You want to manually close an episode that hasn't recovered automatically. Deactivating marks the episode as inactive and stops notifications for it. Unlike acknowledge, this closes the episode rather than silencing it while leaving it active. | - -Each mechanism is stored as a separate document in `.alert-actions`, so the full gating history for an episode is queryable in Discover. - -### Snooze scope - -Snooze applies at the group level (by `group_hash`), not per individual episode. When you snooze one episode, every episode sharing the same group (all rows with the same `rule_id` and `group_hash`) is silenced for the duration. Snoozing one row in the alerts table silences the entire series for that rule. - - - -## Related pages - - -- **[Notifications](../notifications.md):** Set up action policies that control routing, grouping, and throttle after gating. diff --git a/explore-analyze/alerting/kibana-alerting-experimental/workflows-alerting.md b/explore-analyze/alerting/kibana-alerting-experimental/workflows-alerting.md index f1e1141f2e..d78845d323 100644 --- a/explore-analyze/alerting/kibana-alerting-experimental/workflows-alerting.md +++ b/explore-analyze/alerting/kibana-alerting-experimental/workflows-alerting.md @@ -1,33 +1,38 @@ --- -navigation_title: Workflows +navigation_title: Connect workflows applies_to: - stack: ga 9.4, preview 9.3 - serverless: ga + stack: preview + serverless: preview products: - id: kibana -description: "How workflows connect to the {{alerting-v2}} action policies and rule automation, and where to configure them." +description: "How workflows connect to the {{alerting-v2-system}} action policies and rule automation, and where to configure them." --- -# Workflows for the {{alerting-v2}} [workflows] +# Connect workflows to the {{alerting-v2-system}} [connect-workflows-experimental-alerting-system] +Workflows are part of the {{alerting-v2-system}} in {{kib}}. Without a workflow attached, an action policy cannot act on an alert episode. [Workflows](../../workflows.md) are the delivery layer. They define what happens when an action policy decides to act, such as sending a message, calling a webhook, or triggering automation. Setting up a workflow is what connects the {{alerting-v2-system}} to the tools your team uses for incident response. -Workflows are part of the {{alerting-v2}} in Kibana. Without a workflow, an action policy has nowhere to send notifications. [Workflows](../../workflows.md) are the delivery layer. They define the actual steps that run when a policy matches an episode: sending a message, calling a webhook, triggering automation, or any combination. Setting up a workflow is what connects the {{alerting-v2}} to the tools your team already uses for incident response. - -Before creating an action policy, make sure the workflows you want to use already exist in your space. Policies store references to workflow IDs, so a destination workflow must exist before you can select it. +:::{important} +Before creating an action policy, make sure the workflows you want to use already exist in your space. For information on creating a workflow, refer to [Build your first workflow](../../workflows/get-started/build-your-first-workflow.md). +::: ## Runtime execution order [runtime-execution-order] -After a rule produces or updates alert episodes, processing follows this sequence: +After a rule runs, processing follows this sequence: ``` -Rule → Alert → Action Policy → Workflow → Notification +Rule → Alert episode → [Dispatcher] → Action policy → Workflow → Notification ``` -1. The rule runs its {{esql}} evaluation and writes to `.rule-events`. -2. In Alert mode, alert documents and episodes represent the ongoing issue. -3. Action policies in the same space are evaluated against episodes (matcher, suppression, grouping, frequency). -4. For each dispatch, the policy invokes its configured workflows. -5. Notifications are the outcome: Email, chat, webhook, and so on. +1. A rule evaluates data on a schedule and writes a rule event. +2. In Alert mode, the rule event opens or updates an alert episode. +3. The dispatcher runs on a short interval, independently of the rule schedule, and picks up active alert episodes. +4. For each active alert episode, the dispatcher evaluates all enabled action policies. Each policy runs the episode through a sequence of gates: suppression, match conditions, grouping, and frequency. +5. For policies where the episode clears all gates, the dispatcher invokes the configured workflows. +6. Workflows deliver the notification or run the automation. + +## Next steps -The policy evaluates matchers and **frequency** limits before any workflow step runs, even though you created the workflow before the policy. That's why configuration order (workflow first, then policy, then rule) is the reverse of runtime order. +- [Create and configure an action policy](action-policies/create-configure-action-policy.md) to start routing alert episodes to workflows. +- [Notifications and actions in {{alerting-v2-system}}](notifications-actions.md) to learn how action policies evaluate and gate alert episodes before invoking a workflow. diff --git a/explore-analyze/alerting/watcher/actions.md b/explore-analyze/alerting/watcher/actions.md index 0181302180..b3d6189a5f 100644 --- a/explore-analyze/alerting/watcher/actions.md +++ b/explore-analyze/alerting/watcher/actions.md @@ -147,7 +147,7 @@ If you do not define a throttle period at the action or watch level, the global xpack.watcher.execution.default_throttle_period: 15m ``` -{{watcher}} also supports acknowledgement-based throttling. You can acknowledge a watch using the [ack watch API]({{es-apis}}operation/operation-watcher-ack-watch) to prevent the watch actions from being executed again while the watch condition remains `true`. This essentially tells {{watcher}} "I received the notification and I’m handling it, do not notify me about this error again". An acknowledged watch action remains in the `acked` state until the watch’s condition evaluates to `false`. When that happens, the action’s state changes to `awaits_successful_execution`. +{{watcher}} also supports acknowledgment-based throttling. You can acknowledge a watch using the [ack watch API]({{es-apis}}operation/operation-watcher-ack-watch) to prevent the watch actions from being executed again while the watch condition remains `true`. This essentially tells {{watcher}} "I received the notification and I’m handling it, do not notify me about this error again". An acknowledged watch action remains in the `acked` state until the watch’s condition evaluates to `false`. When that happens, the action’s state changes to `awaits_successful_execution`. To acknowledge an action, you use the [ack watch API]({{es-apis}}operation/operation-watcher-ack-watch): diff --git a/explore-analyze/toc.yml b/explore-analyze/toc.yml index d50be4be30..d4400f81e4 100644 --- a/explore-analyze/toc.yml +++ b/explore-analyze/toc.yml @@ -381,12 +381,12 @@ toc: - file: alerting.md children: - file: alerting/kibana-alerting-experimental/workflows-alerting.md - - file: alerting/kibana-alerting-experimental/notifications.md + - file: alerting/kibana-alerting-experimental/notifications-actions.md children: - - file: alerting/kibana-alerting-experimental/notifications/create-configure-action-policy.md - - file: alerting/kibana-alerting-experimental/notifications/action-policy-reference.md - - file: alerting/kibana-alerting-experimental/notifications/manage-action-policies.md - - file: alerting/kibana-alerting-experimental/notifications/notification-gating.md + - file: alerting/kibana-alerting-experimental/action-policies/create-configure-action-policy.md + - file: alerting/kibana-alerting-experimental/action-policies/action-policy-reference.md + - file: alerting/kibana-alerting-experimental/action-policies/manage-action-policies.md + - file: alerting/kibana-alerting-experimental/action-policies/reduce-notification-noise.md - file: alerting/alerts.md children: - file: alerting/alerts/alerting-getting-started.md From e3bda0f66034ac95bef62a74b366190dea9ee693 Mon Sep 17 00:00:00 2001 From: Nastasha Solomon Date: Thu, 28 May 2026 16:31:17 -0400 Subject: [PATCH 07/14] fix toc issue --- .../notifications.md | 81 ------------------- 1 file changed, 81 deletions(-) delete mode 100644 explore-analyze/alerting/kibana-alerting-experimental/notifications.md diff --git a/explore-analyze/alerting/kibana-alerting-experimental/notifications.md b/explore-analyze/alerting/kibana-alerting-experimental/notifications.md deleted file mode 100644 index c8fe435703..0000000000 --- a/explore-analyze/alerting/kibana-alerting-experimental/notifications.md +++ /dev/null @@ -1,81 +0,0 @@ ---- -navigation_title: Notifications and actions -applies_to: - stack: preview - serverless: preview -products: - - id: kibana -description: "How {{alerting-v2-system}} action policies route alert episodes to notifications and actions." ---- - -# {{alerting-v2-system-cap}} notifications and actions - -Action policies are part of the {{alerting-v2-system}} in {{kib}}. After a rule produces alert episodes, action policies decide whether and when to invoke workflows. Workflows are what actually send the notification or run the automation. - -This page explains how action policies work. For creating and configuring them step by step, refer to [Create and configure an action policy](action-policies/create-configure-action-policy.md). - -## What is an action policy [action-policies] - -An action policy is the gating layer between an alert episode and a workflow. It decides whether and when to invoke a workflow by running the alert episode through a sequence of gates. A workflow runs only if the alert episode clears each gate in sequence. - -The three gates are match conditions, suppression, and frequency: - -* **Match conditions**: Match conditions filter which alert episodes the policy applies to. You define them using [KQL](../../query-filter/languages/kql.md). An empty match condition applies to all alert episodes within the policy's scope. -* **Suppression**: Suppression checks whether the alert episode should be silenced. Episodes that are acknowledged, snoozed, or inside a maintenance window are stopped here and no workflow is invoked. For details on each mechanism and its scope, refer to [Reduce notification noise](action-policies/reduce-notification-noise.md). -* **Frequency**: Frequency controls how often the policy can invoke its workflows for the same group of episodes, and how episodes batch before a workflow is invoked. Options are one notification per alert episode, one per notification group, or one digest for all matching episodes. If a workflow was already invoked within the cooldown period, the episode waits. - -If any gate stops the episode, the workflow is not invoked for that policy. - -:::{note} -Because each action policy evaluates alert episodes independently, an episode that is blocked by one policy can still trigger a workflow through a second policy with different conditions. -::: - -## Policy types [policy-types] - -Policies can be global or per-rule. Global policies apply across all rules in a space and suit most use cases. Per-rule policies apply to a single rule and give you precise control over routing for that rule without affecting anything else in the space. - -### Global policies - -A global policy applies to all alert episodes in the space, from any rule. When an alert episode is produced, the dispatcher evaluates all enabled global policies that are not snoozed. Global is the default type and suits most use cases. - -### Per-rule policies - -A per-rule policy is scoped to a single rule. It applies only to alert episodes produced by that specific rule. Use a per-rule policy when routing is specific to one rule and you do not want it to affect other rules in the space. The rule association is set at creation and cannot be changed. - -## How policies apply to rules - -How a policy applies depends on whether it is global or per-rule. Multiple policies can match the same alert episode, and each runs independently. There is no precedence or merging between them. If no policy matches an alert episode, no workflow is invoked and no notification is sent. - -### Global policy application - -Global policies don't reference rules directly. You scope them using KQL over alert episode and rule fields, for example `rule.tags: "checkout"` or `data.severity: "critical"`. A global policy applies to every matching alert episode in the space, from any rule. - -### Per-rule policy application - -Per-rule policies are bound to a specific rule at creation. They apply only to alert episodes from that rule, and you can still use match conditions to filter further within that rule's alert episodes. - -## How action policies are evaluated [how-action-policies-evaluated] - -{{kib}} runs a background process called the dispatcher that checks for eligible alert episodes on a short interval (around 5 seconds) and evaluates action policies against them. The dispatcher runs on its own cycle, separate from the rule schedule. - -For each enabled policy that is not snoozed, the dispatcher works through the following steps. - -1. **Gating:** Is the alert episode acknowledged, snoozed, or deactivated? If so, skip. Refer to [Reduce notification noise](action-policies/reduce-notification-noise.md) to learn more. -2. **Matcher:** Does the alert episode match the policy's KQL? If not, skip this policy. -3. **Grouping:** How should matching alert episodes batch into notification groups? -4. **Frequency:** Has a workflow already been invoked for this notification group recently? If so, wait. -5. **Destinations:** Invoke the configured workflows. - -Workflow invocations may not happen immediately after a rule evaluates. - -## Why policies are separate from rules - -Policies are independent of rules. A single global policy can cover alert episodes from many rules, so a policy matching `data.severity: "critical"` applies regardless of which rule produced the alert episode. You can also update notification routing without touching any rule, and you can create rules without any action policy, which is useful for testing detection logic before wiring up notifications. - -When you do need routing that is specific to one rule, create a per-rule policy and bind it to that rule at creation. - -## Next steps - -- [Create and configure an action policy](action-policies/create-configure-action-policy.md) to set up policy type, match conditions, grouping, frequency, and workflow destinations. -- [Manage action policies](action-policies/manage-action-policies.md) to enable, disable, snooze, edit, or delete the policies in your space. -- [Action policy reference](action-policies/action-policy-reference.md) to look up available match condition fields, grouping modes, and frequency options. From 62c48501bc7ff3080c6a1c7bc36bfbab80958294 Mon Sep 17 00:00:00 2001 From: Nastasha Solomon <79124755+nastasha-solomon@users.noreply.github.com> Date: Tue, 2 Jun 2026 16:51:07 -0400 Subject: [PATCH 08/14] Update explore-analyze/alerting/kibana-alerting-experimental/action-policies/manage-action-policies.md Co-authored-by: Aleksandra Spilkowska <96738481+alexandra5000@users.noreply.github.com> --- .../action-policies/manage-action-policies.md | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/explore-analyze/alerting/kibana-alerting-experimental/action-policies/manage-action-policies.md b/explore-analyze/alerting/kibana-alerting-experimental/action-policies/manage-action-policies.md index fc47ce612b..0aef4b1fdc 100644 --- a/explore-analyze/alerting/kibana-alerting-experimental/action-policies/manage-action-policies.md +++ b/explore-analyze/alerting/kibana-alerting-experimental/action-policies/manage-action-policies.md @@ -41,7 +41,10 @@ During a [maintenance window](../../alerts/maintenance-windows.md), action polic You can rotate the API key used to run a policy's workflows without changing matchers or destinations. Use the **Update API key** action on one policy or for multiple selected policies. -::::{important} Production considerations +::::{important} + +**Production considerations** + When you update or delete an action policy, previous API keys used for execution are marked for invalidation and removed on a schedule managed by {{kib}}. Allow for a short delay before new keys are used for dispatch. :::: From a5cb3ec9500083783910f49c97ce92a756950b25 Mon Sep 17 00:00:00 2001 From: Nastasha Solomon Date: Tue, 2 Jun 2026 17:06:01 -0400 Subject: [PATCH 09/14] Editorial feedback --- .../action-policies/action-policy-reference.md | 4 ++-- .../action-policies/create-configure-action-policy.md | 4 ++-- .../action-policies/manage-action-policies.md | 8 ++++++-- .../action-policies/reduce-notification-noise.md | 8 ++++++-- .../kibana-alerting-experimental/notifications-actions.md | 4 ++-- .../kibana-alerting-experimental/workflows-alerting.md | 4 ++-- 6 files changed, 20 insertions(+), 12 deletions(-) diff --git a/explore-analyze/alerting/kibana-alerting-experimental/action-policies/action-policy-reference.md b/explore-analyze/alerting/kibana-alerting-experimental/action-policies/action-policy-reference.md index 1801f4f47e..2bb2c0d73c 100644 --- a/explore-analyze/alerting/kibana-alerting-experimental/action-policies/action-policy-reference.md +++ b/explore-analyze/alerting/kibana-alerting-experimental/action-policies/action-policy-reference.md @@ -1,8 +1,8 @@ --- navigation_title: Action policy reference applies_to: - stack: preview - serverless: preview + stack: experimental 9.5+ + serverless: experimental products: - id: kibana description: "Grouping modes, frequency options, dispatch outcomes, and match conditions field reference for action policies in the {{alerting-v2-system}}." diff --git a/explore-analyze/alerting/kibana-alerting-experimental/action-policies/create-configure-action-policy.md b/explore-analyze/alerting/kibana-alerting-experimental/action-policies/create-configure-action-policy.md index 082558e2a6..e53ffbcba7 100644 --- a/explore-analyze/alerting/kibana-alerting-experimental/action-policies/create-configure-action-policy.md +++ b/explore-analyze/alerting/kibana-alerting-experimental/action-policies/create-configure-action-policy.md @@ -1,8 +1,8 @@ --- navigation_title: Create an action policy applies_to: - stack: preview - serverless: preview + stack: experimental 9.5+ + serverless: experimental products: - id: kibana description: "Create action policies in the {{alerting-v2-system}}, configure match conditions, Notify per, Frequency, and workflow destinations." diff --git a/explore-analyze/alerting/kibana-alerting-experimental/action-policies/manage-action-policies.md b/explore-analyze/alerting/kibana-alerting-experimental/action-policies/manage-action-policies.md index 0aef4b1fdc..4f575fb2ee 100644 --- a/explore-analyze/alerting/kibana-alerting-experimental/action-policies/manage-action-policies.md +++ b/explore-analyze/alerting/kibana-alerting-experimental/action-policies/manage-action-policies.md @@ -1,8 +1,8 @@ --- navigation_title: Manage action policies applies_to: - stack: preview - serverless: preview + stack: experimental 9.5+ + serverless: experimental products: - id: kibana description: "View policy details, enable, disable, snooze, review execution history, and rotate API keys for action policies in the {{alerting-v2-system}}." @@ -33,6 +33,10 @@ To investigate delivery issues or audit which policies ran for an alert episode, You can disable a policy so it is not evaluated for new alert episodes. You can snooze a policy for a defined window so that it does not dispatch notifications during that period. Policies that are not enabled or are snoozed are skipped when the dispatcher evaluates policies. +:::{note} +Snoozing a policy differs from [snoozing an alert episode](reduce-notification-noise.md#snooze-scope). When you snooze a policy, the dispatch mechanism is paused and every series the policy would process is silenced. When you snooze an alert episode, you target one specific series before policy matching runs, silencing it regardless of which policy would have handled it. Use alert snooze when you want to quiet a specific recurring alert without affecting other series handled by the same policy. +::: + ### Maintenance windows [maintenance-windows] During a [maintenance window](../../alerts/maintenance-windows.md), action policies stop dispatching notifications automatically. No policy configuration is required. Rule evaluation continues and alert episodes are still recorded in `.rule-events`. Maintenance windows are configured separately, not on the action policy. diff --git a/explore-analyze/alerting/kibana-alerting-experimental/action-policies/reduce-notification-noise.md b/explore-analyze/alerting/kibana-alerting-experimental/action-policies/reduce-notification-noise.md index a79b333dab..2b0727e4b6 100644 --- a/explore-analyze/alerting/kibana-alerting-experimental/action-policies/reduce-notification-noise.md +++ b/explore-analyze/alerting/kibana-alerting-experimental/action-policies/reduce-notification-noise.md @@ -1,8 +1,8 @@ --- navigation_title: Reduce notification noise applies_to: - stack: preview - serverless: preview + stack: experimental 9.5+ + serverless: experimental products: - id: kibana description: "How to reduce notification noise in the {{alerting-v2-system}} using acknowledge, snooze, and deactivate to silence alert episodes." @@ -32,6 +32,10 @@ Snooze applies at the group level (by `group_hash`), not per individual alert ep For instructions on snoozing and unsnoozing single or multiple episodes, refer to [View and manage alerts](../alerts/view-and-manage-alerts.md#snooze-episode). --> +:::{note} +Snoozing an alert episode differs from [snoozing an action policy](manage-action-policies.md#enable-and-snooze). When you snooze a policy, the dispatch mechanism is paused and every series the policy would process is silenced. When you snooze an alert episode, you target one specific series before policy matching runs, silencing it regardless of which policy would have handled it. Use policy snooze when you want to pause all notifications from a given policy, for example, during planned maintenance on a destination system. +::: + ## Related pages ## Summary Updates the action policy, reference, and workflow docs for the experimental alerting features with content from five M2 doc issues. Following the tech preview principle of accuracy over comprehensiveness, content additions focus on confirmed stable behavior. UI details subject to change and API schema changes with no existing reference docs are deferred. ### `episode.severity` as a first-class match condition field (#6834, #6689) Issues #6834 and #6689 tracked the same underlying feature from two angles. #6689 documented the rule-side contract: severity is populated when a rule's ES|QL query includes a `severity` column with a recognized value (`info`, `low`, `medium`, `high`, `critical`); unrecognized values are silently ignored; the field is absent on recovery events. #6834 documented the dispatcher side: `episode.severity` is now available in the matcher context, making `data.severity` — previously the only way to match on severity — the legacy approach. **`action-policy-reference.md`** — Added `episode.severity` to the match conditions field reference table with its full behavioral contract: populated from the rule's ES|QL `severity` column, case-insensitive, unrecognized values silently ignored, not set during recovery. Updated the introductory KQL example from `data.severity: "critical"` to `episode.severity: "critical"`. Updated the `data.*` row to clarify it covers rule-specific payload fields not available as standard episode fields, removing the misleading `data.severity: "critical"` example. Narrowed the content-needed comments to three remaining open questions: whether `episode.severity_max` is also available, whether a mid-episode severity change triggers re-evaluation, and a placeholder for a future cross-link to rule authoring docs once they exist. **`create-configure-action-policy.md`** — Updated the match conditions example to lead with `episode.severity: "critical"` for PagerDuty routing, with `rule.tags` as a secondary scoping example. Narrowed the content-needed comment to the backward-compatibility question of whether `data.severity` on legacy rules should be documented as an alternative or removed from guidance. **`notifications-actions.md`** — Updated the "Why policies are separate from rules" example from `data.severity: "critical"` to `episode.severity: "critical"`. This was the one remaining reference to `data.severity` as a severity matcher outside the reference table. The full ES|QL severity authoring contract (column name, case-insensitivity, silent-ignore behavior) belongs in rule authoring docs. Those docs don't exist yet (#6689 notes this). A comment in the `episode.severity` table row flags this as a future cross-link target. ### Matcher context `rule.*` fields reduced to id, name, and tags (#6836) PR #271768 simplified `MatcherContextRule` to expose only `id`, `name`, and `tags`. The fields `rule.description`, `rule.enabled`, `rule.createdAt`, and `rule.updatedAt` were removed from the matcher context. The same PR added `rule.name`, `rule.tags`, and `rule.description` as payload variables in single-step workflow templates. **`action-policy-reference.md`** — Removed `rule.description` and `rule.enabled` rows from the match conditions field reference table. Both were previously listed as available matcher fields; both were removed in this PR. `rule.id`, `rule.name`, and `rule.tags` remain. The workflow payload variable additions (`{{rule.name}}`, `{{rule.tags}}`, `{{rule.description}}`) are not documented here — no workflow payload template docs exist yet. When those docs are written, they should document these three variables with example template snippets such as `Alert from rule: {{rule.name}}`. ### Notification policy form: Workflows enablement prerequisite and UI labels (#6843) PR #260796 introduced a warning callout in the notification policy form when `workflows:ui:enabled` is disabled on Stack, directing users to Advanced Settings to enable it. In 9.4, workflows was enabled by default, so there's no need to doc this as a pre-req. **`workflows-alerting.md`** — Added a note inside the existing important callout explaining that on Stack, workflows must be enabled before they can be used as destinations in action policies. Includes the path (**Stack Management > Advanced Settings**) and the setting key (`workflows:ui:enabled`). UI label changes from this PR ("Policy scope" section header, "Create a workflow" link in the destination field) are omitted — these are unstable IA details that will be documented once the UI stabilizes post-preview. ### Creator and updater display names in the policy list (#6845) PR #268426 standardized Alerting v2 on user profile UIDs for `createdBy` and `updatedBy` fields, resolving them to display names on read via the Kibana user profiles API. The policy list page and details flyout now show the full display name of the creator or last updater. **`manage-action-policies.md`** — Added a sentence noting that the policy list shows the display name of the user who created the policy and the user who last updated it. The API schema breaking change (`createdByUsername`/`updatedByUsername` fields removed in 9.5.0) is not documented here — no public API reference exists yet for the Alerting v2 action policy API, so there is nothing to update. When that reference is written, it should document `createdBy`/`updatedBy` as user profile UIDs and note the removal as a breaking change for pre-release integrations. ## Generative AI disclosure 1. Did you use a generative AI (GenAI) tool to assist in creating this contribution? - [x] Yes - Cursor + Claude - [ ] No --- .../action-policy-reference.md | 21 +++++++------------ .../create-configure-action-policy.md | 4 ++-- .../action-policies/manage-action-policies.md | 2 +- .../notifications-actions.md | 2 +- .../workflows-alerting.md | 1 + 5 files changed, 13 insertions(+), 17 deletions(-) diff --git a/explore-analyze/alerting/kibana-alerting-experimental/action-policies/action-policy-reference.md b/explore-analyze/alerting/kibana-alerting-experimental/action-policies/action-policy-reference.md index 2bb2c0d73c..617bb26ce6 100644 --- a/explore-analyze/alerting/kibana-alerting-experimental/action-policies/action-policy-reference.md +++ b/explore-analyze/alerting/kibana-alerting-experimental/action-policies/action-policy-reference.md @@ -14,29 +14,24 @@ Action policies are part of the {{alerting-v2-system}} in {{kib}}. This page is ## Match conditions fields [matcher-fields] -Use these fields in the **Match conditions** expression to filter which alert episodes a policy applies to. Combine them with standard [KQL](../../../query-filter/languages/kql.md) operators, for example `data.severity: "critical" AND episode_status: "active"`. +Use these fields in the **Match conditions** expression to filter which alert episodes a policy applies to. Combine them with standard [KQL](../../../query-filter/languages/kql.md) operators, for example `episode.severity: "critical" AND episode_status: "active"`. | Field | Type | Description | Accepted values | Example | |---|---|---|---|---| | `episode_id` | string | Unique identifier of the alert episode. | Any string | `episode_id: "ep-001"` | | `episode_status` | string | Current lifecycle status of the alert episode. | `inactive`, `pending`, `active`, `recovering` | `episode_status: "active"` | +| `episode.severity` | string | Current severity of the alert episode. Populated when the rule's ES\|QL query includes a `severity` column whose value matches a supported level (case-insensitive). Unrecognized values are silently ignored and the field is absent. Not set during recovery. Use to route high-severity episodes to dedicated workflows. | `info`, `low`, `medium`, `high`, `critical` | `episode.severity: "critical" OR episode.severity: "high"` | | `group_hash` | string | Stable hash identifying the alert series the alert episode belongs to. | Any string | `group_hash: "abc123"` | | `last_event_timestamp` | string | ISO 8601 timestamp of the most recent event recorded for the alert episode. | ISO 8601 timestamp | `last_event_timestamp > "2026-01-01"` | | `rule.id` | string | Unique identifier of the rule that generated the alert episode. | Any string | `rule.id: "rule-001"` | | `rule.name` | string | Display name of the rule. | Any string | `rule.name: "High CPU"` | -| `rule.description` | string | Description of the rule. | Any string | `rule.description: *checkout*` | | `rule.tags` | string[] | Tags attached to the rule. Use to match alert episodes from rules with a specific tag. | Any string | `rule.tags: "payment-service"` | -| `rule.enabled` | boolean | Whether the rule is currently enabled. | `true`, `false` | `rule.enabled: true` | -| `data.*` | object | Dynamic payload fields sent by the rule. Available fields depend on the rule type and configuration. | Depends on rule type | `data.severity: "critical"` | +| `data.*` | object | Dynamic payload fields sent by the rule. Available fields depend on the rule type and configuration. Use for rule-specific fields not covered by the standard episode fields above. | Depends on rule type | `data.host.name: "web-01"` | - ## Notify per options [notification-grouping] @@ -60,7 +55,7 @@ Frequency controls how often the policy fires for a given alert episode or notif | At most once every… | Caps notifications at one per alert episode or notification group within the chosen interval, regardless of rule frequency. | You want to limit notification volume for noisy rules without missing new or ongoing issues. | | Every evaluation | Notifies on every rule evaluation. Can be noisy. Use sparingly and only with infrequent rule schedules. | You need a full audit trail of every evaluation, or the rule runs infrequently enough that noise isn't a concern. | - ### Frequency options for Episode [frequency-when-episode-per_episode] diff --git a/explore-analyze/alerting/kibana-alerting-experimental/action-policies/create-configure-action-policy.md b/explore-analyze/alerting/kibana-alerting-experimental/action-policies/create-configure-action-policy.md index e53ffbcba7..b4cac2a949 100644 --- a/explore-analyze/alerting/kibana-alerting-experimental/action-policies/create-configure-action-policy.md +++ b/explore-analyze/alerting/kibana-alerting-experimental/action-policies/create-configure-action-policy.md @@ -41,9 +41,9 @@ Optional string labels you assign to a policy to categorize it or filter it on t An optional [KQL](../../../query-filter/languages/kql.md) expression that filters which alert episodes this policy applies to. An empty match condition matches every alert episode covered by the policy's scope. For a global policy, that means all alert episodes in the space. For a per-rule policy, it means all alert episodes from the associated rule. -Use match conditions to route different alert episodes to different policies, for example, one policy for `rule.tags: "payment-service"` alert episodes routed to PagerDuty and another for warnings routed to Slack. For available fields and examples, refer to [Match conditions fields](action-policy-reference.md#matcher-fields). +Use match conditions to route different alert episodes to different policies, for example, one policy for `episode.severity: "critical"` alert episodes routed to PagerDuty and another for lower-severity episodes routed to Slack. You can also scope by rule, such as `rule.tags: "payment-service"`, to apply a policy only to alert episodes from a set of related rules. For available fields and examples, refer to [Match conditions fields](action-policy-reference.md#matcher-fields). - ## Grouping and frequency [reduce-noise-grouping] diff --git a/explore-analyze/alerting/kibana-alerting-experimental/action-policies/manage-action-policies.md b/explore-analyze/alerting/kibana-alerting-experimental/action-policies/manage-action-policies.md index 4f575fb2ee..caa34ba73e 100644 --- a/explore-analyze/alerting/kibana-alerting-experimental/action-policies/manage-action-policies.md +++ b/explore-analyze/alerting/kibana-alerting-experimental/action-policies/manage-action-policies.md @@ -14,7 +14,7 @@ Action policies are part of the {{alerting-v2-system}} in {{kib}}. This page cov ## View policy details -From the **Action policies** list, you can open a policy to see its full configuration, including match conditions, grouping mode, frequency, and destinations. You can also edit, clone, delete, enable, disable, snooze, or update its API key without leaving the list page. +From the **Action policies** list, you can open a policy to see its full configuration, including match conditions, grouping mode, frequency, and destinations. The list also shows the display name of the user who created the policy and the user who last updated it. You can also edit, clone, delete, enable, disable, snooze, or update its API key without leaving the list page. ## Execution history diff --git a/explore-analyze/alerting/kibana-alerting-experimental/notifications-actions.md b/explore-analyze/alerting/kibana-alerting-experimental/notifications-actions.md index 813e461497..9c773a6158 100644 --- a/explore-analyze/alerting/kibana-alerting-experimental/notifications-actions.md +++ b/explore-analyze/alerting/kibana-alerting-experimental/notifications-actions.md @@ -32,7 +32,7 @@ Because each action policy evaluates alert episodes independently, an episode th ## Why policies are separate from rules -Policies are independent of rules. A single global policy can cover alert episodes from many rules, so a policy matching `data.severity: "critical"` applies regardless of which rule produced the alert episode. You can also update notification routing without touching any rule, and you can create rules without any action policy, which is useful for testing detection logic before wiring up notifications. +Policies are independent of rules. A single global policy can cover alert episodes from many rules, so a policy matching `episode.severity: "critical"` applies regardless of which rule produced the alert episode. You can also update notification routing without touching any rule, and you can create rules without any action policy, which is useful for testing detection logic before wiring up notifications. When you do need routing that is specific to one rule, create a per-rule policy and bind it to that rule at creation. diff --git a/explore-analyze/alerting/kibana-alerting-experimental/workflows-alerting.md b/explore-analyze/alerting/kibana-alerting-experimental/workflows-alerting.md index bd1c2931b7..041eff0df0 100644 --- a/explore-analyze/alerting/kibana-alerting-experimental/workflows-alerting.md +++ b/explore-analyze/alerting/kibana-alerting-experimental/workflows-alerting.md @@ -14,6 +14,7 @@ Workflows are part of the {{alerting-v2-system}} in {{kib}}. Without a workflow :::{important} Before creating an action policy, make sure the workflows you want to use already exist in your space. For information on creating a workflow, refer to [Build your first workflow](../../workflows/get-started/build-your-first-workflow.md). + ::: ## Runtime execution order [runtime-execution-order] From d8c233e7bcde3ca36e5cffbf462123991239d38e Mon Sep 17 00:00:00 2001 From: Nastasha Solomon <79124755+nastasha-solomon@users.noreply.github.com> Date: Tue, 23 Jun 2026 21:11:58 -0400 Subject: [PATCH 13/14] [Alerting V2][Serverless & 9.5][M2] Action policy and workflow changes from June 22, 2026 (#7079) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit ## Summary Updates action policy, workflow, and notification docs for the experimental alerting system with content from five M2 doc issues. Following the tech preview principle of accuracy over comprehensiveness, additions focus on stable concepts — architectural constraints, when to choose features, and how the system works. UI step-by-step procedures, unstable IA details, and unconfirmed API identifiers are deferred to post-preview docs. ### Action policies target only Alert mode rules (#6877) PR #272302 fixes a bug where rule quick filters in the action policy form surfaced all rule kinds instead of alert-kind rules only. **`notifications-actions.md`** — Integrated the Alert mode constraint into the opening paragraph so readers immediately understand the scope of action policies. The specific UI detail about quick filters is omitted; no existing docs described the incorrect behavior. **`action-policies/create-configure-action-policy.md`** — Added the Alert mode constraint as the opening sentence of the Policy type section so it reads as a foundational rule before the global/per-rule choice, not as an afterthought. ### Action policy scope uses matcher expressions only (#7012) PR #272196 removes `type` and `ruleId` fields from action policy scope and replaces them with a matcher-based pattern. **`action-policies/create-configure-action-policy.md`** — Added a sentence to the Match conditions section clarifying that the matcher expression is the sole mechanism for scoping a policy. There are no separate rule type or rule ID selector fields. The existing docs already reflected the KQL-based approach; this change makes the constraint explicit for users who may have encountered older API references. ### Execution history search and outcome filter (#7011) PR #272914 adds a search bar and outcome filter to the Execution history Policies tab. **`action-policies/manage-action-policies.md`** — Added a conceptual sentence noting that execution history supports search by policy name, rule name, or saved-object ID, and filtering by outcome. Explains that the new-events count reflects active filters. UI control details are omitted. Also updated the section heading from "Enable and snooze" to "Enable, disable, and snooze" so the section is findable by scanning. ### Alert episode lifecycle triggers (#6873, #7006) PR #268915 registers an `episodeAssigned` workflow trigger. PR #272504 adds triggers for eight alert episode lifecycle events including activation, deactivation, snooze, acknowledge, assign, unassign, and tag. **`workflows-alerting.md`** — Restructured the page to document both integration pathways: action policy-driven workflows and event-driven lifecycle triggers. Added a new Alert episode lifecycle triggers section with a table of all eight trigger IDs, the common event payload fields (`episodeId`, `ruleId`, `spaceId`), an example condition, and guidance on when to use triggers vs action policies. Marked the section as Stack only from 9.5.0. A TODO comment flags a discrepancy between trigger ID prefixes (`alertingV2.*` in issue #6873 vs `alerting.*` in issue #7006) for engineering confirmation before publishing. ### Description field cleanup (all pages) Replaced `{{alerting-v2-system}}` with the literal string "experimental alerting system" in the `description` frontmatter field across all six pages in the set. The variable is not valid in that field. ### Scope statement improvements (all pages) Added or sharpened "This page covers..." statements on `notifications-actions.md`, `workflows-alerting.md`, and `action-policies/reduce-notification-noise.md` to explicitly tell readers what they will understand or be able to do after reading. --- ### Issues confirmed as already covered or out of scope No issues from this batch were already covered or out of scope. The trigger ID prefix discrepancy between issues #6873 and #7006 is flagged with a TODO comment in `workflows-alerting.md` and is not blocking publication of the stable concepts documented here. ## Generative AI disclosure 1. Did you use a generative AI (GenAI) tool to assist in creating this contribution? - [x] Yes - Cursor + Claude - [ ] No --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> --- .../action-policy-reference.md | 2 +- .../create-configure-action-policy.md | 10 ++-- .../action-policies/manage-action-policies.md | 8 +-- .../reduce-notification-noise.md | 8 +-- .../notifications-actions.md | 14 ++--- .../workflows-alerting.md | 40 +++++++++++--- .../triggers/event-driven-triggers.md | 53 ++++++++++++++++++- 7 files changed, 108 insertions(+), 27 deletions(-) diff --git a/explore-analyze/alerting/kibana-alerting-experimental/action-policies/action-policy-reference.md b/explore-analyze/alerting/kibana-alerting-experimental/action-policies/action-policy-reference.md index 617bb26ce6..2e2075aa71 100644 --- a/explore-analyze/alerting/kibana-alerting-experimental/action-policies/action-policy-reference.md +++ b/explore-analyze/alerting/kibana-alerting-experimental/action-policies/action-policy-reference.md @@ -5,7 +5,7 @@ applies_to: serverless: experimental products: - id: kibana -description: "Grouping modes, frequency options, dispatch outcomes, and match conditions field reference for action policies in the {{alerting-v2-system}}." +description: "Grouping modes, frequency options, dispatch outcomes, and match conditions field reference for action policies in the experimental alerting system." --- # Action policy reference for the {{alerting-v2-system}} [action-policy-reference] diff --git a/explore-analyze/alerting/kibana-alerting-experimental/action-policies/create-configure-action-policy.md b/explore-analyze/alerting/kibana-alerting-experimental/action-policies/create-configure-action-policy.md index b4cac2a949..d6c19a78c7 100644 --- a/explore-analyze/alerting/kibana-alerting-experimental/action-policies/create-configure-action-policy.md +++ b/explore-analyze/alerting/kibana-alerting-experimental/action-policies/create-configure-action-policy.md @@ -5,7 +5,7 @@ applies_to: serverless: experimental products: - id: kibana -description: "Create action policies in the {{alerting-v2-system}}, configure match conditions, Notify per, Frequency, and workflow destinations." +description: "Create action policies in the experimental alerting system, configure match conditions, Notify per, Frequency, and workflow destinations." --- # Create an action policy for the {{alerting-v2-system}} [create-manage-action-policies] @@ -25,10 +25,12 @@ For match conditions fields, grouping modes, frequency options, and dispatch out ## Policy type [policy-type] +Action policies only process alert episodes from rules running in Alert mode. Signals produced by rules running in Detect mode are not eligible for action policy evaluation. + An action policy can be global or per-rule: -- **Global**: Global policies apply to any alert episode in the space. Use a global policy when you want to route alert episodes from multiple rules. For example, a policy matching `rule.tags: "checkout"` applies to every rule with that tag. -- **Per-rule**: Per-rule policies are scoped to a single rule. Use a per-rule policy when notification routing is specific to one rule and you don't want it to affect other rules in the space. +- **Global** - Global policies apply to any alert episode in the space. Use a global policy when you want to route alert episodes from multiple rules. For example, a policy matching `rule.tags: "checkout"` applies to every rule with that tag. +- **Per-rule** - Per-rule policies are scoped to a single rule. Use a per-rule policy when notification routing is specific to one rule and you don't want it to affect other rules in the space. The policy type is set at creation and cannot be changed. If you need a different type, create a new policy. @@ -41,6 +43,8 @@ Optional string labels you assign to a policy to categorize it or filter it on t An optional [KQL](../../../query-filter/languages/kql.md) expression that filters which alert episodes this policy applies to. An empty match condition matches every alert episode covered by the policy's scope. For a global policy, that means all alert episodes in the space. For a per-rule policy, it means all alert episodes from the associated rule. +The match condition is the sole mechanism for scoping a policy beyond its base type. There are no separate rule type or rule ID selector fields. All scoping is done through this expression. For a global policy that should target a specific rule, use `rule.id: "my-rule-id"` or `rule.tags: "my-tag"` in the match condition. + Use match conditions to route different alert episodes to different policies, for example, one policy for `episode.severity: "critical"` alert episodes routed to PagerDuty and another for lower-severity episodes routed to Slack. You can also scope by rule, such as `rule.tags: "payment-service"`, to apply a policy only to alert episodes from a set of related rules. For available fields and examples, refer to [Match conditions fields](action-policy-reference.md#matcher-fields). :::{note} -Snoozing an alert episode differs from [snoozing an action policy](manage-action-policies.md#enable-and-snooze). When you snooze a policy, the dispatch mechanism is paused and every series the policy would process is silenced. When you snooze an alert episode, you target one specific series before policy matching runs, silencing it regardless of which policy would have handled it. Use policy snooze when you want to pause all notifications from a given policy, for example, during planned maintenance on a destination system. +Snoozing an alert episode differs from [snoozing an action policy](manage-action-policies.md#enable-disable-and-snooze). When you snooze a policy, the dispatch mechanism is paused and every series the policy would process is silenced. When you snooze an alert episode, you target one specific series before policy matching runs, silencing it regardless of which policy would have handled it. Use policy snooze when you want to pause all notifications from a given policy, for example, during planned maintenance on a destination system. ::: ## Related pages diff --git a/explore-analyze/alerting/kibana-alerting-experimental/notifications-actions.md b/explore-analyze/alerting/kibana-alerting-experimental/notifications-actions.md index 9c773a6158..64e9ce2b0f 100644 --- a/explore-analyze/alerting/kibana-alerting-experimental/notifications-actions.md +++ b/explore-analyze/alerting/kibana-alerting-experimental/notifications-actions.md @@ -5,14 +5,14 @@ applies_to: serverless: experimental products: - id: kibana -description: "How {{alerting-v2-system}} action policies route alert episodes to notifications and actions." +description: "How experimental alerting system action policies route alert episodes to notifications and actions." --- # Notifications and actions for the {{alerting-v2-system}} -Action policies are part of the {{alerting-v2-system}} in {{kib}}. After a rule produces alert episodes, action policies decide whether and when to invoke workflows. Workflows are what actually send the notification or run the automation. +Action policies are part of the {{alerting-v2-system}} in {{kib}}. After a rule produces alert episodes in Alert mode, action policies decide whether and when to invoke workflows. Workflows are what actually send the notification or run the automation. Rules running in Detect mode produce signals, which are not processed by action policies. -This page explains how action policies work. For creating and configuring them step by step, refer to [Create and configure an action policy](action-policies/create-configure-action-policy.md). +This page covers how action policies gate alert episodes before invoking a workflow, the difference between global and per-rule policies, and how the dispatcher evaluates them on a continuous cycle. For creating and configuring them step by step, refer to [Create and configure an action policy](action-policies/create-configure-action-policy.md). ## What is an action policy [action-policies] @@ -20,9 +20,9 @@ An action policy is the gating layer between an alert episode and a workflow. It The three gates are suppression, match conditions, and frequency: -* **Suppression**: Suppression checks whether the alert episode should be silenced. Episodes that are acknowledged, snoozed, or inside a maintenance window are stopped here and no workflow is invoked. For details on each mechanism and its scope, refer to [Reduce notification noise](action-policies/reduce-notification-noise.md). -* **Match conditions**: Match conditions filter which alert episodes the policy applies to. You define them using [KQL](../../query-filter/languages/kql.md). An empty match condition applies to all alert episodes within the policy's scope. -* **Frequency**: Frequency controls how often the policy can invoke its workflows for the same group of episodes, and how episodes batch before a workflow is invoked. Options are one notification per alert episode, one per notification group, or one digest for all matching episodes. If a workflow was already invoked within the cooldown period, the episode waits. +* **Suppression** - Suppression checks whether the alert episode should be silenced. Episodes that are acknowledged, snoozed, or inside a maintenance window are stopped here and no workflow is invoked. For details on each mechanism and its scope, refer to [Reduce notification noise](action-policies/reduce-notification-noise.md). +* **Match conditions** - Match conditions filter which alert episodes the policy applies to. You define them using [KQL](../../query-filter/languages/kql.md). An empty match condition applies to all alert episodes within the policy's scope. +* **Frequency** - Frequency controls how often the policy can invoke its workflows for the same group of episodes, and how episodes batch before a workflow is invoked. Options are one notification per alert episode, one per notification group, or one digest for all matching episodes. If a workflow was already invoked within the cooldown period, the episode waits. If any gate stops the episode, the workflow is not invoked for that policy. @@ -64,7 +64,7 @@ Per-rule policies are bound to a specific rule at creation. They apply only to a {{kib}} runs a background process called the dispatcher that checks for eligible alert episodes on a short interval (around 5 seconds) and evaluates action policies against them. The dispatcher runs on its own cycle, separate from the rule schedule. -For each enabled policy that is not snoozed, the dispatcher works through the following steps. +For each enabled policy that is not snoozed, the dispatcher works through the following steps: 1. **Gating:** Is the alert episode acknowledged, snoozed, or deactivated? If so, skip. Refer to [Reduce notification noise](action-policies/reduce-notification-noise.md) to learn more. 2. **Matcher:** Does the alert episode match the policy's KQL? If not, skip this policy. diff --git a/explore-analyze/alerting/kibana-alerting-experimental/workflows-alerting.md b/explore-analyze/alerting/kibana-alerting-experimental/workflows-alerting.md index 041eff0df0..156b6abd74 100644 --- a/explore-analyze/alerting/kibana-alerting-experimental/workflows-alerting.md +++ b/explore-analyze/alerting/kibana-alerting-experimental/workflows-alerting.md @@ -5,21 +5,27 @@ applies_to: serverless: experimental products: - id: kibana -description: "How workflows connect to the {{alerting-v2-system}} action policies and rule automation, and where to configure them." +description: "How workflows connect to the experimental alerting system through action policies and alert episode lifecycle triggers, and when to use each." --- # Connect workflows to the {{alerting-v2-system}} [connect-workflows-experimental-alerting-system] -Workflows are part of the {{alerting-v2-system}} in {{kib}}. Without a workflow attached, an action policy cannot act on an alert episode. [Workflows](../../workflows.md) are the delivery layer. They define what happens when an action policy decides to act, such as sending a message, calling a webhook, or triggering automation. Setting up a workflow is what connects the {{alerting-v2-system}} to the tools your team uses for incident response. +Workflows are part of the {{alerting-v2-system}} in {{kib}}. [Workflows](../../workflows.md) are the delivery layer. They define what happens when the {{alerting-v2-system}} decides to act, such as sending a message, calling a webhook, or triggering an automation. Setting up a workflow is what connects the {{alerting-v2-system}} to the tools your team uses for incident response. -:::{important} -Before creating an action policy, make sure the workflows you want to use already exist in your space. For information on creating a workflow, refer to [Build your first workflow](../../workflows/get-started/build-your-first-workflow.md). +This page covers how action policies drive workflow invocations at runtime, the available alert episode lifecycle triggers, and when to use each pathway. -::: +The {{alerting-v2-system}} connects to workflows through two pathways. -## Runtime execution order [runtime-execution-order] +- **Action policies** - Action policies evaluate active alert episodes on a continuous schedule and invoke workflows based on match conditions and frequency settings. +- **Alert episode lifecycle triggers** - Workflows are invoked when a specific state change occurs on an alert episode, such as when the alert episode is activated, assigned, or resolved. -After a rule runs, processing follows this sequence: +## How action policies invoke workflows [action-policy-driven-workflows] + +:::{important} +Action policies need a workflow to act on alert episodes. Without one, the policy has nowhere to send notifications or run automations. If you haven't created a workflow yet, [build your first workflow](../../workflows/get-started/build-your-first-workflow.md) before continuing. +::: + +After a rule runs, the system routes alert episodes to workflows through the following steps. ``` Rule → Alert episode → [Dispatcher] → Action policy → Workflow → Notification @@ -28,10 +34,28 @@ Rule → Alert episode → [Dispatcher] → Action policy → Workflow → Notif 1. A rule evaluates data on a schedule and writes a rule event. 2. In Alert mode, the rule event opens or updates an alert episode. 3. The dispatcher runs on a short interval, independently of the rule schedule, and picks up active alert episodes. -4. For each active alert episode, the dispatcher evaluates all enabled action policies. Each policy runs the episode through a sequence of gates: suppression, match conditions, grouping, and frequency. +4. For each active alert episode, the dispatcher evaluates all enabled action policies. Each policy runs the episode through suppression, match conditions, grouping, and frequency gates. 5. For policies where the episode clears all gates, the dispatcher invokes the configured workflows. 6. Workflows deliver the notification or run the automation. +## Alert episode lifecycle triggers [alert-episode-lifecycle-triggers] + +Alert episode lifecycle triggers are a type of [event-driven trigger](../../workflows/triggers/event-driven-triggers.md) that start a workflow automatically when a specific event occurs. + +The {{alerting-v2-system}} emits a trigger event each time an alert episode changes state (for example, when it's activated, assigned to a user, acknowledged, or snoozed) and any workflow attached to that trigger type runs immediately in response. + +For a list of available triggers and event payload fields, refer to [Alert episode lifecycle triggers](../../workflows/triggers/event-driven-triggers.md). + +## Choosing between lifecycle triggers and action policies [choosing-lifecycle-triggers-action-policies] + +If you're unsure whether to use lifecycle triggers or action policies, the following table compares when each option is a good fit. Both can run different workflows simultaneously and coexist without conflict. + +| | Action policies | Lifecycle triggers | +|---|---|---| +| **How they run** | Evaluate alert episodes on a continuous schedule | React immediately to a specific state change | +| **Frequency control** | Apply suppression, grouping, and frequency gates | Fire exactly once per state change, no gates to configure | +| **Best for** | Recurring notifications and escalation logic that runs as long as a problem persists | One-shot automations, such as opening a ticket when an episode is assigned or posting a message when it resolves | + ## Next steps - [Create and configure an action policy](action-policies/create-configure-action-policy.md) to start routing alert episodes to workflows. diff --git a/explore-analyze/workflows/triggers/event-driven-triggers.md b/explore-analyze/workflows/triggers/event-driven-triggers.md index 9ef2ec27e3..9897e4a4bf 100644 --- a/explore-analyze/workflows/triggers/event-driven-triggers.md +++ b/explore-analyze/workflows/triggers/event-driven-triggers.md @@ -3,7 +3,7 @@ navigation_title: Event-driven triggers applies_to: stack: preview 9.4+ serverless: preview -description: Run a workflow in response to a platform event. Includes workflows.failed and the cases trigger family. +description: Run a workflow in response to a platform event. Includes workflows.failed, the cases trigger family, and alert episode lifecycle triggers. products: - id: kibana - id: cloud-serverless @@ -15,10 +15,11 @@ products: # Event-driven triggers [workflows-event-driven-triggers] -Event-driven triggers let workflows react to events elsewhere in {{kib}}. Two trigger families are available: +Event-driven triggers let workflows react to events elsewhere in {{kib}}. Three trigger families are available: - **`workflows.failed`** — Fires when another workflow's execution fails. {applies_to}`stack: preview 9.4+` {applies_to}`serverless: preview` - **Cases triggers** — Fire when cases change (created, updated, status changed, attachments added, comments added). {applies_to}`stack: preview 9.5+` {applies_to}`serverless: preview` +- **Alert episode lifecycle triggers** — Fire when an alert episode changes state in the experimental {{alerting-v2-system}}, such as when it is activated, assigned, acknowledged, or snoozed. {applies_to}`stack: experimental 9.5+` {applies_to}`serverless: experimental` :::{warning} The event-driven trigger system is in technical preview, including the triggers documented on this page. The schema and semantics can change in future releases. @@ -340,8 +341,55 @@ triggers: condition: 'event.owner: "securitySolution"' ``` +## Alert episode lifecycle triggers [alert-episode-lifecycle-triggers-event-driven] + +```{applies_to} +stack: experimental 9.5+ +serverless: experimental +``` + +Alert episode lifecycle triggers fire when an alert episode changes state in the experimental {{alerting-v2-system}}. Unlike `workflows.failed` and cases triggers, they are not configured through a `triggers` block in your workflow YAML. They are emitted by the alerting system and automatically invoke any workflow attached to the matching trigger type. Each trigger fires exactly once per state change. There is no polling interval or frequency gate. + +### Available triggers [alert-episode-lifecycle-triggers-available] + +| Trigger ID | When it fires | +|---|---| +| `alerting.episodeActivated` | An alert episode transitions to the active state. | +| `alerting.episodeDeactivated` | An alert episode is manually deactivated or recovers. | +| `alerting.episodeSnoozed` | An alert episode is snoozed. | +| `alerting.episodeUnsnoozed` | An alert episode is unsnoozed. | +| `alerting.episodeAcked` | An alert episode is acknowledged. | +| `alerting.episodeUnacked` | An alert episode acknowledgment is removed. | +| `alerting.episodeAssigned` | An alert episode is assigned to a user. | +| `alerting.episodeUnassigned` | An alert episode assignment is removed. | +| `alerting.episodeTagged` | A tag is applied to an alert episode. | + +### Event payload [alert-episode-lifecycle-triggers-event] + +All lifecycle triggers include these common fields in the event payload. + +| `event.*` field | Contains | +|---|---| +| `event.episodeId` | Unique identifier of the alert episode. | +| `event.ruleId` | ID of the rule that produced the alert episode. | +| `event.spaceId` | ID of the {{kib}} space where the event occurred. | + +Reference these fields with Liquid templating in workflow steps: + +```yaml +- name: log + type: console + with: + message: | + Episode {{ event.episodeId }} from rule {{ event.ruleId }} changed state. +``` + +Use these fields to write workflow conditions that scope the automation to specific rules or episodes. For example, use `event.ruleId: "my-rule-id"` to scope the workflow to alert episodes from a specific rule. + ## Prevent cascading handler loops +This section applies to `workflows.failed` handlers. Alert episode lifecycle triggers fire once per state change and do not re-trigger on workflow failure. + If a handler workflow itself fails, it can re-trigger itself. Two safeguards help you avoid infinite loops: - Every event includes `event.workflow.isErrorHandler`, which is `true` when the failing workflow is itself a handler. Filter on this in your handler's logic to skip handling your own failures. @@ -354,3 +402,4 @@ In practice, keep handler workflows simpler than the workflows they monitor. A h - [Triggers overview](/explore-analyze/workflows/triggers.md): All trigger types. - [Pass data and handle errors](/explore-analyze/workflows/authoring-techniques/pass-data-handle-errors.md): Per-step `on-failure` strategies complement event-driven handlers. - [Cases steps](/explore-analyze/workflows/steps/cases.md): Open cases from your handler. +- [Connect workflows to the {{alerting-v2-system}}](../../alerting/kibana-alerting-experimental/workflows-alerting.md): Full reference for alert episode lifecycle triggers, including available trigger IDs, event payload fields, and when to use lifecycle triggers versus action policies. From 661f27340e162e75ba44dbf13605b482deb2dfbc Mon Sep 17 00:00:00 2001 From: Nastasha Solomon <79124755+nastasha-solomon@users.noreply.github.com> Date: Wed, 24 Jun 2026 18:22:23 -0400 Subject: [PATCH 14/14] [Alerting V2][Serverless & 9.5][M2] Add action policies conceptual page, common scenarios, and nav restructure (#7105) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit ## Summary This PR restructures the experimental alerting (alerting v2) notification and action policies documentation. The main goals are to separate conceptual content from how-to content, improve navigation, and add new pages that address common user questions about severity-based routing and notification behavior. ### Structural changes - **Splits `notifications-actions.md`** into a brief section landing page and a new dedicated conceptual page (`about-action-policies.md`), so that the landing page orients users and the conceptual content has its own focused home. - **Moves `workflows-alerting.md`** from a top-level sibling of `notifications-actions.md` to a child, grouping all notification-related content under one section in the sidebar. - **Adds `common-action-policy-scenarios.md`** — a new page with annotated, table-backed examples for three common scenarios: routing by severity, managing notifications across severity changes, and re-notifying for persistently active episodes. ### Content changes - **`notifications-actions.md`** — Rewritten as a brief landing page explaining the two-layer model (workflows + action policies) with a get-started flow and an in-section index. - **`about-action-policies.md`** (new) — Consolidates conceptual content: the three gates (suppression, match conditions, frequency), policy types (global vs per-rule), how the dispatcher evaluates policies, and a tip callout explaining how severity changes interact with policy matching. - **`workflows-alerting.md`** — Restructured with a new "How the alerting system connects to workflows" section containing the two pathways as subsections, with intro sentences added to each. - **`create-configure-action-policy.md`** — Adds a note in the Frequency section explaining the `on_status_change` + severity change interaction and when to use a time-based throttle instead. - **`action-policy-reference.md`**, **`reduce-notification-noise.md`**, **`create-configure-action-policy.md`** — Cross-links updated to reflect the new page structure. - `episode.severity` replaced with `severity` throughout the action policies pages. ### Navigation (`toc.yml`) ``` Notifications and actions ← section landing page ├── Connect workflows ├── About action policies ← new ├── Common scenarios ← new ├── Create an action policy ├── Action policy reference ├── Manage action policies └── Reduce notification noise ``` ## Generative AI disclosure 1. Did you use a generative AI (GenAI) tool to assist in creating this contribution? - [x] Yes - Cursor + Claude - [ ] No --- .../action-policies/about-action-policies.md | 81 ++++++++++++++ .../action-policy-reference.md | 13 +-- .../common-action-policy-scenarios.md | 104 ++++++++++++++++++ .../create-configure-action-policy.md | 21 ++-- .../reduce-notification-noise.md | 4 +- .../notifications-actions.md | 79 +++---------- .../workflows-alerting.md | 20 ++-- explore-analyze/toc.yml | 4 +- 8 files changed, 229 insertions(+), 97 deletions(-) create mode 100644 explore-analyze/alerting/kibana-alerting-experimental/action-policies/about-action-policies.md create mode 100644 explore-analyze/alerting/kibana-alerting-experimental/action-policies/common-action-policy-scenarios.md diff --git a/explore-analyze/alerting/kibana-alerting-experimental/action-policies/about-action-policies.md b/explore-analyze/alerting/kibana-alerting-experimental/action-policies/about-action-policies.md new file mode 100644 index 0000000000..5c2ff530cb --- /dev/null +++ b/explore-analyze/alerting/kibana-alerting-experimental/action-policies/about-action-policies.md @@ -0,0 +1,81 @@ +--- +navigation_title: About action policies +applies_to: + stack: experimental 9.5+ + serverless: experimental +products: + - id: kibana +description: "How action policies gate alert episodes through suppression, match conditions, and frequency before invoking workflows in the experimental alerting system." +--- + +# About action policies [about-action-policies] + +Action policies are part of the {{alerting-v2-system}} in {{kib}}. An action policy is the gating layer between an alert episode and a workflow. It decides whether and when to invoke a workflow by running the alert episode through a sequence of gates. A workflow runs only if the alert episode clears each gate in sequence. + +## Why policies are separate from rules [policies-separate-from-rules] + +Policies are independent of rules. A single global policy can cover alert episodes from many rules, so a policy matching `severity: "critical"` applies regardless of which rule produced the alert episode. You can also update notification routing without touching any rule, and you can create rules without any action policy, which is useful for testing detection logic before wiring up notifications. + +When you do need routing that is specific to one rule, create a per-rule policy and bind it to that rule at creation. + +## How alert episodes are gated [action-policy-gates] + +The three gates are suppression, match conditions, and frequency: + +* **Suppression** - Suppression checks whether the alert episode should be silenced. Episodes that are acknowledged, snoozed, or inside a maintenance window are stopped here and no workflow is invoked. For details on each mechanism and its scope, refer to [Reduce notification noise](reduce-notification-noise.md). +* **Match conditions** - Match conditions filter which alert episodes the policy applies to. You define them using [KQL](../../../query-filter/languages/kql.md). An empty match condition applies to all alert episodes within the policy's scope. +* **Frequency** - Frequency controls how often the policy can invoke its workflows for the same group of episodes, and how episodes batch before a workflow is invoked. Options are one notification per alert episode, one per notification group, or one digest for all matching episodes. If a workflow was already invoked within the cooldown period, the episode waits. + +If any gate stops the episode, the workflow is not invoked for that policy. + +:::{note} +Because each action policy evaluates alert episodes independently, an episode that is blocked by one policy can still trigger a workflow through a second policy with different conditions. +::: + +## How policy types differ [policy-types] + +Policies can be global or per-rule. Global policies apply across all rules in a space and suit most use cases. Per-rule policies apply to a single rule and give you precise control over routing for that rule without affecting anything else in the space. + +### Global policies + +A global policy applies to all alert episodes in the space, from any rule. When an alert episode is produced, the dispatcher evaluates all enabled global policies that are not snoozed. Global is the default type and suits most use cases. + +### Per-rule policies + +A per-rule policy is scoped to a single rule. It applies only to alert episodes produced by that specific rule. Use a per-rule policy when routing is specific to one rule and you do not want it to affect other rules in the space. The rule association is set at creation and cannot be changed. + +## How policies apply to rules [how-policies-apply] + +How a policy applies depends on whether it is global or per-rule. Multiple policies can match the same alert episode, and each runs independently. There is no precedence or merging between them. If no policy matches an alert episode, no workflow is invoked and no notification is sent. + +### Global policies + +Global policies don't reference rules directly. You scope them using KQL over alert episode and rule fields, for example `rule.tags: "checkout"` or `data.severity: "critical"`. A global policy applies to every matching alert episode in the space, from any rule. + +### Per-rule policies + +Per-rule policies are bound to a specific rule at creation. They apply only to alert episodes from that rule, and you can still use match conditions to filter further within that rule's alert episodes. + +## How action policies are evaluated [how-action-policies-evaluated] + +{{kib}} runs a background process called the dispatcher that checks for eligible alert episodes on a short interval (around 5 seconds) and evaluates action policies against them. The dispatcher runs on its own cycle, separate from the rule schedule. + +For each enabled policy that is not snoozed, the dispatcher works through the following steps: + +1. **Gating:** Is the alert episode acknowledged, snoozed, or deactivated? If so, skip. Refer to [Reduce notification noise](reduce-notification-noise.md) to learn more. +2. **Matcher:** Does the alert episode match the policy's KQL? If not, skip this policy. +3. **Grouping:** How should matching alert episodes batch into notification groups? +4. **Frequency:** Has a workflow already been invoked for this notification group recently? If so, wait. +5. **Destinations:** Invoke the configured workflows. + +Workflow invocations may not happen immediately after a rule evaluates. + +:::{tip} +Severity changes can cause a policy to match an episode for the first time, which fires a notification even if the episode is not new. For example, if a policy is scoped to `severity: "critical"` and an episode escalates from `low` to `critical`, the policy fires because it has no prior notification record for that episode. However, a severity change alone does not re-trigger a policy that already matched the episode. Only a status change or the expiry of a time-based throttle can do that. For details and examples, refer to [Severity escalation and de-escalation](common-action-policy-scenarios.md#severity-escalation). +::: + +## Next steps + +- [Create and configure an action policy](create-configure-action-policy.md) to set up policy type, match conditions, grouping, frequency, and workflow destinations. +- [Manage action policies](manage-action-policies.md) to enable, disable, snooze, edit, or delete the policies in your space. +- [Action policy reference](action-policy-reference.md) to look up available match condition fields, grouping modes, and frequency options. diff --git a/explore-analyze/alerting/kibana-alerting-experimental/action-policies/action-policy-reference.md b/explore-analyze/alerting/kibana-alerting-experimental/action-policies/action-policy-reference.md index 2e2075aa71..7f9141c6fb 100644 --- a/explore-analyze/alerting/kibana-alerting-experimental/action-policies/action-policy-reference.md +++ b/explore-analyze/alerting/kibana-alerting-experimental/action-policies/action-policy-reference.md @@ -14,13 +14,13 @@ Action policies are part of the {{alerting-v2-system}} in {{kib}}. This page is ## Match conditions fields [matcher-fields] -Use these fields in the **Match conditions** expression to filter which alert episodes a policy applies to. Combine them with standard [KQL](../../../query-filter/languages/kql.md) operators, for example `episode.severity: "critical" AND episode_status: "active"`. +Use these fields in the **Match conditions** expression to filter which alert episodes a policy applies to. Combine them with standard [KQL](../../../query-filter/languages/kql.md) operators, for example `severity: "critical" AND episode_status: "active"`. | Field | Type | Description | Accepted values | Example | |---|---|---|---|---| | `episode_id` | string | Unique identifier of the alert episode. | Any string | `episode_id: "ep-001"` | | `episode_status` | string | Current lifecycle status of the alert episode. | `inactive`, `pending`, `active`, `recovering` | `episode_status: "active"` | -| `episode.severity` | string | Current severity of the alert episode. Populated when the rule's ES\|QL query includes a `severity` column whose value matches a supported level (case-insensitive). Unrecognized values are silently ignored and the field is absent. Not set during recovery. Use to route high-severity episodes to dedicated workflows. | `info`, `low`, `medium`, `high`, `critical` | `episode.severity: "critical" OR episode.severity: "high"` | +| `severity` | string | Current severity of the alert episode. Populated when the rule's ES\|QL query includes a `severity` column whose value matches a supported level (case-insensitive). Unrecognized values are silently ignored and the field is absent. Not set during recovery. Use to route high-severity episodes to dedicated workflows. | `info`, `low`, `medium`, `high`, `critical` | `severity: "critical" OR severity: "high"` | | `group_hash` | string | Stable hash identifying the alert series the alert episode belongs to. | Any string | `group_hash: "abc123"` | | `last_event_timestamp` | string | ISO 8601 timestamp of the most recent event recorded for the alert episode. | ISO 8601 timestamp | `last_event_timestamp > "2026-01-01"` | | `rule.id` | string | Unique identifier of the rule that generated the alert episode. | Any string | `rule.id: "rule-001"` | @@ -29,9 +29,7 @@ Use these fields in the **Match conditions** expression to filter which alert ep | `data.*` | object | Dynamic payload fields sent by the rule. Available fields depend on the rule type and configuration. Use for rule-specific fields not covered by the standard episode fields above. | Depends on rule type | `data.host.name: "web-01"` | ## Notify per options [notification-grouping] @@ -55,9 +53,6 @@ Frequency controls how often the policy fires for a given alert episode or notif | At most once every… | Caps notifications at one per alert episode or notification group within the chosen interval, regardless of rule frequency. | You want to limit notification volume for noisy rules without missing new or ongoing issues. | | Every evaluation | Notifies on every rule evaluation. Can be noisy. Use sparingly and only with infrequent rule schedules. | You need a full audit trail of every evaluation, or the rule runs infrequently enough that noise isn't a concern. | - - ### Frequency options for Episode [frequency-when-episode-per_episode] Available frequency options when you set **Notify per** to **Episode**. @@ -101,4 +96,4 @@ The dispatcher records each run with one of the following outcomes. To investiga - [Create and configure an action policy](create-configure-action-policy.md) to apply these fields and options when setting up a policy. - [Manage action policies in {{alerting-v2-system}}](manage-action-policies.md) to enable, disable, snooze, or audit your policies. -- [Notifications and actions in {{alerting-v2-system}}](../notifications-actions.md) to understand how action policies evaluate and gate alert episodes. \ No newline at end of file +- [About action policies](about-action-policies.md) to understand how action policies evaluate and gate alert episodes. \ No newline at end of file diff --git a/explore-analyze/alerting/kibana-alerting-experimental/action-policies/common-action-policy-scenarios.md b/explore-analyze/alerting/kibana-alerting-experimental/action-policies/common-action-policy-scenarios.md new file mode 100644 index 0000000000..f18a724986 --- /dev/null +++ b/explore-analyze/alerting/kibana-alerting-experimental/action-policies/common-action-policy-scenarios.md @@ -0,0 +1,104 @@ +--- +navigation_title: Common scenarios +applies_to: + stack: experimental 9.5+ + serverless: experimental +products: + - id: kibana +description: "Common action policy scenarios for the experimental alerting system, including routing by severity, handling severity escalation, and controlling re-notification." +--- + +# Common action policy scenarios [common-action-policy-scenarios] + +Action policies are part of the {{alerting-v2-system}} in {{kib}}. This page covers common situations you're likely to encounter when setting up action policies, and explains how to configure them to get the behavior you expect. + +## Route alert episodes to different workflows by severity [routing-by-severity] + +When your rules produce alert episodes at different severity levels, you'll often want to route them to different workflows. For example, you might page an on-call team for critical episodes while sending lower-severity episodes to a Slack channel for async review. + +To do this, create separate policies scoped to specific severity values using match conditions: + +| Field | Policy A | Policy B | +|---|---|---| +| **Policy type** | Global | Global | +| **Match conditions** | `severity: "critical"` | `severity: "low" OR severity: "medium" OR severity: "high"` | +| **Notify per** | Episode | Episode | +| **Frequency** | On status change | On status change | +| **Destinations** | PagerDuty workflow | Slack workflow | + +Each policy evaluates alert episodes independently: + +- An episode with `severity: "critical"` matches Policy A but not Policy B. +- An episode with `severity: "high"` matches Policy B but not Policy A. +- If an episode's severity changes mid-lifecycle, the policies that match it change accordingly. For example, if an episode escalates from `high` to `critical`, Policy A starts matching and Policy B stops matching. Policy A fires because it has no prior notification record for that episode. + +## Manage notifications across severity changes [severity-escalation] + +To manage notifications effectively when episode severity changes, you need to understand how policies match and re-match episodes as severity shifts. Whether a severity change fires a notification depends on whether the policy has matched the episode before and what frequency option is set. + +### Notify when an episode escalates into a new severity threshold + +Scope a policy to the severity level you want to be notified about. When an episode escalates into that severity level for the first time, the policy fires because it has no prior notification record for the episode. + +**Example:** Policy B is scoped to `severity: "critical"`. An episode starts at `low` severity, so Policy B does not match. When the episode escalates to `critical`, Policy B now matches and fires (regardless of the frequency setting) because it has never notified for this episode before. + +| Field | Value | +|---|---| +| **Policy type** | Global | +| **Match conditions** | `severity: "critical"` | +| **Notify per** | Episode | +| **Frequency** | On status change | +| **Destinations** | PagerDuty workflow | + +### Prevent duplicate notifications when severity changes within an existing match + +If a policy already matched an episode at a lower severity and the episode escalates, the policy does not automatically re-notify. With `On status change` frequency, a severity change alone does not count as a status change. + +**Example:** Policy A matches all episodes regardless of severity. It notified when the episode was `low`. The episode escalates to `critical`, but Policy A still matches and the status has not changed, only the severity has. The throttle blocks re-notification. To re-notify on escalation, use a time-based throttle or create separate policies per severity level as described in [Route alert episodes to different workflows by severity](#routing-by-severity). + +| Field | Value | +|---|---| +| **Policy type** | Global | +| **Match conditions** | (None, matches all episodes) | +| **Notify per** | Episode | +| **Frequency** | On status change | +| **Destinations** | Slack workflow | + +### Stop notifications when an episode de-escalates below a policy's threshold + +If an episode drops below a policy's severity threshold, the policy stops matching and sends no further notifications. If the episode later escalates back above the threshold, the policy fires again as if it were the first match. + +**Example:** Policy B is scoped to `severity: "critical"`. An episode de-escalates from `critical` to `high`. Policy B no longer matches and stops sending notifications. If the episode later escalates back to `critical`, Policy B fires again. + +| Field | Value | +|---|---| +| **Policy type** | Global | +| **Match conditions** | `severity: "critical"` | +| **Notify per** | Episode | +| **Frequency** | On status change | +| **Destinations** | PagerDuty workflow | + +## Re-notify for persistently active episodes [controlling-re-notification] + +The `On status change` frequency option notifies once per status transition (for example, when an episode activates or resolves). This is efficient for reducing noise, but it means that a persistently active episode that only changes in severity won't re-trigger a notification. + +If you want re-notification for episodes that stay active without a status change, use a time-based throttle instead: + +- **`At most once every…`:** Re-notifies after the configured interval regardless of whether severity or status changed. For example, `1h` sends a follow-up notification every hour while the episode remains active and matched. +- **`On status change + repeat at interval`:** Notifies on status change and then repeats at the configured interval while the episode stays in the same status. + +**Example:** You want to be re-paged if a critical episode stays open for more than an hour. Set the policy frequency to `At most once every 1h`. The policy fires when the episode first matches and then again each hour until the episode resolves or no longer matches. + +| Field | Value | +|---|---| +| **Policy type** | Global | +| **Match conditions** | `severity: "critical"` | +| **Notify per** | Episode | +| **Frequency** | At most once every 1 hour | +| **Destinations** | PagerDuty workflow | + +## Related pages + +- [Action policy reference](action-policy-reference.md) to find descriptions of match condition fields, grouping modes, and frequency options. +- [About action policies](about-action-policies.md) to understand how action policies evaluate and gate alert episodes. +- [Create and configure an action policy](create-configure-action-policy.md) to learn how to set up policy type, match conditions, grouping, frequency, and workflow destinations. diff --git a/explore-analyze/alerting/kibana-alerting-experimental/action-policies/create-configure-action-policy.md b/explore-analyze/alerting/kibana-alerting-experimental/action-policies/create-configure-action-policy.md index d6c19a78c7..4220566194 100644 --- a/explore-analyze/alerting/kibana-alerting-experimental/action-policies/create-configure-action-policy.md +++ b/explore-analyze/alerting/kibana-alerting-experimental/action-policies/create-configure-action-policy.md @@ -17,12 +17,6 @@ Because policies are separate from rules, you can update notification behavior a For match conditions fields, grouping modes, frequency options, and dispatch outcomes, refer to [Action policy reference](action-policy-reference.md). - - ## Policy type [policy-type] Action policies only process alert episodes from rules running in Alert mode. Signals produced by rules running in Detect mode are not eligible for action policy evaluation. @@ -45,10 +39,7 @@ An optional [KQL](../../../query-filter/languages/kql.md) expression that filter The match condition is the sole mechanism for scoping a policy beyond its base type. There are no separate rule type or rule ID selector fields. All scoping is done through this expression. For a global policy that should target a specific rule, use `rule.id: "my-rule-id"` or `rule.tags: "my-tag"` in the match condition. -Use match conditions to route different alert episodes to different policies, for example, one policy for `episode.severity: "critical"` alert episodes routed to PagerDuty and another for lower-severity episodes routed to Slack. You can also scope by rule, such as `rule.tags: "payment-service"`, to apply a policy only to alert episodes from a set of related rules. For available fields and examples, refer to [Match conditions fields](action-policy-reference.md#matcher-fields). - - +Use match conditions to route different alert episodes to different policies, for example, one policy for `severity: "critical"` alert episodes routed to PagerDuty and another for lower-severity episodes routed to Slack. You can also scope by rule, such as `rule.tags: "payment-service"`, to apply a policy only to alert episodes from a set of related rules. For available fields and examples, refer to [Match conditions fields](action-policy-reference.md#matcher-fields). ## Grouping and frequency [reduce-noise-grouping] @@ -73,12 +64,18 @@ For detailed descriptions, frequency options, and examples for each mode, refer **Frequency** limits how often the policy can fire for a given notification group. The interval resets from the last time the policy fired, so successive notifications stay at least `interval` apart. Set a duration such as `1h` or `30m`. For available options by **Notify per** mode, refer to [Frequency](action-policy-reference.md#throttle-strategies). +:::{note} +`On status change` only re-notifies when the alert episode's status changes, not when its severity changes. If an episode escalates from `low` to `critical` but the policy already matched it and the status hasn't changed, the throttle blocks re-notification. + +To receive escalation notifications, either create separate policies scoped to specific severity levels, or use a time-based throttle such as `At most once every 1h` so the policy re-notifies after the interval regardless of severity or status changes. For examples, refer to [Controlling re-notification](common-action-policy-scenarios.md#controlling-re-notification). +::: + ## Destinations -One or more workflows to invoke when the policy matches. Use the search field to find and attach workflows. +One or more [workflows](../../../workflows.md) to invoke when the policy matches. Use the search field to find and attach workflows. ## Related pages - [Manage action policies in {{alerting-v2-system}}](manage-action-policies.md) to view, enable, disable, or snooze the policies you create. - [Action policy reference in {{alerting-v2-system}}](action-policy-reference.md) to look up match condition fields, grouping modes, and frequency options. -- [Notifications and actions in {{alerting-v2-system}}](../notifications-actions.md) to understand how action policies evaluate and gate alert episodes. +- [About action policies](about-action-policies.md) to understand how action policies evaluate and gate alert episodes. diff --git a/explore-analyze/alerting/kibana-alerting-experimental/action-policies/reduce-notification-noise.md b/explore-analyze/alerting/kibana-alerting-experimental/action-policies/reduce-notification-noise.md index b9861e7872..c68478bd1e 100644 --- a/explore-analyze/alerting/kibana-alerting-experimental/action-policies/reduce-notification-noise.md +++ b/explore-analyze/alerting/kibana-alerting-experimental/action-policies/reduce-notification-noise.md @@ -12,7 +12,7 @@ description: "How to reduce notification noise in the experimental alerting syst Acknowledge, snooze, and deactivate are part of the {{alerting-v2-system}} in {{kib}}. Each one silences notifications for an alert episode at a different scope. When an alert episode is silenced, the dispatcher stops processing it before any action policy matching, grouping, or frequency evaluation runs. -This page covers when to use each silencing mechanism and how the scope of an alert episode snooze differs from the scope of a policy snooze. For an overview of where this fits in the full dispatch cycle, refer to [Notifications and actions in {{alerting-v2-system}}](../notifications-actions.md). +This page covers when to use each silencing mechanism and how the scope of an alert episode snooze differs from the scope of a policy snooze. For an overview of where this fits in the full dispatch cycle, refer to [About action policies](about-action-policies.md). ## Silencing mechanisms [silencing-mechanisms] @@ -44,6 +44,6 @@ Snoozing an alert episode differs from [snoozing an action policy](manage-action - [View and manage alerts](../alerts/view-and-manage-alerts.md) to apply gating actions from the alerts table or episode detail page. - [{{alerting-v2-system}} alerts](../alerts.md) to understand alert episode lifecycle, series, and where alert data is stored. --> -- [Notifications and actions in {{alerting-v2-system}}](../notifications-actions.md) to learn how action policies route and throttle alert episodes after silencing. +- [About action policies](about-action-policies.md) to learn how action policies route and throttle alert episodes after silencing. - [Create and configure an action policy](create-configure-action-policy.md) to set up the policies that run after gating checks pass. - [Action policy reference](action-policy-reference.md) to look up match condition fields, grouping modes, and frequency options. diff --git a/explore-analyze/alerting/kibana-alerting-experimental/notifications-actions.md b/explore-analyze/alerting/kibana-alerting-experimental/notifications-actions.md index 64e9ce2b0f..a5e265e6c5 100644 --- a/explore-analyze/alerting/kibana-alerting-experimental/notifications-actions.md +++ b/explore-analyze/alerting/kibana-alerting-experimental/notifications-actions.md @@ -5,77 +5,32 @@ applies_to: serverless: experimental products: - id: kibana -description: "How experimental alerting system action policies route alert episodes to notifications and actions." +description: "How to set up notifications and actions for rules in the experimental alerting system using workflows and action policies." --- # Notifications and actions for the {{alerting-v2-system}} -Action policies are part of the {{alerting-v2-system}} in {{kib}}. After a rule produces alert episodes in Alert mode, action policies decide whether and when to invoke workflows. Workflows are what actually send the notification or run the automation. Rules running in Detect mode produce signals, which are not processed by action policies. +Rules in the {{alerting-v2-system}} don't send notifications directly. Instead, they produce alert episodes, and you use workflows and action policies to decide what happens next. -This page covers how action policies gate alert episodes before invoking a workflow, the difference between global and per-rule policies, and how the dispatcher evaluates them on a continuous cycle. For creating and configuring them step by step, refer to [Create and configure an action policy](action-policies/create-configure-action-policy.md). +- **Workflows** are the delivery layer. They define what happens when the system decides to act, such as sending a message, calling a webhook, or triggering an automation. +- **Action policies** are the gating layer. They evaluate active alert episodes on a continuous schedule and invoke workflows based on match conditions, grouping, and frequency settings. -## What is an action policy [action-policies] +## Get started -An action policy is the gating layer between an alert episode and a workflow. It decides whether and when to invoke a workflow by running the alert episode through a sequence of gates. A workflow runs only if the alert episode clears each gate in sequence. +To send notifications or run actions from an alerting v2 rule: -The three gates are suppression, match conditions, and frequency: +1. [Build a workflow](../../workflows/get-started/build-your-first-workflow.md) that defines the action to take. +2. [Create an action policy](action-policies/create-configure-action-policy.md) that routes alert episodes to that workflow. -* **Suppression** - Suppression checks whether the alert episode should be silenced. Episodes that are acknowledged, snoozed, or inside a maintenance window are stopped here and no workflow is invoked. For details on each mechanism and its scope, refer to [Reduce notification noise](action-policies/reduce-notification-noise.md). -* **Match conditions** - Match conditions filter which alert episodes the policy applies to. You define them using [KQL](../../query-filter/languages/kql.md). An empty match condition applies to all alert episodes within the policy's scope. -* **Frequency** - Frequency controls how often the policy can invoke its workflows for the same group of episodes, and how episodes batch before a workflow is invoked. Options are one notification per alert episode, one per notification group, or one digest for all matching episodes. If a workflow was already invoked within the cooldown period, the episode waits. - -If any gate stops the episode, the workflow is not invoked for that policy. - -:::{note} -Because each action policy evaluates alert episodes independently, an episode that is blocked by one policy can still trigger a workflow through a second policy with different conditions. +:::{tip} +If you need an action to fire exactly once in response to a specific alert episode state change (such as opening a ticket when an episode is assigned) use an alert episode lifecycle trigger instead of an action policy. Refer to [Connect workflows](workflows-alerting.md) for a comparison of both approaches. ::: -## Why policies are separate from rules - -Policies are independent of rules. A single global policy can cover alert episodes from many rules, so a policy matching `episode.severity: "critical"` applies regardless of which rule produced the alert episode. You can also update notification routing without touching any rule, and you can create rules without any action policy, which is useful for testing detection logic before wiring up notifications. - -When you do need routing that is specific to one rule, create a per-rule policy and bind it to that rule at creation. - -## Policy types [policy-types] - -Policies can be global or per-rule. Global policies apply across all rules in a space and suit most use cases. Per-rule policies apply to a single rule and give you precise control over routing for that rule without affecting anything else in the space. - -### Global policies - -A global policy applies to all alert episodes in the space, from any rule. When an alert episode is produced, the dispatcher evaluates all enabled global policies that are not snoozed. Global is the default type and suits most use cases. - -### Per-rule policies - -A per-rule policy is scoped to a single rule. It applies only to alert episodes produced by that specific rule. Use a per-rule policy when routing is specific to one rule and you do not want it to affect other rules in the space. The rule association is set at creation and cannot be changed. - -## How policies apply to rules - -How a policy applies depends on whether it is global or per-rule. Multiple policies can match the same alert episode, and each runs independently. There is no precedence or merging between them. If no policy matches an alert episode, no workflow is invoked and no notification is sent. - -### Global policy application - -Global policies don't reference rules directly. You scope them using KQL over alert episode and rule fields, for example `rule.tags: "checkout"` or `data.severity: "critical"`. A global policy applies to every matching alert episode in the space, from any rule. - -### Per-rule policy application - -Per-rule policies are bound to a specific rule at creation. They apply only to alert episodes from that rule, and you can still use match conditions to filter further within that rule's alert episodes. - -## How action policies are evaluated [how-action-policies-evaluated] - -{{kib}} runs a background process called the dispatcher that checks for eligible alert episodes on a short interval (around 5 seconds) and evaluates action policies against them. The dispatcher runs on its own cycle, separate from the rule schedule. - -For each enabled policy that is not snoozed, the dispatcher works through the following steps: - -1. **Gating:** Is the alert episode acknowledged, snoozed, or deactivated? If so, skip. Refer to [Reduce notification noise](action-policies/reduce-notification-noise.md) to learn more. -2. **Matcher:** Does the alert episode match the policy's KQL? If not, skip this policy. -3. **Grouping:** How should matching alert episodes batch into notification groups? -4. **Frequency:** Has a workflow already been invoked for this notification group recently? If so, wait. -5. **Destinations:** Invoke the configured workflows. - -Workflow invocations may not happen immediately after a rule evaluates. - -## Next steps +## In this section -- [Create and configure an action policy](action-policies/create-configure-action-policy.md) to set up policy type, match conditions, grouping, frequency, and workflow destinations. -- [Manage action policies](action-policies/manage-action-policies.md) to enable, disable, snooze, edit, or delete the policies in your space. -- [Action policy reference](action-policies/action-policy-reference.md) to look up available match condition fields, grouping modes, and frequency options. +- [Connect workflows](workflows-alerting.md) - How action policies and lifecycle triggers invoke workflows at runtime. +- [About action policies](action-policies/about-action-policies.md) - How action policies evaluate and gate alert episodes. +- [Create an action policy](action-policies/create-configure-action-policy.md) - Configure policy type, match conditions, grouping, frequency, and destinations. +- [Action policy reference](action-policies/action-policy-reference.md) - Available match condition fields, grouping modes, and frequency options. +- [Manage action policies](action-policies/manage-action-policies.md) - Enable, disable, snooze, edit, or delete policies. +- [Reduce notification noise](action-policies/reduce-notification-noise.md) - Suppress alerts using acknowledgment, snooze, and maintenance windows. diff --git a/explore-analyze/alerting/kibana-alerting-experimental/workflows-alerting.md b/explore-analyze/alerting/kibana-alerting-experimental/workflows-alerting.md index 156b6abd74..4e11e40fa9 100644 --- a/explore-analyze/alerting/kibana-alerting-experimental/workflows-alerting.md +++ b/explore-analyze/alerting/kibana-alerting-experimental/workflows-alerting.md @@ -10,22 +10,20 @@ description: "How workflows connect to the experimental alerting system through # Connect workflows to the {{alerting-v2-system}} [connect-workflows-experimental-alerting-system] -Workflows are part of the {{alerting-v2-system}} in {{kib}}. [Workflows](../../workflows.md) are the delivery layer. They define what happens when the {{alerting-v2-system}} decides to act, such as sending a message, calling a webhook, or triggering an automation. Setting up a workflow is what connects the {{alerting-v2-system}} to the tools your team uses for incident response. +[Workflows](../../workflows.md) are part of the {{alerting-v2-system}} in {{kib}}. They are the delivery layer that defines what happens when the {{alerting-v2-system}} takes an action, such as sending a message, calling a webhook, or triggering an automation. Workflows are what allow your team's incident response tools to connect with the {{alerting-v2-system}}. This page covers how action policies drive workflow invocations at runtime, the available alert episode lifecycle triggers, and when to use each pathway. +## How the alerting system connects to workflows [connection-pathways] + The {{alerting-v2-system}} connects to workflows through two pathways. - **Action policies** - Action policies evaluate active alert episodes on a continuous schedule and invoke workflows based on match conditions and frequency settings. - **Alert episode lifecycle triggers** - Workflows are invoked when a specific state change occurs on an alert episode, such as when the alert episode is activated, assigned, or resolved. -## How action policies invoke workflows [action-policy-driven-workflows] - -:::{important} -Action policies need a workflow to act on alert episodes. Without one, the policy has nowhere to send notifications or run automations. If you haven't created a workflow yet, [build your first workflow](../../workflows/get-started/build-your-first-workflow.md) before continuing. -::: +### Action policies [action-policy-driven-workflows] -After a rule runs, the system routes alert episodes to workflows through the following steps. +Action policies evaluate alert episodes on a continuous schedule and invoke workflows when an episode meets the configured conditions. After a rule runs, the system routes alert episodes to workflows through the following steps. ``` Rule → Alert episode → [Dispatcher] → Action policy → Workflow → Notification @@ -38,15 +36,15 @@ Rule → Alert episode → [Dispatcher] → Action policy → Workflow → Notif 5. For policies where the episode clears all gates, the dispatcher invokes the configured workflows. 6. Workflows deliver the notification or run the automation. -## Alert episode lifecycle triggers [alert-episode-lifecycle-triggers] +### Alert episode lifecycle triggers [alert-episode-lifecycle-triggers] -Alert episode lifecycle triggers are a type of [event-driven trigger](../../workflows/triggers/event-driven-triggers.md) that start a workflow automatically when a specific event occurs. +Lifecycle triggers start a workflow immediately in response to a specific state change on an alert episode, without any scheduling or gating. Alert episode lifecycle triggers are a type of [event-driven trigger](../../workflows/triggers/event-driven-triggers.md) that start a workflow automatically when a specific event occurs. The {{alerting-v2-system}} emits a trigger event each time an alert episode changes state (for example, when it's activated, assigned to a user, acknowledged, or snoozed) and any workflow attached to that trigger type runs immediately in response. For a list of available triggers and event payload fields, refer to [Alert episode lifecycle triggers](../../workflows/triggers/event-driven-triggers.md). -## Choosing between lifecycle triggers and action policies [choosing-lifecycle-triggers-action-policies] +### When to use action policies or lifecycle triggers [when-to-use-action-policies-lifecycle-triggers] If you're unsure whether to use lifecycle triggers or action policies, the following table compares when each option is a good fit. Both can run different workflows simultaneously and coexist without conflict. @@ -59,5 +57,5 @@ If you're unsure whether to use lifecycle triggers or action policies, the follo ## Next steps - [Create and configure an action policy](action-policies/create-configure-action-policy.md) to start routing alert episodes to workflows. -- [Notifications and actions in {{alerting-v2-system}}](notifications-actions.md) to learn how action policies evaluate and gate alert episodes before invoking a workflow. +- [About action policies](action-policies/about-action-policies.md) to learn how action policies evaluate and gate alert episodes before invoking a workflow. diff --git a/explore-analyze/toc.yml b/explore-analyze/toc.yml index b01edf6032..6bacb804b8 100644 --- a/explore-analyze/toc.yml +++ b/explore-analyze/toc.yml @@ -382,9 +382,11 @@ toc: - file: report-and-share/reporting-troubleshooting-pdf.md - file: alerting.md children: - - file: alerting/kibana-alerting-experimental/workflows-alerting.md - file: alerting/kibana-alerting-experimental/notifications-actions.md children: + - file: alerting/kibana-alerting-experimental/workflows-alerting.md + - file: alerting/kibana-alerting-experimental/action-policies/about-action-policies.md + - file: alerting/kibana-alerting-experimental/action-policies/common-action-policy-scenarios.md - file: alerting/kibana-alerting-experimental/action-policies/create-configure-action-policy.md - file: alerting/kibana-alerting-experimental/action-policies/action-policy-reference.md - file: alerting/kibana-alerting-experimental/action-policies/manage-action-policies.md