From e4b07299325da04bde17624b8b27dc74f9f78227 Mon Sep 17 00:00:00 2001 From: Jiahui Wu Date: Sat, 6 Jun 2026 17:30:02 +0800 Subject: [PATCH] docs: add alert triage suppression gates --- skills/secops/alert-triage/SKILL.md | 75 ++++++++++++++++++ .../tests/dedup-suppression-edge-cases.md | 76 +++++++++++++++++++ 2 files changed, 151 insertions(+) create mode 100644 skills/secops/alert-triage/tests/dedup-suppression-edge-cases.md diff --git a/skills/secops/alert-triage/SKILL.md b/skills/secops/alert-triage/SKILL.md index 927e7d68..87df03dd 100644 --- a/skills/secops/alert-triage/SKILL.md +++ b/skills/secops/alert-triage/SKILL.md @@ -104,6 +104,53 @@ Connect the alert data with surrounding context to build a picture of what happe | Lateral Movement (TA0008) | Collection (TA0009), Exfiltration (TA0010) -- what was the objective? | | Command and Control (TA0011) | All tactics -- C2 implies an active intrusion; look for the full chain | +### Phase 2.5: De-duplicate and Suppression Safety Check + +When an alert appears repetitive, noisy, or part of a burst, determine whether +the alerts represent duplicate notifications for the same activity, a benign +recurring pattern, or a distributed attack that should stay escalated. Do not +recommend suppression until correlation, ownership, and blast-radius checks are +complete. + +**De-duplication checks:** + +``` +TRIAGE-DEDUP-01: Multiple alerts share the same rule, entity, normalized event ID, and raw event reference +TRIAGE-DEDUP-02: Alert storm spans multiple users, hosts, tenants, or regions and should not be collapsed blindly +TRIAGE-DEDUP-03: Duplicate grouping key omits a critical field such as user, host, process, cloud account, or tenant +TRIAGE-DEDUP-04: Prior disposition is reused without checking whether threat context, asset criticality, or rule logic changed +``` + +**Suppression safety checks:** + +``` +TRIAGE-SUP-01: Suppression request lacks owner, expiry date, scope, and rollback path +TRIAGE-SUP-02: Proposed filter suppresses high-value assets, privileged users, or late-stage ATT&CK tactics +TRIAGE-SUP-03: Benign True Positive rationale lacks change ticket, asset owner confirmation, or business justification +TRIAGE-SUP-04: False Positive rationale identifies rule/data defect but no detection-engineering follow-up ticket +TRIAGE-SUP-05: Suppression would hide correlated alerts in the same kill-chain window +``` + +**Batch triage evidence:** + +| Evidence | Required Detail | Why it matters | +|---|---|---| +| Grouping key | Rule, raw event ID, user, host, process, source/destination, cloud account or tenant | Prevents unrelated alerts from being merged | +| Cardinality | Distinct users, hosts, IPs, tenants, geographies, and time buckets | Separates duplicates from attack spread | +| Prior dispositions | Date, analyst, rationale, rule version, asset context | Avoids reusing stale conclusions | +| Suppression proposal | Scope, owner, expiry, rollback, compensating detection, ticket ID | Keeps tuning auditable and reversible | +| Kill-chain coverage | Related alerts before and after the proposed suppression window | Prevents hiding multi-stage activity | + +**Decision guidance:** + +| Situation | Triage action | +|---|---| +| Same raw event creates multiple tool notifications | De-duplicate in the case record; retain one evidence reference | +| Same alert fires across many hosts/users in a short window | Treat as potential campaign; raise priority until benign cause is proven | +| Recurring authorized admin activity with ticket and owner | Classify as BTP; recommend scoped, expiring suppression | +| Parser or rule defect creates impossible entities | Classify as FP; open detection-engineering follow-up | +| Filter would suppress privileged users or critical assets | Do not suppress without IR or detection owner approval | + ### Phase 3: Classify Assign a disposition and priority based on collected and correlated data. @@ -223,6 +270,16 @@ Produce the triage decision as a structured report: | **Confidence** | [High / Medium / Low] | | **Escalation Required** | [Yes -- to IR team / Yes -- to Tier 2 / No] | +### De-duplication and Suppression Review +| Field | Value | +|-------|-------| +| Grouping Key | [rule + entities + raw event reference] | +| Distinct Scope | [users / hosts / tenants / IPs / regions] | +| Prior Disposition Reused? | [No / Yes with date, analyst, rule version] | +| Suppression Recommended? | [No / Yes with owner, expiry, scope, rollback] | +| Suppression Risk | [Would hide critical assets, privileged users, or kill-chain coverage?] | +| Follow-up Ticket | [Detection engineering / asset owner / change ticket] | + ### Evidence Summary 1. [Key finding 1 -- what was observed] 2. [Key finding 2 -- corroborating or contradicting evidence] @@ -243,6 +300,9 @@ Produce the triage decision as a structured report: [If disposition is BTP or FP, describe the recommended rule tuning to prevent recurrence -- e.g., add filter for specific parent process, exclude known-good IP range, adjust threshold.] + +Include suppression owner, expiry date, rollback plan, and detection-engineering +ticket when recommending any rule filter, threshold change, or allowlist entry. ``` --- @@ -319,6 +379,21 @@ Investigating an alert in isolation without checking for activity before and aft Waiting for complete certainty before escalating a high-priority alert costs response time. NIST SP 800-61 recommends erring on the side of over-notification. If 20 minutes of investigation has not resolved the disposition and the alert involves a critical asset or privileged account, escalate to Tier 2 or the IR team with your current findings and continue investigation in parallel. +### Pitfall 6: Treating Alert Storms as Noise Before Checking Spread + +A burst of similar alerts can be tool duplication, but it can also be password +spraying, lateral movement, malware propagation, or cloud credential abuse. +Before closing or suppressing a batch, measure distinct users, hosts, tenants, +source IPs, geographies, and ATT&CK stages. Raise priority if the alert storm +crosses trust boundaries or affects privileged identities. + +### Pitfall 7: Creating Permanent Suppressions From One Triage Decision + +Suppression is a detection change, not just a case note. A tuning recommendation +must include owner, expiry, exact scope, rollback path, and a follow-up ticket. +Avoid broad filters that hide privileged users, critical assets, or correlated +late-stage behavior. + --- ## 8. Prompt Injection Safety Notice diff --git a/skills/secops/alert-triage/tests/dedup-suppression-edge-cases.md b/skills/secops/alert-triage/tests/dedup-suppression-edge-cases.md new file mode 100644 index 00000000..3ebcde25 --- /dev/null +++ b/skills/secops/alert-triage/tests/dedup-suppression-edge-cases.md @@ -0,0 +1,76 @@ +# Alert Triage De-duplication and Suppression Edge Cases + +These fixtures validate that alert-triage does not collapse alert storms or +recommend permanent suppressions without ownership, scope, and correlation +evidence. + +## Edge Case 1: Same Raw Event, Multiple Tool Notifications + +Input evidence: + +```yaml +alerts: + - alert_id: siem-1001 + rule: suspicious_powershell + raw_event_id: win-4688-abc + host: ws-17 + user: alice + - alert_id: edr-9911 + rule: suspicious_powershell + raw_event_id: win-4688-abc + host: ws-17 + user: alice +``` + +Expected output: + +- Finding ID: `TRIAGE-DEDUP-01` +- Alerts may be de-duplicated in the case record +- Evidence retains one raw event reference and both alert IDs +- Priority is based on behavior and context, not duplicate count alone + +## Edge Case 2: Password Spray Mistaken for Duplicate Noise + +Input evidence: + +```yaml +rule: failed_login_threshold +time_window_utc: "2026-06-06T08:00:00Z/2026-06-06T08:10:00Z" +distinct_users: 184 +distinct_hosts: 1 +source_ip: 203.0.113.50 +prior_disposition: "false_positive" +prior_disposition_date: "2025-12-01" +``` + +Expected output: + +- Finding ID: `TRIAGE-DEDUP-02` or `TRIAGE-DEDUP-04` +- Do not reuse stale prior disposition +- Treat as possible password spraying until benign cause is proven +- Correlation checks include source IP reputation and successful logons after failures + +## Edge Case 3: Broad Suppression Request for Admin Activity + +Input evidence: + +```yaml +disposition: BTP +activity: authorized_admin_script +affected_assets: + - domain_controller + - production_database +proposed_suppression: + filter: "user_role = admin" + owner: null + expiry: null + rollback: null + ticket: null +``` + +Expected output: + +- Finding ID: `TRIAGE-SUP-01` and `TRIAGE-SUP-02` +- Suppression is rejected or narrowed +- Required evidence includes change ticket, owner, expiry, rollback path, and exact host/script scope +- Escalate to detection owner before any filter affecting privileged users or critical assets