Add: New azure-data-collection-rules skill for Azure Monitor DCR authoring#2575
Add: New azure-data-collection-rules skill for Azure Monitor DCR authoring#2575ivkhrul wants to merge 2 commits into
Conversation
…oring Adds a skill that guides authoring, editing, and validating Azure Monitor Data Collection Rules (DCRs), including: - DCR JSON schema and structure guidance - KQL ingestion-time transformation patterns (7-stage pipeline) - Stream declarations and destination routing - Direct ingestion DCR authoring via Log Ingestion API - Custom table creation for Log Analytics - Client-side and ingestion-side processors - PowerShell helper scripts (validate, put, get, send-logs) - Token-optimized reference files (all under 2000-token limit) Covers single-stage and multi-stage transformation DCRs, direct ingestion, and custom log tables.
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Adds an Azure Data Collection Rules (DCR) “skill” with supporting PowerShell utilities, reference documentation, and example DCR JSONs to help author, validate, deploy, and test ingestion (agent-based and Direct ingestion).
Changes:
- Introduces PowerShell scripts for DCR validation, deployment (PUT/GET), table schema management, and sending sample logs via the Log Ingestion API.
- Adds a comprehensive documentation set covering DCR kinds, schema, routing rules, processor heuristics, limits, and KQL transform patterns.
- Provides working example DCR JSONs for common scenarios (syslog filtering, Windows event split, perf aggregation, direct ingestion).
Reviewed changes
Copilot reviewed 28 out of 28 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| plugin/skills/azure-data-collection-rules/scripts/validate-dcr.ps1 | New pre-deploy validator for structural rules, routing rules, and limits. |
| plugin/skills/azure-data-collection-rules/scripts/send-logs.ps1 | New helper to acquire AAD token and POST sample log payloads via Log Ingestion API. |
| plugin/skills/azure-data-collection-rules/scripts/put-dcr.ps1 | New helper to deploy (create/update) DCR via Invoke-AzRestMethod. |
| plugin/skills/azure-data-collection-rules/scripts/get-table-schema.ps1 | New helper to fetch Log Analytics table schema via ARM. |
| plugin/skills/azure-data-collection-rules/scripts/get-dcr.ps1 | New helper to retrieve and optionally save an existing DCR. |
| plugin/skills/azure-data-collection-rules/scripts/create-custom-table.ps1 | New helper to create/update a custom Log Analytics table schema. |
| plugin/skills/azure-data-collection-rules/references/supported-tables.md | Documents standard tables that accept custom streams. |
| plugin/skills/azure-data-collection-rules/references/stream-declarations.md | Documents streamDeclarations rules and constraints. |
| plugin/skills/azure-data-collection-rules/references/processors-operations.md | Documents processor configs/semantics (filter/map/parse/aggregate/enrich/KQL). |
| plugin/skills/azure-data-collection-rules/references/processors-headers.md | Documents header processors and stage availability. |
| plugin/skills/azure-data-collection-rules/references/processor-heuristics-transforms.md | Heuristics for mapping intent to transforms/processors. |
| plugin/skills/azure-data-collection-rules/references/processor-heuristics-staging.md | Heuristics for choosing client vs ingestion stage and cost implications. |
| plugin/skills/azure-data-collection-rules/references/processor-heuristics-filters.md | Heuristics for native vs processor/KQL filtering. |
| plugin/skills/azure-data-collection-rules/references/procedure.md | End-to-end workflow for gathering requirements → author → validate → deploy → verify. |
| plugin/skills/azure-data-collection-rules/references/limits.md | Consolidated limits relevant to DCR/table authoring and ingestion. |
| plugin/skills/azure-data-collection-rules/references/la-tables.md | Reference for standard/custom tables and creating custom tables. |
| plugin/skills/azure-data-collection-rules/references/kql-transforms.md | KQL ingestion transform patterns and constraints. |
| plugin/skills/azure-data-collection-rules/references/direct-ingestion.md | Direct ingestion (kind=Direct) architecture, requirements, and examples. |
| plugin/skills/azure-data-collection-rules/references/destination-routing.md | Stream-to-table routing rules and constraints. |
| plugin/skills/azure-data-collection-rules/references/decision-guide.md | Scenario-to-approach quick mapping. |
| plugin/skills/azure-data-collection-rules/references/dcr-schema.md | DCR schema reference and constraints. |
| plugin/skills/azure-data-collection-rules/references/dcr-kinds.md | Guide to selecting DCR kind and available transformation sections. |
| plugin/skills/azure-data-collection-rules/examples/windows-events-split.json | Example: split Windows events into custom + standard flows. |
| plugin/skills/azure-data-collection-rules/examples/syslog-filter-drop.json | Example: syslog with native filters + client-side column drops. |
| plugin/skills/azure-data-collection-rules/examples/perf-counter-aggregation.json | Example: perf counter aggregation routed to custom table. |
| plugin/skills/azure-data-collection-rules/examples/direct-ingestion-custom-table.json | Example: Direct ingestion mapping into a custom table. |
| plugin/skills/azure-data-collection-rules/examples/custom-json-log.json | Example: JSON text logs + parse + drop + ingestion KQL. |
| plugin/skills/azure-data-collection-rules/SKILL.md | Skill manifest tying procedure + references together for discoverability. |
| $scope = [System.Web.HttpUtility]::UrlEncode("https://monitor.azure.com//.default") | ||
| $tokenBody = "client_id=$AppId&scope=$scope&client_secret=$AppSecret&grant_type=client_credentials" |
There was a problem hiding this comment.
Fixed in be68179. Scope URL corrected to single slash.
| Add-Type -AssemblyName System.Web | ||
| $scope = [System.Web.HttpUtility]::UrlEncode("https://monitor.azure.com//.default") | ||
| $tokenBody = "client_id=$AppId&scope=$scope&client_secret=$AppSecret&grant_type=client_credentials" | ||
| $tokenHeaders = @{ "Content-Type" = "application/x-www-form-urlencoded" } |
There was a problem hiding this comment.
Fixed in be68179. Replaced manual URL encoding with hashtable -Body (PowerShell handles form-encoding natively). Also removed Add-Type -AssemblyName System.Web.
| } catch { | ||
| $statusCode = $_.Exception.Response.StatusCode.value__ | ||
| Write-Error "Failed to send data. Status: $statusCode. Error: $_" | ||
| exit 1 | ||
| } |
There was a problem hiding this comment.
Fixed in be68179. Status code extraction is now conditional: if ($_.Exception.Response) { ... } else { 'N/A' }.
| if (-not $streamName.StartsWith('Custom-') -and -not $streamName.StartsWith('Microsoft-')) { | ||
| $errors += "Stream '$streamName' must start with 'Custom-' or 'Microsoft-'" | ||
| } |
There was a problem hiding this comment.
Fixed in be68179. Microsoft-* keys in streamDeclarations now produce a hard error. Custom-* is the only valid prefix.
| $supportedStandardTables = @( | ||
| 'ABAPAuditLog','ABAPAuthorizationDetails','ABAPChangeDocsLog','ABAPUserDetails', | ||
| 'ADAssessmentRecommendation','ADSecurityAssessmentRecommendation','Anomalies', | ||
| 'ASimAuditEventLogs','ASimAuthenticationEventLogs','ASimDhcpEventLogs','ASimDnsActivityLogs', | ||
| 'ASimFileEventLogs','ASimNetworkSessionLogs','ASimProcessEventLogs','ASimRegistryEventLogs', | ||
| 'ASimUserManagementActivityLogs','ASimWebSessionLogs', | ||
| 'AWSALBAccessLogs','AWSCloudTrail','AWSCloudWatch','AWSEKS','AWSELBFlowLogs','AWSGuardDuty', | ||
| 'AWSNetworkFirewallAlert','AWSNetworkFirewallFlow','AWSNetworkFirewallTls','AWSNLBAccessLogs', | ||
| 'AWSRoute53Resolver','AWSS3ServerAccess','AWSSecurityHubFindings','AWSVPCFlow','AWSWAF', | ||
| 'AzureAssessmentRecommendation','AzureMetricsV2','CommonSecurityLog', | ||
| 'CrowdStrikeAlerts','CrowdStrikeCases','CrowdStrikeDetections','CrowdStrikeHosts', | ||
| 'CrowdStrikeIncidents','CrowdStrikeVulnerabilities', | ||
| 'DeviceTvmSecureConfigurationAssessmentKB','DeviceTvmSoftwareVulnerabilitiesKB', | ||
| 'DnsAuditEvents','Event', | ||
| 'ExchangeAssessmentRecommendation','ExchangeOnlineAssessmentRecommendation', | ||
| 'GCPApigee','GCPAuditLogs','GCPCDN','GCPCloudRun','GCPCloudSQL','GCPComputeEngine', | ||
| 'GCPDNS','GCPFirewallLogs','GCPIAM','GCPIDS','GCPMonitoring','GCPNAT','GCPNATAudit', | ||
| 'GCPResourceManager','GCPVPCFlow','GKEAPIServer','GKEApplication','GKEAudit', | ||
| 'GKEControllerManager','GKEHPADecision','GKEScheduler','GoogleCloudSCC','GoogleWorkspaceReports', | ||
| 'IlumioInsights','OTelLogs','QualysKnowledgeBase', | ||
| 'Rapid7InsightVMCloudAssets','Rapid7InsightVMCloudVulnerabilities', | ||
| 'SCCMAssessmentRecommendation','SCOMAssessmentRecommendation','SecurityEvent', | ||
| 'SfBAssessmentRecommendation','SfBOnlineAssessmentRecommendation', | ||
| 'SharePointOnlineAssessmentRecommendation','SPAssessmentRecommendation','SQLAssessmentRecommendation', | ||
| 'Syslog','ThreatIntelIndicators','ThreatIntelligenceIndicator','ThreatIntelObjects', | ||
| 'UCClient','UCClientReadinessStatus','UCClientUpdateStatus','UCDeviceAlert', | ||
| 'UCDOAggregatedStatus','UCDOStatus','UCServiceUpdateStatus','UCUpdateAlert', | ||
| 'WindowsClientAssessmentRecommendation','WindowsEvent','WindowsServerAssessmentRecommendation' | ||
| ) |
There was a problem hiding this comment.
Fixed in be68179. Created references/supported-tables.json as the single source of truth. The validator now loads from this file at runtime, so docs and script stay consistent.
| # DCR Structure Limits | ||
| $dataSourceTypes = @('syslog','windowsEventLogs','performanceCounters','logFiles','iisLogs','extensions') |
|
|
||
| ### Authentication | ||
|
|
||
| Token audience (scope): `https://monitor.azure.com/.default` |
There was a problem hiding this comment.
Fixed in be68179. Code sample now uses hashtable body with correct single-slash scope, matching the prose.
| | 2 | **Custom stream → custom table** always works. Table name must end with `_CL`. | `Custom-MyLogs` → `Custom-MyLogs_CL` | | ||
| | 3 | **Custom stream → supported standard table** works only for tables on the [supported list](#standard-tables-accepting-custom-streams). | `Custom-MyEvents` → `Microsoft-Event` (Event is on the list) | | ||
| | 4 | **Custom stream → unsupported standard table** is **not allowed**. Route to a custom table instead (`Custom-*_CL`). In direct ingestion, only custom streams are available, so this is the only option. | Cannot send `Custom-X` to a table not on the supported list → create a custom table | | ||
| | 5 | **Standard stream → custom table** requires `outputStream` set to `Custom-{Table}_CL`. A `transformKql` may not be required (even a pass-through `"source"` might be unnecessary). **TODO: validate whether `transformKql` is needed or if `outputStream` alone suffices.** | Split syslog: `Microsoft-Syslog` + `outputStream: "Custom-SyslogArchive_CL"` | |
There was a problem hiding this comment.
Fixed in be68179. Confirmed behavior: transformKql is required for standard-to-custom routing. TODO removed; rule 5 now states this definitively.
| [Parameter(Mandatory)][string]$TenantId, | ||
| [Parameter(Mandatory)][string]$AppId, | ||
| [Parameter(Mandatory)][string]$AppSecret, |
There was a problem hiding this comment.
Acknowledged. Keeping as plain string intentionally. This script is designed to be invoked by the Copilot agent runtime, which passes parameters as strings. SecureString would break the automated invocation flow. Users running the script manually should use environment variables or a secret manager to avoid shell history exposure.
…ble list, validation fixes - send-logs.ps1: Replace Add-Type System.Web with hashtable body (PS7+ cross-platform) - send-logs.ps1: Fix double-slash in monitor.azure.com scope URL - send-logs.ps1: Handle null Response in catch block (DNS/TLS failures) - validate-dcr.ps1: Load supported tables from centralized supported-tables.json - validate-dcr.ps1: Reject Microsoft-* keys in streamDeclarations (hard error) - validate-dcr.ps1: Remove unused dataSourceTypes variable - direct-ingestion.md: Fix double-slash scope URL and use hashtable auth pattern - destination-routing.md: Resolve TODO on rule 5 (transformKql is required) - references/supported-tables.json: New centralized table list (single source of truth)
Summary
Adds a new skill azure-data-collection-rules that guides authoring, editing, and validating Azure Monitor Data Collection Rules (DCRs).
What's included
Capabilities
Token compliance
Checklist