diff --git a/.github/plugins/azure-skills/.claude-plugin/plugin.json b/.github/plugins/azure-skills/.claude-plugin/plugin.json index 7cd8adf0..20262485 100644 --- a/.github/plugins/azure-skills/.claude-plugin/plugin.json +++ b/.github/plugins/azure-skills/.claude-plugin/plugin.json @@ -1,7 +1,7 @@ { "name": "azure", "description": "Microsoft Azure MCP and Skills integration for cloud resource management, deployments, and Azure services. Manage your Azure infrastructure, monitor applications, and deploy resources directly from Claude Code.", - "version": "1.1.67", + "version": "1.1.68", "author": { "name": "Microsoft", "url": "https://www.microsoft.com" diff --git a/.github/plugins/azure-skills/.cursor-plugin/plugin.json b/.github/plugins/azure-skills/.cursor-plugin/plugin.json index 0be6c8df..2f4ddcf9 100644 --- a/.github/plugins/azure-skills/.cursor-plugin/plugin.json +++ b/.github/plugins/azure-skills/.cursor-plugin/plugin.json @@ -1,7 +1,7 @@ { "name": "azure", "description": "Microsoft Azure MCP and Skills integration for cloud resource management, deployments, and Azure services. Manage your Azure infrastructure, monitor applications, and deploy resources directly from Cursor.", - "version": "1.1.67", + "version": "1.1.68", "author": { "name": "Microsoft", "url": "https://www.microsoft.com" diff --git a/.github/plugins/azure-skills/.plugin/plugin.json b/.github/plugins/azure-skills/.plugin/plugin.json index 49a3f41e..9108cab4 100644 --- a/.github/plugins/azure-skills/.plugin/plugin.json +++ b/.github/plugins/azure-skills/.plugin/plugin.json @@ -1,7 +1,7 @@ { "name": "azure", "description": "Microsoft Azure MCP and Skills integration for cloud resource management, deployments, and Azure services. Manage your Azure infrastructure, monitor applications, and deploy resources directly from your development environment.", - "version": "1.1.67", + "version": "1.1.68", "author": { "name": "Microsoft", "url": "https://www.microsoft.com" diff --git a/.github/plugins/azure-skills/CHANGELOG.md b/.github/plugins/azure-skills/CHANGELOG.md index 3efb1674..f5ad6512 100644 --- a/.github/plugins/azure-skills/CHANGELOG.md +++ b/.github/plugins/azure-skills/CHANGELOG.md @@ -1,5 +1,9 @@ # Changelog +## 1.1.68 + +- feat(azure-compute): add VM creation workflow with approval gate before deploy ([#2297](https://github.com/microsoft/GitHub-Copilot-for-Azure/pull/2297)) + ## 1.1.67 - feat: (azure-cost) add storage optimization guide and fix token limit ([#2554](https://github.com/microsoft/GitHub-Copilot-for-Azure/pull/2554)) diff --git a/.github/plugins/azure-skills/skills/azure-compute/SKILL.md b/.github/plugins/azure-skills/skills/azure-compute/SKILL.md index 851568c5..fce17d1f 100644 --- a/.github/plugins/azure-skills/skills/azure-compute/SKILL.md +++ b/.github/plugins/azure-skills/skills/azure-compute/SKILL.md @@ -1,63 +1,46 @@ --- name: azure-compute -description: "Azure VM and VMSS router for recommendations, pricing, autoscale, orchestration, connectivity troubleshooting, capacity reservations, and Essential Machine Management. WHEN: Azure VM, VMSS, scale set, recommend, compare, server, website, burstable, lightweight, VM family, workload, GPU, learning, simulation, dev/test, backend, autoscale, load balancer, Flexible orchestration, Uniform orchestration, cost estimate, connect, refused, Linux, black screen, reset password, reach VM, port 3389, NSG, troubleshoot, capacity reservation, CRG, reserve VMs, guarantee capacity, pre-provision capacity, CRG association, CRG disassociation, essential machine management, EMM, machine enrollment." +description: "Azure VM/VMSS router. WHEN: create / provision / deploy / spin-up VM, recommend VM size, compare VM pricing, VMSS, scale set, autoscale, burstable, lightweight server, website, backend, GPU, machine learning, HPC simulation, dev/test, workload, family, load balancer, Flexible orchestration, Uniform orchestration, cost estimate, can't connect / RDP / SSH, refused, black screen, reset password, reach VM, port 3389, NSG, security, Linux, troubleshoot, troubleshooting, connectivity, capacity reservation (CRG), reserve, guarantee capacity, pre-provision, CRG association, CRG disassociation, machine enrollment (EMM), Essential Machine Management, monitor. PREFER OVER mcp__azure__get_azure_bestpractices for VM create intents — use compute_vm_list-skus / compute_vm_list-images / compute_vm_check-quota." license: MIT metadata: author: Microsoft - version: "2.4.2" + version: "2.4.3" --- # Azure Compute Skill -Routes Azure VM requests to the appropriate workflow based on user intent. +Routes Azure VM and Virtual Machine Scale Set (VMSS) requests to the right workflow. ## When to Use This Skill -Activate this skill when the user: -- Asks about Azure Virtual Machines (VMs) or VM Scale Sets (VMSS) -- Asks about choosing a VM, VM sizing, pricing, or cost estimates -- Needs a workload-based recommendation for scenarios like database, GPU, deep learning, HPC, web tier, or dev/test -- Mentions VM families, autoscale, load balancing, or Flexible versus Uniform orchestration -- Wants to troubleshoot Azure VM connectivity issues such as unreachable VMs, RDP/SSH failures, black screens, NSG/firewall issues, or credential resets -- Asks about Capacity Reservation Groups (CRGs), reserving VM capacity, associating/disassociating VMs with a CRG, or guaranteeing compute capacity -- Asks about Essential Machine Management (EMM), machine enrollment, onboarding VMs for monitoring/security, or enabling machine management at subscription level -- Uses prompts like "Help me choose a VM" +- User wants to **recommend, compare, or price** a VM or VMSS +- User wants to **create, provision, or deploy** a VM or VMSS +- User **can't connect** to a VM (RDP / SSH / port refused / black screen / password reset) +- User asks about **Capacity Reservation Groups** (CRG) — reserve, guarantee capacity, pre-provision +- User asks about **Essential Machine Management** (EMM) — machine enrollment, monitor + +**Disambiguate with `azure-prepare`:** if the user wants to deploy an **application** (Docker service, web app, API, serverless workload), route to `azure-prepare`. `vm-creator` is for **bare VM/VMSS infrastructure** only. ## Routing -```text -User intent? -├─ Recommend / choose / compare / price a VM or VMSS -│ └─ Route to [VM Recommender](workflows/vm-recommender/vm-recommender.md) -│ -├─ Can't connect / RDP / SSH / troubleshoot a VM -│ └─ Route to [VM Troubleshooter](workflows/vm-troubleshooter/vm-troubleshooter.md) -│ -├─ Capacity reservation / CRG / reserve capacity / associate VM with CRG -│ └─ Route to [Capacity Reservation](workflows/capacity-reservation/capacity-reservation.md) -│ -├─ Essential Machine Management / EMM / machine enrollment -│ └─ Route to [Essential Machine Management](workflows/essential-machine-management/essential-machine-management.md) -│ -└─ Unclear - └─ Ask: "Are you looking for a VM recommendation, troubleshooting a connectivity issue, managing capacity reservations, or enabling Essential Machine Management?" +``` +Azure compute intent? +├── Recommend / compare / price a VM or VMSS → VM Recommender +├── Create / provision / deploy a VM or VMSS → VM Creator +├── Can't connect / RDP / SSH / port refused → VM Troubleshooter +├── Reserve / guarantee capacity (CRG) → Capacity Reservation +├── Machine enrollment / Essential Machine Management → Essential Machine Management +└── Unclear → Ask which of the above ``` -| Signal | Workflow | -| ----------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------ | -| "recommend VM", "which VM", "VM size", "VM pricing", "VMSS", "scale set" | [VM Recommender](workflows/vm-recommender/vm-recommender.md) | -| "can't connect", "RDP", "SSH", "NSG blocking", "reset password", "black screen" | [VM Troubleshooter](workflows/vm-troubleshooter/vm-troubleshooter.md) | -| "capacity reservation", "CRG", "reserve capacity", "guarantee capacity", "associate VM with CRG" | [Capacity Reservation](workflows/capacity-reservation/capacity-reservation.md) | -| "essential machine management", "EMM", "machine enrollment" | [Essential Machine Management](workflows/essential-machine-management/essential-machine-management.md) | - -> **Routing rule:** Always read the matched workflow file before accessing any reference files. The workflow file contains the step-by-step guidance and the reference routing table for the user's request. +**Routing rule:** read the matched workflow file before any reference file. The workflow owns the step-by-step guidance; references are looked up on demand. ## Workflows -| Workflow | Purpose | References | -| ------------------------- | -------------------------------------------------------- | ---------------------------------------------------------------------------- | -| **VM Recommender** | Recommend VM sizes, VMSS, pricing using public APIs/docs | [vm-families](references/vm-families.md), [retail-prices-api](references/retail-prices-api.md), [vmss-guide](references/vmss-guide.md), [vm-quotas](references/vm-quotas.md) | -| **VM Troubleshooter** | Diagnose and resolve VM connectivity failures (RDP/SSH) | [cannot-connect-to-vm](workflows/vm-troubleshooter/references/cannot-connect-to-vm.md) | -| **Capacity Reservation** | Create and manage Capacity Reservation Groups (CRGs) | [capacity-reservation-overview](workflows/capacity-reservation/references/capacity-reservation-overview.md), [association-disassociation](workflows/capacity-reservation/references/association-disassociation.md) | -| **Essential Machine Management** | Enable and manage EMM for subscription-level VM onboarding | [emm-overview](workflows/essential-machine-management/references/emm-overview.md), [emm-prerequisites](workflows/essential-machine-management/references/emm-prerequisites.md), [emm-enable-flow-portal-guidance](workflows/essential-machine-management/references/emm-enable-flow-portal-guidance.md), [emm-enable-flow](workflows/essential-machine-management/references/emm-enable-flow.md) | - +| Workflow | File | Use when | +|---|---|---| +| **VM Recommender** | [vm-recommender.md](workflows/vm-recommender/vm-recommender.md) | User asks which VM/VMSS to choose, wants pricing, or wants to compare options | +| **VM Creator** | [vm-creator.md](workflows/vm-creator/vm-creator.md) | User wants to provision a bare VM or VMSS (not an app deployment) | +| **VM Troubleshooter** | [vm-troubleshooter.md](workflows/vm-troubleshooter/vm-troubleshooter.md) | User can't connect, RDP/SSH refused, black screen, needs password reset | +| **Capacity Reservation** | [capacity-reservation.md](workflows/capacity-reservation/capacity-reservation.md) | User needs to reserve / guarantee VM capacity (CRG create / associate / disassociate) | +| **Essential Machine Management** | [essential-machine-management.md](workflows/essential-machine-management/essential-machine-management.md) | User asks about EMM / machine enrollment / monitor | diff --git a/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/examples/bicep/README.md b/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/examples/bicep/README.md new file mode 100644 index 00000000..169f7e9b --- /dev/null +++ b/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/examples/bicep/README.md @@ -0,0 +1,48 @@ +# {vm-name} — Bicep + +Deploys a single Linux VM with VNet, subnet, NSG (SSH allow), public IP, and NIC. + +## Prerequisites +- Azure CLI (`az login`) +- An existing resource group +- SSH public key at `~/.ssh/id_rsa.pub` + +## Quickstart + +```bash +az deployment group what-if \ + --resource-group {resourceGroup} \ + --template-file main.bicep \ + --parameters vmName={vmName} adminUsername={adminUsername} adminPublicKey="$(cat ~/.ssh/id_rsa.pub)" + +az deployment group create \ + --resource-group {resourceGroup} \ + --template-file main.bicep \ + --parameters vmName={vmName} adminUsername={adminUsername} adminPublicKey="$(cat ~/.ssh/id_rsa.pub)" +``` + +## Parameters + +| Name | Required | Default | Notes | +|---|---|---|---| +| `vmName` | * | — | VM resource name | +| `adminUsername` | * | — | Linux admin user | +| `adminPublicKey` | * | — | Contents of `id_rsa.pub` (secure) | +| `location` | | resourceGroup location | Azure region | +| `vmSize` | | `Standard_D2s_v5` | Verify availability with `compute_vm_list-skus` | +| `osDiskSizeGb` | | `30` | | +| `osDiskType` | | `Premium_LRS` | | +| `zone` | | `''` | `1`/`2`/`3`, or empty for regional | +| `tags` | | `{}` | | + +## Outputs +- `vmId` — full ARM resource ID +- `publicIpAddress` — connect with `ssh {adminUsername}@{publicIpAddress}` + +## VMSS variant +Swap `Microsoft.Compute/virtualMachines` for `Microsoft.Compute/virtualMachineScaleSets@2024-07-01`, add `sku: { name: vmSize, capacity: instanceCount }`, `properties.orchestrationMode: 'Flexible'`, and move `osProfile`/`storageProfile`/`networkProfile` inside `properties.virtualMachineProfile`. + +## Cleanup +```bash +az group delete --name {resourceGroup} --yes --no-wait +``` diff --git a/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/examples/bicep/main.bicep b/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/examples/bicep/main.bicep new file mode 100644 index 00000000..2be62449 --- /dev/null +++ b/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/examples/bicep/main.bicep @@ -0,0 +1,155 @@ +@description('Name of the VM') +param vmName string + +@description('Azure region') +param location string = resourceGroup().location + +@description('VM size, e.g. Standard_D2s_v5') +param vmSize string = 'Standard_D2s_v5' + +@description('Admin username') +param adminUsername string + +@description('SSH public key contents') +@secure() +param adminPublicKey string + +@description('Address space for the new VNet') +param vnetAddressPrefix string = '10.0.0.0/16' + +@description('Subnet prefix') +param subnetAddressPrefix string = '10.0.0.0/24' + +@description('OS disk size in GB') +param osDiskSizeGb int = 30 + +@description('OS disk storage type') +param osDiskType string = 'Premium_LRS' + +@description('Availability zone (1, 2, or 3); empty for regional') +param zone string = '' + +@description('Tags applied to all resources') +param tags object = {} + +@description('Source address prefix allowed for SSH inbound (CIDR or IP). Required — supply your public IP (e.g. "203.0.113.42/32") or a trusted CIDR range. "*" exposes port 22 to the entire internet; only pass it explicitly when you have accepted that risk.') +param sshSourceAddressPrefix string + +var vnetName = '${vmName}-vnet' +var subnetName = 'default' +var nsgName = '${vmName}-nsg' +var publicIpName = '${vmName}-ip' +var nicName = '${vmName}-nic' + +resource nsg 'Microsoft.Network/networkSecurityGroups@2024-05-01' = { + name: nsgName + location: location + tags: tags + properties: { + securityRules: [ + { + name: 'AllowSSH' + properties: { + priority: 1000 + access: 'Allow' + direction: 'Inbound' + protocol: 'Tcp' + sourceAddressPrefix: sshSourceAddressPrefix + sourcePortRange: '*' + destinationAddressPrefix: '*' + destinationPortRange: '22' + } + } + ] + } +} + +resource vnet 'Microsoft.Network/virtualNetworks@2024-05-01' = { + name: vnetName + location: location + tags: tags + properties: { + addressSpace: { addressPrefixes: [vnetAddressPrefix] } + subnets: [ + { + name: subnetName + properties: { + addressPrefix: subnetAddressPrefix + networkSecurityGroup: { id: nsg.id } + } + } + ] + } +} + +resource publicIp 'Microsoft.Network/publicIPAddresses@2024-05-01' = { + name: publicIpName + location: location + tags: tags + sku: { name: 'Standard' } + properties: { publicIPAllocationMethod: 'Static' } +} + +resource nic 'Microsoft.Network/networkInterfaces@2024-05-01' = { + name: nicName + location: location + tags: tags + properties: { + ipConfigurations: [ + { + name: 'ipconfig1' + properties: { + subnet: { id: '${vnet.id}/subnets/${subnetName}' } + publicIPAddress: { id: publicIp.id } + privateIPAllocationMethod: 'Dynamic' + } + } + ] + } +} + +resource vm 'Microsoft.Compute/virtualMachines@2024-07-01' = { + name: vmName + location: location + tags: tags + zones: empty(zone) ? null : [zone] + properties: { + hardwareProfile: { vmSize: vmSize } + storageProfile: { + imageReference: { + publisher: 'Canonical' + offer: 'ubuntu-24_04-lts' + sku: 'server' + version: 'latest' + } + osDisk: { + createOption: 'FromImage' + diskSizeGB: osDiskSizeGb + managedDisk: { storageAccountType: osDiskType } + } + } + osProfile: { + computerName: vmName + adminUsername: adminUsername + linuxConfiguration: { + disablePasswordAuthentication: true + ssh: { + publicKeys: [ + { + path: '/home/${adminUsername}/.ssh/authorized_keys' + keyData: adminPublicKey + } + ] + } + } + } + networkProfile: { + networkInterfaces: [ + { id: nic.id } + ] + } + } +} + +output vmId string = vm.id +output publicIpAddress string = publicIp.properties.ipAddress diff --git a/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/examples/terraform/README.md b/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/examples/terraform/README.md new file mode 100644 index 00000000..61d306f4 --- /dev/null +++ b/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/examples/terraform/README.md @@ -0,0 +1,52 @@ +# {vm-name} — Terraform + +Deploys a Linux VM (RG, VNet, subnet, NSG with SSH allow, public IP, NIC). + +## Prerequisites +- `terraform >= 1.5` +- `az login` +- Exported `AZ_SUB=` env var +- SSH public key at `~/.ssh/id_rsa.pub` + +## Quickstart + +```bash +MY_IP=$(curl -s ifconfig.me)/32 # your current public IP, locked to /32 +terraform init +terraform plan -var "vm_name=dev-vm" -var "admin_public_key=$(cat ~/.ssh/id_rsa.pub)" -var "subscription_id=$AZ_SUB" -var "resource_group_name=dev-vm-rg" -var "ssh_source_address_prefix=$MY_IP" +terraform apply -var "vm_name=dev-vm" -var "admin_public_key=$(cat ~/.ssh/id_rsa.pub)" -var "subscription_id=$AZ_SUB" -var "resource_group_name=dev-vm-rg" -var "ssh_source_address_prefix=$MY_IP" +``` + +## Variables (see `variables.tf`) + +| Variable | Type | Default | Notes | +|---|---|---|---| +| `subscription_id` * | string | — | Azure subscription | +| `resource_group_name` * | string | — | RG will be created | +| `vm_name` * | string | — | VM resource name | +| `admin_public_key` * | string (sensitive) | — | Contents of `id_rsa.pub` | +| `ssh_source_address_prefix` * | string | — | Your public IP as `/32` or a trusted CIDR. `"*"` opens port 22 to the internet — only pass it if you have accepted that risk. | +| `location` | string | `eastus` | Azure region | +| `size` | string | `Standard_D2s_v5` | Verify with `compute_vm_list-skus` | +| `admin_username` | string | `azureuser` | | +| `zone` | string | `""` | `1`/`2`/`3`, or empty for regional | +| `os_disk_type` | string | `Premium_LRS` | | +| `os_disk_size_gb` | number | `30` | | +| `tags` | map(string) | `{}` | | + +`*` = required (no default). + +## Outputs (see `outputs.tf`) +- `vm_id` — full ARM resource ID +- `public_ip` — connect with `ssh {admin_username}@{public_ip}` + +## VMSS variant +Replace `azurerm_linux_virtual_machine` with `azurerm_linux_virtual_machine_scale_set`; add `instances`, `upgrade_mode = "Manual" | "Automatic" | "Rolling"`. Inline NIC inside the scale set via `network_interface { ip_configuration { ... } }`. + +## Notes +`ssh_source_address_prefix` is required because an open SSH port is a credential-stuffing target within minutes of going public. Always pass `/32` (or a trusted CIDR) — even for dev. For production, also add managed identity, diagnostics, and backup. + +## Cleanup +```bash +terraform destroy -var "vm_name=dev-vm" -var "admin_public_key=$(cat ~/.ssh/id_rsa.pub)" -var "subscription_id=$AZ_SUB" -var "resource_group_name=dev-vm-rg" -var "ssh_source_address_prefix=$MY_IP" -auto-approve +``` diff --git a/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/examples/terraform/main.tf b/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/examples/terraform/main.tf new file mode 100644 index 00000000..53fc859f --- /dev/null +++ b/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/examples/terraform/main.tf @@ -0,0 +1,110 @@ +terraform { + required_providers { + azurerm = { + source = "hashicorp/azurerm" + version = "~> 4.0" + } + } +} + +provider "azurerm" { + features {} + subscription_id = var.subscription_id +} + +resource "azurerm_resource_group" "main" { + name = var.resource_group_name + location = var.location + tags = var.tags +} + +resource "azurerm_network_security_group" "main" { + name = "${var.vm_name}-nsg" + location = azurerm_resource_group.main.location + resource_group_name = azurerm_resource_group.main.name + tags = var.tags + + security_rule { + name = "AllowSSH" + priority = 1000 + direction = "Inbound" + access = "Allow" + protocol = "Tcp" + source_port_range = "*" + destination_port_range = "22" + source_address_prefix = var.ssh_source_address_prefix + destination_address_prefix = "*" + } +} + +resource "azurerm_virtual_network" "main" { + name = "${var.vm_name}-vnet" + address_space = ["10.0.0.0/16"] + location = azurerm_resource_group.main.location + resource_group_name = azurerm_resource_group.main.name + tags = var.tags +} + +resource "azurerm_subnet" "main" { + name = "default" + resource_group_name = azurerm_resource_group.main.name + virtual_network_name = azurerm_virtual_network.main.name + address_prefixes = ["10.0.0.0/24"] +} + +resource "azurerm_subnet_network_security_group_association" "main" { + subnet_id = azurerm_subnet.main.id + network_security_group_id = azurerm_network_security_group.main.id +} + +resource "azurerm_public_ip" "main" { + name = "${var.vm_name}-ip" + location = azurerm_resource_group.main.location + resource_group_name = azurerm_resource_group.main.name + allocation_method = "Static" + sku = "Standard" + tags = var.tags +} + +resource "azurerm_network_interface" "main" { + name = "${var.vm_name}-nic" + location = azurerm_resource_group.main.location + resource_group_name = azurerm_resource_group.main.name + tags = var.tags + + ip_configuration { + name = "ipconfig1" + subnet_id = azurerm_subnet.main.id + private_ip_address_allocation = "Dynamic" + public_ip_address_id = azurerm_public_ip.main.id + } +} + +resource "azurerm_linux_virtual_machine" "main" { + name = var.vm_name + location = azurerm_resource_group.main.location + resource_group_name = azurerm_resource_group.main.name + size = var.size + admin_username = var.admin_username + network_interface_ids = [azurerm_network_interface.main.id] + zone = var.zone == "" ? null : var.zone + tags = var.tags + + admin_ssh_key { + username = var.admin_username + public_key = var.admin_public_key + } + + os_disk { + caching = "ReadWrite" + storage_account_type = var.os_disk_type + disk_size_gb = var.os_disk_size_gb + } + + source_image_reference { + publisher = "Canonical" + offer = "ubuntu-24_04-lts" + sku = "server" + version = "latest" + } +} diff --git a/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/examples/terraform/outputs.tf b/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/examples/terraform/outputs.tf new file mode 100644 index 00000000..142307aa --- /dev/null +++ b/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/examples/terraform/outputs.tf @@ -0,0 +1,7 @@ +output "vm_id" { + value = azurerm_linux_virtual_machine.main.id +} + +output "public_ip" { + value = azurerm_public_ip.main.ip_address +} diff --git a/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/examples/terraform/variables.tf b/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/examples/terraform/variables.tf new file mode 100644 index 00000000..4603e6f6 --- /dev/null +++ b/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/examples/terraform/variables.tf @@ -0,0 +1,56 @@ +variable "subscription_id" { + type = string +} + +variable "resource_group_name" { + type = string +} + +variable "location" { + type = string + default = "eastus" +} + +variable "vm_name" { + type = string +} + +variable "size" { + type = string + default = "Standard_D2s_v5" +} + +variable "admin_username" { + type = string + default = "azureuser" +} + +variable "admin_public_key" { + type = string + sensitive = true +} + +variable "zone" { + type = string + default = "" +} + +variable "os_disk_type" { + type = string + default = "Premium_LRS" +} + +variable "os_disk_size_gb" { + type = number + default = 30 +} + +variable "tags" { + type = map(string) + default = {} +} + +variable "ssh_source_address_prefix" { + description = "Source address prefix allowed for SSH inbound (CIDR or IP). Required — supply your public IP (e.g. \"203.0.113.42/32\") or a trusted CIDR range. \"*\" exposes port 22 to the entire internet; only pass it explicitly when you have accepted that risk." + type = string +} diff --git a/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/references/delivery-options/github-pr.md b/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/references/delivery-options/github-pr.md new file mode 100644 index 00000000..41fa12b8 --- /dev/null +++ b/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/references/delivery-options/github-pr.md @@ -0,0 +1,98 @@ +# Mode C — Sync to a GitHub repo + +## Step 1 — Gather repo + branch + path + +Ask three things (batch into one message; accept all three at once or walk through them): + +1. **Repo** — accept any of: + - `owner/name` (e.g., `myorg/azure-infra`) + - Full URL (e.g., `https://github.com/myorg/azure-infra`) + - Local checkout path (e.g., `~/source/azure-infra`) — skips clone +2. **Branch** — default: `infra/vm-{vm-name}`. **Refuse to commit directly to `main` / `master` / default.** Always a branch + PR. +3. **Target path inside repo** — default: `infra/{vm-name}/`. If the repo has a recognizable infra root (`/terraform`, `/bicep`, `/infrastructure`), suggest that. + +## Step 2 — Pre-flight + +```bash +# 1. gh installed and authenticated +gh auth status # must succeed; if not, fall back to Mode C-fallback below + +# 2. user has push rights +gh repo view {owner/repo} --json viewerPermission --jq '.viewerPermission' +# Must be ADMIN, MAINTAIN, or WRITE; otherwise pivot to fork-and-PR + +# 3. branch doesn't already exist +gh api repos/{owner/repo}/branches/{branch} # 404 = good (new branch) +``` + +If `gh` is missing or unauthenticated, **switch to Mode C-fallback** — don't block. + +## Step 3 — Clone, write, commit, push, PR + +Echo each command so the user can replay: + +```bash +# 1. Working tree +TMP=$(mktemp -d) && cd "$TMP" +gh repo clone {owner/repo} . + +# 2. Branch off the default branch +DEFAULT_BRANCH=$(gh repo view --json defaultBranchRef --jq '.defaultBranchRef.name') +git checkout -b {branch} "origin/$DEFAULT_BRANCH" + +# 3. Write the artifact files (same layout as Mode B) +mkdir -p {targetPath} +# ... Write tool calls for each file ... + +# 4. Stage, commit, push +git add {targetPath} +git commit -m "Add {vm-name} VM infrastructure (Bicep)" \ + -m "Generated by azure-compute vm-creator workflow." +git push -u origin {branch} + +# 5. Open the PR +gh pr create \ + --title "Add {vm-name} VM infrastructure" \ + --body "$(cat <<'EOF' +## Summary +Adds Bicep templates to provision \`{vm-name}\` in \`{location}\`. + +## Plan Card +{paste-plan-card-markdown} + +## Deploy +\`\`\`bash +cd {targetPath} +az deployment group create --resource-group {resourceGroup} \\ + --template-file main.bicep --parameters vmName={vm-name} ... +\`\`\` +EOF +)" \ + --base "$DEFAULT_BRANCH" --head {branch} +``` + +Echo the PR URL back: + +> ✅ Opened PR: https://github.com/myorg/azure-infra/pull/42 + +## Step 4 — Optional follow-ups + +After the PR is open, **offer** (don't force): +- *"Want me to add a GitHub Action to validate this with `az deployment group what-if` on every PR?"* — generates `.github/workflows/bicep-whatif.yml` +- *"Want OIDC auth so CI can apply without a service principal secret?"* — points at the `azure-prepare` skill + +## Mode C-fallback (no `gh` / unauthenticated / no push rights) + +Write the files to `~/Desktop/{vm-name}-infra/` (Mode B path) and print: + +> ⚠ I can't push for you (`gh auth status` failed), so I wrote the files locally. +> +> To open the PR yourself: +> ```bash +> cd ~/Desktop/dev-vm-infra +> git init && git add . && git commit -m "Add dev-vm infrastructure" +> git remote add origin git@github.com:myorg/azure-infra.git +> git checkout -b infra/vm-dev-vm +> git push -u origin infra/vm-dev-vm +> gh pr create --fill # after `gh auth login` +> ``` diff --git a/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/references/delivery-options/index.md b/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/references/delivery-options/index.md new file mode 100644 index 00000000..d768373d --- /dev/null +++ b/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/references/delivery-options/index.md @@ -0,0 +1,44 @@ +# Delivery Options + +After the Plan Card is approved (Step 5) and the output format is chosen (Step 6), the workflow has a generated artifact. **Before printing it into the chat**, ask one more question — where should it go? + +| Mode | When the user wants this | File | +|---|---|---| +| **A. Print in chat** | Quick one-off, copy-paste, just learning | [print.md](print.md) | +| **B. Save to local folder** | Wants files on disk, will commit/run later | [save-local.md](save-local.md) | +| **C. Sync to a GitHub repo** | Has an infra repo, wants the change as a PR | [github-pr.md](github-pr.md) | + +> **Skip the question for Adapter 4 (Apply via MCP).** Delivery is moot — the resources get created in Azure directly. + +## Asking + +> *"Where should I drop this {format} artifact?"* +> *1. Print in the chat (default for quick / one-off)* +> *2. Save to a local folder — I'll suggest a path based on your current workspace* +> *3. Sync to a GitHub repo as a new branch + PR* + +If the user already implied a target ("save it to my infra repo" / "open a PR" / "just print it"), skip the question and use that mode. + +## Re-emitting / changing delivery mid-flow + +After delivery, the Plan Card is cached. The user can say: + +| User says | Action | +|---|---| +| "also save it locally" | Re-run Mode B with the same artifact | +| "open a PR instead" | Re-run Mode C | +| "give me terraform now too" | Re-render that adapter, then re-ask delivery | +| "delete that file you wrote" | `rm` the path that was written | + +Never re-ask Plan Card questions when only the delivery target is changing. + +## Safety checks (B and C) + +| Check | Why | +|---|---| +| Don't overwrite without showing the diff and getting explicit yes | User may have manual edits | +| Scrub `adminPassword` from generated files → replace with `${VM_ADMIN_PASSWORD}` env var | Repos leak passwords | +| Never push directly to `main` / `master` / default branch — feature branch + PR only | Auditability and review | +| Never `git push --force` (use `--force-with-lease` if a rewrite is truly needed) | Catastrophic on shared branches | +| Emit a parameter placeholder for SSH keys, not the literal key contents | Keys committed to git stay there forever | +| If the repo has `CODEOWNERS` for the path, mention who'll be auto-tagged | User shouldn't be surprised by review routing | diff --git a/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/references/delivery-options/print.md b/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/references/delivery-options/print.md new file mode 100644 index 00000000..bdf34d19 --- /dev/null +++ b/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/references/delivery-options/print.md @@ -0,0 +1,9 @@ +# Mode A — Print in chat + +Default. Render fenced code blocks for each file, in this order: + +1. **Bicep:** `main.bicep` + the `az deployment group create` command at the bottom +2. **Terraform:** `main.tf` + `variables.tf` (separate fenced blocks) + the `terraform init && terraform apply` command +3. **bash:** the single script + +Append a one-liner reminder: *"Want me to save this to disk or open a PR? Just ask."* — so the user can shift to Mode B/C without restarting. diff --git a/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/references/delivery-options/save-local.md b/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/references/delivery-options/save-local.md new file mode 100644 index 00000000..2167fb13 --- /dev/null +++ b/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/references/delivery-options/save-local.md @@ -0,0 +1,52 @@ +# Mode B — Save to a local folder + +## Step 1 — Suggest a path + +Detect the user's current working context (in priority order; pick the **first** that succeeds): + +| Signal | Suggested path | +|---|---| +| Host session has an `--add-dir` workspace containing a `.git` directory | `/infra/{vm-name}/` | +| User mentioned a repo path earlier in the conversation | `/infra/{vm-name}/` | +| `pwd` is inside a git repo | `/infra/{vm-name}/` | +| None of the above | `~/Desktop/{vm-name}-infra/` | + +Present the suggestion and let the user override: + +> *"I'll save to `~/Desktop/dev-vm-infra/` (no repo detected). Use that, or pick another path?"* + +If the path already exists with files, **always show the diff and ask before overwriting**. Never silently clobber. + +## Step 2 — Choose the filename(s) + +| Format | Files written | +|---|---| +| bash | `create-vm.sh` (chmod +x) | +| Bicep | `main.bicep`, `README.md` | +| Terraform | `main.tf`, `variables.tf`, `outputs.tf`, `terraform.tfvars.example`, `README.md` | + +The `README.md` contains: +- The full **Plan Card** markdown (so the next reader knows what this deploys) +- **Deploy** section with exact commands (`az deployment group create ...` / `terraform init && apply ...` / `bash create-vm.sh`) +- **Verify** section: `az vm show` / `az vm list-ip-addresses` +- **Cleanup** section: `az group delete --name --yes --no-wait` + +## Step 3 — Write and confirm + +After writing, echo back absolute paths and the single command the user runs next: + +> ✅ Wrote 2 files: +> - `~/source/my-infra/infra/dev-vm/main.bicep` +> - `~/source/my-infra/infra/dev-vm/README.md` +> +> Next step: +> ```bash +> cd ~/source/my-infra/infra/dev-vm +> az deployment group create --resource-group dev-vm-rg --template-file main.bicep \ +> --parameters vmName=dev-vm adminUsername=azureuser \ +> adminPublicKey="$(cat ~/.ssh/id_rsa.pub)" +> ``` + +## Tool implementation + +Use the `Write` tool (or the host's equivalent) for each file. For shell scripts: write, then `chmod +x` via Bash. diff --git a/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/references/depth-probe/beginner.md b/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/references/depth-probe/beginner.md new file mode 100644 index 00000000..89fa0a2b --- /dev/null +++ b/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/references/depth-probe/beginner.md @@ -0,0 +1,21 @@ +# Beginner / fast-path + +Goal: get to a working Plan Card in **≤ 2 questions**, then show defaults and let the user edit. + +| # | Question | Default if skipped | +|---|---|---| +| 1 | "What region? I can recommend if you're not sure." | `eastus` | +| 2 | "Linux or Windows? Default is Ubuntu 24.04." | `Ubuntu2404` (Linux) | + +## Silent defaults (show in Plan Card, don't ask) + +- **Size:** `Standard_D2s_v5` (2 vCPU / 8 GB) +- **Auth:** SSH key from `~/.ssh/id_rsa.pub` (Linux) — read the file; ask only if missing +- **VNet:** create new `-vnet` with `10.0.0.0/16` +- **Subnet:** `default` with `10.0.0.0/24` +- **NSG:** create new, allow SSH 22 (Linux) or RDP 3389 (Windows) from **the user's current public IP** (detect via `curl -s ifconfig.me` or equivalent) — only fall back to `*` if detection fails, and always flag the chosen source in the Plan Card with a ⚠ so the user can edit before apply +- **Public IP:** Standard SKU, dynamic +- **OS disk:** 30 GB Premium SSD +- **Zone:** none (regional) + +If the user is in a region you haven't validated, call `compute_vm_list-skus` to confirm `Standard_D2s_v5` is available there before locking it in. If not, fall back to whatever the recommender suggests. diff --git a/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/references/depth-probe/cost-deep.md b/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/references/depth-probe/cost-deep.md new file mode 100644 index 00000000..71e7bed4 --- /dev/null +++ b/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/references/depth-probe/cost-deep.md @@ -0,0 +1,17 @@ +# Cost-deep branch + +| Topic | Question | Default | +|---|---|---| +| Spot vs regular | "Spot eligible (interruptible)?" | `regular` unless dev / batch | +| Spot max price | "Max spot price ($/hr) or pay up to on-demand?" | `-1` (= on-demand cap) | +| Reservations | "1-yr / 3-yr reservation?" | Skip — recommend post-deploy when usage is known | +| Hybrid Benefit | "Bring Windows Server / RHEL / SLES license?" | Ask only if OS is Windows or RHEL | +| Autoscale floor/ceiling (VMSS) | "Min / max instances?" | `min=1, max=3` for web tier; `min=0, max=10` for batch | +| Schedule shutdown | "Auto-shutdown nightly?" | Offer for dev / sandbox workloads | +| Disk tier | "OS disk tier: Premium SSD / Standard SSD / Standard HDD?" | Premium SSD | + +## Notes + +- Spot interruption rates vary by region and SKU; mention the user can check before committing via the Azure portal "Spot eviction rate" view. +- Reservations and savings plans need 30+ days of usage telemetry to recommend confidently — don't push them on a brand-new workload. +- Auto-shutdown via DevTest Labs is the cheapest scheduled-stop option for single VMs; for VMSS, scale-to-zero is better. diff --git a/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/references/depth-probe/index.md b/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/references/depth-probe/index.md new file mode 100644 index 00000000..b482f2a0 --- /dev/null +++ b/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/references/depth-probe/index.md @@ -0,0 +1,45 @@ +# Depth Probe — Meet the User Where They Are + +The VM Creator adapts its questioning to the user's expertise and intent. A beginner asking for a "dev VM" should not get peppered with networking and egress questions. An advanced networking engineer specifying "VMSS behind App Gateway with private endpoints" should not be asked whether they want a public IP. + +## Philosophy + +1. **Never ask a question whose answer can be inferred or safely defaulted.** +2. **Batch silent inferences into a Plan Card.** Defaulted decisions should be visible and editable. +3. **Defaults ladder.** When you must ask, prefer `[recommended default] / [show alternatives] / [I have specifics]`. +4. **Branching is signal-driven, not flag-driven.** Reclassify any time the user volunteers a deep signal. + +## Classification — read the initial request + +Score each signal that appears in the user's first 1-2 messages. The highest-scoring branch wins; a user can be in multiple branches. + +| Signal phrase / keyword | Branch | +|---|---| +| "VNet", "subnet", "NSG", "egress", "private endpoint", "App Gateway", "accelerated networking", "service tag", "UDR", "IPv6", "DNS", "Bastion" | [networking-deep](networking-deep.md) | +| "vCPUs", "GPU", "memory", "family", "D-series", "N-series", "ephemeral OS disk", "proximity placement", "AMD", "Intel", "generation", "SR-IOV", "trusted launch" | [spec-deep](spec-deep.md) | +| "spot", "reserved", "savings plan", "hybrid benefit", "autoscale floor/ceiling", "$", "budget", "cheapest", "cost-optimize" | [cost-deep](cost-deep.md) | +| "managed identity", "Entra", "RBAC", "Key Vault", "JIT", "encryption at host", "CMK", "confidential", "compliance", "FedRAMP", "HIPAA" | [security-deep](security-deep.md) | +| "dev", "sandbox", "quick", "test out", "play with", "just need", "simple", or nothing specific | [beginner](beginner.md) | + +> **Tiebreak:** prefer the branch that affects the most expensive defaults: Networking > Security > Spec > Cost > Beginner. Networking mistakes are the hardest to undo post-deployment. + +## Cross-branch follow-ups (ask once, after primary branch) + +| Question | When to ask | +|---|---| +| "Tags? (env, owner, cost-center)" | Always — but accept "none" without follow-up | +| "Resource group: existing or new?" | Always — propose `-rg` if new | +| "Number of instances?" | Only for VMSS | +| "Orchestration mode (Flexible/Uniform)?" | Only for VMSS — default Flexible | + +## Reclassification mid-flow + +If the user volunteers a deep signal at any step ("oh wait, I also need a NAT Gateway"), jump into that branch's questions for that topic. Never restart the whole flow — append the new questions and update the Plan Card. + +## Anti-patterns + +- ❌ Asking "what OS?" when the user said "Ubuntu sandbox" +- ❌ Asking about spot pricing for a Windows production VM +- ❌ Asking 8 networking questions before showing a Plan Card +- ❌ Defaulting to public IP open to `*` without flagging it in the Plan Card +- ❌ Burying the cost estimate at the bottom — put it on the Plan Card top row diff --git a/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/references/depth-probe/networking-deep.md b/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/references/depth-probe/networking-deep.md new file mode 100644 index 00000000..af3b04ba --- /dev/null +++ b/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/references/depth-probe/networking-deep.md @@ -0,0 +1,22 @@ +# Networking-deep branch + +Ask only what cannot be inferred. Volunteer the advanced switches. + +| Topic | Question | Default offered | +|---|---|---| +| VNet | "Existing VNet, or new?" | If existing: ask name + RG; offer to list via `network_vnet_list` MCP or `az network vnet list` | +| Subnet sizing | "Subnet CIDR?" | `/24` if new | +| NSG | "Inbound rules: default (SSH/RDP from your IP) or paste a rule set?" | Restrict source to user's current public IP — fetch via `curl -s ifconfig.me` if not provided | +| Public IP | "Public IP, or private only?" | Public unless user said "private", "internal", "no internet" | +| Accelerated networking | "Enable accelerated networking?" | `true` if size supports it (most D/E/F series ≥ 2 vCPU) | +| Private endpoints | "Any private endpoints to attach?" | Not by default; ask only if user mentioned data / Key Vault / storage targets | +| Outbound | "Outbound: default Azure SNAT, NAT Gateway, or Firewall route?" | Default SNAT if user didn't mention egress; if mentioned, default NAT Gateway | +| DNS | "Custom DNS servers?" | Azure-provided | +| IP version | "IPv4 only or dual-stack?" | IPv4 | +| Service endpoints | "Service endpoints on subnet?" | None unless user mentioned a target | + +## Notes + +- Don't auto-create a NAT Gateway just because the user said "secure" — confirm intent first; NAT Gateway is ~$30/mo before traffic. +- If the user wants "private only", offer Azure Bastion as the management path; don't silently leave them with no way in. +- `accelerated networking` defaults to **on** for supporting SKUs because the cost is zero and the throughput gain is large. diff --git a/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/references/depth-probe/security-deep.md b/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/references/depth-probe/security-deep.md new file mode 100644 index 00000000..31ee8fe0 --- /dev/null +++ b/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/references/depth-probe/security-deep.md @@ -0,0 +1,17 @@ +# Security-deep branch + +| Topic | Question | Default | +|---|---|---| +| Managed identity | "System-assigned managed identity?" | `true` (off by default in raw `az vm create`, but we recommend on) | +| Encryption at host | "Encryption at host?" | `true` (requires subscription opt-in — flag if not enabled) | +| Disk encryption set | "Customer-managed key (CMK) on OS disk?" | Skip unless compliance mentioned | +| Confidential VM | "Confidential compute (AMD SEV-SNP)?" | Only if user mentioned `confidential` / `attestation` | +| JIT access | "Enable Just-In-Time RDP/SSH (Defender for Cloud)?" | Offer if subscription has Defender plan | +| Boot diagnostics | "Managed boot diagnostics?" | `true` (Azure-managed storage) | +| Vulnerability scanning | "Enable Defender for Servers Plan 2?" | Mention; do not auto-enable (incurs cost) | + +## Notes + +- Encryption-at-host needs the subscription feature flag `EncryptionAtHost` registered — check via `az feature show` and surface a remediation step if not. +- CMK setup is multi-resource (Key Vault + Disk Encryption Set + RBAC); for first-time users, suggest scaffolding via the `azure-prepare` skill instead. +- JIT access is per-VM and per-port; default to 3-hour windows on 22/3389, not the wider "all common ports" preset. diff --git a/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/references/depth-probe/spec-deep.md b/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/references/depth-probe/spec-deep.md new file mode 100644 index 00000000..25e61d1f --- /dev/null +++ b/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/references/depth-probe/spec-deep.md @@ -0,0 +1,20 @@ +# Spec-deep branch + +Goal: nail the SKU. Lean on `compute_vm_list-skus` (with filters) instead of asking the user to memorize SKU names. + +| Topic | Question | How to answer | +|---|---|---| +| vCPUs / RAM | "Target vCPUs and memory?" | Call `compute_vm_list-skus` with `minVCpus` + `minMemoryGb`; show top 3 | +| GPU | "GPU workload?" | If yes: `familyPrefix=Standard_N`; ask CUDA vs render | +| Family | "Family preference (general D, compute F, memory E, burstable B, GPU N)?" | Skip if vCPU/RAM already nailed it down | +| Generation | "VM generation (v5/v6, AMD/Intel)?" | Default to latest gen available in region | +| Ephemeral OS disk | "Ephemeral OS disk (faster, but no resize/restore)?" | `false` unless workload is stateless tier | +| Trusted Launch / Gen2 | "Gen2 / Trusted Launch?" | `true` (Azure default for new VMs since 2023) | +| Proximity placement | "Need low-latency between VMs?" | Skip unless multi-VM cluster | +| Zone | "Pin to an Availability Zone?" | Skip for single VM unless HA; default `1` for VMSS spreading | + +## Notes + +- For GPU asks, always include the per-hour price in the Plan Card — N-series VMs can hit $3-30/hr. +- If the user says "compute-heavy", default to F-series (compute-optimized) before D-series. +- Burstable B-series is only correct for spiky/idle workloads — flag it explicitly so the user knows it'll throttle under sustained load. diff --git a/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/references/mcp-tools.md b/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/references/mcp-tools.md new file mode 100644 index 00000000..9706f3b2 --- /dev/null +++ b/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/references/mcp-tools.md @@ -0,0 +1,47 @@ +# MCP tools used by vm-creator + +The Azure MCP server exposes the `compute` area as a single namespace proxy: `mcp__azure__compute({intent, command, parameters})`. The commands below are what the workflow dispatches against it. + +## Read-only validation (Step 4) + +| Command | Purpose | Key parameters | +|---|---|---| +| `compute_vm_list-skus` | Confirm SKU availability + filter by vCPU/memory/family | `subscription`, `location`, `minVCpus`, `minMemoryGb`, `familyPrefix`, `top` | +| `compute_vm_list-images` | Resolve image alias / URN | `subscription`, `location`, `publisher`, `offer`, `sku` | +| `compute_vm_check-quota` | Verify vCPU headroom for the family | `subscription`, `location`, `family` | +| `compute_vm_recommend-region` | Find regions where the workload fits | workload hints; returns ranked regions | + +## Apply (Step 6, Adapter 4) + +| Command | Purpose | +|---|---| +| `compute_vm_create` | Create a single VM from Plan Card fields | +| `compute_vmss_create` | Create a VMSS (adds `instance-count`, `upgrade-policy`) | +| `compute_vm_get` | Inspect after create | +| `compute_vm_update` | Tag changes, size resize, identity attach | +| `compute_vm_delete` | Cleanup | + +See [output-adapters/mcp-apply.md](output-adapters/mcp-apply.md) for the full parameter mapping and failure-handling table. + +## CLI fallbacks (when Azure MCP is not connected) + +| MCP command | CLI equivalent | +|---|---| +| `compute_vm_list-skus` | `az vm list-skus --location --output table` | +| `compute_vm_list-images` | `az vm image list --location --offer --all` | +| `compute_vm_check-quota` | `az vm list-usage --location --output table` | +| `compute_vm_recommend-region` | (no CLI equivalent — fall back to docs) | +| `compute_vm_create` | `az vm create ...` (see [az-cli.md](output-adapters/az-cli.md)) | + +## Why the proxy form matters + +The CLI / tool host shows `mcp__azure__compute` as a single tool. Sub-operations like `vm check-quota`, `vm list-skus`, etc. are not separate tools — they are passed through the `command` parameter. Every command is invoked as: + +``` +mcp__azure__compute({ + command: "vm check-quota", + parameters: { location: "eastus", family: "standardDSv5Family" } +}) +``` + +When tracing tool calls or writing must-call rubrics, look for the `command=` argument, not a distinct tool name. diff --git a/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/references/output-adapters/az-cli.md b/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/references/output-adapters/az-cli.md new file mode 100644 index 00000000..bff24cfc --- /dev/null +++ b/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/references/output-adapters/az-cli.md @@ -0,0 +1,69 @@ +# az CLI adapter + +Fast, portable, scriptable. Works anywhere `az` is installed and logged in. + +## VM template + +```bash +#!/usr/bin/env bash +set -euo pipefail + +# {plan-card-summary} + +az group create --name "{resourceGroup}" --location "{location}" + +az vm create \ + --resource-group "{resourceGroup}" \ + --name "{vmName}" \ + --location "{location}" \ + --image "{image}" \ + --size "{vmSize}" \ + --admin-username "{adminUsername}" \ + --ssh-key-values @{sshKeyPath} \ + --vnet-name "{vnetName}" \ + --subnet "{subnetName}" \ + --nsg "{nsgName}" \ + --public-ip-address "{publicIpName}" \ + --zone {zone} \ + --os-disk-size-gb {osDiskSizeGb} \ + --storage-sku {osDiskType} \ + --tags {tagsKv} +``` + +## VMSS template + +Replace `az vm create` with `az vmss create`, swap `--size` for `--vm-sku`, add `--instance-count {n}`, `--orchestration-mode Flexible`, `--upgrade-policy-mode Manual|Automatic|Rolling`. + +## Filled example — dev Linux VM in eastus + +```bash +#!/usr/bin/env bash +set -euo pipefail + +# dev-vm | eastus | Ubuntu2404 | Standard_D2s_v5 | new VNet | est. $70/mo + +az group create --name dev-vm-rg --location eastus + +az vm create \ + --resource-group dev-vm-rg \ + --name dev-vm \ + --location eastus \ + --image Ubuntu2404 \ + --size Standard_D2s_v5 \ + --admin-username azureuser \ + --ssh-key-values @~/.ssh/id_rsa.pub \ + --vnet-name dev-vm-vnet \ + --subnet default \ + --nsg dev-vm-nsg \ + --public-ip-address dev-vm-ip \ + --os-disk-size-gb 30 \ + --storage-sku Premium_LRS \ + --tags env=dev owner=team-name +``` + +## Notes +- Windows VMs: swap `--ssh-key-values @...` for `--admin-password '{password}'`. +- Linux: prefer SSH keys (`~/.ssh/id_rsa.pub` or `~/.ssh/id_ed25519.pub`). Never paste private keys. +- `--zone` is optional; omit the flag entirely (don't pass empty) for regional VMs. +- `--tags` uses space-separated `k=v` pairs. +- Pre-check quota: `compute_vm_check-quota` (or `az vm list-usage --location {location} -o table`). diff --git a/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/references/output-adapters/bicep.md b/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/references/output-adapters/bicep.md new file mode 100644 index 00000000..c07e3e3b --- /dev/null +++ b/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/references/output-adapters/bicep.md @@ -0,0 +1,36 @@ +# Bicep adapter + +Production IaC, repeatable deployments, supports `az deployment group what-if` preview before commit. + +## Template + +Emit [`examples/bicep/main.bicep`](../../examples/bicep/main.bicep) alongside the README. Single-file template — no modules to wire. Parameters (`vmName`, `adminUsername`, `adminPublicKey` required; others have defaults). + +Always emit `examples/bicep/README.md` next to the template so the artifact is self-contained (Plan Card, prereqs, quickstart, parameter table, cleanup). + +## Deploy + +```bash +az deployment group what-if \ + --resource-group {resourceGroup} \ + --template-file main.bicep \ + --parameters vmName={vmName} adminUsername={adminUsername} \ + adminPublicKey="$(cat ~/.ssh/id_rsa.pub)" + +az deployment group create \ + --resource-group {resourceGroup} \ + --template-file main.bicep \ + --parameters vmName={vmName} adminUsername={adminUsername} \ + adminPublicKey="$(cat ~/.ssh/id_rsa.pub)" +``` + +Always run `what-if` first — it's free and surfaces any quota / role / naming conflict before the change lands. + +## VMSS + +Swap `Microsoft.Compute/virtualMachines@2024-07-01` for `Microsoft.Compute/virtualMachineScaleSets@2024-07-01`. Add `sku: { name: vmSize, capacity: instanceCount }`, `properties.orchestrationMode: 'Flexible'`, and move `osProfile` / `storageProfile` / `networkProfile` inside `properties.virtualMachineProfile`. + +## Notes +- Secure params (`adminPassword`, `adminPublicKey`) are `@secure()`; don't echo them in logs. +- For `zone`, pass `'1'`/`'2'`/`'3'` to pin a zone, or `''` for regional. The template handles both via `empty(zone) ? null : [zone]`. +- For private VMs, set `publicIPAllocationMethod` to nothing and drop the `publicIPAddress` block from the NIC. diff --git a/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/references/output-adapters/index.md b/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/references/output-adapters/index.md new file mode 100644 index 00000000..75d8858f --- /dev/null +++ b/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/references/output-adapters/index.md @@ -0,0 +1,47 @@ +# Output Adapters + +The VM Creator's final step emits the user's choice of one (or more) output formats from the same approved Plan Card. The user can re-emit a different format at any time without restarting the conversation. + +## Choosing the format + +| Format | When to offer | File | +|---|---|---| +| **az CLI bash** | Quick one-off, learning, ad-hoc creation, CI scripts | [az-cli.md](az-cli.md) | +| **Bicep** | Production IaC, repeatable, what-if preview, org standardized on Bicep | [bicep.md](bicep.md) | +| **Terraform** | Multi-cloud, existing TF state, org standardized on Terraform | [terraform.md](terraform.md) | +| **Apply via Azure MCP** | "Just do it" — user is in an MCP-connected host and trusts the Plan Card | [mcp-apply.md](mcp-apply.md) | + +Always show the user all four as a numbered choice at Step 6. Default-suggest based on signals: +- "I want a script" → az CLI +- "I want infra-as-code" → Bicep (default) / Terraform (if user mentioned TF) +- "Just create it" / "deploy it now" → Apply via Azure MCP + +## Plan Card → parameter mapping + +Every adapter draws from these Plan Card fields. Capture them once; transform on emit. + +| Plan Card field | az CLI | Bicep | Terraform | MCP parameter | +|---|---|---|---|---| +| name | `--name` | `vmName` | `vm_name` | `vm-name` | +| resourceGroup | `--resource-group` | (scope) | `resource_group_name` | `resource-group` | +| subscription | `--subscription` | (scope) | `subscription_id` | `subscription` | +| location | `--location` | `location` | `location` | `location` | +| size | `--size` | `vmSize` | `size` | `vm-size` | +| image | `--image` | `imageReference` | `source_image_reference` | `image` | +| adminUsername | `--admin-username` | `adminUsername` | `admin_username` | `admin-username` | +| sshKey | `--ssh-key-values` | `adminPublicKey` | `admin_ssh_key.public_key` | `ssh-public-key` | +| adminPassword | `--admin-password` | `adminPassword` (secure) | `admin_password` (sensitive) | `admin-password` | +| vnetName | `--vnet-name` | `vnetName` | `azurerm_virtual_network.name` | `virtual-network` | +| subnetName | `--subnet` | `subnetName` | `azurerm_subnet.name` | `subnet` | +| publicIp | `--public-ip-address` | `publicIpName` | `azurerm_public_ip` | `public-ip-address` | +| nsgName | `--nsg` | `nsgName` | `azurerm_network_security_group.name` | `network-security-group` | +| zone | `--zone` | `zones: [N]` | `zone` | `zone` | +| osDiskType | `--storage-sku` | `osDisk.managedDisk.storageAccountType` | `os_disk.storage_account_type` | `os-disk-type` | +| osDiskSizeGb | `--os-disk-size-gb` | `osDisk.diskSizeGB` | `os_disk.disk_size_gb` | `os-disk-size-gb` | +| tags | `--tags` | `tags` | `tags` | (none — emit separately) | + +For VMSS, also map `instanceCount` and `upgradePolicy` (see each adapter file). + +## Re-emitting after a format switch + +After the user picks one format, save the Plan Card. If they later say "actually give me the bicep too" or "show me terraform", regenerate from the same Plan Card — do not re-ask any questions. The Plan Card is the canonical state. diff --git a/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/references/output-adapters/mcp-apply.md b/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/references/output-adapters/mcp-apply.md new file mode 100644 index 00000000..539e4a2c --- /dev/null +++ b/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/references/output-adapters/mcp-apply.md @@ -0,0 +1,66 @@ +# Apply via Azure MCP + +User is in an MCP-connected host (Claude Code, VS Code Copilot, Cursor) with the Azure MCP server enabled and says "just do it" / "create it now" / "deploy". + +## Prerequisite +Azure MCP server installed, and the user is signed in to Azure (e.g. `az login`). The exact auth mechanism is an implementation detail of the MCP server and may vary. + +## Pre-flight checks (read-only) + +``` +compute_vm_check-quota(subscription, location, family={derived from vmSize}) +compute_vm_list-skus(subscription, location, familyPrefix={derived from vmSize}) +``` + +Confirm: +- Quota has headroom for the requested vCPU count +- The chosen SKU is available in that region (and zone, if specified) + +## Apply — VM + +Call `compute_vm_create` with the Plan Card values: + +| MCP parameter | Plan Card field | +|---|---| +| `subscription` | subscription | +| `resource-group` | resourceGroup | +| `vm-name` | name | +| `location` | location | +| `image` | image | +| `vm-size` | size | +| `admin-username` | adminUsername | +| `ssh-public-key` (Linux) / `admin-password` (Windows) | sshKey / adminPassword | +| `virtual-network` | vnetName (omit to auto-create) | +| `subnet` | subnetName (omit to auto-create) | +| `network-security-group` | nsgName (omit to auto-create) | +| `public-ip-address` | publicIpName | +| `no-public-ip` | true if Plan Card says private only | +| `source-address-prefix` | restrict NSG inbound source (e.g., user's IP) | +| `zone` | zone | +| `os-disk-size-gb` | osDiskSizeGb | +| `os-disk-type` | osDiskType | +| `os-type` | linux / windows (usually auto-detected from image) | + +## Apply — VMSS + +Call `compute_vmss_create` with the same fields, plus: + +| MCP parameter | Plan Card field | +|---|---| +| `vmss-name` | name | +| `instance-count` | instanceCount | +| `upgrade-policy` | upgradePolicy (Manual / Automatic / Rolling) | + +## After apply +- Tool returns `VmCreateResult` / `VmssCreateResult` with `Id`, `Name`, `Location`, `VmSize`, `ProvisioningState`, `PublicIpAddress`, `PrivateIpAddress`, `Zones`, `Tags`. +- Echo back to user: hostname / IP / SSH command (`ssh {adminUsername}@{publicIp}`) or RDP command (`mstsc /v:{publicIp}`). +- Offer next steps: list/inspect (`compute_vm_get`), update (`compute_vm_update`), delete (`compute_vm_delete`). + +## Failure handling + +| Error | Action | +|---|---| +| `Quota exceeded` | re-run `compute_vm_check-quota`; suggest smaller SKU or different family | +| `A VM with the specified name already exists` | ask for a new name | +| `Resource not found` on RG | create the RG first (`group_create` MCP or `az group create`) | +| `Authorization failed` | user needs Contributor or VM Contributor on the RG | diff --git a/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/references/output-adapters/terraform.md b/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/references/output-adapters/terraform.md new file mode 100644 index 00000000..75b7025f --- /dev/null +++ b/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/references/output-adapters/terraform.md @@ -0,0 +1,32 @@ +# Terraform adapter + +Multi-cloud, existing TF state, organization standardized on Terraform. + +## Templates + +Emit three code files to [`examples/terraform/`](../../examples/terraform/README.md): +- `main.tf` — provider, RG, VNet, subnet, NSG (SSH allow), public IP, NIC, Linux VM +- `variables.tf` — typed input variables with defaults +- `outputs.tf` — `vm_id`, `public_ip` + +Always emit `examples/terraform/README.md` (Plan Card, prereqs, quickstart, variables table, outputs, cleanup). When the user requests a PR (Mode C), the same README becomes the PR body. + +## Deploy + +```bash +terraform init +terraform plan -var "vm_name={vmName}" -var "admin_public_key=$(cat ~/.ssh/id_rsa.pub)" \ + -var "subscription_id=$AZ_SUB" -var "resource_group_name={resourceGroup}" +terraform apply -var "vm_name={vmName}" -var "admin_public_key=$(cat ~/.ssh/id_rsa.pub)" \ + -var "subscription_id=$AZ_SUB" -var "resource_group_name={resourceGroup}" +``` + +## VMSS + +Replace `azurerm_linux_virtual_machine` with `azurerm_linux_virtual_machine_scale_set` (or Windows variants). Add `instances`, `upgrade_mode = "Manual" | "Automatic" | "Rolling"`. NIC moves inline inside the scale set resource via `network_interface { ip_configuration { ... } }`. + +## Notes +- Provider version pinned to `~> 4.0` — bump deliberately, not implicitly. +- `admin_public_key` is `sensitive = true`; don't print it. +- `zone` is `""` by default (regional); to pin, pass `"1"`, `"2"`, or `"3"`. +- Pre-check quota with `compute_vm_check-quota` before `terraform apply`. diff --git a/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/references/plan-card.md b/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/references/plan-card.md new file mode 100644 index 00000000..e72b73aa --- /dev/null +++ b/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/references/plan-card.md @@ -0,0 +1,64 @@ +# Plan Card + +The Plan Card is the single source of truth for the create-flow. It renders **every decision** — explicit user answers and silent defaults — as a markdown table the user can read top-to-bottom and either approve or edit. + +## Rendering rules + +1. **Cost + quota on the top half.** The user must see both without scrolling. +2. **Source column is mandatory.** Each row says where the value came from: `user`, `default`, `inferred`, or the MCP tool that produced it. +3. **Flag risky defaults with ⚠.** Open NSG to `*`, public IP exposed, no managed identity, password auth on Windows — all get a marker so the user can edit. +4. **No invisible state.** If you defaulted it, it goes in the card. +5. **Re-emit on every change.** When the user edits a row, render the full card again — diffs in chat are easy to miss. + +## Schema + +| Column | Required? | Notes | +|---|---|---| +| Setting | yes | Human-readable label (`Region`, `Size`, `OS disk`) | +| Value | yes | The concrete value, in backticks if it's a literal | +| Source | yes | `user` / `default` / `inferred` / `` | + +## Example — Linux dev VM in eastus + +```markdown +| Setting | Value | Source | +|---|---|---| +| Hosting model | Single VM | user | +| Name | `dev-vm-01` | default (`-vm-`) | +| Region | `eastus` | user | +| Resource group | `dev-vm-01-rg` (new) | default | +| Image | `Ubuntu2404` | user | +| Size | `Standard_D2s_v5` (2 vCPU / 8 GB) | default | +| Auth | SSH key from `~/.ssh/id_rsa.pub` | inferred | +| VNet | new `dev-vm-01-vnet` (`10.0.0.0/16`) | default | +| Subnet | `default` (`10.0.0.0/24`) | default | +| NSG | SSH 22 from your public IP (`203.0.113.42`) | default — ⚠ change to `*` only if needed | +| Public IP | Standard, dynamic | default | +| OS disk | 30 GB Premium SSD | default | +| Boot diagnostics | Managed | default | +| Estimated cost | ~$0.096/hr (~$70/mo) | from `compute_vm_list-skus` | +| Quota | ✅ 4/100 vCPUs used in `standardDSv5Family` | from `compute_vm_check-quota` | +``` + +## After rendering — single batched action picker + +Render the Plan Card markdown **inline in the chat first** (so the user can read it), then ask **one** AskUserQuestion that combines approval + output format + delivery: + +> *"Looks good? Pick how you want it delivered:"* +> +> 1. **Save Bicep to `./infra/{vm-name}/`** *(Recommended for repos)* +> 2. **Print az CLI in chat** *(Quick copy-paste)* +> 3. **Save Terraform to `./infra/{vm-name}/`** +> 4. **Open GitHub PR with Bicep** +> 5. **Apply live via Azure MCP** *(actually creates resources)* +> 6. **Edit a row first** *(then re-render and re-ask)* + +This collapses what used to be 3 sequential popups (approve → output format → delivery) into **1**. + +Implementation: a single `AskUserQuestion` tool call with `header: "Deliver"`, `multiSelect: false`, and 6 options (the most likely combinations above) — the user can also pick "Other" to type a custom answer like "give me both bicep and terraform". + +**If the user picks "Edit a row first":** then ask which row, update, re-render the full Plan Card, and re-ask the same batched action picker. Do not splinter into multiple popups. + +**If the user already implied the answer in their original prompt** ("save bicep to ./infra" / "open a PR" / "just print az CLI" / "apply it"): **skip this prompt entirely** and proceed straight to delivery. + +**Explicit-override fast path — skip the Plan Card table too.** If the user combines an explicit deliverable ("give me the Bicep", "just print az CLI") with an explicit refusal of dialog ("no questions", "skip planning", "no plan", "just do it"): **do not render the Plan Card markdown table.** Instead emit a single-line preview of the high-signal decisions — e.g. *"→ Deploying `Standard_D2s_v5` in `eastus`, NSG = your public IP only on 22, est. ~$70/mo"* — and follow it immediately with the requested artifact. End with a one-liner noting the full Plan Card is available on request if they want to edit rows. Step 4 validation gates still run. diff --git a/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/references/validation-gates.md b/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/references/validation-gates.md new file mode 100644 index 00000000..3efc3d57 --- /dev/null +++ b/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/references/validation-gates.md @@ -0,0 +1,39 @@ +# Validation Gates + +Step 4 of the create-flow runs four read-only checks before the Plan Card is shown. Required path: Azure MCP. Fallback: CLI patterns in [mcp-tools.md](mcp-tools.md). Do **not** substitute generic guidance tools (`get_azure_bestpractices`, `pricing`) — they don't validate quota, SKU availability, or region support. + +## Checks + +| Check | MCP tool | What to verify | +|---|---|---| +| SKU exists in region | `compute_vm_list-skus` (`location`, `minVCpus`, `minMemoryGb`, optional `familyPrefix`) | At least one matching SKU, no `restrictions` in target zone | +| Image is current | `compute_vm_list-images` (alias or `publisher`/`offer`/`sku`) | Alias resolves to a published URN | +| vCPU quota | `compute_vm_check-quota` (`location`, `family`) | `currentValue + requestedVCPUs ≤ limit` | +| Region availability | `compute_vm_recommend-region` (workload hints) | Region exists and supports the family | + +## Outcomes + +| Result | Action | +|---|---| +| ✅ Sufficient | Proceed to Step 5 (Plan Card) | +| ⚠️ Near limit (>80%) | Proceed but flag in Plan Card; suggest quota increase | +| ❌ Insufficient / SKU missing | Propose alternate SKU or region; do not generate output | + +## Common failures + +| Symptom | Cause | Fix | +|---|---|---| +| `compute_vm_list-skus` returns empty | Filter too narrow; SKU not in region | Drop `familyPrefix`; lower `minVCpus`; try another region | +| Quota at limit | Subscription cap | Smaller SKU / different family / different region / quota-increase request | +| Image URN unresolved | Wrong alias; deprecated image | Switch to `publisher`/`offer`/`sku`/`version` form; check Marketplace | +| Region rejects family | Family not GA in region | Use `compute_vm_recommend-region` to find a region that supports the family | + +## When Azure MCP is not connected + +Warn the user that pre-flight checks are reduced. Use the CLI equivalents in [mcp-tools.md](mcp-tools.md): + +- `az vm list-skus --location --output table` +- `az vm image list --location --offer ubuntu-24_04-lts --all` +- `az vm list-usage --location --output table` + +These don't gate the artifact generation — they're informational. Surface their output verbatim so the user can self-check. diff --git a/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/vm-creator.md b/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/vm-creator.md new file mode 100644 index 00000000..04e42db2 --- /dev/null +++ b/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-creator/vm-creator.md @@ -0,0 +1,116 @@ +# Azure VM/VMSS Creator + +Guided create-flow for Azure Virtual Machines (VMs) and VM Scale Sets (VMSS). Adapts to the user's expertise — beginners get sensible defaults; networking/spec/cost/security experts get the deep questions for their domain only — then emits the chosen artifact: az CLI bash, Bicep, Terraform, or live apply via Azure MCP. + +## When to use + +- User wants to **create / provision / deploy / spin up** a VM or VMSS (not just pick a SKU) +- User has a recommendation in hand and wants a deployable artifact +- User asks for a "create VM" script, template, or commands in az CLI, Bicep, or Terraform + +> **Disambiguator.** If the user wants to deploy an **application** (Docker service, web app, API, function), route to `azure-prepare`. This workflow is for **bare VM/VMSS infrastructure** only. +> **Recommender first.** If the user has not picked a SKU yet ("what should I pick?"), pause and run [vm-recommender](../vm-recommender/vm-recommender.md) Steps 1–6 first, then resume here. + +## Workflow + +### Step 1 — Determine VM vs VMSS + +If the user already said "VM" or "VMSS" / "scale set", use that. Otherwise: autoscaling, multiple identical instances, or stateless tier behind a load balancer → **VMSS**; everything else → **VM**. If unsure, default to single VM and ask one confirmation. + +### Step 2 — Depth Probe + +Classify the user's first 1–2 messages against the signal table in [depth-probe/index.md](references/depth-probe/index.md) and pick the highest-scoring branch: + +| Branch | File | +|---|---| +| Beginner / fast-path | [beginner.md](references/depth-probe/beginner.md) | +| Networking-deep | [networking-deep.md](references/depth-probe/networking-deep.md) | +| Spec-deep | [spec-deep.md](references/depth-probe/spec-deep.md) | +| Cost-deep | [cost-deep.md](references/depth-probe/cost-deep.md) | +| Security-deep | [security-deep.md](references/depth-probe/security-deep.md) | + +> Never ask a question whose answer can be inferred or safely defaulted. Batch silent inferences into the Plan Card so the user can see and edit them. + +### Step 3 — Adaptive Gather + +Ask **only** the questions from the matched branch's matrix. Use the defaults ladder when asking: + +> *"NSG inbound rules — `[Recommended: SSH from your IP only]` / `[Show alternatives]` / `[I have specifics]`"* + +Cross-branch follow-ups (once, after the primary branch): +- Resource group (existing or new — propose `-rg`) +- Tags (accept "none" without follow-up) +- VMSS only: instance count, orchestration mode (default **Flexible**) + +If the user volunteers a deep signal mid-flow, append the relevant matrix questions for that topic. Do not restart. + +### Step 4 — Validate + +> **GATE — do not present the Plan Card until validation passes.** + +Use the Azure MCP read-only tools listed in [validation-gates.md](references/validation-gates.md) (SKU exists in region / image is current / quota headroom / region availability). Required path; CLI fallback is documented in [mcp-tools.md](references/mcp-tools.md). + +Outcomes: + +| Result | Action | +|---|---| +| ✅ Sufficient | Proceed to Step 5 | +| ⚠️ Near limit (>80%) | Proceed but flag in Plan Card; suggest quota increase | +| ❌ Insufficient / SKU missing | Propose alternate SKU or region; do **not** generate output | + +### Step 5 — Plan Card (with explicit-override fast path) + +**Default path.** Render a single markdown table summarizing **every decision** (explicit answers + silent defaults). The user reads top-to-bottom and either approves or edits any row before output is generated. See [plan-card.md](references/plan-card.md) for the schema, example, and rendering rules. + +Ask: *"Approve as-is, edit a row, or change output format?"* — do not generate until approved. + +**Explicit-override fast path.** If the user's prompt combines (a) an explicit deliverable ("give me the Bicep", "just print the az CLI", "apply it via MCP") **and** (b) an explicit refusal of dialog ("no questions", "skip planning", "no plan", "just do it"), **respect them**. Skip the Plan Card table and the approval AskUserQuestion. Instead: + +1. Emit a **single-line preview** that surfaces the high-signal decisions inline — e.g. *"→ Deploying `Standard_D2s_v5` in `eastus`, OS `Ubuntu2404`, NSG = your public IP only on 22, est. ~$70/mo."* +2. Immediately emit the requested artifact (Bicep / Terraform / az CLI / MCP apply). +3. Mention once, at the end, that the full Plan Card is available on request if they want to edit rows. + +Step 4 validation gates (SKU / image / quota / region) still run on the fast path — they protect against broken artifacts, not user intent. If validation fails, fall back to the ❌ outcome in Step 4 (propose alternate SKU/region, do not generate output). + +### Step 6 — Output Choice + +Ask the user to pick one of four formats (or use the one they already specified): + +| Format | When | Adapter | +|---|---|---| +| **az CLI bash** | Quick one-off, learning, copy-paste | [az-cli.md](references/output-adapters/az-cli.md) | +| **Bicep** | Repeatable IaC, production, ARM-native | [bicep.md](references/output-adapters/bicep.md) | +| **Terraform** | Existing TF state, multi-cloud | [terraform.md](references/output-adapters/terraform.md) | +| **Apply via Azure MCP** | "Just do it" — MCP connected, user trusts the Plan Card | [mcp-apply.md](references/output-adapters/mcp-apply.md) | + +All four adapters consume the **same Plan Card parameter set** — switching format is a re-render, not a re-gather. For Apply via MCP, confirm one more time (the only destructive path) before calling `compute_vm_create` / `compute_vmss_create`. + +### Step 7 — Delivery + +> **Skip for Apply via MCP** — the artifact is the live deployment. + +For `az CLI` / `Bicep` / `Terraform`, ask one final question: *where should it land?* See [delivery-options/index.md](references/delivery-options/index.md) for the decision logic. Three modes: [print](references/delivery-options/print.md), [save locally](references/delivery-options/save-local.md), [GitHub PR](references/delivery-options/github-pr.md). + +If the user later says "also save it locally" or "open the PR now", re-run delivery with the cached Plan Card — **do not re-ask Plan Card questions**. + +## Error handling + +| Scenario | Action | +|---|---| +| Azure MCP not connected | Skip MCP pre-flight; warn that quota / SKU availability is unverified; offer CLI fallback | +| `compute_vm_list-skus` returns empty | Broaden filter (drop `familyPrefix`, lower `minVCpus`); if still empty, suggest another region | +| Quota insufficient | Show the gap; offer (a) smaller SKU, (b) different family, (c) different region, (d) quota-increase link | +| User wants Windows but supplies SSH key | Switch auth to password (with strength check) or RDP + cert; do not generate broken artifact | +| User asks "what was that az CLI again?" after picking Bicep | Re-render via Adapter 1; do not re-ask questions | +| Custom image / Shared Image Gallery | Pass full resource ID to `compute_vm_list-images`; do not try to map to an alias | +| User requests confidential / FedRAMP / HIPAA controls mid-flow | Append Security-deep questions; flag any defaults that fail the compliance bar | + +## Routing back / handoff + +| Situation | Route to | +|---|---| +| Deploy an **application** (not a bare VM) | `azure-prepare` skill | +| Reserve capacity *before* creating | [capacity-reservation](../capacity-reservation/capacity-reservation.md) | +| Enroll the new VM in management | [essential-machine-management](../essential-machine-management/essential-machine-management.md) | +| Compare more SKU / pricing options | [vm-recommender](../vm-recommender/vm-recommender.md) Steps 1–6 | +| Post-create RDP / SSH issues | [vm-troubleshooter](../vm-troubleshooter/vm-troubleshooter.md) | diff --git a/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-recommender/references/handoff-to-creator.md b/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-recommender/references/handoff-to-creator.md new file mode 100644 index 00000000..70515e40 --- /dev/null +++ b/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-recommender/references/handoff-to-creator.md @@ -0,0 +1,32 @@ +# Hand-off to vm-creator + +When the user wants to **provision** the recommended option (not just compare), hand off to [vm-creator](../../vm-creator/vm-creator.md). Don't skip directly to an output adapter — the user must see and approve the Plan Card first. + +## Required before hand-off + +Render the [Plan Card](../../vm-creator/references/plan-card.md) markdown table **in chat** with the chosen SKU, region, instance count, pricing, and quota status pre-filled from the recommender's work. The user is approving the Plan Card, not the artifact. + +## Routing signals + +| User says | Action | +|---|---| +| "let's create it" / "spin one up" / "deploy this" | Render Plan Card → route to `vm-creator` Step 5 with selected SKU + region pre-filled | +| "give me the az CLI / Bicep / Terraform" | Render Plan Card → route to `vm-creator` Step 6 (Output Choice) | +| "just compare prices" / "I'm still deciding" | Stay in `vm-recommender`; offer to revisit | + +## Example hand-off message + +> *"Want me to generate the create command? I can output az CLI, Bicep, Terraform, or apply it via Azure MCP — I'll carry over the SKU, region, and pricing we just landed on."* + +## What carries over + +| Recommender output | Plan Card row | +|---|---| +| Hosting Model (VM vs VMSS) | `Hosting model` | +| VM Size (ARM SKU) | `Size` | +| Region | `Region` | +| Instance Count (or `min–max`) | `Instance count` (VMSS only) | +| Estimated $/hr | `Estimated cost` | +| Quota Status (✅/⚠️/❌) | `Quota` | + +`vm-creator` Steps 2–4 (Depth Probe, Adaptive Gather, Validate) still run after hand-off to fill in OS, auth, networking, and tagging — they're additive on top of the recommender's spec choice. diff --git a/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-recommender/references/web-fetch-policy.md b/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-recommender/references/web-fetch-policy.md new file mode 100644 index 00000000..5840ed61 --- /dev/null +++ b/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-recommender/references/web-fetch-policy.md @@ -0,0 +1,35 @@ +# web_fetch policy + +Steps 2 and 3 of the recommender rely on `web_fetch` against `learn.microsoft.com` to verify that a recommendation reflects current capabilities (especially VMSS features, family availability, and Spot eligibility). + +## When `web_fetch` succeeds + +Use the live documentation as the source of truth. Cite the URL in the recommendation so the user can verify. + +## When `web_fetch` fails (timeout, 404, blocked, offline) + +Proceed using the reference files in `../../references/` — but **always** include this warning in the recommendation: + +> ⚠ Unable to verify against latest Azure documentation. Recommendation is based on reference material that may not reflect recent updates (e.g., new VM families, Spot eligibility changes, regional rollouts). + +Do not block the recommendation on `web_fetch` failure. The user is better served by an annotated recommendation than by no recommendation. + +## What to fetch (Step 2 — VMSS) + +``` +https://learn.microsoft.com/azure/virtual-machine-scale-sets/overview +https://learn.microsoft.com/azure/virtual-machine-scale-sets/virtual-machine-scale-sets-autoscale-overview +``` + +## What to fetch (Step 3 — VM family) + +``` +https://learn.microsoft.com/azure/virtual-machines/sizes// +``` + +Examples: +- B-series: `https://learn.microsoft.com/azure/virtual-machines/sizes/general-purpose/b-family` +- D-series: `https://learn.microsoft.com/azure/virtual-machines/sizes/general-purpose/ddsv5-series` +- GPU: `https://learn.microsoft.com/azure/virtual-machines/sizes/gpu-accelerated/nc-family` + +For Spot, also: `https://learn.microsoft.com/azure/virtual-machine-scale-sets/use-spot`. diff --git a/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-recommender/vm-recommender.md b/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-recommender/vm-recommender.md index 130b5ebe..4b6b8607 100644 --- a/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-recommender/vm-recommender.md +++ b/.github/plugins/azure-skills/skills/azure-compute/workflows/vm-recommender/vm-recommender.md @@ -1,168 +1,127 @@ # Azure VM Recommender -Recommend Azure VM sizes, VM Scale Sets (VMSS), and configurations by analyzing workload type, performance requirements, scaling needs, and budget. No Azure subscription required — all data comes from public Microsoft documentation and the unauthenticated Retail Prices API. +Recommend Azure VM sizes, VM Scale Sets (VMSS), and configurations by analyzing workload type, performance requirements, scaling needs, and budget. No Azure subscription required — data comes from public Microsoft documentation and the unauthenticated Retail Prices API. ## When to Use This Skill - User asks which Azure VM or VMSS to choose for a workload -- User needs VM size recommendations for web, database, ML, batch, HPC, or other workloads - User wants to compare VM families, sizes, or pricing tiers -- User asks about trade-offs between VM options (cost vs performance) -- User needs a cost estimate for Azure VMs without an Azure account -- User asks whether to use a single VM or a scale set -- User needs autoscaling, high availability, or load-balanced VM recommendations -- User asks about VMSS orchestration modes (Flexible vs Uniform) +- User asks about trade-offs (cost vs performance, single VM vs scale set, orchestration modes) +- User needs a cost estimate without an Azure subscription +- User asks "Needs autoscaling?" or wants to decide between a single VM and a scale set ## Workflow -> Use reference files for initial filtering - -> **CRITICAL: then always verify with live documentation** from learn.microsoft.com before making final recommendations. If `web_fetch` fails, use reference files as fallback but warn the user the information may be stale. +> Use reference files for initial filtering. Then **verify with live documentation** via `web_fetch` before final recommendations. If `web_fetch` fails, fall back to the reference files and surface the staleness warning from [web-fetch-policy.md](references/web-fetch-policy.md). ### Step 1: Gather Requirements -Ask the user for (infer when possible): - -| Requirement | Examples | -| ---------------------- | ------------------------------------------------------------------ | -| **Workload type** | Web server, relational DB, ML training, batch processing, dev/test | -| **vCPU / RAM needs** | "4 cores, 16 GB RAM" or "lightweight" / "heavy" | -| **GPU needed?** | Yes → GPU families; No → general/compute/memory | -| **Storage needs** | High IOPS, large temp disk, premium SSD | -| **Budget priority** | Cost-sensitive, performance-first, balanced | -| **OS** | Linux or Windows (affects pricing) | -| **Region** | Affects availability and price | -| **Instance count** | Single instance, fixed count, or variable/dynamic | -| **Scaling needs** | None, manual scaling, autoscale based on metrics or schedule | -| **Availability needs** | Best-effort, fault-domain isolation, cross-zone HA | -| **Load balancing** | Not needed, Azure Load Balancer (L4), Application Gateway (L7) | +Ask the user (infer when possible): + +| Requirement | Examples | +|---|---| +| Workload type | Web server, relational DB, ML training, batch, dev/test | +| vCPU / RAM needs | "4 cores, 16 GB" or "lightweight" / "heavy" | +| GPU needed? | Yes → GPU families; No → general / compute / memory | +| Storage needs | High IOPS, large temp disk, premium SSD | +| Budget priority | Cost-sensitive, performance-first, balanced | +| OS | Linux or Windows (affects pricing) | +| Region | Affects availability and price | +| Instance count | Single, fixed count, or variable | +| Scaling needs | None, manual, autoscale (metrics / schedule) | +| Availability needs | Best-effort, fault-domain, cross-zone HA | +| Load balancing | None, Azure Load Balancer (L4), Application Gateway (L7) | ### Step 2: Determine VM vs VMSS -**Workflow:** - -1. Review [VMSS Guide](../../references/vmss-guide.md) to understand when VMSS vs single VM is appropriate -2. Use the gathered requirements to decide which approach fits best -3. **REQUIRED: If recommending VMSS**, fetch current documentation to verify capabilities: - ```bash - web_fetch https://learn.microsoft.com/en-us/azure/virtual-machine-scale-sets/overview - web_fetch https://learn.microsoft.com/en-us/azure/virtual-machine-scale-sets/virtual-machine-scale-sets-autoscale-overview - ``` -4. **If `web_fetch` fails**, proceed with reference file guidance but include this warning: - > Unable to verify against latest Azure documentation. Recommendation based on reference material that may not reflect recent updates. - -```text -Needs autoscaling? -├─ Yes → VMSS -├─ No -│ ├─ Multiple identical instances needed? -│ │ ├─ Yes → VMSS -│ │ └─ No -│ │ ├─ High availability across fault domains / zones? -│ │ │ ├─ Yes, many instances → VMSS -│ │ │ └─ Yes, 1-2 instances → VM + Availability Zone -│ │ └─ Single instance sufficient? → VM -``` +Review [VMSS Guide](../../references/vmss-guide.md). Decision shortcut — start by asking **Needs autoscaling?** then walk the table: -| Signal | Recommendation | Why | -| --------------------------------------------- | ----------------------------- | --------------------------------------------------------------------- | -| Autoscale on CPU, memory, or schedule | **VMSS** | Built-in autoscale; no custom automation needed | -| Stateless web/API tier behind a load balancer | **VMSS** | Homogeneous fleet with automatic distribution | -| Batch / parallel processing across many nodes | **VMSS** | Scale out on demand, scale to zero when idle | -| Mixed VM sizes in one group | **VMSS (Flexible)** | Flexible orchestration supports mixed SKUs | -| Single long-lived server (jumpbox, AD DC) | **VM** | No scaling benefit; simpler management | -| Unique per-instance config required | **VM** | Scale sets assume homogeneous configuration | -| Stateful workload, tightly-coupled cluster | **VM** (or VMSS case-by-case) | Evaluate carefully; VMSS Flexible can work for some stateful patterns | +| Signal | Pick | +|---|---| +| Autoscale on CPU, memory, or schedule | **VMSS** | +| Stateless web/API tier behind a load balancer | **VMSS** | +| Batch / parallel processing across many nodes | **VMSS** | +| Mixed VM sizes in one group | **VMSS (Flexible)** | +| Single long-lived server (jumpbox, AD DC) | **VM** | +| Unique per-instance config | **VM** | +| Stateful, tightly-coupled cluster | **VM** (or VMSS case-by-case) | -> **Warning:** If the user is unsure, default to **single VM** for simplicity. Recommend VMSS only when scaling, HA, or fleet management is clearly needed. +If recommending VMSS, verify with `web_fetch` per [web-fetch-policy.md](references/web-fetch-policy.md). When in doubt, default to a single **VM**. ### Step 3: Select VM Family -**Workflow:** - -1. Review [VM Family Guide](../../references/vm-families.md) to identify 2-3 candidate VM families that match the workload requirements -2. **REQUIRED: verify specifications** for your chosen candidates by fetching current documentation: - ```bash - web_fetch https://learn.microsoft.com/en-us/azure/virtual-machines/sizes// - ``` - - Examples: - - B-series: `https://learn.microsoft.com/en-us/azure/virtual-machines/sizes/general-purpose/b-family` - - D-series: `https://learn.microsoft.com/en-us/azure/virtual-machines/sizes/general-purpose/ddsv5-series` - - GPU: `https://learn.microsoft.com/en-us/azure/virtual-machines/sizes/gpu-accelerated/nc-family` - -3. **If considering Spot VMs**, also fetch: - ```bash - web_fetch https://learn.microsoft.com/en-us/azure/virtual-machine-scale-sets/use-spot - ``` +Review [VM Family Guide](../../references/vm-families.md) and pick 2–3 candidate families. Verify each candidate's specs with `web_fetch` against: -4. **If `web_fetch` fails**, proceed with reference file guidance but include this warning: - > Unable to verify against latest Azure documentation. Recommendation based on reference material that may not reflect recent updates or limitations (e.g., Spot VM compatibility). +``` +https://learn.microsoft.com/en-us/azure/virtual-machines/sizes// +``` -This step applies to both single VMs and VMSS since scale sets use the same VM SKUs. +For Spot eligibility, also fetch `https://learn.microsoft.com/en-us/azure/virtual-machine-scale-sets/use-spot`. If any fetch fails, follow [web-fetch-policy.md](references/web-fetch-policy.md). Same SKUs apply to single VMs and VMSS. ### Step 4: Look Up Pricing -Query the Azure Retail Prices API — [Retail Prices API Guide](../../references/retail-prices-api.md) +Query the Azure Retail Prices API per [Retail Prices API Guide](../../references/retail-prices-api.md). -> **Tip:** VMSS has no extra charge — pricing is per-VM instance. Use the same VM pricing from the API and multiply by the expected instance count to estimate VMSS cost. For autoscaling workloads, estimate cost at both the minimum and maximum instance count. +> **VMSS:** no extra charge — pricing is per-VM. Multiply per-instance price × expected count. For autoscale, estimate at both `min` and `max`. ### Step 5: Validate Quota Availability -> **GATE — Do not present recommendations until quota is validated.** +> **GATE — do not present recommendations until quota is validated.** -If the user has an Azure subscription and region, follow the [VM Quota Validation Guide](../../references/vm-quotas.md) to check vCPU capacity for each candidate VM family. Skip this step if no subscription — add a note that quota should be checked before deployment. +If the user has a subscription + region, review and run the checks from [VM Quota Validation Guide](../../references/vm-quotas.md). Without a subscription, note quota must be checked before deployment. | Outcome | Action | |---|---| | ✅ Sufficient | Proceed to Step 6 | -| ⚠️ Near limit (>80%) | Proceed but warn; suggest requesting increase | +| ⚠️ Near limit (>80%) | Proceed but warn; suggest quota increase | | ❌ Insufficient | Request increase, swap family, or try another region | -Include a "Quota Status" column (✅/⚠️/❌) in the recommendation table. - -> 📖 **Full details:** See [VM Quota Validation Guide](../../references/vm-quotas.md) for quota structure, CLI commands, VMSS considerations, and fallback methods. +Include a "Quota Status" column (✅/⚠️/❌) in the table. ### Step 6: Present Recommendations Provide **2–3 options** with trade-offs: -| Column | Purpose | -| -------------- | ----------------------------------------------- | -| Hosting Model | VM or VMSS (with orchestration mode if VMSS) | -| VM Size | ARM SKU name (e.g., `Standard_D4s_v5`) | -| vCPUs / RAM | Core specs | -| Instance Count | 1 for VM; min–max range for VMSS with autoscale | -| Estimated $/hr | Per-instance pay-as-you-go from API | -| Why | Fit for the workload | -| Trade-off | What the user gives up | +| Column | Purpose | +|---|---| +| Hosting Model | VM or VMSS (with orchestration mode if VMSS) | +| VM Size | ARM SKU name (e.g., `Standard_D4s_v5`) | +| vCPUs / RAM | Core specs | +| Instance Count | `1` for VM; `min–max` for VMSS with autoscale | +| Estimated $/hr | Per-instance pay-as-you-go | +| Why | Workload fit | +| Trade-off | What the user gives up | -> **Tip:** Always explain *why* a family fits and what the user trades off (cost vs cores, burstable vs dedicated, single VM simplicity vs VMSS scalability, etc.). +Always explain *why* a family fits and the Trade-off (cost vs cores, burstable vs dedicated, VM simplicity vs VMSS scale). -For VMSS recommendations, also mention: -- Recommended orchestration mode (Flexible for most new workloads) -- Autoscale strategy (metric-based, schedule-based, or both) -- Load balancer type (Azure Load Balancer for L4, Application Gateway for L7/TLS) +For VMSS, also mention orchestration mode (default **Flexible**), autoscale strategy (metric / schedule / both), and load balancer type. ### Step 7: Offer Next Steps -- Compare reservation / savings plan pricing (query API with `priceType eq 'Reservation'`) +- Compare reservation / savings plan pricing (`priceType eq 'Reservation'` in the API) - Suggest [Azure Pricing Calculator](https://azure.microsoft.com/pricing/calculator/) for full estimates -- For VMSS: suggest reviewing [autoscale best practices](https://learn.microsoft.com/en-us/azure/azure-monitor/autoscale/autoscale-best-practices) and [VMSS networking](https://learn.microsoft.com/en-us/azure/virtual-machine-scale-sets/virtual-machine-scale-sets-networking) +- For VMSS: [autoscale best practices](https://learn.microsoft.com/azure/azure-monitor/autoscale/autoscale-best-practices), [VMSS networking](https://learn.microsoft.com/azure/virtual-machine-scale-sets/virtual-machine-scale-sets-networking) + +### Step 8: Hand Off to VM Creator (Optional) + +If the user wants to **actually provision** what was recommended, hand off to [vm-creator](../vm-creator/vm-creator.md). See [handoff-to-creator.md](references/handoff-to-creator.md) for the required Plan Card render and routing rules. ## Error Handling -| Scenario | Action | -| ------------------------------- | ------------------------------------------------------------------------------ | -| API returns empty results | Broaden filters — check `armRegionName`, `serviceName`, `armSkuName` spelling | -| User unsure of workload type | Ask clarifying questions; default to General Purpose D-series | -| Region not specified | Use `eastus` as default; note prices vary by region | -| Unclear if VM or VMSS needed | Ask about scaling and instance count; default to single VM if unsure | -| User asks VMSS pricing directly | Use same VM pricing API — VMSS has no extra charge; multiply by instance count | +| Scenario | Action | +|---|---| +| API returns empty results | Broaden filters — check `armRegionName`, `serviceName`, `armSkuName` spelling | +| User unsure of workload type | Ask clarifying questions; default to General Purpose D-series | +| Region not specified | Use `eastus` as default; note prices vary by region | +| Unclear if VM or VMSS needed | Ask about scaling + instance count; default to single VM if still unsure | +| User asks VMSS pricing directly | Same VM pricing API; VMSS has no extra charge — multiply by instance count | ## References -- [VM Family Guide](../../references/vm-families.md) — Family-to-workload mapping and selection -- [Retail Prices API Guide](../../references/retail-prices-api.md) — Query patterns, filters, and examples -- [VMSS Guide](../../references/vmss-guide.md) — When to use VMSS, orchestration modes, and autoscale patterns -- [VM Quota Validation Guide](../../references/vm-quotas.md) — vCPU quota checks, CLI commands, and capacity planning +- [VM Family Guide](../../references/vm-families.md) — family-to-workload mapping +- [Retail Prices API Guide](../../references/retail-prices-api.md) — query patterns, filters +- [VMSS Guide](../../references/vmss-guide.md) — when to use VMSS, orchestration, autoscale +- [VM Quota Validation Guide](../../references/vm-quotas.md) — vCPU checks, CLI commands +- [web-fetch-policy.md](references/web-fetch-policy.md) — fail-safe behavior for live docs lookups +- [handoff-to-creator.md](references/handoff-to-creator.md) — Step 8 hand-off rules +- [vm-creator](../vm-creator/vm-creator.md) — provision the recommended SKU