IaC for a single-node Kubernetes cluster running on Azure.
This Terraform module deploys:
- A single Azure VM
- Installs k3s in server configuration
- Installs the
kube-prometheus-stackHelm chart and exposes Grafana atgrafana.${root-domain}for monitoring the system - Installs
cert-managerand configures certs and Ingress for your${root-domain} - Optionally installs
ImagePullSecretsfor your Azure Container Registry in your chosen namespace (defaults toapps)
- Terraform installed
- Azure CLI installed and logged in (
az login) - kubectl installed
- An active Azure account with permissions to create VMs and networking resources
- A domain registered with Cloudflare (with API token that has DNS edit permissions)
- SSH keypair for VM access (RSA format required by Azure, e.g.,
ssh-keygen -t rsa -b 4096) - (Optional) An existing Azure Container Registry for deploying your applications
Make sure you have your API keys and credentials ready!
Login to Azure and run the setup script:
az login
./gnode.shThe script will interactively prompt for all configuration variables and secrets (see Configuration below for details), then automatically deploy both infrastructure and applications.
To tear down all resources:
./gnode.sh destroyThis destroys the applications module first, then the infrastructure module.
⚠️ Azure Quirks Warning: Terraform may not successfully delete all Azure resources. After runningdestroy, verify in the Azure Portal that the resource group has been fully deleted. A VNet or Resource Group might be left hanging.
To remove all local Terraform state and configuration files (useful for a fresh start):
./gnode.sh cleanupThis deletes .terraform/ directories, *.tfstate files, and *.tfvars files from both modules.
The whole process should take approximately 10-15 minutes end to end. Terraform will output the public IP of the instance and a properly configured kubeconfig to connect to the cluster.
The gnode.sh script prompts for all configuration variables and secrets. Variables are written to terraform/infra/prod.tfvars and terraform/apps/prod.tfvars, while secrets are written to corresponding secrets.auto.tfvars files.
| Name | Required | Description |
|---|---|---|
resource_group_name |
No | Azure resource group name (default: gnode) |
location |
No | Azure region (default: westus) |
vm_size |
No | Azure VM size (default: Standard_D4s_v3, ~$140/mo) |
admin_username |
No | SSH admin username for the VM (default: g) |
vm_name |
No | Name of the Azure VM (default: gnode) |
enable_github_actions_ips |
No | Whether to allow access to the Kubernetes API from GitHub Actions IP ranges (default: false) |
local_ip_address |
No | Your local IP address (CIDR) to allow Kubernetes API access; auto-detected if not provided |
root_domain |
Yes | Your root domain name (e.g., gerardosalazar.com) |
acr_registry_url |
No | Azure Container Registry URL (e.g., myregistry.azurecr.io) |
acr_secret_name |
No | Name of the Kubernetes image pull secret (default: acr-secret; only prompted if ACR URL provided) |
acr_secret_namespace |
No | Namespace for the image pull secret (default: apps; only prompted if ACR URL provided) |
| Name | Required | Description |
|---|---|---|
ssh_public_key |
Yes | SSH public key content for VM access (auto-generated if not provided) |
ssh_private_key_path |
No | Path to the SSH private key file (default: ~/.ssh/id_rsa_gnode) |
cloudflare_api_token |
Yes | Cloudflare API token with DNS edit permissions for your domain |
grafana_admin_password |
Yes | Password for the Grafana admin user |
acr_username |
Conditional | ACR username/token (required only if acr_registry_url is provided) |
acr_password |
Conditional | ACR password (required only if acr_registry_url is provided) |
ACR integration is optional. If you don't provide an acr_registry_url, no image pull secrets will be created, and you can use public container images or configure registry credentials manually.
When to enable:
- You have a private Azure Container Registry with your application images
- You want Terraform to automatically create and manage Kubernetes image pull secrets
How to enable:
- Set
acr_registry_urlto your registry URL (e.g.,myregistry.azurecr.io) - Provide
acr_usernameandacr_passwordcredentials - Optionally customize
acr_secret_name(default:acr-secret) andacr_secret_namespace(default:apps)
The setup script will only prompt for ACR credentials if you provide a registry URL.
By default, the Kubernetes API (port 6443) is only accessible from your local IP address. If you want to deploy to the cluster from GitHub Actions CI/CD pipelines, you can enable access from GitHub's runner IP ranges.
When to enable:
- You have GitHub Actions workflows that need to run
kubectlcommands against the cluster - You want automated deployments from GitHub Actions
How to enable:
Set enable_github_actions_ips = true in your prod.tfvars file.
Security considerations:
- This opens port 6443 to all GitHub Actions runner IPs (both IPv4 and IPv6 ranges)
- The Kubernetes API still requires valid kubeconfig credentials for authentication
- Consider using a more restrictive approach (e.g., VPN, bastion host) for production environments with sensitive workloads
Note: GitHub Actions IPs are fetched from GitHub's meta API and chunked to comply with Azure NSG limits (max 4000 addresses per rule).
After Terraform completes, you have two ways to interact with your cluster:
SSH into the VM using the admin username you configured and the VM's public IP (output by Terraform):
ssh <admin_username>@<vm_public_ip>For example, with the default username:
ssh g@$(terraform output -raw vm_public_ip)The node will have kubectl properly configured to interact with the cluster.
First make sure you have kubectl installed locally.
Terraform generates a kubeconfig.yaml file in the repository root, pre-configured with the cluster's public IP. A placeholder is created automatically during setup to ensure the Terraform providers can initialize correctly.
Option 1: Use the KUBECONFIG environment variable
export KUBECONFIG=$(pwd)/kubeconfig.yaml
kubectl get nodesOption 2: Copy to your default kubeconfig location
To avoid setting the environment variable each time, move the kubeconfig to the default location:
# Back up your existing kubeconfig if you have one
cp ~/.kube/config ~/.kube/config.backup
# Copy the new kubeconfig
mkdir -p ~/.kube
cp kubeconfig.yaml ~/.kube/config
# Verify access
kubectl get nodesYou can now use kubectl without setting any environment variables:
kubectl apply -f my-app.yaml
kubectl get pods -n appsAfter Terraform completes, an Ingress is created for your root domain (${root_domain} and www.${root_domain}) with TLS certificates automatically provisioned via Let's Encrypt. However, this Ingress points to a service that does not exist yet.
To serve traffic on your root domain, you need to deploy:
- A Deployment (or Pod) - Your application workload
- A Service - Named
${domain_name_sanitized}-service(e.g., forexample.com, the service should be namedexample-com-service)
Example deployment in the apps namespace:
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app
namespace: apps
spec:
replicas: 1
selector:
matchLabels:
app: my-app
template:
metadata:
labels:
app: my-app
spec:
containers:
- name: app
image: "" # you put your image here, it can be from your ACR!
ports:
- containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
name: example-com-service # Replace with your sanitized domain name
namespace: apps
spec:
selector:
app: my-app
ports:
- port: 80
targetPort: 80Deploy using kubectl:
kubectl apply -f my-app.yamlUntil you deploy this service, requests to your root domain will return 503 errors.
Once the deployment is complete you can monitor your system via Grafana by visiting grafana.${root_domain} and logging in as the admin user with the password you configured.
The deployment is split into two phases for better stability and provider isolation:
This phase handles the Azure resources and the k3s installation.
- Navigate to the directory:
cd terraform/infra - Initialize:
terraform init - Apply:
terraform apply -var-file=prod.tfvars
Outputs: This will generate a kubeconfig.yaml in the project root.
This phase handles Helm charts and Kubernetes manifests.
- Navigate to the directory:
cd terraform/apps - Initialize:
terraform init - Apply:
terraform apply -var-file=prod.tfvars
-
Resource Group (
azurerm_resource_group.gnode_rg)- Creates the Azure resource group which groups all other resources
-
Virtual Network & Subnet (
azurerm_virtual_network.gnode_vnet,azurerm_subnet.gnode_subnet)- Creates VNet (10.0.0.0/16) and subnet (10.0.1.0/24)
-
Public IP (
azurerm_public_ip.gnode_ip)- Allocates static public IP for VM
-
Network Security Group (
azurerm_network_security_group.gnode_nsg)
- Configures firewall rules:
- SSH (port 22) - open to all
- Kubernetes API (port 6443) - restricted to GitHub Actions IPs (if enabled) + local IP
- HTTP (port 80) - open to all
- HTTPS (port 443) - open to all
- Uses local IP detection from
main.tf(via ipify.org if not provided)
-
Network Interface (
azurerm_network_interface.gnode_nic)- Attaches VM to subnet and public IP
-
NSG Association (
azurerm_network_interface_security_group_association.gnode_nic_nsg)- Applies security rules to network interface
-
Virtual Machine (
azurerm_linux_virtual_machine.gnode_vm)- Creates Ubuntu 22.04 LTS VM
- Executes
manifests/cloud-init.yamlon first boot which:- Updates packages
- Installs curl, wget, vim, git
- Downloads and installs k3s via
curl -sfL https://get.k3s.io | sh - - Enables and starts k3s service
-
Wait for K3s (
null_resource.wait_for_k3s)- Polls via SSH to check if k3s is ready using two conditions:
systemctl is-active k3s- verifies the k3s service is activetest -f /etc/rancher/k3s/k3s.yaml- verifies the kubeconfig file exists
- Polls via SSH to check if k3s is ready using two conditions:
-
Copy Kubeconfig (
null_resource.copy_kubeconfig)- SSHes into VM and copies
/etc/rancher/k3s/k3s.yaml - Replaces
127.0.0.1:6443with VM's public IP - Saves as
kubeconfig.yamlin this modules directory
- SSHes into VM and copies
-
Wait for K8s API (
data.kubernetes_nodes.cluster)- Uses Kubernetes provider to check that the cluster is ready
-
Install kube-prometheus-stack (
helm_release.kube_prometheus_stack)- Creates
monitoringnamespace - Installs Prometheus, Grafana, and Alertmanager
- Configures storage and retention settings for a small node
- Waits for all resources to be ready (300s timeout)
- Creates
-
Install cert-manager (
helm_release.cert_manager)- Creates
cert-managernamespace - Installs cert-manager for TLS certificate management
- Waits for all resources to be ready (300s timeout)
- Creates
-
Wait for cert-manager to be ready
- The
helm_release.cert_managerresource is configured withwait = true, which ensures the deployment and its webhook are ready before proceeding.
- The
-
Create Cloudflare API Token Secret (
kubernetes_secret.cloudflare_api_token)- Creates a Kubernetes secret in the
cert-managernamespace containing the Cloudflare API token - Used by cert-manager for DNS-01 ACME challenge validation
- Creates a Kubernetes secret in the
-
Apply cert-manager Manifests (
null_resourcewith kubectl)- Manifests are templated from
manifests/certs.yamlusingvar.root_domainand applied via kubectl: - ClusterIssuer (
null_resource.letsencrypt_cluster_issuer): Creates Let's Encrypt ClusterIssuer with emailadmin@${root_domain}. Depends onhelm_release.cert_manager. - Certificate (
null_resource.root_domain_certificate): Creates TLS certificate for${root_domain}andwww.${root_domain} - Ingress (
null_resource.root_domain_ingress): Creates ingress for${root_domain}andwww.${root_domain}pointing to${domain_name_sanitized}-service
- Manifests are templated from
-
Wait for Grafana
- The
helm_release.kube_prometheus_stackresource is configured withwait = true, which ensures the Grafana deployment is ready before proceeding.
- The
-
Apply Grafana Ingress (
null_resource.grafana_ingress)- Manifest is templated from
manifests/grafana-ingress.yamlusingvar.root_domain - Creates ingress for
grafana.${root_domain} - Points to
kube-prometheus-stack-grafanaservice inmonitoringnamespace (port 80) - Uses cert-manager ClusterIssuer
letsencrypt-prodfor automatic TLS certificate provisioning - Uses Traefik ingress class with
webandwebsecureentrypoints - Depends on
helm_release.kube_prometheus_stackand the ClusterIssuer.
- Manifest is templated from
-
Create apps Namespace & ACR Secret (conditional, only if ACR configured)
kubernetes_namespace.apps: Creates the namespace specified byacr_secret_namespace(default:apps)kubernetes_secret.acr_image_pull_secret: Creates the image pull secret for Azure Container Registry- Only created when
acr_registry_url,acr_username, andacr_passwordare all provided
Resource Group
├── Virtual Network
│ └── Subnet
├── Public IP
├── NSG
│
└── Network Interface
└── NSG Association
└── VM
└── Wait for K3s
└── Copy Kubeconfig
└── Wait for K8s API
├── Install kube-prometheus-stack
│ └── Apply Grafana Ingress
├── Install cert-manager
│ └── Create Cloudflare API Token Secret
│ └── Apply ClusterIssuer
│ └── Apply Certificate
│ └── Apply root domain Ingress
└── (If ACR configured) Create apps Namespace
└── Create ACR Image Pull Secret