Skip to content

morph-eos/docker2azure4student

Repository files navigation

Terraform blueprint for the student-friendly Azure stack

This repository contains modular Terraform that stands up a small Azure footprint tailored for Azure for Students subscriptions: one Linux VM that runs your containerized workload and a managed PostgreSQL Flexible Server, with an Azure Key Vault for application secrets, Application Insights / Log Analytics for observability, and automation to keep costs predictable. The accompanying GitHub Actions pipeline deploys it secretlessly (Azure login via OpenID Connect, no long-lived credentials) and the VM reads its runtime configuration from Key Vault using its own managed identity.

Architecture overview

Everything lives in a single resource group whose name is derived from environment_name (normalized and truncated to 45 characters). The Terraform is split into reusable modules under modules/:

Module Resources
network /16 virtual network with one VM subnet, NSG (HTTP/HTTPS open, SSH limited to allowed_admin_cidrs), a static Standard-SKU public IP, and the VM NIC.
compute Ubuntu 22.04 LTS VM with a 64 GB Premium SSD OS disk, SSH-key auth only, and a system-assigned managed identity.
database PostgreSQL Flexible Server (Burstable B1ms, 32 GB, auto-grow off by default to stay in the free tier) and the default postgres database. The endpoint is public but firewalled to the VM's static public IP only.
storage Optional Azure Storage Account for blobs (blob_storage_enabled), Standard LRS, TLS 1.2 only, public access disabled, with blob versioning and 7-day soft delete for recoverability.
automation Azure Automation account + runbooks, created only when at least one automation feature is enabled (VM start/stop schedules, ad-hoc snapshots, snapshot cleanup, on-demand PostgreSQL backups).
keyvault Azure Key Vault holding the application secrets (see Secret management below), with access policies for the pipeline (read/write) and the VM identity (read).
monitoring Log Analytics workspace (with a daily ingestion cap to keep costs modest), workspace-based Application Insights, diagnostic settings that route PostgreSQL and Key Vault logs/metrics to the workspace, and a free observability workbook (requests / exceptions / traces over KQL).

The root module wires the modules together, owns the resource group, stores the Application Insights connection string in Key Vault, and exposes connection details (SSH command, VM IP, database FQDN/connection string, storage account name, Key Vault name) as outputs.

Network security

  • The VM exposes only HTTP/HTTPS publicly; SSH is restricted to allowed_admin_cidrs and the pipeline opens a temporary, run-scoped SSH rule that is always removed afterwards.
  • PostgreSQL keeps a public endpoint but is firewalled to a single IP — the VM's static public IP. The default vm_public_ip_static = true guarantees that IP is stable across VM stop/start, so the allow-list rule stays valid. (A full private endpoint / VNet-integrated server is intentionally not used: on a Flexible Server the networking model is fixed at creation and switching it is a destructive, data-migrating operation.)

Authentication (secretless / OIDC)

The pipeline authenticates to Azure through workload identity federation (OpenID Connect) — there is no Service Principal secret stored anywhere:

  • azure/login exchanges a short-lived GitHub OIDC token for an Azure access token, using the AZURE_CLIENT_ID, AZURE_TENANT_ID, and AZURE_SUBSCRIPTION_ID repository variables (these are identifiers, not secrets).
  • The deployment job runs in the production GitHub environment, so the OIDC token's subject matches a federated credential registered on the Azure app registration.

Secret management (Key Vault)

Application secrets live in Azure Key Vault rather than in the application repository:

  • The pipeline assembles the full application environment (static base + the live database connection and, when enabled, the storage account credentials) and publishes it to Key Vault as the app-env secret. The static base is seeded once from APP_ENV_VARS_B64 and thereafter stored as app-env-base.
  • The database connection string and the storage account name/key are also stored as individual Key Vault secrets.
  • At deploy time the VM fetches app-env from Key Vault using its managed identity (via the instance metadata service) and writes app.env; the container then starts with --env-file. No application secret is copied over SSH.

Observability (Application Insights / Log Analytics)

Telemetry and resource logs land in a single Log Analytics workspace, with cost kept predictable by design:

  • Application Insights is workspace-based; its connection string is stored in Key Vault (appinsights-connection-string) and injected into the application environment, so the app reports requests, dependencies, exceptions, and traces.
  • Diagnostic settings forward PostgreSQL and Key Vault logs/metrics into the same workspace.
  • A free Application Insights workbook (<prefix> observability) charts requests, top exceptions, and recent traces with ready-made KQL queries.
  • A daily ingestion cap keeps the bill modest: log_max_total_gb (default 3) is enforced as daily_quota_gb = log_max_total_gb / retention_days. Set it to -1 to disable the cap. (Azure's minimum workspace retention is 30 days, so the cap limits ingestion rate rather than deleting old data row by row.)

Remote state

Terraform state is stored remotely in Azure Storage so that it survives ephemeral CI runners, is shared across machines, and is protected against concurrent writes (the azurerm backend takes a blob lease, which stops two terraform apply runs from corrupting the state at the same time).

The backend is bootstrapped automatically by the pipeline — no manual setup, no extra file or secret:

  • versions.tf declares an empty backend "azurerm" {}; the concrete settings are injected at init time via -backend-config.
  • On every run the workflow ensures the backing resources exist (create-if-missing, idempotent):
    • Resource group tfstate-rg
    • Storage account named <environment_name><suffix>, where the suffix is derived deterministically from the subscription ID (globally unique yet stable across runs)
    • Container tfstate, state stored under the key docker2azure.tfstate
  • The state storage account has blob versioning and soft delete (7 days) enabled, so a bad apply can be recovered.

To work against the same remote state locally, initialise with the matching backend settings:

terraform init \
  -backend-config=resource_group_name=tfstate-rg \
  -backend-config=storage_account_name=<the-account-name> \
  -backend-config=container_name=tfstate \
  -backend-config=key=docker2azure.tfstate

Automation toggles

Feature Variables What it does
VM daily schedule vm_schedule_enabled, vm_schedule_start_time, vm_schedule_stop_time, vm_schedule_timezone Automation runbooks + schedules that start/stop the VM daily to save credits.
Manual VM snapshot vm_snapshot_runbook_enabled Deploys the *-snapshot runbook for on-demand OS-disk snapshots.
Snapshot cleanup vm_snapshot_cleanup_enabled, vm_snapshot_retention_days, vm_snapshot_cleanup_time, vm_snapshot_cleanup_timezone Scheduled runbook that deletes snapshots older than the retention window.
PostgreSQL on-demand backup db_backup_enabled, db_backup_time, db_backup_timezone Runbook + schedule that calls the Flexible Server REST API for an extra daily backup.

Set the boolean flags to false when you do not need a capability; Terraform skips the related Automation modules, runbooks, schedules, and job bindings.

Prerequisites

You do not need Terraform or the Azure CLI installed to use this — the pipeline provisions the remote state, the infrastructure, and the application end to end. The only requirements are:

  • An Azure subscription with the deployment identity configured (OIDC federated credentials) and the repository's deployment variables/secrets set.
  • An SSH public key (ed25519 or RSA), which becomes the only authentication method for the VM.

Everything else (state bootstrap, resource creation, secret distribution, container deploy) happens automatically on each run.

Running it locally (optional)

The pipeline already does all of this; you only need the steps below if you want to drive Terraform yourself. They require Terraform >= 1.5 and an az login session.

# 1) provide values
cp terraform.tfvars.example terraform.tfvars   # then edit: environment_name, location,
                                               # admin_ssh_public_key, db_admin_password, ...

# 2) initialise against the shared remote state (see "Remote state" above) and apply
terraform init \
  -backend-config=resource_group_name=tfstate-rg \
  -backend-config=storage_account_name=<the-account-name> \
  -backend-config=container_name=tfstate \
  -backend-config=key=docker2azure.tfstate
terraform plan
terraform apply

Outputs and what to do with them

  • resource_group_name – Scope Azure CLI commands after deployment.
  • vm_public_ip / ssh_connection_string – Connect to the VM.
  • database_fqdn / database_connection_string – Configure your application. The connection string uses TLS (sslmode=require).
  • storage_account_name – Available only when blob_storage_enabled = true.
  • key_vault_name – The Key Vault that holds the application secrets (including the Application Insights connection string).

Operations

Almost everything is automatic or configuration-driven — there are no manual post-deploy steps:

  • VM scheduling, snapshot cleanup, and PostgreSQL backups run on their own once enabled via the automation toggles above.
  • Changing the infrastructure (firewall CIDRs, VM size, automation toggles) means editing the Terraform variables; the next deploy reconciles everything, including the matching database firewall rule.
  • Operator-initiated actions are taking an on-demand VM snapshot via the *-snapshot runbook (when enabled) and triggering a rollback of the container or the infrastructure via the rollback workflow (see Rollback below).

GitHub Actions integration

The sync/... branches used by deployment automation are temporary delivery branches, not feature branches. Before any important infrastructure change, update the affected README or .md files (Terraform variables, deployment flow, required secrets, operational runbooks).

Pull request validation

Every pull request targeting main runs .github/workflows/pr-validation.yml:

  1. terraform fmt -check and terraform validate always run (no cloud credentials required).
  2. When Azure access is configured, it also runs terraform plan against the live remote state.
  3. The validation output (and the plan, when produced) is published as a build artifact.

Security scanning

.github/workflows/security-scan.yml runs Trivy on every push and pull request:

  • IaC misconfiguration scan of the Terraform (fails the job on CRITICAL/HIGH findings; an accepted baseline is documented in .trivyignore).
  • Secret scan of the working tree.
  • Results are also uploaded as SARIF to the GitHub Security tab where Advanced Security is available.

Continuous deployment

.github/workflows/deploy-from-sync.yml runs on a short-lived sync/... branch that carries a sync-bundle/ directory with the application artifacts and Dockerfile. The job:

  1. Logs in to Azure via OIDC and ensures the remote state backend exists.
  2. On a brand-new environment, adopts any pre-existing Azure resources into state; on a populated state this step is skipped.
  3. Runs terraform plan -out=tfplan, publishes the plan as an artifact, and applies exactly that plan.
  4. Builds and pushes the container image, publishes the assembled app-env to Key Vault, and has the VM load it via its managed identity.
  5. Redeploys the container over SSH and always deletes the temporary NSG rule and the sync/... branch when it finishes.
  6. Records two pointers in Key Vault — app-image-current and app-image-previous — so a rollback always knows the last known-good image (see Rollback below).

Rollback

.github/workflows/rollback.yml is a manual (workflow_dispatch) workflow for incident response. It has a target (either container or terraform) and an apply switch so you can preview first and apply only after review:

  • Container — redeploys a previous, immutable image tag on the VM. The container registry already stores every image; by default the rollback uses app-image-previous from Key Vault (the last known-good tag), or you can pass an explicit image_tag. The application configuration on the VM is left untouched — only the image is swapped.
  • Terraform — re-applies the infrastructure code from a known-good git_ref (SHA, tag or branch) against the live remote state. A saved plan is deliberately not used for rollback because it goes stale as soon as the state changes; the canonical, reproducible description of the infrastructure is the git commit, and the state account's versioning + soft delete are the safety net.

Refer to AUTOMATION.md for the full automation playbook, including required secrets/variables and how the application and infrastructure repositories coordinate.

Repository layout

.
├── main.tf                # Root module: resource group + module wiring + Key Vault secrets
├── variables.tf           # Input variables with defaults and docs
├── locals.tf              # Naming helpers
├── outputs.tf             # Connection details for operators and CI
├── moved.tf               # State `moved` blocks mapping resources to their module addresses
├── providers.tf / versions.tf  # Providers + remote azurerm backend declaration
├── modules/               # network, compute, database, automation, storage, keyvault, monitoring
├── .github/workflows/     # pr-validation.yml, security-scan.yml, deploy-from-sync.yml, rollback.yml
├── scripts/tfvars_meta.py # Utility used by CI to read tfvars metadata
├── .trivyignore           # Accepted security-scan baseline
├── terraform.tfvars.example
├── README.md
└── AUTOMATION.md

About

Terraform blueprint for a cost-optimized Azure stack (VM + PostgreSQL). Features automated start/stop schedules, self-service backups, and GitHub Actions CI/CD.

Topics

Resources

Stars

Watchers

Forks

Contributors