Skip to content

POps-Rox/terraform-az-overlays-datafactory

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

35 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

CI License: MIT PRs Welcome Maintained Terraform

Azure Data Factory Overlay

Changelog Notice MIT License TF Registry

This Overlay terraform module can create a Azure Data Factory and manage related parameters to be used in a SCCA compliant Network.

SCCA Compliance

This module can be SCCA compliant and can be used in a SCCA compliant Network. Enable private endpoints and SCCA compliant network rules to make it SCCA compliant.

For more information, please read the SCCA documentation.

Contributing

If you want to contribute to this repository, feel free to to contribute to our Terraform module.

More details are available in the CONTRIBUTING.md file.

Using Azure Clouds

Since this module is built for both public and us government clouds. The environment variable defaults to public for Azure Cloud. When using this module with the Azure Government Cloud, you must set the environment variable to usgovernment. You will also need to set the azurerm provider environment variable to the proper cloud as well. This will ensure that the correct Azure Government Cloud endpoints are used. You will also need to set the location variable to a valid Azure Government Cloud location.

Example Usage for Azure Government Cloud:

provider "azurerm" {
  environment = "usgovernment"
}

module "overlays-datafactory" {
  source  = "POps-Rox/tf-az-overlays-datafactory/azurerm"
  version = "2.0.0"
  
  location = "usgovvirginia"
  environment = "usgovernment"
  ...
}

Resources Used

Module Usage

# Azurerm Provider configuration
provider "azurerm" {
  features {}
}

module "mod_datafactory" {
  source  = "POps-Rox/tf-az-overlays-datafactory/azurerm"
  version = "x.x.x"
  
  # By default, this module will create a resource group and 
  # provide a name for an existing resource group. If you wish 
  # to use an existing resource group, change the option 
  # to "create_datafactory_resource_group = false." The location of the group 
  # will remain the same if you use the current resource.
  existing_resource_group_name = azurerm_resource_group.datafactory_rg.name
  location                     = module.mod_azure_region_lookup.location_cli
  environment                  = var.environment
  deploy_environment           = var.deploy_environment
  org_name                     = var.org_name
  workload_name                = var.workload_name

}

Optional Features

Data Factory Overlay has optional features that can be enabled by setting parameters on the deployment.

Create resource group

By default, this module will create a resource group and the name of the resource group to be given in an argument existing_resource_group_name. If you want to use an existing resource group, specify the existing resource group name, and set the argument to create_datafactory_resource_group = false.

If you are using an existing resource group, then this module uses the same resource group location to create all resources in this module.

Azure Runtime

This module can be used with the Azure Data Factory Integration Runtime Azure. To use this feature, you must specify the azure_integration_runtime variable. The azure_integration_runtime variable is a map object that defines any Azure Integration Runtime nodes that are required. The key of each object is the name of a new node.

# Azurerm Provider configuration
provider "azurerm" {
  features {}
}

module "mod_datafactory" {
  source  = "POps-Rox/tf-az-overlays-datafactory/azurerm"
  version = "x.x.x

  ---Left out for brevity---

azure_integration_runtime = {
  az-ir-co-01 {
    "compute_type" .  = "ComputeOptimized"
    "cleanup_enabled" = true
    core_count        = 16
  },
  az-ir-gen-01 {},
  az-ir-gen-02 {},
}

}

Self Hosted Runtime

This module can be used with the Azure Data Factory Integration Runtime Self Hosted. To use this feature, you must specify the selfhosted_integration_runtime variable. The selfhosted_integration_runtime variable is a map object that defines any Self Hosted Integration Runtime nodes that are required. The key of each object is the name of a new node.

# Azurerm Provider configuration
provider "azurerm" {
  features {}
}

module "mod_datafactory" {
  source  = "POps-Rox/tf-az-overlays-datafactory/azurerm"
  version = "x.x.x

  ---Left out for brevity---

selfhosted_integration_runtime = {
  sh-ir-co-01 {
    "description" = "Self Hosted Integration Runtime"  
  },
  sh-ir-gen-01 {},
  sh-ir-gen-02 {},
}

}

Azure SSIS Runtime

This module can be used with the Azure Data Factory Integration Runtime Azure SSIS. To use this feature, you must specify the azure_ssis_integration_runtime variable. The azure_ssis_integration_runtime variable is a map object that defines any Azure SSIS Integration Runtime nodes that are required. The key of each object is the name of a new node.

# Azurerm Provider configuration
provider "azurerm" {
  features {}
}

module "mod_datafactory" {
  source  = "POps-Rox/tf-az-overlays-datafactory/azurerm"
  version = "x.x.x

  ---Left out for brevity---

azure_ssis_integration_runtime = {
  az-ssis-ir-co-01 {
    "node_size"                        = "Standard_D4_v3"
    "number_of_nodes"                  = 1
    "max_parallel_executions_per_node" = 1
    "edition"                          = "Standard"
    "license_type"                     = "LicenseIncluded"
  },
  az-ssis-ir-gen-01 {},
  az-ssis-ir-gen-02 {},
}

}

Private Endpoint

This module can be used with the Private Endpoint Module to create private endpoints for the Data Factory. To use this module with private endpoints, you must set the enable_private_endpoint variable to true. You must also provide the existing_virtual_network_name and existing_private_subnet_name variables. This will create a private endpoint connection to the Data Factory. You can also provide the existing_dev_private_dns_zone and existing_sql_private_dns_zone variables to use existing private DNS zones for the Data Factory. If you do not provide these variables, the module will create private DNS zones for the Data Factory workspace.

# Azurerm Provider configuration
provider "azurerm" {
  features {}
}

module "mod_datafactory" {
  source  = "POps-Rox/tf-az-overlays-datafactory/azurerm"
  version = "x.x.x"

  # The following variables are used to create a private endpoint connection
  enable_private_endpoint       = true
  existing_virtual_network_name = azurerm_virtual_network.datafactory_vnet.name
  existing_private_subnet_name  = azurerm_subnet.datafactory_subnet.name
  existing_dev_private_dns_zone = "privatelink.dev.azuredatafactory.net"
  existing_sql_private_dns_zone = "privatelink.sql.azuredatafactory.net"
}

Resource Locks

This module can be used with the Resource Lock Module to create resource locks for the Synapse workspace.

Recommended naming and tagging conventions

Applying tags to your Azure resources, resource groups, and subscriptions to logically organize them into a taxonomy. Each tag consists of a name and a value pair. For example, you can apply the name Environment and the value Production to all the resources in production. For recommendations on how to implement a tagging strategy, see Resource naming and tagging decision guide.

Important : Tag names are case-insensitive for operations. A tag with a tag name, regardless of the casing, is updated or retrieved. However, the resource provider might keep the casing you provide for the tag name. You'll see that casing in cost reports. Tag values are case-sensitive.

An effective naming convention assembles resource names by using important resource information as parts of a resource's name. For example, using these recommended naming conventions, a public IP resource for a production SharePoint workload is named like this: pip-sharepoint-prod-westus-001.

Requirements

Name Version
terraform >= 1.9
popsrox-utils ~> 1.0.4
azurerm ~> 3.116

Providers

Name Version
popsrox-utils ~> 1.0.4
azurerm ~> 3.116

Modules

Name Source Version
mod_azure_region_lookup POps-Rox/overlays-azregions-lookup/azurerm ~> 1.0.0
mod_scaffold_rg POps-Rox/overlays-resource-group/azurerm ~> 1.0.1

Resources

Name Type
azurerm_data_factory.main_data_factory resource
azurerm_data_factory_integration_runtime_azure.integration_runtime resource
azurerm_data_factory_integration_runtime_azure_ssis.integration_runtime resource
azurerm_data_factory_integration_runtime_self_hosted.integration_runtime resource
azurerm_management_lock.data_factory_level_lock resource
azurerm_management_lock.resource_group_level_lock resource
azurerm_private_dns_a_record.a_rec resource
azurerm_private_dns_zone.dns_zone resource
azurerm_private_dns_zone_virtual_network_link.vnet_link resource
azurerm_private_endpoint.pep resource
popsrox_resource_name.data_factory_name data source
azurerm_client_config.current data source
azurerm_private_endpoint_connection.pip data source
azurerm_resource_group.rgrp data source
azurerm_subnet.snet data source
azurerm_virtual_network.vnet data source

Inputs

Name Description Type Default Required
add_tags Map of custom tags. map(string) {} no
azure_devops_configuration Azure DevOps configuration for data factory. See documentation at https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/data_factory#vsts_configuration map(string) null no
azure_integration_runtime Map Object to define any Azure Integration Runtime nodes that required.
key of each object is the name of a new node.
configuration parameters within the object allow customisation.
EXAMPLE:
azure_integration_runtime = {
az-ir-co-01 {
"compute_type" . = "ComputeOptimized"
"cleanup_enabled" = true
core_count = 16
},
az-ir-gen-01 {},
az-ir-gen-02 {},
}
map(object({
description = optional(string, "Azure Integrated Runtime")
compute_type = optional(string, "General")
virtual_network_enabled = optional(string, true)
core_count = optional(number, 8)
time_to_live_min = optional(number, 0)
cleanup_enabled = optional(bool, true)
}))
{} no
azure_ssis_integration_runtime Map Object to define any Azure SSIS Integration Runtime nodes that required.
key of each object is the name of a new node.
configuration parameters within the object allow customisation.
EXAMPLE:
azure_ssis_integration_runtime = {
az-ssis-ir-co-01 {
"node_size" = "Standard_D4_v3"
"number_of_nodes" = 1
"max_parallel_executions_per_node" = 1
"edition" = "Standard"
"license_type" = "LicenseIncluded"
},
az-ssis-ir-gen-01 {},
az-ssis-ir-gen-02 {},
}
map(object({
description = optional(string, "Azure SSIS Integration Runtime")
node_size = optional(string, "Standard_D4_v3")
number_of_nodes = optional(number, 1)
max_parallel_executions_per_node = optional(number, 1)
edition = optional(string, "Standard")
license_type = optional(string, "LicenseIncluded")
}))
{} no
create_data_factory_resource_group Create a resource group for the data factory. If set to false, the existing_resource_group_name variable must be set. Default is false. bool false no
custom_data_factory_name Custom name of the Data Factory, generated if not set. string null no
custom_resource_group_name Custom name of the resource group, generated if not set. string null no
default_tags_enabled Option to enable or disable default tags. bool true no
deploy_environment Name of the workload's environment string n/a yes
enable_private_endpoint Manages a Private Endpoint to Azure Container Registry. Default is false. bool false no
enable_resource_locks (Optional) Enable resource locks, default is false. If true, resource locks will be created for the resource group and the storage account. bool false no
environment The Terraform backend environment e.g. public or usgovernment string n/a yes
existing_private_dns_zone Name of the existing private DNS zone any null no
existing_private_subnet_name Name of the existing subnet for the private endpoint any null no
existing_resource_group_name The name of the existing resource group to use. If not set, the name will be generated using the org_name, workload_name, deploy_environment and environment variables. string null no
existing_virtual_network_name Name of the virtual network for the private endpoint any null no
github_configuration Github configuration for data factory. See documentation at https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/data_factory#github_configuration map(string) null no
global_parameters Global parameters for data factory. See documentation at https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/data_factory#global_parameter list(map(string)) [] no
identity_ids Specifies a list of User Assigned Managed Identity IDs to be assigned to this Storage Account. list(string) null no
identity_type Specifies the type of Managed Service Identity that should be configured on this Storage Account. Possible values are SystemAssigned, UserAssigned, SystemAssigned, UserAssigned (to enable both). string "SystemAssigned" no
integration_runtime_custom_name Name of the integration_runtime resource string null no
location Azure region in which instance will be hosted string n/a yes
lock_level (Optional) id locks are enabled, Specifies the Level to be used for this Lock. string "CanNotDelete" no
managed_virtual_network_enabled True to enable managed virtual network bool true no
name_prefix Optional prefix for the generated name string "" no
name_suffix Optional suffix for the generated name string "" no
org_name Name of the organization string n/a yes
public_network_enabled True to make data factory visible to the public network bool false no
selfhosted_integration_runtime Map Object to define any Self Hosted Integration Runtime nodes that required.
key of each object is the name of a new node.
configuration parameters within the object allow customisation.
EXAMPLE:
selfhosted_integration_runtime = {
sh-ir-co-01 {
"description" = "Self Hosted Integration Runtime"
},
sh-ir-gen-01 {},
sh-ir-gen-02 {},
}
map(object({
description = optional(string, "Self Hosted Integration Runtime")
}))
{} no
use_location_short_name Use short location name for resources naming (ie eastus -> eus). Default is true. If set to false, the full cli location name will be used. if custom naming is set, this variable will be ignored. bool true no
use_naming Use the Azure NoOps naming provider to generate default resource name. storage_account_custom_name override this if set. Legacy default name is used if this is set to false. bool true no
workload_name Name of the workload_name string n/a yes

Outputs

Name Description
data_factory_id Data factory id
data_factory_location Data factory location
data_factory_managed_identity Type of managed identity
data_factory_name Data factory name
data_factory_resource_group_name Data factory resource group name
global_paramaters A map showing any created Global Parameters.