Deploy an Azure SQL Managed Instance on GKE using a Terraform plan

The following scanario will guide you on how to deploy a “Ready to Go” environment so you can deploy Azure Arc-enabled data services on a Google Kubernetes Engine (GKE) cluster using Terraform.

By the end of this guide, you will have a GKE cluster deployed with an Azure Arc Data Controller (in “Directly Connected” mode), Azure SQL MI with a sample database and a Microsoft Windows Server 2019 (Datacenter) GKE compute instance VM installed and pre-configured with all the required tools needed to work with Azure Arc data services:

Deployed Architecture

Note: Currently, Azure Arc-enabled data services with PostgreSQL Hyperscale is in public preview.

Deployment Process Overview

  • Create a Google Cloud Platform (GCP) project, IAM Role & Service Account
  • Download GCP credentials file
  • Clone the Azure Arc Jumpstart repository
  • Edit TF_VAR variables values
  • Export TFVAR values
  • terraform init
  • terraform apply
  • User remotes into client Windows VM, which automatically kicks off the DataServicesLogonScript PowerShell script that deploys and configures Azure Arc-enabled data services on the GKE cluster.
  • Open Azure Data Studio and connect to SQL MI instance and sample database
  • kubectl delete namespace arc
  • terraform destroy


  • Clone the Azure Arc Jumpstart repository

    git clone
  • Install or update Azure CLI to version 2.20.0 and above. Use the below command to check your current installed version.

    az --version
  • Google Cloud account with billing enabled - Create a free trial account. To create Windows Server virtual machines, you must upgraded your account to enable billing. Click Billing from the menu and then select Upgrade in the lower right.

    Screenshot showing how to enable billing on GCP account

    Screenshot showing how to enable billing on GCP account

    Screenshot showing how to enable billing on GCP account

    Disclaimer - To prevent unexpected charges, please follow the “Delete the deployment” section at the end of this README

  • Install Terraform 1.0 or higher

  • Create Azure service principal (SP)

    To be able to complete the scenario and its related automation, Azure service principal assigned with the “Contributor” role is required. To create it, login to your Azure account run the below command (this can also be done in Azure CloudShell)

    az login
    az ad sp create-for-rbac -n "<Unique SP Name>" --role contributor

    For example:

    az ad sp create-for-rbac -n "http://AzureArcData" --role contributor

    Output should look like this:

    "displayName": "AzureArcData",
    "name": "http://AzureArcData",

    Note: It is optional, but highly recommended, to scope the SP to a specific Azure subscription.

  • Create a new GCP Project, IAM Role & Service Account. In order to deploy resources in GCP, we will create a new GCP Project as well as a service account to allow Terraform to authenticate against GCP APIs and run the plan to deploy resources.

  • Browse to and login with your Google Cloud account. Once logged in, create a new project named “Azure Arc Demo”. After creating it, be sure to copy down the project id as it is usually different then the project name.

    GCP new project

    GCP new project

    GCP new project

  • Enable the Compute Engine API for the project, create a project Owner service account credentials and download the private key JSON file and copy the file to the directory where Terraform files are located. Change the JSON file name (for example account.json). The Terraform plan will be using the credentials stored in this file to authenticate against your GCP project.

    Enable Compute Engine API

    Enable Compute Engine API

    Add credentials

    Add credentials

    Add credentials

    Add credentials

    Add credentials

    Create private key

    Create private key

    Create private key

    Create private key


  • Enable the Kubernetes Engine API for the project

    Enable the Kubernetes Engine API

    Enable the Kubernetes Engine API

Automation Flow

Read the below explanation to get familiar with the automation and deployment flow.

  • User edits and exports TF_VAR Terraform runtime environment variables (1-time edit). The variable values are used throughout the deployment.

  • User deploys the Terraform plan which will deploy a GKE cluster and compute instance VM as well as an Azure resource group. The Azure resource group is required to host the Azure Arc services such as the Azure Arc-enabled Kubernetes cluster, the custom location, the Azure Arc data controller, and the SQL MI database service.

  • In addition, the plan will copy the local_ssd_sc.yaml file which will be used to create a Kubernetes Storage Class backed by SSD disks. These disks will be used by Azure Arc Data Controller to create persistent volume claims (PVC).

    Note: Depending on the GCP region, make sure you do not have any SSD quota limit in the region, otherwise, the Azure Arc Data Controller kubernetes resources will fail to deploy.

  • As part of the Windows Server 2019 VM deployment, there are 4 script executions:

    1. azure_arc.ps1 script will be created automatically as part of the Terraform plan runtime and is responsible for injecting the TF_VAR variable values as environment variables on the Windows instance which will then be used in both the ClientTools and the LogonScript scripts.

    2. password_reset.ps1 script will be created automatically as part of the Terraform plan runtime and is responsible for creating the Windows username and password.

    3. Bootstrap.ps1 script will run during Terraform plan runtime and will:

      • Create the Bootstrap.log file
      • Install the required tools – az cli, az cli Powershell module, kubernetes-cli, Visual C++ Redistributable (Chocolaty packages)
      • Download Azure Data Studio & Azure Data CLI
      • Disable Windows Server Manager, remove Internet Explorer, disable Windows Firewall
      • Download the DataServicesLogonScript.ps1 PowerShell script
      • Create the Windows schedule task to run the DataServicesLogonScript at first login
    4. DataServicesLogonScript.ps1 script will run on first login to Windows and will:

      • Create the DataServicesLogonScript.log file
      • Install the Azure Data Studio Azure Data CLI, Azure Arc and PostgreSQL extensions
      • Create the Azure Data Studio desktop shortcut
      • Apply the local_ssd_sc.yaml file on the GKE cluster
      • Use Azure CLI to connect the GKE cluster to Azure as an Azure Arc-enabled Kubernetes cluster
      • Create a custom location for use with the Azure Arc-enabled Kubernetes cluster
      • Open another Powershell session which will execute a command to watch the deployed Azure Arc Data Controller Kubernetes pods
      • Deploy an ARM template that will deploy the Azure Arc data controller on the GKE cluster
      • Execute a secondary DeploySQLMI.ps1 script which will configure the SQL MI instance, download and install the sample Adventureworks database, and configure Azure Data Studio to connect to the SQL MI database instance
      • Unregister the logon script Windows scheduler task so it will not run after first login


As mentioned, the Terraform plan and automation scripts will deploy a GKE cluster, the Azure Arc Data Controller on that cluster, a SQL Managed Instance with sample database, and a Windows Server 2019 Client GCP compute instance.

  • Before running the Terraform plan, edit the below TF_VAR values and export it (simply copy/paste it after you finished edit these). An example TF_VAR shell script file is located here

    Terraform vars export

    • export TF_VAR_gcp_project_id='Your GCP Project ID (Created in the prerequisites section)’

    • export TF_VAR_gcp_credentials_filename='Your GCP Credentials JSON filename (Created in the prerequisites section)’

    • export TF_VAR_gcp_region='GCP region where resource will be created’

    • export TF_VAR_gcp_zone='GCP zone where resource will be created’

    • export TF_VAR_gke_cluster_name='GKE cluster name’

    • export TF_VAR_admin_username='GKE cluster administrator username’

    • export TF_VAR_admin_password='GKE cluster administrator password’

    • export TF_VAR_gke_cluster_node_count='GKE cluster number of worker nodes’

    • export TF_VAR_windows_username='Windows Server Client compute instance VM administrator username’ Do not use ‘Administrator’

    • export TF_VAR_windows_password='Windows Server Client compute instance VM administrator password’ (The password must be at least 8 characters long and contain characters from three of the following four sets: uppercase letters, lowercase letters, numbers, and symbols as well as not containing the user’s account name or parts of the user’s full name that exceed two consecutive characters)

    • export TF_VAR_SPN_CLIENT_ID='Your Azure service principal name’

    • export TF_VAR_SPN_CLIENT_SECRET='Your Azure service principal password’

    • export TF_VAR_SPN_TENANT_ID='Your Azure tenant ID’

    • export TF_VAR_SPN_AUTHORITY= Do not change

    • export TF_VAR_AZDATA_USERNAME='Azure Arc Data Controller admin username’

    • export TF_VAR_AZDATA_PASSWORD='Azure Arc Data Controller admin password’ (The password must be at least 8 characters long and contain characters from three of the following four sets: uppercase letters, lowercase letters, numbers, and symbols)

    • export TF_VAR_ARC_DC_NAME='Azure Arc Data Controller name’ (The name must consist of lowercase alphanumeric characters or ‘-', and must start and end with a alphanumeric character. This name will be used for k8s namespace as well)

    • export TF_VAR_ARC_DC_SUBSCRIPTION='Azure Arc Data Controller Azure subscription ID’

    • export TF_VAR_ARC_DC_RG='Azure resource group where all future Azure Arc resources will be deployed’

    • export TF_VAR_ARC_DC_REGION='Azure location where the Azure Arc Data Controller resource will be created in Azure’ (Currently, supported regions supported are eastus, eastus2, centralus, westus2, westeurope, southeastasia)

    • export TF_VAR_deploy_SQLMI='Boolean that sets whether or not to deploy SQL Managed Instance, for this scenario we leave it set to true’

    • export TF_VAR_deploy_PostgreSQL='Boolean that sets whether or not to deploy PostgreSQL Hyperscale, for this scenario we leave it set to false’

    • export TF_VAR_templateBaseUrl='GitHub URL to the deployment template - filled in by default to point to Microsoft/Azure Arc repository, but you can point this to your forked repo as well - e.g.’

      Note: If you are running in a PowerShell environment, to set the Terraform environment variables see example below

  • Navigate to the folder that has Terraform binaries.

    cd azure_arc_data_jumpstart/gke/terraform
  • Run the terraform init command which is used to initialize a working directory containing Terraform configuration files and load the required Terraform providers.

    terraform init

  • (Optional but recommended) Run the terraform plan command to make sure everything is configured properly.

    terraform plan

  • Run the terraform apply --auto-approve command and wait for the plan to finish. Runtime for deploying all the GCP resources for this plan is ~20-30min.

  • Once completed, you can review the GKE cluster and the worker nodes resources as well as the GCP compute instance VM created.

    terraform apply completed

    GKE cluster

    GKE cluster

    GCP VM instances

    GCP VM instances

  • In the Azure Portal, a new empty Azure resource group was created which will be used for Azure Arc Data Controller and the other data services you will be deploying in the future.

    New empty Azure resource group

Windows Login & Post Deployment

Now that we have both the GKE cluster and the Windows Server Client instance created, it is time to login to the Client VM.

  • Select the Windows instance, click on the RDP dropdown and download the RDP file. Using your windows_username and windows_password credentials, log in to the VM.

    GCP Client VM RDP

    GCP Client VM RDP

  • At first login, as mentioned in the “Automation Flow” section, a logon script will get executed. This script was created as part of the automated deployment process.

    Let the script run its course and do not close the PowerShell session, this will be done for you once completed. You will notice that the Azure Arc Data Controller gets deployed on the GKE cluster. The logon script run time is approximately 10min long.

    Once the script finishes its run, the logon script PowerShell session will be closed and the Azure Arc Data Controller will be deployed on the GKE cluster and be ready to use.

    PowerShell login script run

    PowerShell login script run

    PowerShell login script run

    PowerShell login script run

    PowerShell login script run

    PowerShell login script run

  • When the scripts are complete, all PowerShell windows will close.

    PowerShell login script run

  • From Azure Portal, navigate to the resource group and confirm that the Azure Arc-enabled Kubernetes cluster, the Azure Arc data controller resource and the Custom Location resource are present.

    Azure Portal showing data controller resource

  • Another tool automatically deployed is Azure Data Studio along with the Azure Data CLI, the Azure Arc and the PostgreSQL extensions. Using the Desktop shortcut created for you, open Azure Data Studio and expand the SQL MI connection to see the Adventureworks sample database.

    Azure Data Studio shortcut

    Azure Data Studio extension

    Azure Data studio sample database

    Azure Data studio sample database


Azure Arc-enabled SQL Managed Instance Stress Simulation

Included in this scenario, is a dedicated SQL stress simulation tool named SqlQueryStress automatically installed for you on the Client VM. SqlQueryStress will allow you to generate load on the Azure Arc-enabled SQL Managed Instance that can be done used to showcase how the SQL database and services are performing as well to highlight operational practices described in the next section.

  • To start with, open the SqlQueryStress desktop shortcut and connect to the SQL Managed Instance primary endpoint IP address. This can be found in the SQLMI Endpoints text file desktop shortcut that was also created for you alongside the username and password you used to deploy the environment.

    Open SqlQueryStress

    SQLMI Endpoints text file

Note: Secondary SQL Managed Instance endpoint will be available only when using the HA deployment model (“Business Critical”).

  • To connect, use “SQL Server Authentication” and select the deployed sample AdventureWorks database (you can use the “Test” button to check the connection).

    SqlQueryStress connected

  • To generate some load, we will be running a simple stored procedure. Copy the below procedure and change the number of iterations you want it to run as well as the number of threads to generate even more load on the database. In addition, change the delay between queries to 1ms for allowing the stored procedure to run for a while.

    exec [dbo].[uspGetEmployeeManagers] @BusinessEntityID = 8
  • As you can see from the example below, the configuration settings are 100,000 iterations, five threads per iteration, and a 1ms delay between queries. These configurations should allow you to have the stress test running for a while.

    SqlQueryStress settings

    SqlQueryStress running

Azure Arc-enabled SQL Managed Instance monitoring using Grafana

When deploying Azure Arc-enabled data services, a Grafana instance is also automatically deployed on the same Kubernetes cluster and include built-in dashboards for both Kubernetes infrastructure as well SQL Managed Instance monitoring (PostgreSQL dashboards are included as well but we will not be covering these in this section).

  • Now that you have the SqlQueryStress stored procedure running and generating load, we can look how this is shown in the the built-in Grafana dashboard. As part of the automation, a new URL desktop shortcut simply named “Grafana” was created.

    Grafana desktop shortcut

  • [Optional] The IP address for this instance represents the Kubernetes LoadBalancer external IP that was provision as part of Azure Arc-enabled data services. Use the kubectl get svc -n arc command to view the metricsui external service IP address.

    metricsui Kubernetes service

  • To log in, use the same username and password that is in the SQLMI Endpoints text file desktop shortcut.

    Grafana username and password

  • Navigate to the built-in “SQL Managed Instance Metrics” dashboard.

    Grafana dashboards

    Grafana “SQL Managed Instance Metrics” dashboard

  • Change the dashboard time range to “Last 5 minutes” and re-run the stress test using SqlQueryStress (in case it was already finished).

    Last 5 minutes time range

  • You can now see how the SQL graphs are starting to show increased activity and load on the database instance.

    Increased load activity

    Increased load activity

Delete the deployment

To completely delete the environment, follow the below steps.

  • Delete the data services resources by using kubectl. Run the below command from a PowerShell window on the client VM.

    kubectl delete namespace arc

    Delete database resources

  • Use terraform to delete all of the GCP resources as well as the Azure resource group. The terraform destroy run time is approximately ~5-6min long.

    terraform destroy --auto-approve

    terraform destroy

Known Issues

  • Webhook pods go into error state, even after Data Controller/SQL MI/Postgres pods are up, caused by a known Helm-related backend issue that is being worked on. These errors can be safely ignored and do not impact the functionality of Azure Arc-enabled data services and the Jumpstart automation.

    webhook known issue