Deploy Azure Arc-enabled SQL Managed Instance in directly connected mode on Kubeadm Kubernetes cluster with Azure provider using an ARM Template

The following Jumpstart scenario will guide you on how to deploy a “Ready to Go” environment so you can start using Azure Arc-enabled data services and SQL Managed Instance deployed on Kubeadm Kubernetes cluster.

By the end of this scenario, you will have a Kubeadm Kubernetes cluster deployed with an Azure Arc Data Controller and a Microsoft Windows Server 2022 (Datacenter) Azure client VM, installed & pre-configured with all the required tools needed to work with Azure Arc-enabled data services.

NOTE: Currently, Azure Arc-enabled data services with PostgreSQL is in public preview.

Prerequisites

  • Clone the Azure Arc Jumpstart repository

    git clone https://github.com/microsoft/azure_arc.git
    
  • Install or update Azure CLI to version 2.42.0 and above. Use the below command to check your current installed version.

    az --version
    
  • Generate a new SSH key pair or use an existing one (Windows 10 and above now comes with a built-in ssh client).

    ssh-keygen -t rsa -b 4096
    

    To retrieve the SSH public key after it’s been created, depending on your environment, use one of the below methods:

    • In Linux, use the cat ~/.ssh/id_rsa.pub command.
    • In Windows (CMD/PowerShell), use the SSH public key file that by default, is located in the C:\Users\WINUSER/.ssh/id_rsa.pub folder.

    SSH public key example output:

    ssh-rsa o1djFhyNe5NXyYk7XVF7wOBAAABgQDO/QPJ6IZHujkGRhiI+6s1ngK8V4OK+iBAa15GRQqd7scWgQ1RUSFAAKUxHn2TJPx/Z/IU60aUVmAq/OV9w0RMrZhQkGQz8CHRXc28S156VMPxjk/gRtrVZXfoXMr86W1nRnyZdVwojy2++sqZeP/2c5GoeRbv06NfmHTHYKyXdn0lPALC6i3OLilFEnm46Wo+azmxDuxwi66RNr9iBi6WdIn/zv7tdeE34VAutmsgPMpynt1+vCgChbdZR7uxwi66RNr9iPdMR7gjx3W7dikQEo1djFhyNe5rrejrgjerggjkXyYk7XVF7wOk0t8KYdXvLlIyYyUCk1cOD2P48ArqgfRxPIwepgW78znYuwiEDss6g0qrFKBcl8vtiJE5Vog/EIZP04XpmaVKmAWNCCGFJereRKNFIl7QfSj3ZLT2ZXkXaoLoaMhA71ko6bKBuSq0G5YaMq3stCfyVVSlHs7nzhYsX6aDU6LwM/BTO1c= user@pc
    
  • Create Azure service principal (SP). To deploy this scenario, an Azure service principal assigned with the following Role-based access control (RBAC) is required:

    • “Owner” - Required for provisioning Azure resources, interact with Azure Arc-enabled data services billing, monitoring metrics, and logs management and creating role assignment for the Monitoring Metrics Publisher role.

      To create it login to your Azure account run the below command (this can also be done in Azure Cloud Shell.

      az login
      subscriptionId=$(az account show --query id --output tsv)
      az ad sp create-for-rbac -n "<Unique SP Name>" --role "Owner" --scopes /subscriptions/$subscriptionId
      

      For example:

      az login
      subscriptionId=$(az account show --query id --output tsv)
      az ad sp create-for-rbac -n "JumpstartArcDataSvc" --role "Owner" --scopes /subscriptions/$subscriptionId
      

      Output should look like this:

      {
      "appId": "XXXXXXXXXXXXXXXXXXXXXXXXXXXX",
      "displayName": "JumpstartArcDataSvc",
      "password": "XXXXXXXXXXXXXXXXXXXXXXXXXXXX",
      "tenant": "XXXXXXXXXXXXXXXXXXXXXXXXXXXX"
      }
      

      NOTE: The Jumpstart scenarios are designed with as much ease of use in-mind and adhering to security-related best practices whenever possible. It is optional but highly recommended to scope the service principal to a specific Azure subscription and resource group as well considering using a less privileged service principal account

Automation Flow

For you to get familiar with the automation and deployment flow, below is an explanation.

  • User is editing the ARM template parameters file (1-time edit) and export the Azure Custom Location Resource Provider (RP) Object ID (OID) variable to use it as a parameter. These parameters values are being used throughout the deployment.

  • Main azuredeploy ARM template will initiate the deployment of the linked ARM templates:

    • VNET - Deploys a VNET and Subnet for Client and K8s VMs.
    • ubuntuKubeadm - Deploys two Ubuntu Linux VMs which will be transformed into a
    • Kubeadm management cluster (a single control-plane and a single Worker node) using the installKubeadm and the installKubeadmWorker shell scripts. This Kubeadm cluster will be used by the rest of the Azure Arc-enabled data services automation deploy.
    • clientVm - Deploys the client Windows VM. This is where all user interactions with the environment are made from.
    • mgmtStagingStorage - Used for staging files in automation scripts.
    • logAnalytics - Deploys Azure Log Analytics workspace to support Azure Arc-enabled data services logs uploads.
  • User remotes into client Windows VM, which automatically kicks off the DataServicesLogonScript PowerShell script that creates a new Azure Arc-enabled Kubernetes cluster and configure Azure Arc-enabled data services on the kubeadm workload cluster including the Data Controller. Azure Arc-enabled data services deployed in directly connected are using this type of resource in order to deploy the data services cluster extension as well as for using Azure Arc Custom Location.

  • In addition to deploying the data controller and SQL Managed Instance, the sample AdventureWorks database will restored automatically for you as well.

Deployment

As mentioned, this deployment will leverage ARM templates. You will deploy a single template that will initiate the entire automation for this scenario.

  • The deployment is using the ARM template parameters file. Before initiating the deployment, edit the azuredeploy.parameters.json file located in your local cloned repository folder. An example parameters file is located here.

    • sshRSAPublicKey - Your SSH public key
    • spnClientId - Your Azure service principal id
    • spnClientSecret - Your Azure service principal secret
    • spnTenantId - Your Azure tenant id
    • windowsAdminUsername - Client Windows VM Administrator name
    • windowsAdminPassword - Client Windows VM Password. Password must have 3 of the following: 1 lower case character, 1 upper case character, 1 number, and 1 special character. The value must be between 12 and 123 characters long.
    • logAnalyticsWorkspaceName - Unique name for the deployment log analytics workspace.
    • deploySQLMI - Boolean that sets whether or not to deploy SQL Managed Instance, for this Azure Arc-enabled SQL Managed Instance scenario we will set it to true.
    • SQLMIHA - Boolean that sets whether or not to deploy SQL Managed Instance with high-availability (business continuity) configurations, set this to either true or false.
    • deployPostgreSQL - Boolean that sets whether or not to deploy PostgreSQL, for this scenario we leave it set to false.
    • deployBastion - Choice (true | false) to deploy Azure Bastion or not to connect to the client VM.
    • bastionHostName - Azure Bastion host name.
  • You will also need to get the Azure Custom Location Resource Provider (RP) Object ID (OID) and export it as an environment variable. This is required to enable Custom Location on your cluster.

    NOTE: You need permissions to list all the service principals.

    Option 1: Bash

    customLocationRPOID=$(az ad sp list --filter "displayname eq 'Custom Locations RP'" --query "[?appDisplayName=='Custom Locations RP'].id" -o tsv)
    

    Option 2: PowerShell

    $customLocationRPOID=(az ad sp list --filter "displayname eq 'Custom Locations RP'" --query "[?appDisplayName=='Custom Locations RP'].id" -o tsv)
    
  • To deploy the ARM template, navigate to the local cloned deployment folder and run the below command:

    az group create --name <Name of the Azure resource group> --location <Azure Region>
    az deployment group create \
    --resource-group <Name of the Azure resource group> \
    --name <The name of this deployment> \
    --template-uri https://raw.githubusercontent.com/microsoft/azure_arc/main/azure_arc_data_jumpstart/kubeadm/azure/ARM/azuredeploy.json \
    --parameters <The _azuredeploy.parameters.json_ parameters file location> \
    --parameters customLocationRPOID="$customLocationRPOID"
    

    NOTE: Make sure that you are using the same Azure resource group name as the one you’ve just used in the azuredeploy.parameters.json file

    For example:

    az group create --name Arc-Data-Demo --location "East US"
    az deployment group create \
    --resource-group Arc-Data-Demo \
    --name arcdatademo \
    --template-uri https://raw.githubusercontent.com/microsoft/azure_arc/main/azure_arc_data_jumpstart/kubeadm/azure/ARM/azuredeploy.json \
    --parameters customLocationRPOID="$customLocationRPOID" \
    --parameters azuredeploy.parameters.json
    

    NOTE: The deployment time for this scenario can take ~15-20min

  • Once Azure resources has been provisioned, you will be able to see it in Azure portal.

    Screenshot showing ARM template deployment completed

    Screenshot showing the new Azure resource group with all resources

Windows Login & Post Deployment

Various options are available to connect to Arc-Data-Client VM, depending on the parameters you supplied during deployment.

  • RDP - available after configuring access to port 3389 on the Arc-Data-Client-NSG, or by enabling Just-in-Time access (JIT).
  • Azure Bastion - available if true was the value of your deployBastion parameter during deployment.

Connecting directly with RDP

By design, port 3389 is not allowed on the network security group. Therefore, you must create an NSG rule to allow inbound 3389.

  • Open the Arc-Data-Client-NSG resource in Azure portal and click “Add” to add a new rule.

    Screenshot showing Arc-Data-Client-NSG with blocked RDP

    Screenshot showing adding a new inbound security rule

  • Specify the IP address that you will be connecting from and select RDP as the service with “Allow” set as the action. You can retrieve your public IP address by accessing https://icanhazip.com or https://whatismyip.com.

    Screenshot showing all inbound security rule

    Screenshot showing all NSG rules after opening RDP

    Screenshot showing connecting to the VM using RDP

Connect using Azure Bastion

  • If you have chosen to deploy Azure Bastion in your deployment, use it to connect to the VM.

    Screenshot showing connecting to the VM using Bastion

    NOTE: When using Azure Bastion, the desktop background image is not visible. Therefore some screenshots in this guide may not exactly match your experience if you are connecting with Azure Bastion.

Connect using just-in-time access (JIT)

If you already have Microsoft Defender for Cloud enabled on your subscription and would like to use JIT to access the Client VM, use the following steps:

  • In the Client VM configuration pane, enable just-in-time. This will enable the default settings.

    Screenshot showing the Microsoft Defender for cloud portal, allowing RDP on the client VM

    Screenshot showing connecting to the VM using JIT

Post Deployment

  • At first login, as mentioned in the “Automation Flow” section above, the DataServicesLogonScript PowerShell logon script will start it’s run.

  • Let the script to run its course and do not close the PowerShell session, this will be done for you once completed. Once the script will finish it’s run, the logon script PowerShell session will be closed, the Windows wallpaper will change and the Azure Arc Data Controller and the SQL Managed Instance will be deployed on the cluster and be ready to use.

    Screenshot showing the PowerShell logon script run

    Screenshot showing the PowerShell logon script run

    Screenshot showing the PowerShell logon script run

    Screenshot showing the PowerShell logon script run

    Screenshot showing the PowerShell logon script run

    Screenshot showing the PowerShell logon script run

    Screenshot showing the PowerShell logon script run

    Screenshot showing the PowerShell logon script run

    Screenshot showing the PowerShell logon script run

    Screenshot showing the PowerShell logon script run

    Screenshot showing the PowerShell logon script run

    Screenshot showing the PowerShell logon script run

    Screenshot showing the PowerShell logon script run

    Screenshot showing the PowerShell logon script run

    Screenshot showing the PowerShell logon script run

    Screenshot showing the post-run desktop

  • Since this scenario is onboarding your Kubernetes cluster with Arc and deploying the Azure Arc Data Controller, you will also notice additional newly deployed Azure resources in the resources group. The important ones to notice are:

    • Custom location - provides a way for tenant administrators to use their Azure Arc-enabled Kubernetes clusters as target locations for deploying Azure services instances.

    • Azure Arc Data Controller - The data controller that is now deployed on the Kubernetes cluster.

    • Azure Arc-enabled SQL Managed Instance - The SQL Managed Instance that is now deployed on the Kubernetes cluster.

    Screenshot showing additional Azure resources in the resource group

  • As part of the automation, Azure Data Studio is installed along with the Azure Data CLI, Azure CLI, Azure Arc and the PostgreSQL extensions. Using the Desktop shortcut created for you, open Azure Data Studio and click the Extensions settings to see the installed extensions.

    Screenshot showing Azure Data Studio shortcut

    Screenshot showing Azure Data Studio extensions

  • Additionally, the SQL Managed Instance connection will be configured automatically for you. As mentioned, the sample AdventureWorks database was restored as part of the automation.

    Screenshot showing Azure Data Studio SQL MI connection

Cluster extensions

In this scenario, two Azure Arc-enabled Kubernetes cluster extensions were installed:

High Availability with SQL Always-On availability groups

Azure Arc-enabled SQL Managed Instance is deployed on Kubernetes as a containerized application and uses kubernetes constructs such as stateful sets and persistent storage to provide built-in health monitoring, failure detection, and failover mechanisms to maintain service health. For increased reliability, you can also configure Azure Arc-enabled SQL Managed Instance to deploy with extra replicas in a high availability configuration.

For showcasing and testing SQL Managed Instance with Always On availability groups, a dedicated Jumpstart scenario is available to help you simulate failures and get hands-on experience with this deployment model.

Operations

Azure Arc-enabled SQL Managed Instance stress simulation

Included in this scenario, is a dedicated SQL stress simulation tool named SqlQueryStress automatically installed for you on the Client VM. SqlQueryStress will allow you to generate load on the Azure Arc-enabled SQL Managed Instance that can be done used to showcase how the SQL database and services are performing as well to highlight operational practices described in the next section.

  • To start with, open the SqlQueryStress desktop shortcut and connect to the SQL Managed Instance primary endpoint IP address. This can be found in the SQLMI Endpoints text file desktop shortcut that was also created for you alongside the username and password you used to deploy the environment.

    Screenshot showing opened SqlQueryStress

    Screenshot showing SQLMI Endpoints text file

NOTE: Secondary SQL Managed Instance endpoint will be available only when using the HA deployment model (“Business Critical”).

  • To connect, use “SQL Server Authentication” and select the deployed sample AdventureWorks database (you can use the “Test” button to check the connection).

    Screenshot showing SqlQueryStress connected

  • To generate some load, we will be running a simple stored procedure. Copy the below procedure and change the number of iterations you want it to run as well as the number of threads to generate even more load on the database. In addition, change the delay between queries to 1ms for allowing the stored procedure to run for a while.

    exec [dbo].[uspGetEmployeeManagers] @BusinessEntityID = 8
    
  • As you can see from the example below, the configuration settings are 100,000 iterations, five threads per iteration, and a 1ms delay between queries. These configurations should allow you to have the stress test running for a while.

    Screenshot showing SqlQueryStress settings

    Screenshot showing SqlQueryStress running

Azure Arc-enabled SQL Managed Instance monitoring using Grafana

When deploying Azure Arc-enabled data services, a Grafana instance is also automatically deployed on the same Kubernetes cluster and include built-in dashboards for both Kubernetes infrastructure as well SQL Managed Instance monitoring (PostgreSQL dashboards are included as well but we will not be covering these in this section).

  • Now that you have the SqlQueryStress stored procedure running and generating load, we can look how this is shown in the the built-in Grafana dashboard. As part of the automation, a new URL desktop shortcut simply named “Grafana” was created.

    Screenshot showing Grafana desktop shortcut

  • [Optional] The IP address for this instance represents the Kubernetes LoadBalancer external IP that was provision as part of Azure Arc-enabled data services. Use the kubectl get svc -n arc command to view the metricsui external service IP address.

    Screenshot showing metricsui Kubernetes service

  • To log in, use the same username and password that is in the SQLMI Endpoints text file desktop shortcut.

    Screenshot showing Grafana username and password

  • Navigate to the built-in “SQL Managed Instance Metrics” dashboard.

    Screenshot showing Grafana dashboards

    Screenshot showing Grafana “SQL Managed Instance Metrics” dashboard

  • Change the dashboard time range to “Last 5 minutes” and re-run the stress test using SqlQueryStress (in case it was already finished).

    Screenshot showing “Last 5 minutes” time range

  • You can now see how the SQL graphs are starting to show increased activity and load on the database instance.

    Screenshot showing increased load activity

    Screenshot showing increased load activity

Exploring logs from the Client virtual machine

Occasionally, you may need to review log output from scripts that run on the Arc-Data-Client, Arc-Data-Kubeadm-MGMT-Master or Arc-Data-Kubeadm-MGMT-Worker virtual machines in case of deployment failures. To make troubleshooting easier, the scenario deployment scripts collect all relevant logs in the C:\Temp folder on Arc-Data-Client. A short description of the logs and their purpose can be seen in the list below:

Logfile Description
C:\Temp\Bootstrap.log Output from the initial bootstrapping script that runs on Arc-Data-Client.
C:\Temp\DataServicesLogonScript.log Output of DataServicesLogonScript.ps1 which configures Azure Arc-enabled data services baseline capability.
C:\Temp\DeploySQLMI.log Output of deploySQL.ps1 which deploys and configures SQL Managed Instance with Azure Arc.
C:\Temp\installKubeadm.log Output from the custom script extension which runs on Arc-Data-Kubeadm-MGMT-Master and configures the Kubeadm cluster Master Node. If you encounter ARM deployment issues with ubuntuKubeadm.json then review this log.
C:\Temp\installKubeadmWorker.log Output from the custom script extension which runs on _Arc-Data-Kubeadm-MGMT-Worker and configures the Kubeadm cluster Worker Node. If you encounter ARM deployment issues with ubuntuKubeadm.json then review this log.
C:\Temp\SQLMIEndpoints.log Output from SQLMIEndpoints.ps1 which collects the service endpoints for SQL MI and uses them to configure Azure Data Studio connection settings.

Screenshot showing the Temp folder with deployment logs

Cleanup

  • If you want to delete the entire environment, simply delete the deployment resource group from the Azure portal.

    Screenshot showing Azure resource group deletion