Configure disaster recovery in Azure Arc-enabled SQL Managed Instance on AKS using an ARM Template

The following Jumpstart scenario will guide you on how to deply a “Ready to Go” environment so you can configure disaster recovery using Azure Arc-enabled data services and SQL Managed Instance deployed on Azure Kubernetes Service (AKS) cluster using an Azure ARM Template.

By the end of this guide, you will have two Azure Kubernetes Service (AKS) clusters deployed in two separate Azure virtual networks with two Azure Arc-enabled SQL Managed Instances deployed on both clusters, disaster recovery architecture configured between the two sites, and a Microsoft Windows Server 2022 (Datacenter) Azure client VM, installed & pre-configured with all the required tools needed to work with Azure Arc-enabled data services:

Screenshot showing the deployed architecture

Prerequisites

  • Clone the Azure Arc Jumpstart repository

    git clone https://github.com/microsoft/azure_arc.git
    
  • Install or update Azure CLI to version 2.36.0 and above. Use the below command to check your current installed version.

    az --version
    
  • Generate SSH Key (or use existing ssh key).

  • Create Azure service principal (SP). To deploy this scenario, an Azure service principal assigned with multiple RBAC roles is required:

    • “Contributor” - Required for provisioning Azure resources

    • “Security admin” - Required for installing Cloud Defender Azure-Arc enabled Kubernetes extension and dismiss alerts

    • “Security reader” - Required for being able to view Azure-Arc enabled Kubernetes Cloud Defender extension findings

    • “Monitoring Metrics Publisher” - Required for being Azure Arc-enabled data services billing, monitoring metrics, and logs management

      To create it login to your Azure account run the below command (this can also be done in Azure Cloud Shell.

      az login
      subscriptionId=$(az account show --query id --output tsv)
      az ad sp create-for-rbac -n "<Unique SP Name>" --role "Contributor" --scopes /subscriptions/$subscriptionId
      az ad sp create-for-rbac -n "<Unique SP Name>" --role "Security admin" --scopes /subscriptions/$subscriptionId
      az ad sp create-for-rbac -n "<Unique SP Name>" --role "Security reader" --scopes /subscriptions/$subscriptionId
      az ad sp create-for-rbac -n "<Unique SP Name>" --role "Monitoring Metrics Publisher" --scopes /subscriptions/$subscriptionId
      

      For example:

      az login
      subscriptionId=$(az account show --query id --output tsv)
      az ad sp create-for-rbac -n "JumpstartArcDataSvc" --role "Contributor" --scopes /subscriptions/$subscriptionId
      az ad sp create-for-rbac -n "JumpstartArcDataSvc" --role "Security admin" --scopes /subscriptions/$subscriptionId
      az ad sp create-for-rbac -n "JumpstartArcDataSvc" --role "Security reader" --scopes /subscriptions/$subscriptionId
      az ad sp create-for-rbac -n "JumpstartArcDataSvc" --role "Monitoring Metrics Publisher" --scopes /subscriptions/$subscriptionId
      

      Output should look like this:

      {
      "appId": "XXXXXXXXXXXXXXXXXXXXXXXXXXXX",
      "displayName": "JumpstartArcDataSvc",
      "password": "XXXXXXXXXXXXXXXXXXXXXXXXXXXX",
      "tenant": "XXXXXXXXXXXXXXXXXXXXXXXXXXXX"
      }
      

      NOTE: If you create multiple subsequent role assignments on the same service principal, your client secret (password) will be destroyed and recreated each time. Therefore, make sure you grab the correct password.

      NOTE: The Jumpstart scenarios are designed with as much ease of use in-mind and adhering to security-related best practices whenever possible. It is optional but highly recommended to scope the service principal to a specific Azure subscription and resource group as well considering using a less privileged service principal account

Automation Flow

For you to get familiar with the automation and deployment flow, below is an explanation.

  • User is editing the ARM template parameters file (1-time edit). These parameters values are being used throughout the deployment.

  • Main azuredeploy ARM template will initiate the deployment of the linked ARM templates:

    • VNET - Deploys three Virtual Networks, two for each site where the clusters will be located and a third Virtual Network to be used by the Client virtual machine.

    • aks - Deploys the two AKS clusters in both sites (primary and secondary) where all the Azure Arc data services will be deployed.

    • clientVm - Deploys the client Windows VM. This is where all user interactions with the environment are made from.

    • logAnalytics - Deploys Azure Log Analytics workspace to support Azure Arc-enabled data services logs uploads.

    • User remotes into client Windows VM, which automatically kicks off the DataServicesLogonScript PowerShell script that deploy and configure Azure Arc-enabled data services on the AKS clusters including the data controllers and SQL Managed Instances, in addition to configuring disaster recovery between the two clusters.

    • In addition to deploying the data controllers and SQL Managed Instances, the sample AdventureWorks database will restored automatically for you as well on the primary cluster.

Deployment

As mentioned, this deployment will leverage ARM templates. You will deploy a single template that will initiate the entire automation for this scenario.

  • The deployment is using the ARM template parameters file. Before initiating the deployment, edit the azuredeploy.parameters.json file located in your local cloned repository folder. An example parameters file is located here.

    • sshRSAPublicKey - Your SSH public key

    • spnClientId - Your Azure service principal id

    • spnClientSecret - Your Azure service principal secret

    • spnTenantId - Your Azure tenant id

    • windowsAdminUsername - Client Windows VM Administrator name

    • windowsAdminPassword - Client Windows VM Password. Password must have 3 of the following: 1 lower case character, 1 upper case character, 1 number, and 1 special character. The value must be between 12 and 123 characters long.

    • myIpAddress - Your local public IP address. This is used to allow remote RDP and SSH connections to the client Windows VM.

    • logAnalyticsWorkspaceName - Unique name for the deployment log analytics workspace.

    • deploySQLMI - Boolean that sets whether or not to deploy SQL Managed Instance, for this Azure Arc-enabled SQL Managed Instance scenario we will set it to true.

    • SQLMIHA - Boolean that sets whether or not to deploy SQL Managed Instance with high-availability (business continuity) configurations, set this to either true or false.

    • deployBastion - Choice (true | false) to deploy Azure Bastion or not to connect to the client VM.

    • bastionHostName - Azure Bastion host name.

      **NOTE: This scenario goes through the capability to failover one Azure Arc-enabled SQL Managed Instance to another instance on a different cluster. If you would like to learn about high availability within the same cluster, you can check our Jumpstart SQL Managed Instance Availability Groups Failover scenario.

  • To deploy the ARM template, navigate to the local cloned deployment folder and run the below command:

    az group create --name <Name of the Azure resource group> --location <Azure Region>
    az deployment group create \
    --resource-group <Name of the Azure resource group> \
    --name <The name of this deployment> \
    --template-uri https://raw.githubusercontent.com/microsoft/azure_arc/main/azure_arc_data_jumpstart/aks/DR/ARM/azuredeploy.json \
    --parameters <The *azuredeploy.parameters.json* parameters file location>
    

    NOTE: Make sure that you are using the same Azure resource group name as the one you’ve just used in the azuredeploy.parameters.json file

    For example:

    az group create --name Arc-Data-Demo --location "East US"
    az deployment group create \
    --resource-group Arc-Data-Demo \
    --name arcdata \
    --template-uri https://raw.githubusercontent.com/microsoft/azure_arc/main/azure_arc_data_jumpstart/aks/DR/ARM/azuredeploy.json \
    --parameters azuredeploy.parameters.json
    

    NOTE: The deployment time for this scenario can take ~15-20min

  • Once Azure resources have been provisioned, you will be able to see them in the Azure portal. At this point, the resource group should have 11 various Azure resources deployed (If you chose to deploy Azure Bastion, you will have 12 Azure resources).

    Screenshot showing ARM template deployment completed

    Screenshot showing the new Azure resource group with all resources

Windows Login & Post Deployment

  • Now that the first phase of the automation is completed, it is time to RDP to the client VM. If you have not chosen to deploy Azure Bastion in the ARM template, RDP to the VM using its public IP.

    Screenshot showing the Client VM public IP

  • If you have chosen to deploy Azure Bastion in the ARM template, use it to connect to the VM.

    Screenshot showing connecting using Azure Bastion

  • At first login, as mentioned in the “Automation Flow” section above, the DataServicesLogonScript PowerShell logon script will start it’s run.

  • Let the script to run its course and do not close the PowerShell session, this will be done for you once completed. Once the script will finish it’s run, the logon script PowerShell session will be closed, the Windows wallpaper will change and both the Azure Arc Data Controller and SQL Managed Instance will be deployed on the cluster and be ready to use.

    Screenshot showing the PowerShell logon script run

    Screenshot showing the PowerShell logon script run

    Screenshot showing the PowerShell logon script run

    Screenshot showing the PowerShell logon script run

    Screenshot showing the PowerShell logon script run

    Screenshot showing the PowerShell logon script run

    Screenshot showing the PowerShell logon script run

    Screenshot showing the PowerShell logon script run

    Screenshot showing the PowerShell logon script run

    Screenshot showing the PowerShell logon script run

    Screenshot showing the PowerShell logon script run

    Screenshot showing the PowerShell logon script run

    Screenshot showing the PowerShell logon script run

    Screenshot showing the PowerShell logon script run

    Screenshot showing the PowerShell logon script run

    Screenshot showing the PowerShell logon script run

    Screenshot showing the PowerShell logon script run

    Screenshot showing the PowerShell logon script run

    Screenshot showing the PowerShell logon script run

    Screenshot showing the PowerShell logon script run

    Screenshot showing the PowerShell logon script run

    Screenshot showing the PowerShell logon script run

    Screenshot showing the PowerShell logon script run

    Screenshot showing the PowerShell logon script run

    Screenshot showing the PowerShell logon script run

    Screenshot showing the PowerShell logon script run

    Screenshot showing the post-run desktop

  • Since this scenario is deploying the Azure Arc Data Controller and SQL Managed Instance, you will also notice additional newly deployed Azure resources in the resources group (at this point you should have 19 various Azure resources deployed. The important ones to notice are:

    • Azure Arc-enabled Kubernetes cluster - Azure Arc-enabled data services deployed in directly connected are using this type of resource in order to deploy the data services cluster extension as well as for using Azure Arc Custom locations.

    • Custom location - provides a way for tenant administrators to use their Azure Arc-enabled Kubernetes clusters as target locations for deploying Azure services instances.

    • Azure Arc Data Controller - The data controllers that are now deployed on the Kubernetes clusters.

    • Azure Arc-enabled SQL Managed Instance - The SQL Managed Instances that are now deployed on the Kubernetes clusters.

      Screenshot showing additional Azure resources in the resource group

  • As part of the automation, Azure Data Studio is installed along with the Azure Data CLI, Azure CLI, Azure Arc and the PostgreSQL extensions. Using the Desktop shortcut created for you, open Azure Data Studio and click the Extensions settings to see the installed extensions.

    Screenshot showing Azure Data Studio shortcut

    Screenshot showing Azure Data Studio extensions

  • Additionally, the SQL Managed Instances connections will be configured automatically for you. As mentioned, the sample AdventureWorks database was restored as part of the automation on the primary instance.

    Screenshot showing Azure Data Studio SQL MI connection

Cluster extensions

In this scenario, two Azure Arc-enabled Kubernetes cluster extensions were installed:

In order to view these cluster extensions, click on the Azure Arc-enabled Kubernetes resource Extensions settings.

Screenshot showing the Azure Arc-enabled Kubernetes cluster extensions settings

Screenshot showing the Azure Arc-enabled Kubernetes installed extensions

Disaster recovery with SQL distributed availability groups

Azure Arc-enabled SQL Managed Instance is deployed on Kubernetes as a containerized application and uses Kubernetes constructs such as stateful sets and persistent storage to provide built-in health monitoring, failure detection, and failover mechanisms to maintain service health. For increased reliability, you can also configure disaster recovery between multiple Azure Arc-enabled SQL Managed Instances to be able to failover to another instance on another Kubernetes cluster. Disaster recovery is supported for both General Purpose and Business Critical tiers.

Operations

Azure Arc-enabled SQL Managed Instance distributed availability group validation

  • To be able to failover to a different cluster, a distributed availability group is created by the automation flow that spans the primary and secondary clusters. This can be validated by running the following commands:

    az sql instance-failover-group-arc show --name primarycr --use-k8s  --k8s-namespace arc
    

    Screenshot showing disaster recovery configuration

  • As part of the automation, the script will also create a new text file and a desktop shortcut named Endpoints that includes both the primary and the secondary SQL endpoints for both SQL instances.

    Screenshot showing the Azure Arc-enabled SQL Managed Instances endpoint URLs text file

  • Open Microsoft SQL Server Management Studio (SSMS) which is installed automatically for you as part of the bootstrap Jumpstart scenario and use the primary endpoint IP address for the primary cluster and login to the primary DB instance using the username and password provided in the text file mentioned above.

    Screenshot showing opening SQL Server Management Studio from the start menu

  • Use the username and password you entered when provisioned the environment and select “SQL Server Authentication”. Alternatively, you can retrieve the username and password using the $env:AZDATA_USERNAME and $env:AZDATA_PASSWORD commands.

    Screenshot showing logging into the SQL Server Management Studio

  • Connect to the secondary instance as well using the primary endpoint IP address for the secondary cluster in the in the text file mentioned above.

    Screenshot showing the SQL Server Management Studio after login

    Screenshot showing the SQL Server Management Studio after login

    Screenshot showing the two Azure Arc-enabled SQL Managed Instances connected in the SQL Management Studio

  • Expand the Always On High Availability node on both instances to verify that the distributed availability group is created.

    Screenshot showing the local and distributed Availabilty groups on both instances

  • You will find the AdventureWorks2019 database already deployed into the primary instance (js-sql-pr) and automatically replicated to the secondary instance (js-sql-dr) as part of the distributed availability group.

    Screenshot showing adventureworks database opened on the primary instance

    NOTE: You will not be able to browse the AdventureWorks2019 database from the secondary instance since this instance is configured as a disaster recovery instace.

    Screenshot showing adventureworks database opened on the secondary instance

Simulating failure on the primary site

  • First, to test that the DB replication is working, a simple table modification is needed. For this example, on the primary replica, run the following query to update the title of one of the rows to be Jumpstart Administrator

     USE [AdventureWorks2019]
     GO
     UPDATE [HumanResources].[Employee]
     SET [JobTitle] = 'Jumpstart Administrator'
     WHERE NationalIDNumber = 245797967
     GO
    

    Screenshot showing updating a record in the database

    Screenshot showing the updated record in the database

  • To simulate a disaster situation, navigate to the AKS cluster in the Azure portal and stop the primary cluster.

    Screenshot showing stopping the primary AKS cluster

  • Wait for two minutes for the cluster to shutdown and try to refresh the connection to the primary instance and you can see that its no longer available.

    Screenshot showing unavailable primary instance

Initiating a forced failover to the secondary site.

  • On the client VM, run the following commands on the secondary instance to promote to primary with a forced failover incurring potential data loss.

     kubectx secondary
     az sql instance-failover-group-arc update --k8s-namespace arc --name secondarycr --use-k8s --role force-primary-allow-data-loss
    

    Screenshot showing stopping the primary AKS cluster

  • Browse to the secondary instance on the Microsoft SQL Server Management Studio (SSMS) and you can see that the secondary (js-sql-dr) instance is now promoted to primary.

    Screenshot showing browsing to the secondary instance

  • To validate that the data you updated earlier has been replicated to the secondary instance, select the “HumanResources.Employee” table, click on “Edit Top 200 Rows”.

    Screenshot showing Edit Top 200 Rows

Cleanup

  • If you want to delete the entire environment, simply delete the deployment resource group from the Azure portal.

    Screenshot showing Azure resource group deletion