Deploy a vanilla Azure Arc Data Controller in directly connected mode on Cluster API Kubernetes cluster with Azure provider using an ARM Template

The following Jumpstart scenario will guide you on how to deploy a “Ready to Go” environment so you can start using Azure Arc-enabled data services deployed on Cluster API (CAPI) Kubernetes cluster and it’s Cluster API Azure provider (CAPZ).

By the end of this scenario, you will have a CAPI Kubernetes cluster deployed with an Azure Arc Data Controller and a Microsoft Windows Server 2022 (Datacenter) Azure client VM, installed & pre-configured with all the required tools needed to work with Azure Arc-enabled data services.

NOTE: Currently, Azure Arc-enabled data services with PostgreSQL is in public preview.

Prerequisites

  • Clone the Azure Arc Jumpstart repository

    git clone https://github.com/microsoft/azure_arc.git
    
  • Install or update Azure CLI to version 2.36.0 and above. Use the below command to check your current installed version.

    az --version
    
  • Generate SSH Key (or use existing ssh key).

  • Create Azure service principal (SP). To deploy this scenario, an Azure service principal assigned with multiple RBAC roles is required:

    • “Contributor” - Required for provisioning Azure resources

    • “Security admin” - Required for installing Cloud Defender Azure-Arc enabled Kubernetes extension and dismiss alerts

    • “Security reader” - Required for being able to view Azure-Arc enabled Kubernetes Cloud Defender extension findings

    • “Monitoring Metrics Publisher” - Required for being Azure Arc-enabled data services billing, monitoring metrics, and logs management

      To create it login to your Azure account run the below command (this can also be done in Azure Cloud Shell.

      az login
      subscriptionId=$(az account show --query id --output tsv)
      az ad sp create-for-rbac -n "<Unique SP Name>" --role "Contributor" --scopes /subscriptions/$subscriptionId
      az ad sp create-for-rbac -n "<Unique SP Name>" --role "Security admin" --scopes /subscriptions/$subscriptionId
      az ad sp create-for-rbac -n "<Unique SP Name>" --role "Security reader" --scopes /subscriptions/$subscriptionId
      az ad sp create-for-rbac -n "<Unique SP Name>" --role "Monitoring Metrics Publisher" --scopes /subscriptions/$subscriptionId
      

      For example:

      az login
      subscriptionId=$(az account show --query id --output tsv)
      az ad sp create-for-rbac -n "JumpstartArcDataSvc" --role "Contributor" --scopes /subscriptions/$subscriptionId
      az ad sp create-for-rbac -n "JumpstartArcDataSvc" --role "Security admin" --scopes /subscriptions/$subscriptionId
      az ad sp create-for-rbac -n "JumpstartArcDataSvc" --role "Security reader" --scopes /subscriptions/$subscriptionId
      az ad sp create-for-rbac -n "JumpstartArcDataSvc" --role "Monitoring Metrics Publisher" --scopes /subscriptions/$subscriptionId
      

      Output should look like this:

      {
      "appId": "XXXXXXXXXXXXXXXXXXXXXXXXXXXX",
      "displayName": "JumpstartArcDataSvc",
      "password": "XXXXXXXXXXXXXXXXXXXXXXXXXXXX",
      "tenant": "XXXXXXXXXXXXXXXXXXXXXXXXXXXX"
      }
      

      NOTE: If you create multiple subsequent role assignments on the same service principal, your client secret (password) will be destroyed and recreated each time. Therefore, make sure you grab the correct password.

      NOTE: The Jumpstart scenarios are designed with as much ease of use in-mind and adhering to security-related best practices whenever possible. It is optional but highly recommended to scope the service principal to a specific Azure subscription and resource group as well considering using a less privileged service principal account

Architecture (In a nutshell)

From the Cluster API Book docs:

“Cluster API requires an existing Kubernetes cluster accessible via kubectl; during the installation process the Kubernetes cluster will be transformed into a management cluster by installing the Cluster API provider components, so it is recommended to keep it separated from any application workload.”

in this scenario and as part of the automation flow (described below), a Rancher K3s cluster will be deployed which will be used as the management cluster. This cluster will then be used to deploy the workload cluster using the Cluster API Azure provider (CAPZ).

Automation Flow

For you to get familiar with the automation and deployment flow, below is an explanation.

  • User is editing the ARM template parameters file (1-time edit). These parameters values are being used throughout the deployment.

  • Main azuredeploy ARM template will initiate the deployment of the linked ARM templates:

    • VNET - Deploys a Virtual Network with a single subnet to be used by the Client virtual machine.
    • ubuntuCapi - Deploys an Ubuntu Linux VM which will have Rancher K3s installed and transformed into a Cluster API management cluster via the Azure CAPZ provider. As part of it’s automation and the installCAPI shell script, a new Azure Arc-enabled Kubernetes cluster will already be created to be used by the rest of the Azure Arc-enabled data services automation. Azure Arc-enabled data services deployed in directly connected are using this type of resource in order to deploy the data services cluster extension as well as for using Azure Arc Custom location.
    • clientVm - Deploys the client Windows VM. This is where all user interactions with the environment are made from.
    • mgmtStagingStorage - Used for staging files in automation scripts.
    • logAnalytics - Deploys Azure Log Analytics workspace to support Azure Arc-enabled data services logs uploads.
  • User remotes into client Windows VM, which automatically kicks off the DataServicesLogonScript PowerShell script that deploy and configure Azure Arc-enabled data services on the CAPI workload cluster including the data controller.

Deployment

As mentioned, this deployment will leverage ARM templates. You will deploy a single template that will initiate the entire automation for this scenario.

  • The deployment is using the ARM template parameters file. Before initiating the deployment, edit the azuredeploy.parameters.json file located in your local cloned repository folder. An example parameters file is located here.

    • ‘sshRSAPublicKey’ - Your SSH public key
    • ‘spnClientId’ - Your Azure service principal id
    • ‘spnClientSecret’ - Your Azure service principal secret
    • ‘spnTenantId’ - Your Azure tenant id
    • ‘windowsAdminUsername’ - Client Windows VM Administrator name
    • ‘windowsAdminPassword’ - Client Windows VM Password. Password must have 3 of the following: 1 lower case character, 1 upper case character, 1 number, and 1 special character. The value must be between 12 and 123 characters long.
    • ‘myIpAddress’ - Your local IP address. This is used to allow remote RDP and SSH connections to the client Windows VM and K3s Rancher VM.
    • ‘logAnalyticsWorkspaceName’ - Unique name for the deployment log analytics workspace.
    • ‘deploySQLMI’ - Boolean that sets whether or not to deploy SQL Managed Instance, for this data controller vanilla scenario we leave it set to false.
    • ‘SQLMIHA` - Boolean that sets whether or not to deploy SQL Managed Instance with high-availability (business continuity) configurations, for this data controller vanilla scenario we leave it set to false.
    • ‘deployPostgreSQL’ - Boolean that sets whether or not to deploy PostgreSQL, for this data controller vanilla scenario we leave it set to false.
    • ‘deployBastion’ - Choice (true | false) to deploy Azure Bastion or not to connect to the client VM.
    • ‘bastionHostName’ - Azure Bastion host name.
  • To deploy the ARM template, navigate to the local cloned deployment folder and run the below command:

    az group create --name <Name of the Azure resource group> --location <Azure Region>
    az deployment group create \
    --resource-group <Name of the Azure resource group> \
    --name <The name of this deployment> \
    --template-uri https://raw.githubusercontent.com/microsoft/azure_arc/main/azure_arc_data_jumpstart/cluster_api/capi_azure/ARM/azuredeploy.json \
    --parameters <The *azuredeploy.parameters.json- parameters file location>
    

    NOTE: Make sure that you are using the same Azure resource group name as the one you’ve just used in the azuredeploy.parameters.json file

    For example:

    az group create --name Arc-Data-Demo --location "East US"
    az deployment group create \
    --resource-group Arc-Data-Demo \
    --name arcdatademo \
    --template-uri https://raw.githubusercontent.com/microsoft/azure_arc/main/azure_arc_data_jumpstart/cluster_api/capi_azure/ARM/azuredeploy.json \
    --parameters azuredeploy.parameters.json
    

    NOTE: The deployment time for this scenario can take ~15-20min

  • Once Azure resources has been provisioned, you will be able to see it in Azure portal. As mentioned, a new Azure Arc-enabled Kubernetes cluster resource will already be available at this point.

    Screenshot showing ARM template deployment completed

    Screenshot showing the new Azure resource group with all resources

    Screenshot showing the new Azure resource group with all resources

Windows Login & Post Deployment

  • Now that the first phase of the automation is completed, it is time to RDP to the client VM. If you have not chosen to deploy Azure Bastion in the ARM template, RDP to the VM using its public IP.

    Screenshot showing Client VM public IP

  • If you have chosen to deploy Azure Bastion in the ARM template, use it to connect to the VM.

    Screenshot showing connecting using Azure Bastion

  • At first login, as mentioned in the “Automation Flow” section above, the DataServicesLogonScript PowerShell logon script will start it’s run.

  • Let the script to run its course and do not close the PowerShell session, this will be done for you once completed. Once the script will finish it’s run, the logon script PowerShell session will be closed, the Windows wallpaper will change and the Azure Arc Data Controller will be deployed on the cluster and be ready to use.

    Screenshot showing the PowerShell logon script run

    Screenshot showing the PowerShell logon script run

    Screenshot showing the PowerShell logon script run

    Screenshot showing the PowerShell logon script run

    Screenshot showing the PowerShell logon script run

    Screenshot showing the PowerShell logon script run

    Screenshot showing the PowerShell logon script run

    Screenshot showing the PowerShell logon script run

    Screenshot showing the PowerShell logon script run

    Screenshot showing the PowerShell logon script run

    Screenshot showing the PowerShell logon script run

    Screenshot showing the PowerShell logon script run

    Screenshot showing the PowerShell logon script run

    Screenshot showing the PowerShell logon script run

    Screenshot showing the post-run desktop

  • Since this scenario is deploying the Azure Arc Data Controller, you will also notice additional newly deployed Azure resources in the resources group. The important ones to notice are:

    • Custom location - provides a way for tenant administrators to use their Azure Arc-enabled Kubernetes clusters as target locations for deploying Azure services instances.

    • Azure Arc Data Controller - The data controller that is now deployed on the Kubernetes cluster.

    Screenshot showing additional Azure resources in the resource group

  • As part of the automation, Azure Data Studio is installed along with the Azure Data CLI, Azure CLI, Azure Arc and the PostgreSQL extensions. Using the Desktop shortcut created for you, open Azure Data Studio and click the Extensions settings to see the installed extensions.

    Screenshot showing Azure Data Studio shortcut

    Screenshot showing Azure Data Studio extensions

Cluster extensions

In this scenario, four Azure Arc-enabled Kubernetes cluster extensions were installed:

Exploring logs from the Client virtual machine

Occasionally, you may need to review log output from scripts that run on the Arc-Data-Client or Arc-Data-CAPI-MGMT virtual machines in case of deployment failures. To make troubleshooting easier, the scenario deployment scripts collect all relevant logs in the C:\Temp folder on Arc-Data-Client. A short description of the logs and their purpose can be seen in the list below:

Logfile Description
C:\Temp\Bootstrap.log Output from the initial bootstrapping script that runs on Arc-Data-Client.
C:\Temp\DataServicesLogonScript.log Output of DataServicesLogonScript.ps1 which configures Azure Arc-enabled data services baseline capability.
C:\Temp\installCAPI.log Output from the custom script extension which runs on Arc-Data-CAPI-MGMT and configures the Cluster API for Azure cluster and onboards it as an Azure Arc-enabled Kubernetes cluster. If you encounter ARM deployment issues with ubuntuCapi.json then review this log.

Screenshot showing the Temp folder with deployment logs

Cleanup

  • If you want to delete the entire environment, simply delete the deployment resource group from the Azure portal.

    Screenshot showing Azure resource group deletion