The following Jumpstart scenario will guide you on how to enable Azure Monitor Container Insights and configure all recommended metric alerts from Container insights for an Azure Arc-enabled Kubernetes cluster.

In this scenario, you will hook the Azure Arc-enabled Kubernetes cluster to Azure Monitor Container Insights by deploying the Azure Monitor cluster extension on your Kubernetes cluster in order to start collecting Kubernetes related logs and telemetry. Then the recommended alerts will be enabled by an ARM template.

NOTE: This scenario assumes you already deployed a Kubernetes cluster and connected it to Azure Arc. If you haven’t, this repository offers you a way to do so in an automated fashion

Kubernetes extensions are add-ons for Kubernetes clusters. The extensions feature on Azure Arc-enabled Kubernetes clusters enables usage of Azure Resource Manager based APIs, CLI and Azure Portal for deployment of extension components (Helm charts in initial release) and will also provide lifecycle management capabilities such as auto/manual extension version upgrades for the extensions.

Prerequisites

  • Clone the Azure Arc Jumpstart repository

    git clone https://github.com/microsoft/azure_arc.git
    
  • Install or update Azure CLI to version 2.36.0 and above. Use the below command to check your current installed version.

    az --version
    
  • Create Azure service principal (SP). To deploy this scenario, an Azure service principal assigned with an RBAC Contributor role is required:

    • “Contributor” - Required for provisioning Azure resources

      To create it login to your Azure account run the below command (this can also be done in Azure Cloud Shell).

      az login
      subscriptionId=$(az account show --query id --output tsv)
      az ad sp create-for-rbac -n "<Unique SP Name>" --role "Contributor" --scopes /subscriptions/$subscriptionId
      

      For example:

      az login
      subscriptionId=$(az account show --query id --output tsv)
      az ad sp create-for-rbac -n "JumpstartArcK8s" --role "Contributor" --scopes /subscriptions/$subscriptionId
      

      Output should look like this:

      {
      "appId": "XXXXXXXXXXXXXXXXXXXXXXXXXXXX",
      "displayName": "JumpstartArcK8s",
      "password": "XXXXXXXXXXXXXXXXXXXXXXXXXXXX",
      "tenant": "XXXXXXXXXXXXXXXXXXXXXXXXXXXX"
      }
      

      NOTE: The Jumpstart scenarios are designed with as much ease of use in-mind and adhering to security-related best practices whenever possible. It is optional but highly recommended to scope the service principal to a specific Azure subscription and resource group as well considering using a less privileged service principal account

Automation Flow

For you to get familiar with the automation and deployment flow, below is an explanation:

  • User has deployed a Kubernetes cluster and has it connected as an Azure Arc-enabled Kubernetes cluster.

  • User is editing the environment variables on the Shell script file (1-time edit) which then will be used throughout the extension deployment.

  • User is running the shell script. The script will use the extension management feature of Azure Arc to deploy the Azure Monitor cluster extension on the Azure Arc-enabled Kubernetes cluster and create all the recommended alerts.

  • User is veryfing that the cluster is shown in Azure Monitor and that the extension is deployed as well as all the recommended alerts.

  • User is simulating an alert.

Create Azure Monitor cluster extensions instance

To create a new extension instance, we will use the k8s-extension create command while passing in values for the mandatory parameters. This scenario provides you with the automation to deploy the Azure Monitor cluster extension on your Azure Arc-enabled Kubernetes cluster.

  • Before integrating the cluster with Azure Monitor, click on the “Extensions” tab for the connected Azure Arc cluster to show how the cluster is not currently being assessed by Azure Monitor.

    Screenshot showing Azure Portal with Azure Arc-enabled Kubernetes resource extensions

  • Navigate to the folder that has the deployment script.

  • Edit the environment variables in the script to match your environment parameters.

    • subscriptionId - Your Azure subscription ID
    • appId - Your Azure service principal name
    • password - Your Azure service principal password
    • tenantId - Your Azure tenant ID
    • resourceGroup - Azure resource group name
    • arcClusterName - Azure Arc Cluster Name
    • azureLocation - Azure region
    • logAnalyticsWorkspace - Log Analytics Workspace Name
    • k8sExtensionName - Azure Monitor extension name, should be azuremonitor-containers
    • actionGroupName - Action Group for the Alerts
    • email - Email for the Action Group

    Screenshot parameter examples

  • After editing the variables, to run the script, navigate to the script folder and run the command

    sudo chmod +x azure_monitor_alerts.sh && . ./azure_monitor_alerts.sh
    

    NOTE: The extra dot is due to the shell script having an export function and needs to have the vars exported in the same shell session as the rest of the commands.

    The script will:

    • Login to your Azure subscription using the service principal credentials
    • Add or Update your local connectedk8s and k8s-extension Azure CLI extensions
    • Create the Azure Monitor cluster extension instance
    • Create an action group and all recommended alerts
  • Verify under the extensions tab of the Azure Arc-enabled Kubernetes cluster that the Azure Monitor cluster extension is correctly installed.

    Screenshot showing Azure Portal with Azure Arc-enabled Kubernetes resource extensions

  • You can also verify the pods by running the command below:

    kubectl get pod -n kube-system --kubeconfig <kubeconfig> | grep omsagent
    

    Screenshot extension pods on cluster

  • Verify under the Alert rules tab on alerts section of the Azure Arc-enabled Kubernetes cluster that the alert rules are correctly created.

    Screenshot showing Azure Portal with Azure Arc-enabled Kubernetes resource alerts rules

Simulate an alert

  • To verify that the recommended alerts are working properly, create the below pod to simulate an OOMKilledContainers alert:

    pod-test.yaml

    apiVersion: v1
    kind: Pod
    metadata:
      name: memory-demo
    spec:
      containers:
      - name: memory-demo-ctr
        image: polinux/stress
        resources:
          requests:
            memory: "50Mi"
          limits:
            memory: "100Mi"
        command: ["stress"]
        args: ["--vm", "1", "--vm-bytes", "250M", "--vm-hang", "1"]
    
  • Create the above file and run the following command to create the pod:

    kubectl apply -f pod-test.yaml --kubeconfig <kubeconfig>
    
  • In few minutes an alert will be created, you will see it in the Azure Portal under Alerts tab of your Azure Arc-enabled cluster.

    Screenshot Monitor alert

  • You also will receive an email like this:

    Screenshot Monitor alert email

Delete resources

Complete the following steps to clean up your environment. The commands below delete the extension instance, recommended alerts, action group and Log Analytics workspace.

export arcClusterName='<Azure Arc Cluster Name>'
export resourceGroup='<Azure resource group name>'
export logAnalyticsWorkspace='<Log Analytics Workspace Name>'
export actionGroupName='<Action Group for the Alerts>'
az k8s-extension delete --name azuremonitor-containers --cluster-type connectedClusters --cluster-name $arcClusterName --resource-group $resourceGroup
az monitor metrics alert delete --name alertDeploymentContainerCPU --resource-group $resourceGroup
az monitor metrics alert delete --name alertDeploymentContainerWorkingSetMemory --resource-group $resourceGroup
az monitor metrics alert delete --name alertDeploymentNodeCPU --resource-group $resourceGroup
az monitor metrics alert delete --name alertDeploymentNodeDiskUsage --resource-group $resourceGroup
az monitor metrics alert delete --name alertDeploymentNodeNotReady --resource-group $resourceGroup
az monitor metrics alert delete --name alertDeploymentNodeWorkingSetMemory --resource-group $resourceGroup
az monitor metrics alert delete --name alertDeploymentOOMKilledContainers --resource-group $resourceGroup
az monitor metrics alert delete --name alertDeploymentPodsReady --resource-group $resourceGroup
az monitor metrics alert delete --name alertDeploymentFailedPodCounts --resource-group $resourceGroup
az monitor metrics alert delete --name alertDeploymentPersistentVolumeUsage --resource-group $resourceGroup
az monitor metrics alert delete --name alertDeploymentRestartingContainerCount --resource-group $resourceGroup
az monitor metrics alert delete --name alertDeploymentCompletedJobCount --resource-group $resourceGroup
az monitor scheduled-query delete --name alertDailyDataCapBreachedForWorkspace --resource-group $resourceGroup
az monitor action-group delete --name $actionGroupName --resource-group $resourceGroup
az monitor log-analytics workspace delete --resource-group $resourceGroup --workspace-name $logAnalyticsWorkspace