Skip to main content

Katonic MLOps Platform on Azure

This guide describes how to install, operate, administer, and configure the Katonic Platform in your own Azure Kubernetes cluster. This content is applicable to Katonic users with self-installation licenses.

Hardware Configurations​

This configuration is designed to offer high availability (HA) or performance testing. It is designed to achieve superior performance that enables real-time execution of analytics, machine learning (ML), and artificial intelligence (AI) applications in a production pipeline.

Katonic on Azure​

Katonic can run on a Kubernetes cluster provided by Azure Kubernetes Service. When running on AKS, the Katonic architecture uses Azure resources to fulfil the Katonic MLOps platform requirements as follows:

Architecture1

Runtime platform:​

A: AKS cluster deployed in 2 Availability Zones (AZ), versions 1.29, Node/instances: Virtual Machine Scale Set.

B: Platform nodes: Node pool (min 2) Standard_DS3_v2

C: Compute nodes: Node pool (Variable) Standard_D8s_v3

D: GPU compute nodes: Nodepool (Variable) Standard_NC6s_v3

Storage:​

A: Shared filesystem and datasets: Azure Storage Account

B: Backups: Azure Storage Account

C: Environment and model image: Azure Container Registry

Networking:​

A: Ingress Load Balancer: Standard SKU Azure Load Balancer

B: Cluster network: Azure Virtual Network with a subnet with 65536 IP addresses (/16 subnet mask).

When running on AKS, the Katonic uses Azure resources to fulfil the cluster requirements as follows:​

  • Kubernetes control is handled by the AKS control plane with managed Kubernetes masters

  • The AKS cluster’s node pool which is labeled katonic.ai/node-pool=platform is configured to host the Katonic platform

  • Additional AKS node pools provide compute(labelled katonic.ai/node-pool=compute) and GPU(labelled katonic.ai/node-pool=gpu) nodes for user workloads

  • An Azure storage account stores Katonic blob data and datasets

  • The kubernetes.io/azure-disk provisioner is used to create persistent volumes for Katonic executions

  • Ingress to the Katonic application is handled by an SSL-terminating Application Gateway that points to a Kubernetes load balancer

Setting up an AKS cluster for the Katonic Platform​

This section describes how to configure an Azure AKS cluster for use with Katonic. When configuring an AKS cluster for Katonic, you must be familiar with the following Azure services:

  • Azure Kubernetes Service (AKS)
  • Virtual Networking (Vnet)
  • Virtual Machines and Disks
  • Azure File System storage
  • Azure Blob Storage

Additionally, a basic understanding of Kubernetes concepts like node pools, network CNI, storage classes, autoscaling, and Docker will be useful when deploying the cluster.

Service quotas​

Azure maintains default service quotas for each of the services listed above. You can check the default service quotas and manage your quotas by logging in to the Azure Service Quotas console.

Create Azure Kubernetes Service(AKS)​

By default Katonic installer create AKS. If you are going to create AKS then first create new separate resource group and create AKS cluster in that resource group.

Dynamic block storage​

AKS clusters come equipped with several kubernetes.io/azure-disk backed storage classes by default. Katonic recommends the use of Standard SSD disks for better input and output performance. The Standard SSD-based storage class(kfs) is created by default by the katonic installer.

If you creating a cluster by yourself then you need to create a kfs named storage class. To create a storage class use the following YAML.

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
annotations:
storageclass.kubernetes.io/is-default-class: "true"
name: kfs
parameters:
skuname: StandardSSD_LRS
provisioner: disk.csi.azure.com
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true

Note: Make sure only kfs storage class is default. Remove other storage class from default.

Dynamic shared storage​

AKS clusters come equipped with an Azure file storage class by default. Katonic recommends the use of that Azure file system storage class for better input and output performance.

Katonic Installer has an optional parameter Shared Storage.create to create a kfs-shared Storage class based on the Azure file system for the katonic platform.

If you are creating a cluster by yourself and you want to use shared storage then you need to create an Azure file system-based storage class. Use the following YAML to create it.

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: kfs
parameters:
skuname: StandardSSD_LRS
provisioner: disk.csi.azure.com
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true

Domain​

Katonic must be configured to serve from a specific FQDN. To serve Katonic securely over HTTPS, you will also need an SSL certificate that covers the chosen name. Record the FQDN for use when installing Katonic.

Katonic offers the default option to use the .katonic.ai domain in all versions of the Katonic Platform. However, if you have your own domain, you can also utilize it across all versions provided by the Katonic Platform.

Resources Provisioned Post-Installation​

When the platform is installed, the following resources are created. Take this into account when selecting your installation configuration.

SR NO.TYPEAMOUNTWHENNOTES
1Network interface1 per nodeAlways
2OS boot disk (Azure managed disk)1 per nodeAlways
3Public IP address1 per nodeThe platform has public IP addresses.
4VNet1The platform is deployed to a new VNet.
5Network security group1AlwaysSee Network Security Groups Configuration (Azure).
6AKS Cluster1When AKS is used as the application clusterVersion 1.29
7Azure File System1When you enable shared storage while installing Katonic platform.

Kubernetes(AKS) version​

Katonic platform 5.0.9 version has been validated with Kubernetes(AKS) version 1.29 and above.

Network plugin​

Katonic relies on Kubernetes network policies to manage secure communication between pods in the cluster. Network policies are implemented by the network plugin, so your cluster uses a networking solution that supports NetworkPolicy, such as Calico.

You must ensure the subnets you use for your cluster have CIDR ranges of sufficient size, as every deployed pod in the cluster will be assigned an elastic network interface and consume a subnet address. Katonic recommends at least a /23 CIDR for the cluster.

The Katonic-hosting cluster should use the default network plugin created when AKS is deployed.

Data Visualisation​

  • Katonic MLOps platform 5.0.9 include Superset Version 2.0.1 for Data Visualization.

  • You require an additional DNS if you're installing Superset.

    Example:

Connectors​

  • Katonic MLOps platform 5.0.9 include Airbyte Version 0.40.32 for Connectors.

  • You require an additional DNS if you're installing Airbyte.

    Example:

Katonic Platform Installation​

Installation of the Katonic platform has been segmented based on product. When you click the link, you will be redirected to the installation process documentation.