Edit

Share via


Advanced Azure Kubernetes Service (AKS) microservices architecture

Azure Application Gateway
Azure Container Registry
Azure Kubernetes Service (AKS)
Azure Virtual Network

This reference architecture describes several configurations to consider when you run microservices on Azure Kubernetes Service (AKS). This article discusses network policy configuration, pod autoscaling, and distributed tracing across a microservice-based application.

This architecture builds on the AKS baseline architecture, which Microsoft recommends as the starting point for AKS infrastructure. The AKS baseline describes infrastructural features like Microsoft Entra Workload ID, ingress and egress restrictions, resource limits, and other secure AKS infrastructure configurations. These features aren't covered in this article. We recommend that you become familiar with the AKS baseline architecture before you proceed with the microservices content.

GitHub logo A reference implementation of this architecture is available on GitHub.

Architecture

Network diagram that shows a hub-spoke network that has two peered virtual networks and the Azure resources that this implementation uses.

Download a Visio file of this architecture.

If you prefer to start with a more basic microservices example on AKS, see Microservices architecture on AKS.

Workflow

This request flow implements the Publisher-Subscriber, Competing Consumers, and Gateway Routing cloud design patterns. The following workflow corresponds to the previous diagram:

  1. An HTTPS request is submitted to schedule a drone pickup. The request passes through Azure Application Gateway into the ingestion web application, which runs as an in-cluster microservice in AKS.

  2. The ingestion web application produces a message and sends it to the Azure Service Bus message queue.

  3. The back-end system assigns a drone and notifies the user. The workflow microservice does the following tasks:

    • Consumes message information from the Service Bus message queue
    • Sends an HTTPS request to the delivery microservice, which passes data to Azure Cache for Redis external data storage
    • Sends an HTTPS request to the drone scheduler microservice
    • Sends an HTTPS request to the package microservice, which passes data to MongoDB external data storage
  4. The architecture uses an HTTPS GET request to return delivery status. This request passes through Application Gateway into the delivery microservice.

  5. The delivery microservice reads data from Azure Cache for Redis.

Components

  • AKS provides a managed Kubernetes cluster. When you use AKS, Azure manages the Kubernetes API server. The cluster operator can access and manage the Kubernetes nodes or node pools.

    This architecture uses the following AKS infrastructure features:

  • Azure Virtual Network provides isolated and highly secure environments for running virtual machines (VMs) and applications. This reference architecture uses a peered hub-spoke virtual network topology. The hub virtual network hosts the Azure Firewall and Azure Bastion subnets. The spoke virtual network contains the AKS system and user node pool subnets and the Application Gateway subnet.

  • Azure Private Link allocates specific private IP addresses to access Azure Container Registry and Azure Key Vault through the Microsoft backbone network. Platform as a service solutions like Container Registry and Key Vault are accessed via a private endpoint from the AKS system and user node pool subnet.

  • Application Gateway with web application firewall (WAF) load balances web traffic to the web application. In this architecture, Application Gateway exposes the ingestion microservice as a public endpoint.

  • Azure Firewall is a cloud-native, intelligent network firewall security service that provides threat protection for your Azure cloud workloads. The firewall allows only approved services and fully qualified domain names (FQDNs) as egress traffic. In this architecture, Azure Firewall controls outbound communications from microservices to resources outside the virtual network.

External storage and other components

  • Key Vault stores and manages security keys, secret values, and certificates for Azure services. In this architecture, Azure key vaults store credentials for Azure Cosmos DB and Azure Cache for Redis.

  • Container Registry stores private container images that can be run in the AKS cluster. AKS authenticates with Container Registry by using its Microsoft Entra managed identity. You can also use other container registries like Docker Hub. In this architecture, Container Registry stores the container images for microservices.

  • Azure Cosmos DB is a fully managed NoSQL, relational, and vector database. Microservices are typically stateless and write their state to external data stores. Azure Cosmos DB has open-source APIs for MongoDB, PostgreSQL, and Cassandra. In this architecture, Azure Cosmos DB and Azure Cosmos DB for MongoDB serve as data stores for each microservice.

  • Service Bus provides reliable cloud messaging as a service and simple hybrid integration. Service Bus supports asynchronous messaging patterns that are common in microservices applications. In this architecture, Service Bus serves as the asynchronous queueing layer between the ingestion and workflow microservices.

  • Azure Cache for Redis adds a caching layer to the application architecture to improve speed and performance for heavy-traffic loads. In this architecture, the delivery microservice uses Azure Cache for Redis as the state store and side cache.

  • Azure Monitor collects and stores metrics and logs, including application telemetry and Azure platform and service metrics. In this architecture, you can use this data to monitor the application, set up alerts and dashboards, and perform root cause analysis of failures.

Other operations support system components

  • Helm is a package manager for Kubernetes that bundles Kubernetes objects into a single unit that you can publish, deploy, version, and update. In this architecture, use Helm commands to package and deploy microservices to the AKS cluster.

  • Key Vault Secrets Store CSI Driver provider The Key Vault provider for Secrets Store CSI Driver allows for the integration of a key vault as a secret store with an AKS cluster via a CSI volume. In this architecture, the key vault secrets are mounted as a volume in microservice containers so that microservices can retrieve credentials for Azure Cosmos DB, Azure Cache for Redis, and Service Bus.

  • Flux is an open and extensible continuous delivery solution for Kubernetes that enables GitOps in AKS.

Alternatives

Instead of using an application routing add-on, you can use alternatives like Application Gateway for Containers and Istio gateway add-on. For a comparison of ingress options in AKS, see Ingress in AKS. Application Gateway for Containers is an evolution of Application Gateway ingress controller and provides extra features such as traffic splitting and weighted round-robin load balancing.

You can use ArgoCD as the GitOps tool instead of Flux v2. Both Flux v2 and ArgoCD are available as cluster extensions.

Instead of storing credentials for Azure Cosmos DB and Azure Cache for Redis in key vaults, we recommend that you use managed identities to authenticate because password-free authentication mechanisms are more secure. For more information, see Use managed identities to connect to Azure Cosmos DB from an Azure VM and Authenticate a managed identity by using Microsoft Entra ID to access Service Bus resources. Azure Cache for Redis also supports authentication by using managed identities.

Scenario details

The example Fabrikam Drone Delivery Shipping App shown in the preceding diagram implements the architectural components and practices that this article describes. In this example, Fabrikam, Inc., a fictitious company, manages a fleet of drone aircraft. Businesses register with the service, and users can request a drone to pick up goods for delivery. When a customer schedules a pickup, the back-end system assigns a drone and notifies the user with an estimated delivery time. While the delivery is in progress, the customer can track the drone's location and see a continuously updated estimated time of arrival.

Recommendations

You can apply the following recommendations to most scenarios. Follow these recommendations unless you have a specific requirement that overrides them.

Managed NGINX ingress with application routing add-on

API Gateway Routing is a general microservices design pattern. An API gateway functions as a reverse proxy that routes requests from clients to microservices. The Kubernetes ingress resource and the ingress controller handle most API gateway functionality by performing the following actions:

  • Routing client requests to the correct back-end services to provide a single endpoint for clients and help decouple clients from services

  • Aggregating multiple requests into a single request to reduce chatter between the client and the back end

  • Offloading functionality like Secure Sockets Layer (SSL) termination, authentication, IP address restrictions, and client rate-limiting or throttling from the back-end services

Ingress controllers simplify traffic ingestion into AKS clusters, improve safety and performance, and save resources. This architecture uses the managed NGINX ingress with the application routing add-on for ingress control.

We recommend that you use the ingress controller with an internal (private) IP address and an internal load balancer and integrate to Azure private Domain Name System zones for host name resolution of microservices. Configure the private IP address or host name of the ingress controller as the back-end pool address in Application Gateway. Application Gateway receives traffic on the public endpoint, performs WAF inspections, and routes the traffic to the ingress private IP address.

You should configure the ingress controller with a custom domain name and SSL certificate so that the traffic is end-to-end encrypted. Application Gateway receives traffic on the HTTPS listener. After WAF inspections, Application Gateway routes traffic to the HTTPS endpoint of the ingress controller. All microservices should be configured with custom domain names and SSL certificates so that inter-microservice communication within the AKS cluster is also secured by using SSL.

Multitenant workloads or a single cluster that supports development and testing environments might require more ingress controllers. The application routing add-on supports advanced configurations and customizations, including multiple ingress controllers within the same AKS cluster and using annotations to configure ingress resources.

Zero Trust network policies

Network policies specify how AKS pods are allowed to communicate with each other and with other network endpoints. By default, all ingress and egress traffic is allowed to and from pods. When you design how your microservices communicate with each other and with other endpoints, consider following a Zero Trust principle, where access to any service, device, application, or data repository requires explicit configuration.

One strategy to implement a Zero Trust policy is to create a network policy that denies all ingress and egress traffic to all pods within the target namespace. The following example shows a deny all policy that applies to all pods located in the backend-dev namespace.

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-all
  namespace: backend-dev
spec:
  podSelector: {}
  policyTypes:
  - Ingress
  - Egress

After a restrictive policy is in place, begin to define specific network rules to allow traffic into and out of each pod in the microservice. In the following example, the network policy is applied to any pod in the backend-dev namespace that has a label that matches app.kubernetes.io/component: backend. The policy denies any traffic unless it's sourced from a pod that has a label that matches app.kubernetes.io/part-of: dronedelivery.

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: package-v010-dev-np-allow-ingress-traffic
  namespace: backend-dev
spec:
  podSelector:
    matchLabels:
      app.kubernetes.io/component: backend
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app.kubernetes.io/part-of: dronedelivery
    ports:
    - port: 80
      protocol: TCP

For more information about Kubernetes network policies and more examples of potential default policies, see Network policies in the Kubernetes documentation.

Azure provides three network policy engines for enforcing network policies:

  • Cilium for AKS clusters that use Azure CNI Powered by Cilium
  • Azure network policy manager
  • Calico, an open-source network and network security solution

We recommend that you use Cilium as the network policy engine.

Resource quotas

Administrators use resource quotas to reserve and limit resources across a development team or project. You can set resource quotas on a namespace and use them to set limits on the following resources:

  • Compute resources, such as CPU and memory, or GPUs
  • Storage resources, including the number of volumes or amount of disk space for a given storage class
  • Object count, such as the maximum number of secrets, services, or jobs that can be created

After the cumulative total of resource requests or limits passes the assigned quota, no further deployments are successful.

Resource quotas ensure that the total set of pods assigned to the namespace can't exceed the resource quota of the namespace. The front end can't use all of the resources for the back-end services, and the back end can't use all of the resources for the front-end services.

When you define resource quotas, all pods created in the namespace must provide limits or requests in their pod specifications. If they don't provide these values, the deployment is rejected.

The following example shows a pod specification that sets resource quota requests and limits:

requests:
  cpu: 100m
  memory: 350Mi
limits:
  cpu: 200m
  memory: 500Mi

For more information about resource quotas, see:

Autoscaling

Kubernetes supports autoscaling to increase the number of pods allocated to a deployment or to increase the nodes in the cluster to increase the total available compute resources. Autoscaling is a self-correcting autonomous feedback system. You can scale pods and nodes manually, but autoscaling minimizes the chances of services reaching resource limits at high loads. An autoscaling strategy must account for both pods and nodes.

Cluster autoscaling

The Cluster Autoscaler (CA) scales the number of nodes. If pods can't be scheduled because of resource constraints, the cluster autoscaler provisions more nodes. You define a minimum number of nodes to keep the AKS cluster and your workloads operational and a maximum number of nodes for heavy traffic. The CA checks every few seconds for pending pods or empty nodes and scales the AKS cluster appropriately.

The following example shows the CA configuration from the Bicep template:

autoScalerProfile: {
  'scan-interval': '10s'
  'scale-down-delay-after-add': '10m'
  'scale-down-delay-after-delete': '20s'
  'scale-down-delay-after-failure': '3m'
  'scale-down-unneeded-time': '10m'
  'scale-down-unready-time': '20m'
  'scale-down-utilization-threshold': '0.5'
  'max-graceful-termination-sec': '600'
  'balance-similar-node-groups': 'false'
  expander: 'random'
  'skip-nodes-with-local-storage': 'true'
  'skip-nodes-with-system-pods': 'true'
  'max-empty-bulk-delete': '10'
  'max-total-unready-percentage': '45'
  'ok-total-unready-count': '3'
}

The following lines in the Bicep template set example minimum and maximum nodes for the cluster autoscaler:

minCount: 2
maxCount: 5

Horizontal pod autoscaling

The Horizontal Pod Autoscaler (HPA) scales pods based on observed CPU, memory, or custom metrics. To configure horizontal pod scaling, you specify target metrics and the minimum and maximum number of replicas in the Kubernetes deployment pod specification. Load test your services to determine these numbers.

CA and HPA work together, so enable both autoscaler options in your AKS cluster. HPA scales the application, while CA scales the infrastructure.

The following example sets resource metrics for HPA:

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: delivery-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: delivery
  minReplicas: 2
  maxReplicas: 5
  metrics:
  - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70

HPA looks at actual resources consumed or other metrics from running pods. The CA provisions nodes for pods that aren't scheduled yet. As a result, CA looks at the requested resources, as specified in the pod specification. Use load testing to fine-tune these values.

For more information, see Scaling options for applications in AKS.

Vertical pod autoscaling

The Vertical Pod Autoscaler (VPA) automatically adjusts the CPU and memory requests for your pods to match the usage patterns of your workloads. When it's configured, the VPA automatically sets resource requests and limits on containers for each workload based on past usage. The VPA makes CPU and memory available for other pods and helps ensure effective utilization of your AKS clusters.

In this architecture, VPA increases the CPU and memory requests and limits for microservices based on their past usage. For example, if the workflow microservice consumes more CPU compared to other microservices, the VPA can monitor this usage and increase the CPU limits for the workflow microservice.

Kubernetes event-driven autoscaling

The Kubernetes Event Driven Autoscaler (KEDA) add-on enables event-driven autoscaling to scale your microservice to meet demand in a sustainable and cost-efficient manner. For example, KEDA can scale up microservices when the number of messages in the Service Bus queue surpasses specific thresholds.

In the Fabrikam drone delivery example, KEDA scales out the workflow microservice depending on the Service Bus queue depth and based on the ingestion microservice output. For a list of KEDA scalers for Azure services, see Integrations with KEDA on AKS.

Health probes

Kubernetes load balances traffic to pods that match a label selector for a service. Only pods that started successfully and are healthy receive traffic. If a container crashes, Kubernetes removes the pod and schedules a replacement.

Kubernetes defines three types of health probes that a pod can expose:

  • The readiness probe tells Kubernetes whether the pod is ready to accept requests.

  • The liveness probe tells Kubernetes whether a pod should be removed and a new instance started.

  • The startup probe tells Kubernetes whether the pod is started.

The liveness probes handle pods that are still running but are unhealthy and should be recycled. For example, if a container serving HTTP requests hangs, the container doesn't crash, but it stops serving requests. The HTTP liveness probe stops responding, which alerts Kubernetes to restart the pod.

Sometimes, a pod might not be ready to receive traffic, even though the pod started successfully. For example, the application running in the container might be performing initialization tasks. The readiness probe indicates whether the pod is ready to receive traffic.

Microservices should expose endpoints in their code that facilitate health probes, with delay and timeout tailored specifically to the checks that they perform. The HPA formula relies on the pod's ready phase, so it's crucial that health probes exist and are accurate.

Monitoring

In a microservices application, application performance management (APM) monitoring is crucial for detecting anomalies, diagnosing problems, and quickly understanding the dependencies between services. Application Insights, a feature of Azure Monitor, provides APM monitoring for live applications written in .NET Core, Node.js, Java, and many other application languages.

Azure provides various mechanisms for monitoring microservice workloads:

  • Managed Prometheus for metric collection. Use Prometheus to monitor and alert on the performance of infrastructure and workloads.

  • Azure Monitor managed service for Prometheus and container insights work together for complete monitoring of your Kubernetes environment.

  • Managed Grafana for cluster and microservice visualization.

To contextualize service telemetry in Kubernetes, integrate Azure Monitor telemetry with AKS to collect metrics from controllers, nodes, containers, and container and node logs. You can integrate Application Insights with AKS without code changes.

Considerations

These considerations implement the pillars of the Azure Well-Architected Framework, which is a set of guiding tenets that you can use to improve the quality of a workload. For more information, see Well-Architected Framework.

Security

Security provides assurances against deliberate attacks and the misuse of your valuable data and systems. For more information, see Design review checklist for Security.

Consider the following points when you plan for security.

  • Use deployment safeguards in the AKS cluster. Deployment safeguards enforce Kubernetes best practices in your AKS cluster through Azure Policy controls.

  • Integrate security scanning into the microservice build and deployment pipelines. Manage your DevOps environment by using Microsoft Defender for Cloud DevOps security. Use agentless code scanning and run static code analysis tools as part of continuous integration and continuous deployment (CI/CD) pipelines so that you can identify and address the microservice code vulnerabilities as part of the build and deployment processes.

  • An AKS pod authenticates itself by using a workload identity that's stored in Microsoft Entra ID. You should use a workload identity because it doesn't require a client secret.

  • When you use managed identities, the application can quickly get Azure Resource Manager OAuth 2.0 tokens when it runs. It doesn't need passwords or connection strings. In AKS, you can assign identities to individual pods by using Workload ID.

  • Each service in the microservice application should be assigned a unique workload identity to facilitate least-privileged RBAC assignments. You should only assign identities to services that require them.

  • In cases where an application component requires Kubernetes API access, ensure that application pods are configured to use a service account with appropriately scoped API access. For more information, see Manage Kubernetes service accounts.

  • Not all Azure services support using Microsoft Entra ID for data plane authentication. To store credentials or application secrets for those services, for non-Microsoft services, or for API keys, use Key Vault. Key Vault provides centralized management, access control, encryption at rest, and auditing of all keys and secrets.

  • In AKS, you can mount one or more secrets from Key Vault as a volume. The pod can then read the Key Vault secrets just like a regular volume. For more information, see Use the Key Vault provider for Secrets Store CSI Driver in an AKS cluster. We recommend that you maintain separate key vaults for each microservice. The reference implementation uses separate key vaults for each microservice.

  • If the microservice needs to communicate to resources, such as external URLs, outside of the cluster, control the access through Azure Firewall. If the microservice doesn't need to make any outbound calls, use network isolated clusters.

  • Enable Microsoft Defender for Containers to provide security posture management, vulnerability assessment for microservices, run-time threat protection, and other security features.

Cost Optimization

Cost Optimization focuses on ways to reduce unnecessary expenses and improve operational efficiencies. For more information, see Design review checklist for Cost Optimization.

  • The Cost Optimization section in the Well-Architected Framework describes cost considerations.

  • Use the Azure pricing calculator to estimate costs for your specific scenario.

  • In the Free tier, AKS has no costs associated with deployment, management, and operations of the Kubernetes cluster. You only pay for the VM instances, storage, and networking resources that the cluster consumes. Cluster autoscaling can significantly reduce the cost of the cluster by removing empty or unused nodes.

  • Consider using the Free tier of AKS for development workloads, and use the Standard and Premium tiers for production workloads.

  • Consider enabling AKS cost analysis for granular cluster infrastructure cost allocation by Kubernetes-specific constructs.

Operational Excellence

Operational Excellence covers the operations processes that deploy an application and keep it running in production. For more information, see Design review checklist for Operational Excellence.

Consider the following points when you plan for manageability.

  • Manage the AKS cluster infrastructure via an automated deployment pipeline. The reference implementation for this architecture provides a GitHub Actions workflow that you can reference when you build your pipeline.

  • The workflow file deploys the infrastructure only, not the workload, into the already-existing virtual network and Microsoft Entra configuration. Deploying the infrastructure and the workload separately lets you address distinct life cycle and operational concerns.

  • Consider your workflow as a mechanism to deploy to another region if there's a regional failure. Build the pipeline so that you can deploy a new cluster in a new region with parameter and input changes.

Performance Efficiency

Performance Efficiency refers to your workload's ability to scale to meet user demands efficiently. For more information, see Design review checklist for Performance Efficiency.

Consider the following points when you plan for scalability.

  • Don't combine autoscaling and imperative or declarative management of the number of replicas. Users and an autoscaler both attempting to modify the number of replicas might cause unexpected behavior. When HPA is enabled, reduce the number of replicas to the minimum number that you want to be deployed.

  • A side effect of pod autoscaling is that pods might be created or evicted frequently as the application scales in or scales out. To mitigate these effects, perform the following actions:

    • Use readiness probes to let Kubernetes know when a new pod is ready to accept traffic.
    • Use pod disruption budgets to limit how many pods can be evicted from a service at a time.
  • If there's a large number of outbound flows from the microservice, consider using Network Address Translation gateways to avoid Source Network Address Translation port exhaustion.

  • Multitenant or other advanced workloads might have node pool isolation requirements that demand more and likely smaller subnets. For more information, see Add node pools with unique subnets. Organizations have different standards for their hub-spoke implementations. Be sure to follow your organizational guidelines.

  • Consider using CNI with overlay networking to conserve network address space.

Next steps