Scale the Istio Control Plane

Istio has two major components:

  • The proxy is responsible for handling the incoming traffic, and is injected into the Pods of every Workload.
  • The Control Plane is responsible for coordinating the proxies, by sending them the latest configuration.

This section focuses on the performance characteristics of the Control Plane. For details on the proxies, see Scale the Istio proxy sidecar of a workload.

Find the Control Plane

By default, Service Mesh Manager installs the Control Plane into the istio-system namespace:

kubectl get pods -n istio-system

The output should be similar to:

NAME                                                    READY   STATUS    RESTARTS   AGE
istio-meshexpansion-gateway-cp-v115x-5d647494bb-qnn8b   1/1     Running   0          3h57m
istiod-cp-v115x-749c46569-ld8fn                         1/1     Running   0          3h57m

From this list the istiod-cp-v115x-749c46569-ld8fn Pod is the Control Plane for Istio for version 1.15.x (v115x). The other pod is used to provide cross-cluster connectivity.

In practice, these Pods are provisioned from an IstioControlPlane resource in the same namespace:

kubectl get istiocontrolplanes -A

The output should be similar to:

NAMESPACE      NAME       MODE     NETWORK    STATUS      MESH EXPANSION   EXPANSION GW IPS   ERROR   AGE
istio-system   cp-v115x   ACTIVE   network1   Available   true             ["172.18.250.1"]           42h 

When using the Service Mesh Manager dashboard for analyzing the resource usage, check these Workloads.

Understand scaling behavior

For the Control Plane, the CPU usage scales horizontally: if you add more pods to Kubernetes cluster, istiod requires more CPU resources to send the configuration to all of those proxies. By adding more istiod Pods, the number of workloads the service mesh can support increases.

On the other hand, each istiod Pod needs to maintain the inventory of all the running Pods, Services, and Istio custom resources, thus as more of these added to the Kubernetes cluster, each Pod will need more memory. So from the memory consumption point of view, the service behaves as if it had vertical scalability.

This is quite a common situation where a service exhibits both vertical and horizontal scalability. In such cases, focus on the horizontal scalability properties, as long as the vertically scaling part (CPU or Memory) can be somewhat contained.

According to the upstream Istio documentation:

When namespace isolation is enabled, a single Istiod instance can support 1000 services, 2000 sidecars with 1 vCPU and 1.5 GB of memory. You can increase the number of Istiod instances to reduce the amount of time it takes for the configuration to reach all proxies.

When deciding on how to scale the workload, it is worth looking at the backing Kubernetes nodes. Usually the nodes are smaller ones: they have a few CPU cores (2-4) and 8-16 GB of memory. For the sake of this example, let’s say we have Kubernetes nodes with 2 vCPUs and 8GB RAM available.

In this case, giving 1 vCPU to Istio allocates 50% of the available CPU resources, while giving it 1.5 GB only reserves 18% of the available memory.

  • If we want to scale vertically, based on memory, we can only double the resource utilization before we exhaust all of the available CPUs on the underlying node.
  • With horizontal scaling with 1 vCPU allocated to each Istio control plane instance, we can add 8 Istios if needed before we exhaust the number of available resources on a single node, assuming that both memory and CPU utilization is scaling linearly.

Set resource limits via Control Plane

Service Mesh Manager provides two ways to change resource limits. The easiest one is to change the ControlPlane resource by running the following commands:

cat > istio-cp-limits.yaml <<EOF
spec:
 meshManager:
   istio:
     pilot:
       resources:
         requests:
           cpu: 500m
           memory: "1500M"
         limits:
           cpu: "1"
           memory: "2000M"
EOF

kubectl patch controlplane --type=merge --patch "$(cat istio-cp-limits.yaml )" smm
  • If you are using Service Mesh Manager in Operator Mode, then the Istio deployment is updated automatically.
  • If you are using the imperative mode, run the smm operator reconcile command to apply the changes.

Note: By default, the Istio Control Plane has a HPA set up with the minimum of 1 and the maximum of 5 Pods.

Set resource limits via the Istio Resource

If the deployment needs more control over the Istio behavior, then the IstioConrolPlane resource in the istio-system namespace must be changed. Besides any settings (resources and images) defined in the ControlPlane resource, any modifications will be preserved even if the operator reconcile command is invoked, or if the Service Mesh Manager is deployed in Operator Mode.

For more details on this approach, see the Open Source Istio operator