Cluster registry controller

Service Mesh Manager uses the cluster registry controller to synchronize any Kubernetes resources across the clusters in a multi-cluster setup. That way, the necessary resources are automatically synchronized, so the multi-cluster topologies of Istio and the multi-cluster features (for example, observability, multi-cluster topology view, tracing, traffic tapping) of Service Mesh Manager work in a multi-cluster environment.

In addition, you can use the resource synchronization capabilities of Service Mesh Manager to synchronize any Kubernetes resources on demand between the clusters of your mesh.

Overview

When installing Service Mesh Manager in imperative mode from the command line, Service Mesh Manager automatically deploys the cluster registry controller to every cluster of the mesh, and creates the Cluster CRs, with default values that are suitable for most common scenarios.

The Cluster resource represents a Kubernetes cluster. The cluster registry controller fills the status of the Cluster CR with cluster related metadata, and distributes the Cluster CRs to all participating Kubernetes clusters. In addition, the credentials for all clusters are automatically distributed to all clusters (these are usually stored in Kubernetes secrets) to help bootstrap the cluster group itself.

Note: You have to manually configure the Cluster CR or the operator’s Helm values file if your clusters have some unique networking requirements, for example, by setting the KubernetesAPIEndpoints of the cluster.

In such a multi-cluster setup, here is how the cluster registry controller works:

  • The controller only writes to the local cluster where it is deployed to
  • The controller only reads from peer clusters

By default, the required resources are kept in sync between all clusters. You can define your own ResourceSyncRule resources to sync other Kubernetes resources between these clusters. The ResourceSyncRules can be further adjusted to specify from which clusters and to which clusters certain resource should be synced.

Service Mesh Manager operator mode

When you are using Service Mesh Manager in operator mode in a multi-cluster environment, note the following points:

  1. You must explicitly enable the cluster registry in the ControlPlane CR or the operator’s Helm values file.

    Replace <cluster-name> with the name of your cluster. The cluster name format must comply with the RFC 1123 DNS subdomain/label format (alphanumeric string without “_” or “.” characters). Otherwise, you get an error message starting with: Reconciler error: cannot determine cluster name controller=controlplane, controllerGroup=smm.cisco.com, controllerKind=ControlPlane

    spec:
      clusterName: <cluster-name>
      clusterRegistry:
        enabled: true
        namespace: cluster-registry
    
  2. To create trust between the clusters, you must exchange the Secret CRs of the clusters. For an example, see GitOps - multi-cluster installation.

Networking requirements

The cluster registry controller instances running on the clusters must be able to reach the API server of every other cluster in the cluster group, so every cluster can read the relevant resources from the other clusters.

The cluster registry controller pod connects directly to Kubernetes API server of the peer clusters. This works automatically if the API servers are publicly available. Otherwise, configure a reachable endpoint for them in the Cluster CR spec. (For security reasons, we recommend to make the API server addresses available only from the IP ranges of the peer clusters.)

ResourceSyncRule example usage

Sync everywhere

  1. Create a sample secret on the third cluster, which will be copied around:

    apiVersion: v1
    kind: Secret
    metadata:
      name: test-secret
    data: {}
    
  2. Create a ResourceSyncRule on the first cluster to synchronize the secret to all clusters:

    apiVersion: clusterregistry.k8s.cisco.com/v1alpha1
    kind: ResourceSyncRule
    metadata:
      name: test-secret-sink
    spec:
      groupVersionKind:
        kind: Secret
        version: v1
      rules:
      - match:
        - objectKey:
          name: test-secret
          namespace: cluster-registry
    

    This ResourceSyncRule resource itself and the secret resource as well should appear shortly on all clusters of the cluster group.

    At this point, if a secret from any of the clusters (except from the one where it originates from) is deleted or modified, it will be synced back immediately by the cluster registry controller.

Sync to a set of clusters

Cluster registry controller can be configured to sync only to specific clusters in the cluster group (instead of all of them). To do that, you must add an annotation to the cluster where you don’t want to sync to.

  1. Add the following annotation to the ResourceSyncRule on the first cluster:

    annotations:
      cluster-registry.k8s.cisco.com/resource-sync-disabled: "true"
    
  2. Delete the ResourceSyncRule from the second cluster.

    The ResourceSyncRule resource will not be recreated because of the annotation, which was just added.

    If the annotation is not added as described in the previous step, then the ResourceSyncRule will be recreated.

  3. Delete the test-secret from the second cluster.

    The secret will not be recreated because the ResourceSyncRule resource does not exist on the second cluster.

Sync from a set of clusters

Cluster registry controller can be configured, to only sync from specific clusters in the cluster group (instead of all of them). To do that, you must create a ClusterFeature resource on the clusters where you want to sync from and add a clusterFeatureMatch field to the ResourceSyncRule resources on the clusters where you want to sync to.

  1. Add the following field to the ResourceSyncRule spec on the first cluster:

    clusterFeatureMatch:
    - featureName: test-secret-feature
    

    This causes that the secret will only be synced from clusters where there are ClusterFeature resources defined.

    At this point, there is no ClusterFeature present on any cluster, so if the secret would be deleted now from the first cluster, it would not be recreated.

  2. Apply the following ClusterFeature to the third cluster:

    apiVersion: clusterregistry.k8s.cisco.com/v1alpha1
    kind: ClusterFeature
    metadata:
      name: test-secret-source
    spec:
      featureName: test-secret-feature
    
  3. Delete the test-secret from the first cluster.

    It should be recreated now, because it can sync the secret from the third cluster.

RBAC considerations

The cluster registry controller only writes to local clusters and only reads from peer clusters. By default, it has access to read namespace, node and secret resources. If you want to sync other resources, expand the RBAC rules of the operator as needed (it uses aggregated ClusterRoles).

  • On the cluster, where the resources are read from (usually where ClusterFeature resources are present) a ClusterRole should be defined with the correct read roles and the following label should be added:

    labels:
      cluster-registry.k8s.cisco.com/reader-aggregated: "true"
    
  • On the cluster, where the resources are written to (usually where ResourceSyncRule resources are present) a ClusterRole should be defined with the correct write roles and the following label should be added:

    labels:
      cluster-registry.k8s.cisco.com/controller-aggregated: "true"