Manage Schema Registry instances

Streaming Data Manager automates Schema Registry deployment by introducing a new custom resource called SchemaRegistry. Streaming Data Manager provides two modes to manage the schemas:

Both methods use the SchemaRegistry Custom Resource under the hood to manage Schema registry instances. ​

Imperative management of Schema Registry instances

​ The Streaming Data Manager CLI provides commands to deploy Schema Registry instances with either default or custom settings with ease.

Note: If you install Streaming Data Manager with the following command, it sets up everything needed to start playing with Apache Kafka and Schema Registry on Kubernetes, which is ideal for testing: smm install -a --install-sdm

To deploy additional Schema Registry instances or manage existing ones, run the smm sdm cluster schemaregistry create and smm sdm cluster schemaregistry update commands. Both of these commands expect a Schema Registry descriptor.

Declarative management of Schema Registry instances

​ Managing Schema Registry instances with Streaming Data Manager is as simple as creating and updating the SchemaRegistry custom resource. Streaming Data Manager automatically monitors the Schema Registry deployment and configuration settings specified using the SchemaRegistry custom resource (for details, see the description of the custom resource).

These will perform the necessary steps to spin up new Schema Registry instances or reconfigure existing ones with the desired configuration. ​​

Schema Registry API endpoints

Access Schema Registry from inside the Kubernetes cluster

The deployed Schema Registry instances are reachable at the schema-registry-svc-<schema-registry-name>.<namespace>.svc:<servicePort> endpoint. ​

Access Schema Registry from outside the Kubernetes cluster

In order to access the Schema Registry from a client application which is outside the Kubernetes cluster, do the following:

  1. Make sure you have at least one external listener for Kafka exposing it outside the Kubernetes cluster.
  2. Configure the SchemaRegistry custom resource with the following options: Set the istioControlPlane field’s name and namespace (see example below). The istioControlPlane field is a reference to the IstioControlPlane custom resource and is mandatory from SDM version 1.7.0+ as SDM 1.7.0+ uses Istio operator v2, which supports multiple Istio control planes on the same cluster. That is why the corresponding control plane to the Istio ingress gateway must be specified.
  3. (Optional) You can set MTLS to false if you don’t need client authentication and secure communication between the client application and external endpoint.
  4. Get the loadBalancer-type service schema-registry-meshgateway-<schema-registry-name> public ip for connecting client applications from outside the Kubernetes cluster.

This public endpoint is made available when the Kafka cluster that the Schema Registry is bound to becomes exposed externally.

Security

All participants that connect to Schema Registry are authenticated using mTLS by default. This can be disabled via the MTLS field of the SchemaRegistry custom resource. ​

Schema Registry Kafka ACLs

​ When ACLs are enabled in the Apache Kafka cluster that the particular Schema Registry is bound to, Streaming Data Manager automatically creates the required Kafka ACLs for Schema Registry. ​

Registering schemas declaratively

By default, client applications automatically register new schemas. When they produce new messages to the topic, they automatically try to register new schemas. This behavior can be useful during development, but should be avoided in production environments. ​ Aside from registering schemas through Schema Registry’s API endpoint, Streaming Data Manager also makes it possible to register schemas declaratively, through Kubernetes ConfigMaps that hold schema definitions. Streaming Data Manager watches ConfigMaps with specific labels to read schema definitions from, and registers them into the Schema Registry running in that namespace. This behavior is in line with the GitOps and Configuration as Code trends.

Streaming Data Manager SchemaRegistry structure Streaming Data Manager SchemaRegistry structure ​ See the following example:

apiVersion: v1
kind: ConfigMap
metadata:
  name: my-topic-value-schema
  labels:
    schema-registry.banzaicloud.io/name: my-schema-registry
    schema-registry.banzaicloud.io/subject: my-topic-value
data:
  schema.json: |-
    {
       "namespace": "io.examples",
       "type": "record",
       "name": "Payment",
       "fields": [
          {
            "name": "id",
            "type": "string"
         },
         {
           "name": "amount",
           "type": "double"
        }
      ]
    }

​ Notice the two labels on ConfigMap: ​

  • schema-registry.banzaicloud.io/name - The name of the SchemaRegistry custom resource that represents the Schema Registry deployment where the schema should be registered.
  • schema-registry.banzaicloud.io/subject - The name of the subject where the schema should be registered. ​ The schema specification should be under the schema.json key in the ConfigMap. Any updates made to the schema specifications in the ConfigMap create new versions of the underlying schema in the Schema Registry. ​ When the ConfigMap is deleted, all versions related to the corresponding schema are deleted from their Schema Registry.

The SchemaRegistry custom resource

apiVersion: kafka.banzaicloud.io/v1beta1
kind: SchemaRegistry
metadata:
  name: my-schema-registry
  namespace: kafka
spec:
  clusterRef:
    # Name of the KafkaCluster custom resource that represents the Kafka cluster this Schema Registry instance will connect to
    name: kafka
​
  # The port Schema registry listens on for API requests (default: 8081)
  servicePort: 8081

  # Labels to be applied to the schema registry pod
  podLabels:

  # Annotations to be applied to the schema registry pod
  podAnnotations:

  # Annotations to be applied to the service that exposes the Schema registry API on port `ServicePort`
  serviceAnnotations:

  # Labels to be applied to the service that exposes the Schema Registry API on port `ServicePort`
  serviceLabels:

  # Service account for schema registry pod
  serviceAccountName:

  # Description of compute resource requirements
  #
  # requests:
  #   cpu: 200m
  #   mem: 800mi
  # limits:
  #   cpu: 1
  #   mem: 1.2Gi
  resources:
​
  # https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#nodeselector
  nodeSelector:
​
  # https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#affinity-and-anti-affinity
  affinity:
​
  # https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/
  tolerations:
​
  # Minimum number of replicas (default: 1)
  minReplicas: 1
​
  # Maximum number of replicas (default: 5) Horizontal Pod Autoscaler can upscale to
  maxReplicas: 5
​
  # Controls whether mTLS is enforced between Schema Registry and client applications
  # as well as Schema registry instances
  # (default: true)
  MTLS: true
​
  # Heap settings for Schema Registry (default: -Xms512M -Xmx1024M)
  heapOpts: -Xms512M -Xmx1024M
​
  # Defines the config values for schema registry in the form of key-value pairs
  schemaRegistryConfig:

  # IstioControlPlane specifies the namespace and name of the IstioControlPlane custom resource
  # which represents the Istio control plane. Starting from SDM 1.7.0 this field is required if Schema Registry is exposed 
  # outside of the Kubernetes cluster.
  istioControlPlane:
    name: <name of the IstioControlPlane custom resource>
    namespace: <namespace of the IstioControlPlane custom resource>

​ More about the istioControlPlane:

  • Istio operator v2 is supported from SDM version 1.7.0+. Istio operator v2 supports multiple Istio control planes on the same cluster, that is why the corresponding control plane for the Istio ingress must be specified.

The following SchemaRegistry configurations are computed and maintained by Streaming Data Manager, and cannot be overridden: ​

  • host.name
  • listeners
  • kafkastore.bootstrap.servers
  • kafkastore.group.id
  • schema.registry.group.id
  • master.eligibility - always true
  • kafkastore.topic

Example: Schema Registry with a demo application

For a detailed example on managing Schema Registry with a demo Spring Boot application, see our Kafka Schema Registry on Kubernetes blog post.