Manage Schema Registry instances
Streaming Data Manager automates Schema Registry deployment by introducing a new custom resource called SchemaRegistry. Streaming Data Manager provides two modes to manage the schemas:
Both methods use the SchemaRegistry Custom Resource under the hood to manage Schema registry instances.
Imperative management of Schema Registry instances
The Streaming Data Manager CLI provides commands to deploy Schema Registry instances with either default or custom settings with ease.
Note: If you install Streaming Data Manager with the following command, it sets up everything needed to start playing with Apache Kafka and Schema Registry on Kubernetes, which is ideal for testing:
smm install -a --install-sdm
To deploy additional Schema Registry instances or manage existing ones, run the
smm sdm cluster schemaregistry create
andsmm sdm cluster schemaregistry update
commands. Both of these commands expect a Schema Registry descriptor.
Declarative management of Schema Registry instances
Managing Schema Registry instances with Streaming Data Manager is as simple as creating and updating the SchemaRegistry
custom resource. Streaming Data Manager automatically monitors the Schema Registry deployment and configuration settings specified using the SchemaRegistry
custom resource (for details, see the description of the custom resource).
These will perform the necessary steps to spin up new Schema Registry instances or reconfigure existing ones with the desired configuration.
Schema Registry API endpoints
Access Schema Registry from inside the Kubernetes cluster
The deployed Schema Registry instances are reachable at the schema-registry-svc-<schema-registry-name>.<namespace>.svc:<servicePort>
endpoint.
Access Schema Registry from outside the Kubernetes cluster
In order to access the Schema Registry from a client application which is outside the Kubernetes cluster, do the following:
- Make sure you have at least one external listener for Kafka exposing it outside the Kubernetes cluster.
- Configure the SchemaRegistry custom resource with the following options: Set the
istioControlPlane
field’sname
andnamespace
(see example below). TheistioControlPlane
field is a reference to the IstioControlPlane custom resource and is mandatory from SDM version 1.7.0+ as SDM 1.7.0+ uses Istio operator v2, which supports multiple Istio control planes on the same cluster. That is why the corresponding control plane to the Istio ingress gateway must be specified. - (Optional) You can set
MTLS
tofalse
if you don’t need client authentication and secure communication between the client application and external endpoint. - Get the loadBalancer-type service
schema-registry-meshgateway-<schema-registry-name>
public ip for connecting client applications from outside the Kubernetes cluster.
This public endpoint is made available when the Kafka cluster that the Schema Registry is bound to becomes exposed externally.
Security
All participants that connect to Schema Registry are authenticated using mTLS by default.
This can be disabled via the MTLS
field of the SchemaRegistry
custom resource.
Schema Registry Kafka ACLs
When ACLs are enabled in the Apache Kafka cluster that the particular Schema Registry is bound to, Streaming Data Manager automatically creates the required Kafka ACLs for Schema Registry.
Registering schemas declaratively
By default, client applications automatically register new schemas. When they produce new messages to the topic, they automatically try to register new schemas. This behavior can be useful during development, but should be avoided in production environments. Aside from registering schemas through Schema Registry’s API endpoint, Streaming Data Manager also makes it possible to register schemas declaratively, through Kubernetes ConfigMaps that hold schema definitions. Streaming Data Manager watches ConfigMaps with specific labels to read schema definitions from, and registers them into the Schema Registry running in that namespace. This behavior is in line with the GitOps and Configuration as Code trends.
apiVersion: v1
kind: ConfigMap
metadata:
name: my-topic-value-schema
labels:
schema-registry.banzaicloud.io/name: my-schema-registry
schema-registry.banzaicloud.io/subject: my-topic-value
data:
schema.json: |-
{
"namespace": "io.examples",
"type": "record",
"name": "Payment",
"fields": [
{
"name": "id",
"type": "string"
},
{
"name": "amount",
"type": "double"
}
]
}
Notice the two labels on ConfigMap:
- schema-registry.banzaicloud.io/name - The name of the
SchemaRegistry
custom resource that represents the Schema Registry deployment where the schema should be registered. - schema-registry.banzaicloud.io/subject - The name of the subject where the schema should be registered.
The schema specification should be under the
schema.json
key in the ConfigMap. Any updates made to the schema specifications in the ConfigMap create new versions of the underlying schema in the Schema Registry. When the ConfigMap is deleted, all versions related to the corresponding schema are deleted from their Schema Registry.
The SchemaRegistry custom resource
apiVersion: kafka.banzaicloud.io/v1beta1
kind: SchemaRegistry
metadata:
name: my-schema-registry
namespace: kafka
spec:
clusterRef:
# Name of the KafkaCluster custom resource that represents the Kafka cluster this Schema Registry instance will connect to
name: kafka
# The port Schema registry listens on for API requests (default: 8081)
servicePort: 8081
# Labels to be applied to the schema registry pod
podLabels:
# Annotations to be applied to the schema registry pod
podAnnotations:
# Annotations to be applied to the service that exposes the Schema registry API on port `ServicePort`
serviceAnnotations:
# Labels to be applied to the service that exposes the Schema Registry API on port `ServicePort`
serviceLabels:
# Service account for schema registry pod
serviceAccountName:
# Description of compute resource requirements
#
# requests:
# cpu: 200m
# mem: 800mi
# limits:
# cpu: 1
# mem: 1.2Gi
resources:
# https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#nodeselector
nodeSelector:
# https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#affinity-and-anti-affinity
affinity:
# https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/
tolerations:
# Minimum number of replicas (default: 1)
minReplicas: 1
# Maximum number of replicas (default: 5) Horizontal Pod Autoscaler can upscale to
maxReplicas: 5
# Controls whether mTLS is enforced between Schema Registry and client applications
# as well as Schema registry instances
# (default: true)
MTLS: true
# Heap settings for Schema Registry (default: -Xms512M -Xmx1024M)
heapOpts: -Xms512M -Xmx1024M
# Defines the config values for schema registry in the form of key-value pairs
schemaRegistryConfig:
# IstioControlPlane specifies the namespace and name of the IstioControlPlane custom resource
# which represents the Istio control plane. Starting from SDM 1.7.0 this field is required if Schema Registry is exposed
# outside of the Kubernetes cluster.
istioControlPlane:
name: <name of the IstioControlPlane custom resource>
namespace: <namespace of the IstioControlPlane custom resource>
More about the istioControlPlane
:
- Istio operator v2 is supported from SDM version 1.7.0+. Istio operator v2 supports multiple Istio control planes on the same cluster, that is why the corresponding control plane for the Istio ingress must be specified.
The following SchemaRegistry
configurations are computed and maintained by Streaming Data Manager, and cannot be overridden:
- host.name
- listeners
- kafkastore.bootstrap.servers
- kafkastore.group.id
- schema.registry.group.id
- master.eligibility - always
true
- kafkastore.topic
Example: Schema Registry with a demo application
For a detailed example on managing Schema Registry with a demo Spring Boot application, see our Kafka Schema Registry on Kubernetes blog post.