Install SDM - GitOps
This guide details how to set up a GitOps environment for Streaming Data Manager using Argo CD. The same principles can be used for other tools as well.
Prerequisites
-
Service Mesh Manager is already installed using GitOps (for details, see Install SMM - GitOps - single cluster).
CAUTION:
To install Streaming Data Manager on an existing Service Mesh Manager installation, the cluster must run Service Mesh Manager version 1.11 or later. If your cluster is running an earlier Service Mesh Manager version, you must upgrade it first.CAUTION:
When using Streaming Data Manager on Amazon EKS, you must install the EBS CSI driver add-on on your cluster. -
The cluster meets the resource requirements to run Service Mesh Manager and Streaming Data Manager:
CAUTION:
Supported providers and Kubernetes versions
The cluster must run a Kubernetes version that Service Mesh Manager supports: Kubernetes 1.21, 1.22, 1.23, 1.24.
Service Mesh Manager is tested and known to work on the following Kubernetes providers:
- Amazon Elastic Kubernetes Service (Amazon EKS)
- Google Kubernetes Engine (GKE)
- Azure Kubernetes Service (AKS)
- Red Hat OpenShift 4.11
- On-premises installation of stock Kubernetes with load balancer support (and optionally PVCs for persistence)
Calisti resource requirements
Make sure that your Kubernetes or OpenShift cluster has sufficient resources to install Calisti. The following table shows the number of resources needed on the cluster:
Resource Required CPU - 32 vCPU in total
- 4 vCPU available for allocation per worker node (If you are testing on a cluster at a cloud provider, use nodes that have at least 4 CPUs, for example, c5.xlarge on AWS.)Memory - 64 GiB in total
- 4 GiB available for allocation per worker node for the Kubernetes cluster (8 GiB in case of the OpenShift cluster)Storage 12 GB of ephemeral storage on the Kubernetes worker nodes (for Traces and Metrics) These minimum requirements need to be available for allocation within your cluster, in addition to the requirements of any other loads running in your cluster (for example, DaemonSets and Kubernetes node-agents). If Kubernetes cannot allocate sufficient resources to Service Mesh Manager, some pods will remain in Pending state, and Service Mesh Manager will not function properly.
Enabling additional features, such as High Availability increases this value.
The default installation, when enough headroom is available in the cluster, should be able to support at least 150 running
Pods
with the same amount ofServices
. For setting up Service Mesh Manager for bigger workloads, see scaling Service Mesh Manager. -
Argo CD is already installed.
Procedure overview
The high-level steps of the procedure are:
- Prepare the Git repository
- Deploy Service Mesh Manager
- Extend the trust between the meshes
- Deploy other required resources
- Deploy the Kafka cluster application
Set up the environment
-
Set the KUBECONFIG location and context name for the
management-cluster
cluster.MANAGEMENT_CLUSTER_KUBECONFIG=management_cluster_kubeconfig.yaml MANAGEMENT_CLUSTER_CONTEXT=management-cluster kubectl config --kubeconfig "${MANAGEMENT_CLUSTER_KUBECONFIG}" get-contexts "${MANAGEMENT_CLUSTER_CONTEXT}"
Expected output:
CURRENT NAME CLUSTER AUTHINFO NAMESPACE * management-cluster management-cluster
-
Set the KUBECONFIG location and context name for the
workload-cluster-1
cluster.WORKLOAD_CLUSTER_1_KUBECONFIG=workload_cluster_1_kubeconfig.yaml WORKLOAD_CLUSTER_1_CONTEXT=workload-cluster-1 kubectl config --kubeconfig "${WORKLOAD_CLUSTER_1_KUBECONFIG}" get-contexts "${WORKLOAD_CLUSTER_1_CONTEXT}"
Expected output:
CURRENT NAME CLUSTER AUTHINFO NAMESPACE * workload-cluster-1 workload-cluster-1
Repeat this step for any additional workload clusters you want to use.
-
Add the cluster configurations to KUBECONFIG. Include any additional workload clusters you want to use.
KUBECONFIG=$KUBECONFIG:$MANAGEMENT_CLUSTER_KUBECONFIG:$WORKLOAD_CLUSTER_1_KUBECONFIG
-
Make sure the
management-cluster
Kubernetes context is the current context.kubectl config use-context "${MANAGEMENT_CLUSTER_CONTEXT}"
Expected output:
Switched to context "management-cluster".
-
Get the password for the Argo CD
admin
user.kubectl -n argocd get secret argocd-initial-admin-secret -o jsonpath="{.data.password}" | base64 -d; echo
Expected output:
argocd-admin-password
-
Check the
external-ip-or-hostname
address of theargocd-server
service.kubectl get service -n argocd argocd-server
Expected output:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE argocd-server LoadBalancer 10.108.14.130 external-ip-or-hostname 80:31306/TCP,443:30063/TCP 7d13h
-
Log in to the Argo CD server using the
https://external-ip-or-hostname
URL.open https://$(kubectl get service -n argocd argocd-server -o jsonpath='{.status.loadBalancer.ingress[0].hostname}')
-
Log in using the CLI.
argocd login $(kubectl get service -n argocd argocd-server -o jsonpath='{.status.loadBalancer.ingress[0].hostname}') --insecure --username admin --password $(kubectl -n argocd get secret argocd-initial-admin-secret -o jsonpath="{.data.password}" | base64 -d)
Expected output:
'admin:login' logged in successfully Context 'external-ip-or-hostname' updated
-
List the Argo CD clusters and verify that your clusters are registered in Argo CD.
argocd cluster list
Expected output:
SERVER NAME VERSION STATUS MESSAGE PROJECT https://kubernetes.default.svc in-cluster Unknown Cluster has no applications and is not being monitored. https://workload-cluster-1-ip-or-hostname workload-cluster-1 Unknown Cluster has no applications and is not being monitored.
Prepare Git repository
-
Create an empty repository on GitHub (or another provider that Argo CD supports) and initialize it with a README.md file so that you can clone the repository. Alternatively, you can use the repository you have used for the Service Mesh Manager GitOps installation. Because credentials will be stored in this repository, make it a private repository.
GITHUB_ID="github-id"
GITHUB_REPOSITORY_NAME="calisti-gitops"
-
Obtain a personal access token to the repository (on GitHub, see Creating a personal access token), that has the following permissions:
- admin:org_hook
- admin:repo_hook
- read:org
- read:public_key
- repo
-
Log in with your personal access token with
git
.export GH_TOKEN="github-personal-access-token" # Note: this environment variable needs to be exported so the `git` binary is going to use it automatically for authentication.
-
Clone the repository into your local workspace.
git clone "https://github.com/${GITHUB_ID}/${GITHUB_REPOSITORY_NAME}.git"
Expected output:
Cloning into 'calisti-gitops'... remote: Enumerating objects: 144, done. remote: Counting objects: 100% (144/144), done. remote: Compressing objects: 100% (93/93), done. remote: Total 144 (delta 53), reused 135 (delta 47), pack-reused 0 Receiving objects: 100% (144/144), 320.08 KiB | 746.00 KiB/s, done. Resolving deltas: 100% (53/53), done.
-
Add the repository to Argo CD by running the following command. Alternatively, you can add it on Argo CD Web UI.
argocd repo add "https://github.com/${GITHUB_ID}/${GITHUB_REPOSITORY_NAME}.git" --name "${GITHUB_REPOSITORY_NAME}" --username "${GITHUB_ID}" --password "${GH_TOKEN}"
Expected output:
Repository 'https://github.com/github-id/calisti-gitops.git' added
-
Verify that the repository is connected by running:
argocd repo list
In the output, Status should be Successful:
TYPE NAME REPO INSECURE OCI LFS CREDS STATUS MESSAGE PROJECT git calisti-gitops https://github.com/github-id/calisti-gitops.git false false false true Successful
-
Change into the directory of the cloned repository and create the charts, apps and manifests directories.
cd "${GITHUB_REPOSITORY_NAME}"
mkdir charts apps manifests
The final structure of the repository will look like this:
calisti-gitops ├── apps │ ├── sdm-applicationmanifest │ │ └── sdm-applicationmanifest-app.yaml │ ├── sdm-csr-operator-ca-certs │ │ └── sdm-csr-operator-ca-certs-app.yaml │ ├── sdm-istiocontrolplane │ │ └── sdm-istiocontrolplane-app.yaml │ ├── sdm-istiomesh-ca-trust-extension │ │ └── sdm-istiomesh-ca-trust-extension-app.yaml │ ├── sdm-kafka-cluster │ │ └── sdm-kafka-cluster-app.yaml │ ├── sdm-operator │ │ └── sdm-operator-app.yaml │ └── sdm-zookeeper-cluster │ └── sdm-zookeeper-cluster-app.yaml ├── charts │ └── supertubes-control-plane │ ├── Chart.yaml │ ├── README.md │ ├── templates │ │ └── ... │ └── values.yaml └── manifests ├── sdm-applicationmanifest │ └── sdm-applicationmanifest.yaml ├── sdm-csr-operator-ca-certs │ └── sdm-csr-operator-ca-certs-secret.yaml ├── sdm-istiocontrolplane │ ├── kustomization.yaml │ ├── sdm-istio-external-ca-cert-secret.yaml │ └── sdm-icp-v115x.yaml ├── sdm-istiomesh-ca-trust-extension │ ├── kustomization.yaml │ ├── istiomesh-ca-trust-extension-job.yaml │ └── istiomesh-ca-trust-extension-script-cm.yaml ├── sdm-kafka-cluster │ └── sdm-kafka-cluster.yaml └── sdm-zookeeper-cluster └── sdm-zookeeper-cluster.yaml
Prepare the helm charts
-
You need an active Service Mesh Manager registration to download the Streaming Data Manager charts and images. You can sign up for free, or obtain Enterprise credentials on the official Cisco Service Mesh Manager page. After registration, you can obtain your username and password in the Download Center. Set them as environment variables.
CALISTI_USERNAME="calisti-username"
CALISTI_PASSWORD="calisti-password"
-
Download the
supertubes-control-plane
chart fromregistry.eticloud.io
into thecharts
directory of your Streaming Data Manager GitOps repository and unpack it. Run the following commands:export HELM_EXPERIMENTAL_OCI=1 # Needed prior to Helm version 3.8.0
echo "${CALISTI_PASSWORD}" | helm registry login registry.eticloud.io -u "${CALISTI_USERNAME}" --password-stdin
Expected output:
Login Succeeded
helm pull oci://registry.eticloud.io/sdm-charts/supertubes-control-plane --destination charts --untar
Expected output:
Pulled: registry.eticloud.io/sdm-charts/supertubes-control-plane:latest-stable-version Digest: sha256:someshadigest
(Optional) Deploy CA certificate and private key secret
This step is optional because the CA secret is created automatically by default. If you want to use your custom CA certificate and private key, complete the following steps.
The CSR-operator uses this CA secret to sign Certificate Signing Requests for workloads in the Istio mesh and for KafkaUsers CR (Kafka clients).
CAUTION:
Do not push the secrets directly into the git repository, especially when it is a public repository. Argo CD provides solutions to keep secrets safe.-
Create the
sdm-csr-operator-ca-certs
Secret in themanifests/sdm-csr-operator-ca-certs-secret
directory.mkdir -p manifests/sdm-csr-operator-ca-certs
ARGOCD_CLUSTER_NAME="${WORKLOAD_CLUSTER_1_CONTEXT}" CA_CRT_B64="$(cat ca_crt.pem | base64)" CA_KEY_B64="$(cat ca_key.pem | base64)" CHAIN_CRT_B64="$(cat intermediate.pem root.pem | base64)" ; cat > "manifests/sdm-csr-operator-ca-certs/sdm-csr-operator-ca-certs-secret.yaml" << EOF # manifests/sdm-csr-operator-ca-certs/sdm-csr-operator-ca-certs-secret.yaml apiVersion: v1 kind: Secret metadata: name: sdm-csr-operator-ca-certs namespace: csr-operator-system data: ca_crt.pem: ${CA_CRT_B64} ca_key.pem: ${CA_KEY_B64} # chain_crt.pem is optional. Only needed when intermediate CA is used (root CA -> .. -> intermediate CA) chain_crt.pem: ${CHAIN_CRT_B64} EOF
-
Create the
sdm-csr-operator-ca-certs-secret
Application CR in theapps/sdm-csr-operator-ca-certs-secret
directory.mkdir -p apps/sdm-csr-operator-ca-certs
ARGOCD_CLUSTER_NAME="${WORKLOAD_CLUSTER_1_CONTEXT}" ; cat > "apps/sdm-csr-operator-ca-certs/sdm-csr-operator-ca-certs-app.yaml" << EOF # apps/sdm-csr-operator-ca-certs/sdm-csr-operator-ca-certs-app.yaml apiVersion: argoproj.io/v1alpha1 kind: Application metadata: name: sdm-csr-operator-ca-certs namespace: argocd finalizers: - resources-finalizer.argocd.argoproj.io spec: project: default source: repoURL: https://github.com/${GITHUB_ID}/${GITHUB_REPOSITORY_NAME}.git targetRevision: HEAD path: manifests/sdm-csr-operator-ca-certs destination: name: ${ARGOCD_CLUSTER_NAME} namespace: csr-operator-system syncPolicy: automated: prune: true selfHeal: true syncOptions: - Validate=false - CreateNamespace=true - PrunePropagationPolicy=foreground - PruneLast=true EOF
-
Commit and push the
calisti-gitops
repository.git add . git commit -m "add sdm-csr-operator-ca-certs-secret"
git push
Expected output:
Enumerating objects: 13, done. Counting objects: 100% (13/13), done. Delta compression using up to 12 threads Compressing objects: 100% (10/10), done. Writing objects: 100% (10/10), 6.31 KiB | 6.31 MiB/s, done. Total 10 (delta 4), reused 0 (delta 0), pack-reused 0 remote: Resolving deltas: 100% (4/4), completed with 2 local objects. To github.com:github-id/calisti-gitops.git af4e16f..8a81019 main -> main
-
Apply the Application CR.
kubectl apply -f "apps/sdm-csr-operator-ca-certs/sdm-csr-operator-ca-certs-app.yaml"
Expected output:
application.argoproj.io/sdm-csr-operator-ca-certs created
-
Verify that the application has been added to Argo CD and is healthy.
argocd app list
Expected output:
NAME CLUSTER NAMESPACE PROJECT STATUS HEALTH SYNCPOLICY CONDITIONS REPO PATH TARGET ... sdm-csr-operator-ca-certs workload-cluster-1 csr-operator-system default Synced Healthy Auto-Prune <none> https://github.com/github-id/calisti-gitops.git manifests/sdm-csr-operator-ca-certs HEAD ...
Note: You will need to configure the Applicationmanifest CR so that the CSR-operator uses the previously created secret. To do this, you need to change:
- the Applicationmanifest CR’s
csr-operator/valuesOverride/.../issuer/autoGenerated
field tofalse
, and the issuer/secretName
field to the name of your CA secret.
- the Applicationmanifest CR’s
Deploy Streaming Data Manager
Deploy the sdm-operator application
-
Create the
sdm-operator
Application CR in theapps/sdm-operator
directory.mkdir -p apps/sdm-operator
ARGOCD_CLUSTER_NAME="${WORKLOAD_CLUSTER_1_CONTEXT}" ; cat > "apps/sdm-operator/sdm-operator-app.yaml" << EOF # apps/sdm-operator/sdm-operator-app.yaml apiVersion: argoproj.io/v1alpha1 kind: Application metadata: name: sdm-operator namespace: argocd finalizers: - resources-finalizer.argocd.argoproj.io spec: project: default source: repoURL: https://github.com/${GITHUB_ID}/${GITHUB_REPOSITORY_NAME}.git targetRevision: HEAD path: charts/supertubes-control-plane helm: helm: releaseName: sdm-operator values: | imagePullSecrets: - name: "smm-registry.eticloud.io-pull-secret" operator: image: repository: registry.eticloud.io/sdm/supertubes-control-plane destination: name: ${ARGOCD_CLUSTER_NAME} namespace: smm-registry-access syncPolicy: automated: prune: true selfHeal: true syncOptions: - Validate=false - PruneLast=true - CreateNamespace=true EOF
-
Commit and push the
calisti-gitops
repository.git add . git commit -m "add sdm-operator app"
Expected output:
[main 3f57c62] add sdm-operator app 11 files changed, 842 insertions(+) create mode 100644 apps/sdm-operator/sdm-operator-app.yaml create mode 100644 charts/supertubes-control-plane/.helmignore create mode 100644 charts/supertubes-control-plane/Chart.yaml create mode 100644 charts/supertubes-control-plane/README.md create mode 100644 charts/supertubes-control-plane/templates/_helpers.tpl create mode 100644 charts/supertubes-control-plane/templates/supertubes-crd.yaml create mode 100644 charts/supertubes-control-plane/templates/supertubes-deployment.yaml create mode 100644 charts/supertubes-control-plane/templates/supertubes-rbac.yaml create mode 100644 charts/supertubes-control-plane/templates/supertubes-service.yaml create mode 100644 charts/supertubes-control-plane/templates/supertubes-webhooks.yaml create mode 100644 charts/supertubes-control-plane/values.yaml
git push
Expected output:
Enumerating objects: 33, done. Counting objects: 100% (33/33), done. Delta compression using up to 12 threads Compressing objects: 100% (29/29), done. Writing objects: 100% (29/29), 14.38 KiB | 2.88 MiB/s, done. Total 29 (delta 7), reused 0 (delta 0), pack-reused 0 remote: Resolving deltas: 100% (7/7), completed with 2 local objects. To github.com:github-id/calisti-gitops.git + 8a81019...3f57c62 main -> main (forced update)
-
Apply the Application manifest.
kubectl apply -f "apps/sdm-operator/sdm-operator-app.yaml"
Expected output:
application.argoproj.io/sdm-operator created
-
Verify that the application has been added to Argo CD and is healthy.
argocd app list
Expected output:
NAME CLUSTER NAMESPACE PROJECT STATUS HEALTH SYNCPOLICY CONDITIONS REPO PATH TARGET ... sdm-operator workload-cluster-1 smm-registry-access default Synced Healthy Auto-Prune <none> https://github.com/github-id/calisti-gitops.git charts/supertubes-control-plane HEAD ...
You can check the
sdm-operator
application on the Argo CD Web UI as well.open https://$(kubectl get service -n argocd argocd-server -o jsonpath='{.status.loadBalancer.ingress[0].hostname}')
Deploy the sdm-applicationmanifest application
-
Create the
applicationmanifest
ApplicationManifest CR in themanifests/sdm-applicationmanifest
directory.mkdir -p manifests/sdm-applicationmanifest
For Kubernetes:
ISSUER_SECRET_NAME="sdm-csr-operator-ca-certs" ISSUER_AUTOGENERATED="true" ; cat > "manifests/sdm-applicationmanifest/sdm-applicationmanifest.yaml" << EOF # manifests/sdm-applicationmanifest/sdm-applicationmanifest.yaml apiVersion: supertubes.banzaicloud.io/v1beta1 kind: ApplicationManifest metadata: name: applicationmanifest spec: clusterRegistry: enabled: false namespace: cluster-registry csrOperator: enabled: true namespace: csr-operator-system valuesOverride: |- image: repository: "registry.eticloud.io/csro/csr-operator" csroperator: config: privateCASigner: issuer: secretName: "${ISSUER_SECRET_NAME}" autoGenerated: ${ISSUER_AUTOGENERATED} imagePullSecretsOperator: enabled: false namespace: supertubes-system istioOperator: enabled: false namespace: istio-system kafkaMinion: enabled: false kafkaOperator: enabled: false namespace: kafka valuesOverride: | alertManager: permissivePeerAuthentication: create: true monitoring: grafanaDashboards: enabled: false label: app.kubernetes.io/supertubes_managed_grafana_dashboard prometheusOperator: enabled: false namespace: supertubes-system valuesOverride: | prometheus: prometheusSpec: resources: limits: cpu: 2 memory: 2Gi requests: cpu: 1 memory: 1Gi supertubes: enabled: false namespace: supertubes-system valuesOverride: | ui-backend: image: repository: "registry.eticloud.io/sdm/supertubes-ui" podLabels: smm.cisco.com/jwt-auth-from-ingress: "true" operator: image: repository: "registry.eticloud.io/sdm/supertubes" cruiseControlModules: image: repository: "registry.eticloud.io/sdm/cruisecontrol-modules" kafkaAuthAgent: image: repository: "registry.eticloud.io/sdm/kafka-authn-agent" kafkaModules: image: repository: "registry.eticloud.io/sdm/kafka-modules" zookeeperOperator: enabled: false namespace: zookeeper EOF
For OpenShift:
Note: the controllerSettings.platform’s value needs to be openshift and other resource requirements and special settings are needed for some of the components to run Streaming Data Manager in OpenShift environment.
ISSUER_SECRET_NAME="sdm-csr-operator-ca-certs" ISSUER_AUTOGENERATED="true" ; cat > "manifests/sdm-applicationmanifest/sdm-applicationmanifest.yaml" << EOF # manifests/sdm-applicationmanifest/sdm-applicationmanifest.yaml apiVersion: supertubes.banzaicloud.io/v1beta1 kind: ApplicationManifest metadata: name: sdm-applicationmanifest spec: controllerSettings: platform: openshift clusterRegistry: enabled: false namespace: cluster-registry csrOperator: enabled: true namespace: csr-operator-system valuesOverride: |- image: repository: "registry.eticloud.io/csro/csr-operator" csroperator: config: privateCASigner: issuer: secretName: "${ISSUER_SECRET_NAME}" autoGenerated: ${ISSUER_AUTOGENERATED} imagePullSecretsOperator: enabled: false namespace: supertubes-system istioOperator: enabled: false namespace: istio-system kafkaMinion: enabled: false kafkaOperator: enabled: false namespace: kafka valuesOverride: | alertManager: permissivePeerAuthentication: create: true monitoring: grafanaDashboards: enabled: false label: app.kubernetes.io/supertubes_managed_grafana_dashboard prometheusOperator: enabled: false namespace: supertubes-system valuesOverride: | prometheus: prometheusSpec: resources: limits: cpu: 2 memory: 4Gi requests: cpu: 1 memory: 3Gi prometheus-node-exporter: service: port: 9123 targetPort: 9123 prometheusOperator: admissionWebhooks: createSecretJob: securityContext: allowPrivilegeEscalation: false capabilities: drop: - "ALL" patchWebhookJob: securityContext: allowPrivilegeEscalation: false capabilities: drop: - "ALL" containerSecurityContext: capabilities: drop: - "ALL" resources: limits: cpu: 400m memory: 400Mi requests: cpu: 200m memory: 200Mi supertubes: enabled: false namespace: supertubes-system valuesOverride: | ui-backend: image: repository: "registry.eticloud.io/sdm/supertubes-ui" podLabels: smm.cisco.com/jwt-auth-from-ingress: "true" operator: image: repository: "registry.eticloud.io/sdm/supertubes" cruiseControlModules: image: repository: "registry.eticloud.io/sdm/cruisecontrol-modules" kafkaAuthAgent: image: repository: "registry.eticloud.io/sdm/kafka-authn-agent" kafkaModules: image: repository: "registry.eticloud.io/sdm/kafka-modules" zookeeperOperator: enabled: false namespace: zookeeper EOF
-
If you want to use a custom CA and secret created by the CSR-operator, modify the ApplicationManifest CR:
- Set the csr-operator/valuesOverride/…/issuer/autoGenerated field to
false
. - Set the csr-operator/valuesOverride/…/issuer/secretName field to the name of your CA secret.
- Set the csr-operator/valuesOverride/…/issuer/autoGenerated field to
-
Create the
sdm-applicationmanifest
Application CR in theapps/sdm-applicationmanifest
directory.mkdir -p apps/sdm-applicationmanifest
ARGOCD_CLUSTER_NAME="${WORKLOAD_CLUSTER_1_CONTEXT}" ; cat > "apps/sdm-applicationmanifest/sdm-applicationmanifest-app.yaml" <<EOF # apps/sdm-applicationmanifest/sdm-applicationmanifest-app.yaml apiVersion: argoproj.io/v1alpha1 kind: Application metadata: name: sdm-applicationmanifest namespace: argocd finalizers: - resources-finalizer.argocd.argoproj.io spec: project: default source: repoURL: https://github.com/${GITHUB_ID}/${GITHUB_REPOSITORY_NAME}.git targetRevision: HEAD path: manifests/sdm-applicationmanifest destination: name: ${ARGOCD_CLUSTER_NAME} namespace: smm-registry-access syncPolicy: automated: prune: true selfHeal: true syncOptions: - Validate=false - CreateNamespace=true - PrunePropagationPolicy=foreground - PruneLast=true EOF
-
Commit and push the
calisti-gitops
repository.git add . git commit -m "add sdm-applicationmanifest application"
Expected output:
[main 0eae6a5] add sdm-applicationmanifest app 2 files changed, 206 insertions(+) create mode 100644 apps/sdm-applicationmanifest/sdm-applicationmanifest-app.yaml create mode 100644 manifests/sdm-applicationmanifest/sdm-applicationmanifest.yaml
git push origin
Expected output:
Enumerating objects: 13, done. Counting objects: 100% (13/13), done. Delta compression using up to 12 threads Compressing objects: 100% (10/10), done. Writing objects: 100% (10/10), 1.90 KiB | 1.90 MiB/s, done. Total 10 (delta 5), reused 0 (delta 0), pack-reused 0 remote: Resolving deltas: 100% (5/5), completed with 3 local objects. To github.com:github-id/calisti-gitops.git 3f57c62..0eae6a5 main -> main
-
Apply the Application manifest.
kubectl apply -f "apps/sdm-applicationmanifest/sdm-applicationmanifest-app.yaml"
Expected output:
application.argoproj.io/sdm-applicationmanifest created
-
Verify that the application has been added to Argo CD and is healthy.
argocd app list
Expected output:
NAME CLUSTER NAMESPACE PROJECT STATUS HEALTH SYNCPOLICY CONDITIONS REPO PATH TARGET ... sdm-applicationmanifest workload-cluster-1 smm-registry-access default Synced Healthy Auto-Prune <none> https://github.com/github-id/calisti-gitops.git manifests/sdm-applicationmanifest HEAD ...
-
Check if all pods are healthy and running in the
supertubes-system
namespace onworkload-cluster-1
.kubectl get pods -n csr-operator-system --kubeconfig "${WORKLOAD_CLUSTER_1_KUBECONFIG}" --context "${WORKLOAD_CLUSTER_1_CONTEXT}"
Expected output:
NAME READY STATUS RESTARTS AGE csr-operator-587df64776-5bs2p 2/2 Running 0 59s
Deploy the sdm-istiocontrolplane application
-
Create the
kustomization.yaml
file in themanifests/sdm-istiocontrolplane
directory.mkdir -p manifests/sdm-istiocontrolplane
ISTIO_MINOR_VERSION="1.15" ; cat > "manifests/sdm-istiocontrolplane/kustomization.yaml" <<EOF # manifests/sdm-istiocontrolplane/kustomization.yaml resources: - sdm-istio-external-ca-cert-secret.yaml - sdm-icp-v${ISTIO_MINOR_VERSION/.}x.yaml EOF
-
Set the
sdm-istio-external-ca-cert
secret.- When there is no intermediate CA, the
sdm-istio-external-ca-cert
secret has to contain the CA certificate from thesdm-csr-operator-ca-certs
secret with the data keyca_crt.pem
. - Otherwise, the
sdm-istio-external-ca-cert
has to contain the CA certificate chain from thesdm-csr-operator-ca-certs
secret with the data keychain_crt.pem
.
ISSUER_SECRET_NAME="sdm-csr-operator-ca-certs" ; cat > "manifests/sdm-istiocontrolplane/sdm-istio-external-ca-cert-secret.yaml" <<EOF # manifests/sdm-istiocontrolplane/sdm-istio-external-ca-cert-secret.yaml apiVersion: v1 kind: Secret metadata: name: sdm-istio-external-ca-cert namespace: istio-system data: # When there is intermediate CA: # root-cert.pem: $(kubectl --kubeconfig "${WORKLOAD_CLUSTER_1_KUBECONFIG}" --context "${WORKLOAD_CLUSTER_1_CONTEXT}" --namespace csr-operator-system get secret ${ISSUER_SECRET_NAME} -o 'jsonpath={.data.chain_crt\.pem}') # When there is no intermediate CA root-cert.pem: $(kubectl --kubeconfig "${WORKLOAD_CLUSTER_1_KUBECONFIG}" --context "${WORKLOAD_CLUSTER_1_CONTEXT}" --namespace csr-operator-system get secret ${ISSUER_SECRET_NAME} -o 'jsonpath={.data.ca_crt\.pem}') EOF
- When there is no intermediate CA, the
-
Create the SDM IstioControlPlane CR in the
manifests/sdm-istiocontrolplane
directory.For Kubernetes:
ISTIO_VERSION="1.15.3" ISTIO_MINOR_VERSION="1.15" ISTIO_PILOT_VERSION="v1.15.3-bzc.0" ISTIO_PROXY_VERSION="v1.15.3-bzc-kafka.0" ; cat > "manifests/sdm-istiocontrolplane/sdm-icp-v${ISTIO_MINOR_VERSION/.}x.yaml" <<EOF # manifests/sdm-istiocontrolplane/sdm-icp-v${ISTIO_MINOR_VERSION/.}x.yaml apiVersion: servicemesh.cisco.com/v1alpha1 kind: IstioControlPlane metadata: labels: banzaicloud.io/managed-by: supertubes name: sdm-icp-v${ISTIO_MINOR_VERSION/.}x namespace: istio-system spec: containerImageConfiguration: imagePullPolicy: Always imagePullSecrets: - name: smm-pull-secret distribution: cisco istiod: deployment: env: - name: PILOT_SKIP_VALIDATE_TRUST_DOMAIN value: "true" - name: EXTERNAL_CA value: ISTIOD_RA_KUBERNETES_API - name: K8S_SIGNER value: csr.banzaicloud.io/privateca - name: ISTIO_MULTIROOT_MESH value: "true" image: registry.eticloud.io/sdm/istio-pilot:${ISTIO_PILOT_VERSION} k8sResourceOverlays: - groupVersionKind: group: apps kind: Deployment version: v1 objectKey: name: istiod-sdm-icp-v${ISTIO_MINOR_VERSION/.}x patches: - parseValue: true path: /spec/template/spec/volumes/- type: replace value: | name: external-ca-cert secret: secretName: sdm-istio-external-ca-cert optional: true - parseValue: true path: /spec/template/spec/containers/name=discovery/volumeMounts/- type: replace value: | name: external-ca-cert mountPath: /etc/external-ca-cert readOnly: true - groupVersionKind: group: rbac.authorization.k8s.io kind: ClusterRole version: v1 objectKey: name: istiod-sdm-icp-v${ISTIO_MINOR_VERSION/.}x-istio-system patches: - parseValue: true path: /rules/- type: replace value: | apiGroups: - certificates.k8s.io resourceNames: - csr.banzaicloud.io/privateca resources: - signers verbs: - approve meshConfig: defaultConfig: proxyMetadata: PROXY_CONFIG_XDS_AGENT: "true" enableAutoMtls: true protocolDetectionTimeout: 5s meshID: sdm mode: ACTIVE networkName: network1 proxy: image: registry.eticloud.io/sdm/istio-proxyv2:v1.15.3-bzc-kafka.0 telemetryV2: enabled: true version: 1.15.3 EOF
For OpenShift:
ISTIO_VERSION="1.15.3" ISTIO_MINOR_VERSION="1.15" ISTIO_PILOT_VERSION="v1.15.3-bzc.1" ISTIO_PROXY_VERSION="v1.15.3-bzc-kafka.0" ISTIO_CNI_VERSION="v1.15.3-bzc.1"; cat > "manifests/sdm-istiocontrolplane/sdm-icp-v${ISTIO_MINOR_VERSION/.}x.yaml" <<EOF # manifests/sdm-istiocontrolplane/sdm-icp-v${ISTIO_MINOR_VERSION/.}x.yaml apiVersion: servicemesh.cisco.com/v1alpha1 kind: IstioControlPlane metadata: labels: banzaicloud.io/managed-by: supertubes name: sdm-icp-v${ISTIO_MINOR_VERSION/.}x namespace: istio-system spec: containerImageConfiguration: imagePullPolicy: Always imagePullSecrets: - name: smm-pull-secret distribution: cisco istiod: deployment: env: - name: PILOT_SKIP_VALIDATE_TRUST_DOMAIN value: "true" - name: EXTERNAL_CA value: ISTIOD_RA_KUBERNETES_API - name: K8S_SIGNER value: csr.banzaicloud.io/privateca - name: ISTIO_MULTIROOT_MESH value: "true" image: registry.eticloud.io/sdm/istio-pilot:${ISTIO_PILOT_VERSION} k8sResourceOverlays: - groupVersionKind: group: apps kind: Deployment version: v1 objectKey: name: istiod-sdm-icp-v${ISTIO_MINOR_VERSION/.}x patches: - parseValue: true path: /spec/template/spec/volumes/- type: replace value: | name: external-ca-cert secret: secretName: sdm-istio-external-ca-cert optional: true - parseValue: true path: /spec/template/spec/containers/name=discovery/volumeMounts/- type: replace value: | name: external-ca-cert mountPath: /etc/external-ca-cert readOnly: true - groupVersionKind: group: rbac.authorization.k8s.io kind: ClusterRole version: v1 objectKey: name: istiod-sdm-icp-v${ISTIO_MINOR_VERSION/.}x-istio-system patches: - parseValue: true path: /rules/- type: replace value: | apiGroups: - certificates.k8s.io resourceNames: - csr.banzaicloud.io/privateca resources: - signers verbs: - approve meshConfig: defaultConfig: proxyMetadata: PROXY_CONFIG_XDS_AGENT: "true" discoverySelectors: - matchExpressions: - key: kubernetes.io/metadata.name operator: NotIn values: - openshift - kube-system - kube-node-lease - kube-public - key: openshift.io/cluster-monitoring operator: NotIn values: - "true" - key: hive.openshift.io/managed operator: NotIn values: - "true" enableAutoMtls: true protocolDetectionTimeout: 5s meshID: sdm mode: ACTIVE networkName: network1 proxy: image: registry.eticloud.io/sdm/istio-proxyv2:${ISTIO_PROXY_VERSION} proxyInit: cni: binDir: /var/lib/cni/bin chained: false confDir: /etc/cni/multus/net.d confFileName: istio-cni-sdm-icp-v${ISTIO_MINOR_VERSION/.}x-istio-system.conf daemonset: image: registry.eticloud.io/smm/istio-install-cni:${ISTIO_CNI_VERSION} podMetadata: annotations: appnet.cisco.com/image-registry-access: "true" securityContext: privileged: true enabled: true image: registry.eticloud.io/sdm/istio-proxyv2:${ISTIO_PROXY_VERSION} telemetryV2: enabled: true version: ${ISTIO_VERSION} EOF
-
Create
sdm-istiocontrolplane
Application CR in theapps/sdm-istiocontrolplane
directory.mkdir -p apps/sdm-istiocontrolplane
ARGOCD_CLUSTER_NAME="${WORKLOAD_CLUSTER_1_CONTEXT}" ; cat > apps/sdm-istiocontrolplane/sdm-istiocontrolplane-app.yaml <<EOF # apps/sdm-istiocontrolplane/sdm-istiocontrolplane-app.yaml apiVersion: argoproj.io/v1alpha1 kind: Application metadata: name: sdm-istiocontrolplane namespace: argocd finalizers: - resources-finalizer.argocd.argoproj.io spec: project: default source: repoURL: https://github.com/${GITHUB_ID}/${GITHUB_REPOSITORY_NAME}.git targetRevision: HEAD path: manifests/sdm-istiocontrolplane destination: name: ${ARGOCD_CLUSTER_NAME} namespace: istio-system syncPolicy: automated: prune: true selfHeal: true syncOptions: - Validate=false - CreateNamespace=true - PrunePropagationPolicy=foreground - PruneLast=true EOF
-
Commit and push the
calisti-gitops
repository.git add apps/sdm-istiocontrolplane manifests/sdm-istiocontrolplane
git commit -m "add sdm-istiocontrolplane app"
Expected output:
[main 0d8959e] add sdm-istiocontrolplane app 4 files changed, 120 insertions(+) create mode 100644 apps/sdm-istiocontrolplane/sdm-istiocontrolplane-app.yaml create mode 100644 manifests/sdm-istiocontrolplane/kustomization.yaml create mode 100644 manifests/sdm-istiocontrolplane/sdm-icp-v115x.yaml create mode 100644 manifests/sdm-istiocontrolplane/sdm-istio-external-ca-cert-secret.yaml
git push origin
Expected output:
Enumerating objects: 13, done. Counting objects: 100% (13/13), done. Delta compression using up to 12 threads Compressing objects: 100% (10/10), done. Writing objects: 100% (10/10), 3.63 KiB | 1.81 MiB/s, done. Total 10 (delta 3), reused 0 (delta 0), pack-reused 0 remote: Resolving deltas: 100% (3/3), completed with 3 local objects. To github.com:github-id/calisti-gitops.git 63600fc..0d8959e main -> main
-
Apply the
sdm-istiocontrolplane
Application CR.kubectl apply -f apps/sdm-istiocontrolplane/sdm-istiocontrolplane-app.yaml
Expected output:
application.argoproj.io/sdm-istiocontrolplane created
-
Verify that the application has been added to Argo CD and is healthy.
argocd app list
Expected output:
NAME CLUSTER NAMESPACE PROJECT STATUS HEALTH SYNCPOLICY CONDITIONS REPO PATH TARGET ... sdm-istiocontrolplane workload-cluster-1 istio-system default Synced Healthy Auto-Prune <none> https://github.com/github-id/calisti-gitops.git manifests/sdm-istiocontrolplane HEAD ...
Extend the trust between the Istio meshes
Extend the Service Mesh Manager Istio mesh trusted CA certificates with the Streaming Data Manager external CA certificate.
-
Create the
kustomization.yaml
file in themanifests/sdm-istiomesh-ca-trust-extension
directory.mkdir -p manifests/sdm-istiomesh-ca-trust-extension
cat > manifests/sdm-istiomesh-ca-trust-extension/kustomization.yaml <<EOF # manifests/sdm-istiomesh-ca-trust-extension/kustomization.yaml resources: - istiomesh-ca-trust-extension-job.yaml - istiomesh-ca-trust-extension-script-cm.yaml EOF
-
Create the
istiomesh-ca-trust-extension
Job in themanifests/sdm-istiomesh-ca-trust-extension
directory.cat > manifests/sdm-istiomesh-ca-trust-extension/istiomesh-ca-trust-extension-job.yaml <<EOF # manifests/sdm-istiomesh-ca-trust-extension/istiomesh-ca-trust-extension-job.yaml apiVersion: batch/v1 kind: Job metadata: name: istiomesh-ca-trust-extension namespace: smm-registry-access spec: completions: 1 template: metadata: name: istiomesh-ca-trust-extension spec: containers: - command: - /scripts/run.sh image: lachlanevenson/k8s-kubectl:v1.16.10 imagePullPolicy: IfNotPresent name: istio-trust-extension-job volumeMounts: - mountPath: /scripts name: run readOnly: false dnsPolicy: ClusterFirst restartPolicy: Never serviceAccount: sdm-operator-supertubes-control-plane serviceAccountName: sdm-operator-supertubes-control-plane volumes: - configMap: defaultMode: 365 name: istiomesh-ca-trust-extension-script name: run EOF
-
Create the
istiomesh-ca-trust-extension-script
ConfigMap in themanifests/sdm-istiomesh-ca-trust-extension/resources
directory.ISTIO_MINOR_VERSION="1.15" cat > manifests/sdm-istiomesh-ca-trust-extension/istiomesh-ca-trust-extension-script-cm.yaml <<EOF # manifests/sdm-istiomesh-ca-trust-extension/istiomesh-ca-trust-extension-script-cm.yaml apiVersion: v1 kind: ConfigMap metadata: name: istiomesh-ca-trust-extension-script namespace: smm-registry-access data: run.sh: |- #!/bin/sh # Fill these fields properly---------------------- export CA_SECRET_NAMESPACE="istio-system" export CA_SECRET_NAME="sdm-istio-external-ca-cert" # ------------------------------------------------ export ICP_NAME="cp-v${ISTIO_MINOR_VERSION/.}x" export ICP_NAMESPACE="istio-system" export CA_CERT=\$(kubectl get secret -n \$CA_SECRET_NAMESPACE \$CA_SECRET_NAME -o jsonpath='{.data.root-cert\.pem}' | base64 -d | sed '\$ ! s/\$/\\\n/' | tr -d '\n') read -r -d '' PATCH << EOF {"spec": {"meshConfig": {"caCertificates": [{"pem": "\$CA_CERT"}]}}} EOF read -r -d '' INSERT_PATCH << EOF [{"op": "add", "path": "/spec/meshConfig/caCertificates/-", "value": {"pem": "\$CA_CERT"}}] EOF kubectl patch istiocontrolplanes.servicemesh.cisco.com \$ICP_NAME -n \$ICP_NAMESPACE --type json --patch="\$INSERT_PATCH" || kubectl patch istiocontrolplanes.servicemesh.cisco.com \$ICP_NAME -n \$ICP_NAMESPACE --type merge --patch="\$PATCH" EOF
-
Create the
istiomesh-ca-trust-extension
Application CR.mkdir -p apps/sdm-istiomesh-ca-trust-extension
ARGOCD_CLUSTER_NAME="${WORKLOAD_CLUSTER_1_CONTEXT}" ; cat > "apps/sdm-istiomesh-ca-trust-extension/sdm-istiomesh-ca-trust-extension-app.yaml" <<EOF # apps/sdm-istiomesh-ca-trust-extension/sdm-istiomesh-ca-trust-extension-app.yaml apiVersion: argoproj.io/v1alpha1 kind: Application metadata: name: sdm-istiomesh-ca-trust-extension namespace: argocd finalizers: - resources-finalizer.argocd.argoproj.io spec: project: default source: repoURL: https://github.com/${GITHUB_ID}/${GITHUB_REPOSITORY_NAME}.git targetRevision: HEAD path: manifests/sdm-istiomesh-ca-trust-extension destination: name: ${ARGOCD_CLUSTER_NAME} namespace: smm-registry-access syncPolicy: automated: prune: true selfHeal: true syncOptions: - Validate=false - CreateNamespace=true - PrunePropagationPolicy=foreground - PruneLast=true EOF
-
Commit and push the
sdm-gitops
repository.git add apps/sdm-istiomesh-ca-trust-extension manifests/sdm-istiomesh-ca-trust-extension git commit -m "add sdm-istiomesh-ca-trust-extension app"
Expected output:
[main 1ff5fae] add sdm-istiomesh-ca-trust-extension app 4 files changed, 84 insertions(+) create mode 100644 apps/sdm-istiomesh-ca-trust-extension/sdm-istiomesh-ca-trust-extension-app.yaml create mode 100644 manifests/sdm-istiomesh-ca-trust-extension/istiomesh-ca-trust-extension-job.yaml create mode 100644 manifests/sdm-istiomesh-ca-trust-extension/istiomesh-ca-trust-extension-script-cm.yaml create mode 100644 manifests/sdm-istiomesh-ca-trust-extension/kustomization.yaml
git push origin
Expected output:
Enumerating objects: 13, done. Counting objects: 100% (13/13), done. Delta compression using up to 12 threads Compressing objects: 100% (10/10), done. Writing objects: 100% (10/10), 2.21 KiB | 2.21 MiB/s, done. Total 10 (delta 3), reused 0 (delta 0), pack-reused 0 remote: Resolving deltas: 100% (3/3), completed with 3 local objects. To github.com:github-id/calisti-gitops.git 98d53c4..1ff5fae main -> main
-
Apply the
sdm-istiomesh-ca-trust-extension
Application CR.kubectl apply -f apps/sdm-istiomesh-ca-trust-extension/sdm-istiomesh-ca-trust-extension-app.yaml
Expected output:
application.argoproj.io/sdm-istiomesh-ca-trust-extension created
-
Verify that the application has been added to Argo CD and is healthy.
argocd app list
Expected output:
NAME CLUSTER NAMESPACE PROJECT STATUS HEALTH SYNCPOLICY CONDITIONS REPO PATH TARGET ... sdm-istiomesh-ca-trust-extension workload-cluster-1 smm-registry-access default Synced Healthy Auto-Prune <none> https://github.com/github-id/calisti-gitops.git manifests/sdm-istiomesh-ca-trust-extension HEAD ...
Deploy the remaining Streaming Data Manager resources
Modify the ApplicationManifest CR to deploy the remaining Streaming Data Manager resources.
-
Edit the ApplicationManifest CR.
For Kubernetes:
ISSUER_SECRET_NAME="sdm-csr-operator-ca-certs" ISSUER_AUTOGENERATED="false" ; cat > "manifests/sdm-applicationmanifest/sdm-applicationmanifest.yaml" << EOF # manifests/sdm-applicationmanifest/sdm-applicationmanifest.yaml apiVersion: supertubes.banzaicloud.io/v1beta1 kind: ApplicationManifest metadata: name: applicationmanifest spec: clusterRegistry: enabled: false namespace: cluster-registry csrOperator: enabled: true namespace: csr-operator-system valuesOverride: |- image: repository: "registry.eticloud.io/csro/csr-operator" csroperator: config: privateCASigner: issuer: secretName: "${ISSUER_SECRET_NAME}" autoGenerated: ${ISSUER_AUTOGENERATED} imagePullSecretsOperator: enabled: false namespace: supertubes-system istioOperator: enabled: false namespace: istio-system kafkaMinion: enabled: true kafkaOperator: enabled: true namespace: kafka valuesOverride: | alertManager: permissivePeerAuthentication: create: true monitoring: grafanaDashboards: enabled: true label: app.kubernetes.io/supertubes_managed_grafana_dashboard prometheusOperator: enabled: true namespace: supertubes-system valuesOverride: | prometheus: prometheusSpec: resources: limits: cpu: 2 memory: 2Gi requests: cpu: 1 memory: 1Gi supertubes: enabled: true namespace: supertubes-system valuesOverride: | ui-backend: image: repository: "registry.eticloud.io/sdm/supertubes-ui" podLabels: smm.cisco.com/jwt-auth-from-ingress: "true" operator: image: repository: "registry.eticloud.io/sdm/supertubes" cruiseControlModules: image: repository: "registry.eticloud.io/sdm/cruisecontrol-modules" kafkaAuthAgent: image: repository: "registry.eticloud.io/sdm/kafka-authn-agent" kafkaModules: image: repository: "registry.eticloud.io/sdm/kafka-modules" zookeeperOperator: enabled: true namespace: zookeeper EOF
For OpenShift:
Note: the controllerSettings.platform’s value needs to be openshift and other resource requirements and special settings are needed for some of the components to run Streaming Data Manager in OpenShift environment.
ISSUER_SECRET_NAME="sdm-csr-operator-ca-certs" ISSUER_AUTOGENERATED="false" ; cat > "manifests/sdm-applicationmanifest/sdm-applicationmanifest.yaml" << EOF # manifests/sdm-applicationmanifest/sdm-applicationmanifest.yaml apiVersion: supertubes.banzaicloud.io/v1beta1 kind: ApplicationManifest metadata: name: applicationmanifest spec: controllerSettings: platform: openshift clusterRegistry: enabled: false namespace: cluster-registry csrOperator: enabled: true namespace: csr-operator-system valuesOverride: |- image: repository: "registry.eticloud.io/csro/csr-operator" csroperator: config: privateCASigner: issuer: secretName: "${ISSUER_SECRET_NAME}" autoGenerated: ${ISSUER_AUTOGENERATED} imagePullSecretsOperator: enabled: false namespace: supertubes-system istioOperator: enabled: false namespace: istio-system kafkaMinion: enabled: true kafkaOperator: enabled: true namespace: kafka valuesOverride: | alertManager: permissivePeerAuthentication: create: true monitoring: grafanaDashboards: enabled: true label: app.kubernetes.io/supertubes_managed_grafana_dashboard prometheusOperator: enabled: true namespace: supertubes-system valuesOverride: | prometheus: prometheusSpec: resources: limits: cpu: 2 memory: 4Gi requests: cpu: 1 memory: 3Gi prometheus-node-exporter: service: port: 9123 targetPort: 9123 prometheusOperator: admissionWebhooks: createSecretJob: securityContext: allowPrivilegeEscalation: false capabilities: drop: - "ALL" patchWebhookJob: securityContext: allowPrivilegeEscalation: false capabilities: drop: - "ALL" containerSecurityContext: capabilities: drop: - "ALL" resources: limits: cpu: 400m memory: 400Mi requests: cpu: 200m memory: 200Mi supertubes: enabled: true namespace: supertubes-system valuesOverride: | ui-backend: image: repository: "registry.eticloud.io/sdm/supertubes-ui" podLabels: smm.cisco.com/jwt-auth-from-ingress: "true" operator: image: repository: "registry.eticloud.io/sdm/supertubes" cruiseControlModules: image: repository: "registry.eticloud.io/sdm/cruisecontrol-modules" kafkaAuthAgent: image: repository: "registry.eticloud.io/sdm/kafka-authn-agent" kafkaModules: image: repository: "registry.eticloud.io/sdm/kafka-modules" zookeeperOperator: enabled: true namespace: zookeeper EOF
-
If you want to use a custom CA and secret created by the CSR-operator, modify the ApplicationManifest CR:
- Set the csr-operator/valuesOverride/…/issuer/autoGenerated field to
false
. - Set the csr-operator/valuesOverride/…/issuer/secretName field to the name of your CA secret.
- Set the csr-operator/valuesOverride/…/issuer/autoGenerated field to
-
Commit and push the
calisti-gitops
repository.git add manifests/sdm-applicationmanifest git commit -m "update sdm-applicationmanifest app"
Expected output:
[main 18df6de] update sdm-applicationmanifest app 1 file changed, 6 insertions(+), 6 deletions(-)
git push
Expected output:
Enumerating objects: 9, done. Counting objects: 100% (9/9), done. Delta compression using up to 12 threads Compressing objects: 100% (5/5), done. Writing objects: 100% (5/5), 662 bytes | 662.00 KiB/s, done. Total 5 (delta 3), reused 0 (delta 0), pack-reused 0 remote: Resolving deltas: 100% (3/3), completed with 3 local objects. To github.com:github-id/calisti-gitops.git 1ff5fae..18df6de main -> main
-
Sync the changes in Argo CD
argocd app sync sdm-applicationmanifest
Expected output:
TIMESTAMP GROUP KIND NAMESPACE NAME STATUS HEALTH HOOK MESSAGE 2022-10-31T13:07:07+01:00 supertubes.banzaicloud.io ApplicationManifest smm-registry-access applicationmanifest OutOfSync Name: sdm-applicationmanifest Project: default Server: workload-cluster-1 Namespace: smm-registry-access URL: https://a90baf9b2fd8e42ff8bbcbfb60ba59b0-779918959.us-east-1.elb.amazonaws.com/applications/sdm-applicationmanifest Repo: https://github.com/github-id/calisti-gitops.git Target: HEAD Path: manifests/sdm-applicationmanifest SyncWindow: Sync Allowed Sync Policy: Automated (Prune) Sync Status: Synced to HEAD (18df6de) Health Status: Healthy Operation: Sync Sync Revision: 18df6de9a4b64863496e666d7e9217a1b10f351d Phase: Succeeded Start: 2022-10-31 13:07:07 +0100 CET Finished: 2022-10-31 13:07:08 +0100 CET Duration: 1s Message: successfully synced (all tasks run) GROUP KIND NAMESPACE NAME STATUS HEALTH HOOK MESSAGE supertubes.banzaicloud.io ApplicationManifest smm-registry-access applicationmanifest Synced applicationmanifest.supertubes.banzaicloud.io/applicationmanifest configured
-
Wait about 5 minutes and check if all pods are healthy and running in the
supertubes-system
,kafka
, andzookeeper
namespaces onworkload-cluster-1
.kubectl get pods -n supertubes-system --kubeconfig "${WORKLOAD_CLUSTER_1_KUBECONFIG}" --context "${WORKLOAD_CLUSTER_1_CONTEXT}"
Expected output:
NAME READY STATUS RESTARTS AGE prometheus-operator-grafana-6c48dc7db9-qvz4r 4/4 Running 0 32m prometheus-operator-kube-state-metrics-6cfbf4ff4c-zkm7t 2/2 Running 2 (32m ago) 32m prometheus-operator-operator-785464cdb5-fpckj 2/2 Running 2 (32m ago) 32m prometheus-operator-prometheus-node-exporter-85f4h 1/1 Running 0 32m prometheus-operator-prometheus-node-exporter-bgnd2 1/1 Running 0 32m prometheus-operator-prometheus-node-exporter-dn22n 1/1 Running 0 32m prometheus-operator-prometheus-node-exporter-sk2nc 1/1 Running 0 32m prometheus-prometheus-operator-prometheus-0 3/3 Running 0 32m supertubes-6d4f68655b-jrgml 3/3 Running 2 (31m ago) 31m supertubes-ui-backend-55976fc6fc-mc9t5 2/2 Running 2 (31m ago) 31m
kubectl get pods -n kafka --kubeconfig "${WORKLOAD_CLUSTER_1_KUBECONFIG}" --context "${WORKLOAD_CLUSTER_1_CONTEXT}"
Expected output:
NAME READY UP-TO-DATE AVAILABLE AGE kafka-operator-operator 1/1 1 1 39m
kubectl get pods -n zookeeper --kubeconfig "${WORKLOAD_CLUSTER_1_KUBECONFIG}" --context "${WORKLOAD_CLUSTER_1_CONTEXT}"
Expected output:
NAME READY STATUS RESTARTS AGE zookeeper-operator-77df49fd-tv7dl 2/2 Running 1 (38m ago) 38m zookeeper-operator-post-install-upgrade-5vzkp 0/1 Completed 0 38m
Deploy ZooKeeper cluster application
-
Create the
sdm-zookeeper-cluster
ZookeeperCluster CR in theapps/sdm-zookeeper-cluster
directory.mkdir -p manifests/sdm-zookeeper-cluster
cat > manifests/sdm-zookeeper-cluster/sdm-zookeeper-cluster.yaml <<EOF # manifests/sdm-zookeeper-cluster/sdm-zookeeper-cluster.yaml apiVersion: zookeeper.pravega.io/v1beta1 kind: ZookeeperCluster metadata: name: zookeeper-server namespace: zookeeper spec: replicas: 3 persistence: reclaimPolicy: Delete EOF
-
Create the
sdm-zookeeper-cluster
Application CR in theapps/sdm-zookeeper-cluster
directory.mkdir -p apps/sdm-zookeeper-cluster
ARGOCD_CLUSTER_NAME="${WORKLOAD_CLUSTER_1_CONTEXT}" ; cat > apps/sdm-zookeeper-cluster/sdm-zookeeper-cluster-app.yaml <<EOF # apps/sdm-zookeeper-cluster/sdm-zookeeper-cluster-app.yaml apiVersion: argoproj.io/v1alpha1 kind: Application metadata: name: sdm-zookeeper-cluster namespace: argocd finalizers: - resources-finalizer.argocd.argoproj.io spec: project: default source: repoURL: https://github.com/${GITHUB_ID}/${GITHUB_REPOSITORY_NAME}.git targetRevision: HEAD path: manifests/sdm-zookeeper-cluster destination: name: ${ARGOCD_CLUSTER_NAME} namespace: zookeeper syncPolicy: automated: prune: true selfHeal: true syncOptions: - Validate=false - CreateNamespace=true - PrunePropagationPolicy=foreground - PruneLast=true EOF
-
Commit and push the
calisti-gitops
repository.git add apps/sdm-zookeeper-cluster manifests/sdm-zookeeper-cluster
git commit -m "add sdm-zookeeper-cluster app"
Expected output:
[main d3e8b60] add sdm-zookeeper-cluster app 2 files changed, 36 insertions(+) create mode 100644 apps/sdm-zookeeper-cluster/sdm-zookeeper-cluster-app.yaml create mode 100644 manifests/sdm-zookeeper-cluster/sdm-zookeeper-cluster.yaml
git push
Expected output:
Enumerating objects: 11, done. Counting objects: 100% (11/11), done. Delta compression using up to 12 threads Compressing objects: 100% (8/8), done. Writing objects: 100% (8/8), 1.24 KiB | 1.24 MiB/s, done. Total 8 (delta 3), reused 0 (delta 0), pack-reused 0 remote: Resolving deltas: 100% (3/3), completed with 3 local objects. To github.com:github-id/calisti-gitops.git 18df6de..d3e8b60 main -> main
-
Apply the
sdm-zookeeper-cluster
Application CR.kubectl apply -f apps/sdm-zookeeper-cluster/sdm-zookeeper-cluster-app.yaml
Expected output:
application.argoproj.io/sdm-zookeeper-cluster created
-
Verify that the application has been added to Argo CD and is healthy.
argocd app list
Expected output:
NAME CLUSTER NAMESPACE PROJECT STATUS HEALTH SYNCPOLICY CONDITIONS REPO PATH TARGET ... sdm-zookeeper-cluster workload-cluster-1 zookeeper default Synced Healthy Auto-Prune <none> https://github.com/github-id/calisti-gitops.git manifests/sdm-zookeeper-cluster HEAD ...
-
Wait about 5 minutes and check if all pods are healthy and running in the
zookeeper
namespace onworkload-cluster-1
.kubectl get pods --namespace zookeeper --kubeconfig "${WORKLOAD_CLUSTER_1_KUBECONFIG}" --context "${WORKLOAD_CLUSTER_1_CONTEXT}"
Expected output:
NAME READY STATUS RESTARTS AGE zookeeper-0 2/2 Running 0 113s zookeeper-1 2/2 Running 0 79s zookeeper-2 2/2 Running 0 42s zookeeper-operator-5698f684bb-b7lf6 2/2 Running 2 (12m ago) 12m zookeeper-operator-post-install-upgrade-xbj8d 0/1 Completed 0 12m
Deploy the Kafka cluster application
-
Create the
kafka
KafkaCluster CR in themanifests/sdm-kafka-cluster
directory.mkdir -p manifests/sdm-kafka-cluster
-
Create your custom KafkaCluster CR in the
sdm-kafka-cluster
directory based on the samples and the Creating your own KafkaCluster section of the documentation.Edit your KafkaCluster CR and set the
.spec.zkAddresses
field to$ZOOKEEPER_CLUSTER_NAME-client.zookeeper:2181
ISTIO_MINOR_VERSION="1.15" BANZAI_KAFKA_VERSION="2.13-3.1.0" ZOOKEEPER_CLUSTER_NAME="zookeeper" ; cat > "manifests/sdm-kafka-cluster/sdm-kafka-cluster.yaml" <<EOF # manifests/sdm-kafka-cluster/sdm-kafka-cluster.yaml apiVersion: kafka.banzaicloud.io/v1beta1 kind: KafkaCluster metadata: labels: controller-tools.k8s.io: "1.0" name: kafka namespace: kafka spec: headlessServiceEnabled: false ingressController: "istioingress" istioIngressConfig: gatewayConfig: mode: ISTIO_MUTUAL istioControlPlane: name: sdm-icp-v${ISTIO_MINOR_VERSION/.}x namespace: istio-system zkAddresses: - "${ZOOKEEPER_CLUSTER_NAME}-client.zookeeper:2181" # monitoringConfig: # jmxImage: "ghcr.io/banzaicloud/jmx-javaagent:0.16.1" # pathToJar: "/jmx_prometheus_javaagent.jar" # kafkaJMXExporterConfig: | # lowercaseOutputName: true # lowercaseOutputLabelNames: true # ssl: false # whitelistObjectNames: # - kafka.cluster:type=Partition,name=UnderMinIsr,* # - kafka.cluster:type=Partition,name=UnderReplicated,* # - kafka.controller:type=ControllerChannelManager,name=QueueSize,* # - kafka.controller:type=ControllerChannelManager,name=TotalQueueSize # - kafka.controller:type=ControllerStats,* # - kafka.controller:type=KafkaController,* # - kafka.log:type=Log,name=LogEndOffset,* # - kafka.log:type=Log,name=LogStartOffset,* # - kafka.log:type=Log,name=Size,* # - kafka.log:type=LogManager,* # - kafka.network:type=Processor,name=IdlePercent,* # - kafka.network:type=RequestChannel,name=RequestQueueSize # - kafka.network:type=RequestChannel,name=ResponseQueueSize,* # - kafka.network:type=RequestMetrics,name=ErrorsPerSec,* # - kafka.network:type=RequestMetrics,name=RequestsPerSec,* # - kafka.network:type=RequestMetrics,name=LocalTimeMs,request=Produce # - kafka.network:type=RequestMetrics,name=LocalTimeMs,request=FetchConsumer # - kafka.network:type=RequestMetrics,name=LocalTimeMs,request=FetchFollower # - kafka.network:type=RequestMetrics,name=RemoteTimeMs,request=Produce # - kafka.network:type=RequestMetrics,name=RemoteTimeMs,request=FetchConsumer # - kafka.network:type=RequestMetrics,name=RemoteTimeMs,request=FetchFollower # - kafka.network:type=RequestMetrics,name=RequestQueueTimeMs,request=Produce # - kafka.network:type=RequestMetrics,name=RequestQueueTimeMs,request=FetchConsumer # - kafka.network:type=RequestMetrics,name=RequestQueueTimeMs,request=FetchFollower # - kafka.network:type=RequestMetrics,name=ResponseQueueTimeMs,request=Produce # - kafka.network:type=RequestMetrics,name=ResponseQueueTimeMs,request=FetchConsumer # - kafka.network:type=RequestMetrics,name=ResponseQueueTimeMs,request=FetchFollower # - kafka.network:type=RequestMetrics,name=ResponseSendTimeMs,request=Produce # - kafka.network:type=RequestMetrics,name=ResponseSendTimeMs,request=FetchConsumer # - kafka.network:type=RequestMetrics,name=ResponseSendTimeMs,request=FetchFollower # - kafka.network:type=RequestMetrics,name=TotalTimeMs,request=Produce # - kafka.network:type=RequestMetrics,name=TotalTimeMs,request=FetchConsumer # - kafka.network:type=RequestMetrics,name=TotalTimeMs,request=FetchFollower # - kafka.network:type=SocketServer,name=NetworkProcessorAvgIdlePercent # - kafka.server:type=BrokerTopicMetrics,name=BytesInPerSec,* # - kafka.server:type=BrokerTopicMetrics,name=BytesOutPerSec,* # - kafka.server:type=BrokerTopicMetrics,name=FailedFetchRequestsPerSec # - kafka.server:type=BrokerTopicMetrics,name=FailedProduceRequestsPerSec # - kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec,* # - kafka.server:type=BrokerTopicMetrics,name=ReassignmentBytesInPerSec # - kafka.server:type=BrokerTopicMetrics,name=ReassignmentBytesOutPerSec # - kafka.server:type=BrokerTopicMetrics,name=ReplicationBytesInPerSec # - kafka.server:type=BrokerTopicMetrics,name=ReplicationBytesOutPerSec # - kafka.server:type=BrokerTopicMetrics,name=TotalFetchRequestsPerSec,* # - kafka.server:type=BrokerTopicMetrics,name=TotalProduceRequestsPerSec,* # - kafka.server:type=BrokerTopicMetrics,name=FetchMessageConversionsPerSec # - kafka.server:type=BrokerTopicMetrics,name=ProduceMessageConversionsPerSec # - kafka.server:type=DelayedOperationPurgatory,* # - kafka.server:type=FetcherLagMetrics,name=ConsumerLag,* # - kafka.server:type=FetcherStats,name=BytesPerSec,* # - kafka.server:type=KafkaRequestHandlerPool,* # - kafka.server:type=KafkaServer,name=BrokerState # - kafka.server:type=Fetch # - kafka.server:type=Produce # - kafka.server:type=Request # - kafka.server:type=app-info,* # - kafka.server:type=ReplicaFetcherManager,* # - kafka.server:type=ReplicaManager,* # - kafka.server:type=SessionExpireListener,* # - kafka.server:type=ZooKeeperClientMetrics,name=ZooKeeperRequestLatencyMs # - kafka.server:type=socket-server-metrics,listener=*,* # - java.lang:type=* # - java.nio:* # rules: # - pattern: kafka.cluster<type=(Partition), name=(UnderMinIsr|UnderReplicated), topic=([-.\w]+), partition=(\d+)><>(Value) # name: kafka_controller_$1_$2_$5 # type: GAUGE # labels: # topic: $3 # partition: $4 # cache: true # - pattern: kafka.controller<type=(ControllerChannelManager), name=(QueueSize), broker-id=(\d+)><>(Value) # name: kafka_controller_$1_$2_$4 # type: GAUGE # labels: # broker_id: $3 # cache: true # - pattern: kafka.controller<type=(ControllerChannelManager), name=(TotalQueueSize)><>(Value) # name: kafka_controller_$1_$2_$3 # type: GAUGE # cache: true # - pattern: kafka.controller<type=(ControllerStats), name=([-.\w]+)><>(Count) # name: kafka_controller_$1_$2_$3 # type: COUNTER # cache: true # - pattern: kafka.controller<type=(KafkaController), name=([-.\w]+)><>(Value) # name: kafka_controller_$1_$2_$3 # type: GAUGE # cache: true # - pattern: kafka.log<type=Log, name=(LogEndOffset|LogStartOffset|Size), topic=([-.\w]+), partition=(\d+)><>(Value) # name: kafka_log_$1_$4 # type: GAUGE # labels: # topic: $2 # partition: $3 # cache: true # - pattern: kafka.log<type=(LogManager), name=(LogDirectoryOffline), logDirectory="(.+)"><>(Value) # name: kafka_log_$1_$2_$4 # labels: # log_directory: $3 # type: GAUGE # cache: true # - pattern: kafka.log<type=(LogManager), name=(OfflineLogDirectoryCount)><>(Value) # name: kafka_log_$1_$2_$3 # type: GAUGE # cache: true # - pattern : kafka.network<type=(Processor), name=(IdlePercent), networkProcessor=(\d+)><>(Value) # name: kafka_network_$1_$2_$4 # labels: # network_processor: $3 # type: GAUGE # cache: true # - pattern: kafka.network<type=(RequestChannel), name=(RequestQueueSize)><>(Value) # name: kafka_network_$1_$2_$3 # type: GAUGE # cache: true # - pattern : kafka.network<type=(RequestChannel), name=(ResponseQueueSize), processor=(\d+)><>(Value) # name: kafka_network_$1_$2_$4 # type: GAUGE # labels: # processor: $3 # cache: true # - pattern : kafka.network<type=(RequestMetrics), name=(ErrorsPerSec), request=(\w+), error=(\w+)><>(Count) # name: kafka_network_$1_$2_$5 # type: COUNTER # labels: # request: $3 # error: $4 # cache: true # - pattern : kafka.network<type=(RequestMetrics), name=(RequestsPerSec), request=(\w+), version=(\d+)><>(Count) # name: kafka_network_$1_$2_$5 # type: COUNTER # labels: # request: $3 # version: $4 # cache: true # - pattern : kafka.network<type=(RequestMetrics), name=(\w+TimeMs), request=(\w+)><>(Count) # name: kafka_network_$1_$2_$4 # type: COUNTER # labels: # request: $3 # cache: true # - pattern : kafka.network<type=(SocketServer), name=(NetworkProcessorAvgIdlePercent)><>(Value) # name: kafka_network_$1_$2_$3 # type: GAUGE # cache: true # - pattern: kafka.server<type=(BrokerTopicMetrics), name=(\w+PerSec), topic=([-.\w]+)><>(Count) # name: kafka_server_$1_$2_$4 # type: COUNTER # labels: # topic: $3 # cache: true # - pattern: kafka.server<type=(BrokerTopicMetrics), name=(\w+PerSec)><>(Count) # name: kafka_server_$1_$2_total_$3 # type: COUNTER # cache: true # - pattern: kafka.server<type=(DelayedOperationPurgatory), name=(\w+), delayedOperation=(\w+)><>(Value) # name: kafka_server_$1_$2_$4 # type: GAUGE # labels: # delayed_operation: $3 # cache: true # - pattern: kafka.server<type=(FetcherLagMetrics), name=(ConsumerLag), clientId=([-.\w]+), topic=([-.\w]+), partition=(\d+)><>(Value) # name: kafka_server_$1_$2_$6 # type: GAUGE # labels: # client_id: $3 # topic: $4 # partition: $5 # cache: true # - pattern: kafka.server<type=(FetcherStats), name=(\w+PerSec), clientId=([-.\w]+), brokerHost=([-.\w]+), brokerPort=(\d+)><>(Count) # name: kafka_server_$1_$2_$6 # type: COUNTER # labels: # client_id: $3 # broker_host: $4 # broker_port: $5 # cache: true # - pattern: kafka.server<type=(KafkaRequestHandlerPool), name=(\w+)><>(Count) # name: kafka_server_$1_$2_$3 # type: COUNTER # cache: true # - pattern: kafka.server<type=(KafkaRequestHandlerPool), name=(\w+)><>(OneMinuteRate) # name: kafka_server_$1_$2_$3 # type: GAUGE # cache: true # - pattern: kafka.server<type=(KafkaServer), name=(\w+)><>(Value) # name: kafka_server_$1_$2_$3 # type: GAUGE # cache: true # - pattern: kafka.server<type=(Fetch|Produce|Request)><>(queue-size) # name: kafka_server_$1_$2 # type: GAUGE # cache: true # - pattern: kafka.server<type=(app-info), id=(\d+)><>(StartTimeMs) # name: kafka_server_$1_$3 # type: COUNTER # labels: # broker_id: $2 # cache: true # - pattern: 'kafka.server<type=(app-info), id=(\d+)><>(Version): ([-.~+\w\d]+)' # name: kafka_server_$1_$3 # type: COUNTER # labels: # broker_id: $2 # version: $4 # value: 1.0 # cache: false # - pattern: kafka.server<type=(ReplicaFetcherManager), name=(\w+), clientId=([-.\w]+)><>(Value) # name: kafka_server_$1_$2_$4 # type: GAUGE # labels: # client_id: $3 # cache: true # - pattern: kafka.server<type=(ReplicaManager), name=(\w+)><>(Value) # name: kafka_server_$1_$2_$3 # cache: true # - pattern: kafka.server<type=(ReplicaManager), name=(\w+)><>(Count) # name: kafka_server_$1_$2_$3 # type: COUNTER # cache: true # - pattern: kafka.server<type=(SessionExpireListener), name=(\w+)><>(Value) # name: kafka_server_$1_$2_$3 # type: GAUGE # cache: true # - pattern: kafka.server<type=(SessionExpireListener), name=(\w+)><>(Count) # name: kafka_server_$1_$2_$3 # type: COUNTER # cache: true # - pattern: kafka.server<type=(ZooKeeperClientMetrics), name=(\w+)><>(\d+)thPercentile # name: kafka_server_$1_$2 # type: GAUGE # labels: # quantile: 0.$3 # cache: true # - pattern: kafka.server<type=(ZooKeeperClientMetrics), name=(\w+)><>(Mean|Max|StdDev) # name: kafka_server_$1_$2_$3 # type: GAUGE # cache: true # - pattern: kafka.server<type=(socket-server-metrics), listener=([-.\w]+), networkProcessor=(\d+)><>([-\w]+(?:-count|-rate)) # name: kafka_server_$1_$4 # type: GAUGE # labels: # listener: $2 # network_processor: $3 # cache: true # - pattern: java.lang.+ # cache: true # - pattern: java.nio.+ # cache: true oneBrokerPerNode: false clusterImage: "ghcr.io/banzaicloud/kafka:${BANZAI_KAFKA_VERSION}" readOnlyConfig: | auto.create.topics.enable=false offsets.topic.replication.factor=2 cruise.control.metrics.topic.auto.create=true cruise.control.metrics.topic.num.partitions=12 cruise.control.metrics.topic.replication.factor=2 cruise.control.metrics.topic.min.insync.replicas=1 super.users=User:CN=kafka-default;User:CN=kafka-kafka-operator;User:CN=supertubes-system-supertubes;User:CN=supertubes-system-supertubes-ui brokerConfigGroups: default: serviceAccountName: default brokerAnnotations: prometheus.istio.io/merge-metrics: "false" storageConfigs: - mountPath: "/kafka-logs" pvcSpec: accessModes: - ReadWriteOnce resources: requests: storage: 10Gi brokers: - id: 0 brokerConfigGroup: "default" - id: 1 brokerConfigGroup: "default" rollingUpgradeConfig: failureThreshold: 1 listenersConfig: internalListeners: - type: "plaintext" name: "internal" containerPort: 29092 usedForInnerBrokerCommunication: true - type: "plaintext" name: "controller" containerPort: 29093 usedForInnerBrokerCommunication: false usedForControllerCommunication: true cruiseControlConfig: serviceAccountName: default config: | metadata.max.age.ms=60000 client.id=kafka-cruise-control send.buffer.bytes=131072 receive.buffer.bytes=131072 connections.max.idle.ms=540000 reconnect.backoff.ms=50 request.timeout.ms=30000 logdir.response.timeout.ms=10000 num.metric.fetchers=1 metric.sampler.class=com.linkedin.kafka.cruisecontrol.monitor.sampling.CruiseControlMetricsReporterSampler sampling.allow.cpu.capacity.estimation=true metric.reporter.topic=__CruiseControlMetrics sample.store.class=com.linkedin.kafka.cruisecontrol.monitor.sampling.KafkaSampleStore partition.metric.sample.store.topic=__KafkaCruiseControlPartitionMetricSamples broker.metric.sample.store.topic=__KafkaCruiseControlModelTrainingSamples sample.store.topic.replication.factor=2 num.sample.loading.threads=8 metric.sampler.partition.assignor.class=com.linkedin.kafka.cruisecontrol.monitor.sampling.DefaultMetricSamplerPartitionAssignor metric.sampling.interval.ms=120000 partition.metrics.window.ms=300000 num.partition.metrics.windows=12 min.samples.per.partition.metrics.window=1 broker.metrics.window.ms=300000 num.broker.metrics.windows=20 min.samples.per.broker.metrics.window=1 capacity.config.file=config/capacity.json prometheus.server.endpoint=http://localhost:9090 prometheus.query.resolution.step.ms=60000 prometheus.query.supplier=com.linkedin.kafka.cruisecontrol.monitor.sampling.prometheus.DefaultPrometheusQuerySupplier default.goals=com.linkedin.kafka.cruisecontrol.analyzer.goals.RackAwareDistributionGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.ReplicaCapacityGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.DiskCapacityGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.NetworkInboundCapacityGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.NetworkOutboundCapacityGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.CpuCapacityGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.ReplicaDistributionGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.PotentialNwOutGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.DiskUsageDistributionGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.NetworkInboundUsageDistributionGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.NetworkOutboundUsageDistributionGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.CpuUsageDistributionGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.TopicReplicaDistributionGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.LeaderReplicaDistributionGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.LeaderBytesInDistributionGoal goals=com.linkedin.kafka.cruisecontrol.analyzer.goals.RackAwareGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.RackAwareDistributionGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.ReplicaCapacityGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.DiskCapacityGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.NetworkInboundCapacityGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.NetworkOutboundCapacityGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.CpuCapacityGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.ReplicaDistributionGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.PotentialNwOutGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.DiskUsageDistributionGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.NetworkInboundUsageDistributionGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.NetworkOutboundUsageDistributionGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.CpuUsageDistributionGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.TopicReplicaDistributionGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.LeaderReplicaDistributionGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.LeaderBytesInDistributionGoal,com.linkedin.kafka.cruisecontrol.analyzer.kafkaassigner.KafkaAssignerDiskUsageDistributionGoal,com.linkedin.kafka.cruisecontrol.analyzer.kafkaassigner.KafkaAssignerEvenRackAwareGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.PreferredLeaderElectionGoal intra.broker.goals=com.linkedin.kafka.cruisecontrol.analyzer.goals.IntraBrokerDiskCapacityGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.IntraBrokerDiskUsageDistributionGoal hard.goals=com.linkedin.kafka.cruisecontrol.analyzer.goals.RackAwareDistributionGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.ReplicaCapacityGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.DiskCapacityGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.NetworkInboundCapacityGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.NetworkOutboundCapacityGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.CpuCapacityGoal min.valid.partition.ratio=0.95 cpu.balance.threshold=1.1 disk.balance.threshold=1.1 network.inbound.balance.threshold=1.1 network.outbound.balance.threshold=1.1 replica.count.balance.threshold=1.1 cpu.capacity.threshold=0.7 disk.capacity.threshold=0.8 network.inbound.capacity.threshold=0.8 network.outbound.capacity.threshold=0.8 cpu.low.utilization.threshold=0.15 disk.low.utilization.threshold=0.15 network.inbound.low.utilization.threshold=0.15 network.outbound.low.utilization.threshold=0.15 metric.anomaly.percentile.upper.threshold=90.0 metric.anomaly.percentile.lower.threshold=10.0 proposal.expiration.ms=60000 max.replicas.per.broker=10000 num.proposal.precompute.threads=1 topics.excluded.from.partition.movement= leader.replica.count.balance.threshold=1.1 topic.replica.count.balance.threshold=3.0 goal.balancedness.priority.weight=1.1 goal.balancedness.strictness.weight=1.5 default.replica.movement.strategies=com.linkedin.kafka.cruisecontrol.executor.strategy.BaseReplicaMovementStrategy topic.replica.count.balance.min.gap=2 topic.replica.count.balance.max.gap=40 maintenance.event.class=com.linkedin.kafka.cruisecontrol.detector.MaintenanceEvent maintenance.event.reader.class=com.linkedin.kafka.cruisecontrol.detector.NoopMaintenanceEventReader maintenance.event.enable.idempotence=true maintenance.event.idempotence.retention.ms=180000 maintenance.event.max.idempotence.cache.size=25 maintenance.event.stop.ongoing.execution=true zookeeper.security.enabled=false num.concurrent.partition.movements.per.broker=10 num.concurrent.intra.broker.partition.movements=2 num.concurrent.leader.movements=1000 execution.progress.check.interval.ms=10000 anomaly.notifier.class=com.linkedin.kafka.cruisecontrol.detector.notifier.SelfHealingNotifier metric.anomaly.finder.class=com.linkedin.kafka.cruisecontrol.detector.KafkaMetricAnomalyFinder anomaly.detection.interval.ms=300000 goal.violation.detection.interval.ms=300000 metric.anomaly.detection.interval.ms=300000 disk.failure.detection.interval.ms=300000 topic.anomaly.detection.interval.ms=300000 anomaly.detection.goals=com.linkedin.kafka.cruisecontrol.analyzer.goals.RackAwareDistributionGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.ReplicaCapacityGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.DiskCapacityGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.NetworkInboundCapacityGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.NetworkOutboundCapacityGoal,com.linkedin.kafka.cruisecontrol.analyzer.goals.CpuCapacityGoal metric.anomaly.analyzer.metrics=BROKER_PRODUCE_LOCAL_TIME_MS_50TH,BROKER_PRODUCE_LOCAL_TIME_MS_999TH,BROKER_CONSUMER_FETCH_LOCAL_TIME_MS_50TH,BROKER_CONSUMER_FETCH_LOCAL_TIME_MS_999TH,BROKER_FOLLOWER_FETCH_LOCAL_TIME_MS_50TH,BROKER_FOLLOWER_FETCH_LOCAL_TIME_MS_999TH,BROKER_LOG_FLUSH_TIME_MS_50TH,BROKER_LOG_FLUSH_TIME_MS_999TH broker.failure.exclude.recently.demoted.brokers=true broker.failure.exclude.recently.removed.brokers=true goal.violation.exclude.recently.demoted.brokers=true goal.violation.exclude.recently.removed.brokers=true failed.brokers.zk.path=/CruiseControlBrokerList topic.config.provider.class=com.linkedin.kafka.cruisecontrol.config.KafkaTopicConfigProvider cluster.configs.file=config/clusterConfigs.json completed.kafka.monitor.user.task.retention.time.ms=86400000 completed.cruise.control.monitor.user.task.retention.time.ms=86400000 completed.kafka.admin.user.task.retention.time.ms=604800000 completed.cruise.control.admin.user.task.retention.time.ms=604800000 completed.user.task.retention.time.ms=86400000 demotion.history.retention.time.ms=1209600000 removal.history.retention.time.ms=1209600000 max.cached.completed.kafka.monitor.user.tasks=20 max.cached.completed.cruise.control.monitor.user.tasks=20 max.cached.completed.kafka.admin.user.tasks=30 max.cached.completed.cruise.control.admin.user.tasks=30 max.cached.completed.user.tasks=500 max.active.user.tasks=25 self.healing.enabled=true self.healing.broker.failure.enabled=true self.healing.goal.violation.enabled=true self.healing.metric.anomaly.enabled=true self.healing.disk.failure.enabled=true self.healing.topic.anomaly.enabled=false self.healing.exclude.recently.demoted.brokers=true self.healing.exclude.recently.removed.brokers=true topic.anomaly.finder.class=com.linkedin.kafka.cruisecontrol.detector.NoopTopicAnomalyFinder self.healing.maintenance.event.enabled=true goal.violation.distribution.threshold.multiplier=1.0 self.healing.target.topic.replication.factor=1 topic.excluded.from.replication.factor.check= topic.replication.topic.anomaly.class=com.linkedin.kafka.cruisecontrol.detector.TopicReplicationFactorAnomaly topic.replication.factor.margin=1 topic.min.isr.record.retention.time.ms=43200000 webserver.http.port=9090 webserver.http.address=0.0.0.0 webserver.http.cors.enabled=false webserver.http.cors.origin=http://localhost:8080/ webserver.http.cors.allowmethods=OPTIONS,GET,POST webserver.http.cors.exposeheaders=User-Task-ID webserver.api.urlprefix=/kafkacruisecontrol/* webserver.ui.diskpath=./cruise-control-ui/dist/ webserver.ui.urlprefix=/* webserver.request.maxBlockTimeMs=10000 webserver.session.maxExpiryTimeMs=60000 webserver.session.path=/ webserver.accesslog.enabled=true webserver.accesslog.path=access.log webserver.accesslog.retention.days=14 two.step.verification.enabled=false two.step.purgatory.retention.time.ms=1209600000 two.step.purgatory.max.requests=25 clusterConfig: | { "min.insync.replicas": 1 } EOF
-
Create the
sdm-kafka-cluster
Application CR in theapps/sdm-kafka-cluster
directory.mkdir -p apps/sdm-kafka-cluster
ARGOCD_CLUSTER_NAME="${WORKLOAD_CLUSTER_1_CONTEXT}" ; cat > apps/sdm-kafka-cluster/sdm-kafka-cluster-app.yaml <<EOF # apps/sdm-kafka-cluster/sdm-kafka-cluster-app.yaml apiVersion: argoproj.io/v1alpha1 kind: Application metadata: name: sdm-kafka-cluster namespace: argocd finalizers: - resources-finalizer.argocd.argoproj.io spec: project: default source: repoURL: https://github.com/${GITHUB_ID}/${GITHUB_REPOSITORY_NAME}.git targetRevision: HEAD path: manifests/sdm-kafka-cluster destination: name: ${ARGOCD_CLUSTER_NAME} namespace: kafka syncPolicy: automated: prune: true selfHeal: true syncOptions: - Validate=false - CreateNamespace=true - PrunePropagationPolicy=foreground - PruneLast=true EOF
-
Commit and push the
calisti-gitops
repository.git add apps/sdm-kafka-cluster manifests/sdm-kafka-cluster git commit -m "add sdm-kafka-cluster app"
Expected output:
[main 13402f4] add sdm-kafka-cluster app 2 files changed, 712 insertions(+) create mode 100644 apps/sdm-kafka-cluster/sdm-kafka-cluster-app.yaml create mode 100644 manifests/sdm-kafka-cluster/sdm-kafka-cluster.yaml
git push
Expected output:
Enumerating objects: 11, done. Counting objects: 100% (11/11), done. Delta compression using up to 12 threads Compressing objects: 100% (8/8), done. Writing objects: 100% (8/8), 8.93 KiB | 4.47 MiB/s, done. Total 8 (delta 3), reused 0 (delta 0), pack-reused 0 remote: Resolving deltas: 100% (3/3), completed with 3 local objects. To github.com:github-id/calisti-gitops.git d3e8b60..13402f4 main -> main
-
Apply the
sdm-kafka-cluster
Application CR.kubectl apply -f apps/sdm-kafka-cluster/sdm-kafka-cluster-app.yaml
Expected output:
application.argoproj.io/sdm-kafka-cluster created
-
Verify that the application has been added to Argo CD and is healthy.
argocd app list
Expected output:
NAME CLUSTER NAMESPACE PROJECT STATUS HEALTH SYNCPOLICY CONDITIONS REPO PATH TARGET ... sdm-kafka-cluster workload-cluster-1 kafka default Synced Healthy Auto-Prune <none> https://github.com/github-id/calisti-gitops.git manifests/sdm-kafka-cluster HEAD ...
-
Wait about 5 minutes and check if all pods are healthy and running in the
kafka
namespace onworkload-cluster-1
.kubectl get pods -n kafka --kubeconfig "${WORKLOAD_CLUSTER_1_KUBECONFIG}" --context "${WORKLOAD_CLUSTER_1_CONTEXT}"
Expected output:
NAME READY STATUS RESTARTS AGE kafka-0-j5vq4 2/2 Running 0 5d kafka-1-dv9p7 2/2 Running 0 5d kafka-2-wx55p 2/2 Running 0 5d ... kafka-cruisecontrol-775fd5f6fb-d5tml 2/2 Running 0 5d kafka-kminion-59cf847cdb-hc7zt 2/2 Running 3 5d kafka-operator-operator-6cb66c5dbd-7mwxv 3/3 Running 2 5d