Upgrade SMM - GitOps - single cluster
This document describes how to upgrade Calisti and a business application.
- If you are using Calisti on a single cluster, follow this guide.
- If you are using Calisti in a multi-cluster setup, see Upgrade SMM - GitOps - multi-cluster.
CAUTION:
Do not push the secrets directly into the git repository, especially when it is a public repository. Argo CD provides solutions to keep secrets safe.Prerequisites
To complete this procedure, you need:
- A free registration for the Calisti download page
- A Kubernetes cluster running Argo CD (called
management-cluster
in the examples). - A Kubernetes cluster running the previous version of Calisti (called
workload-cluster-1
in the examples). It is assumed that Calisti has been installed on this cluster as described in the Calisti 1.12.0 documentation, and that the cluster meets the resource requirements of Calisti version 1.12.1.
CAUTION:
Supported providers and Kubernetes versions
The cluster must run a Kubernetes version that Service Mesh Manager supports: Kubernetes 1.21, 1.22, 1.23, 1.24.
Service Mesh Manager is tested and known to work on the following Kubernetes providers:
- Amazon Elastic Kubernetes Service (Amazon EKS)
- Google Kubernetes Engine (GKE)
- Azure Kubernetes Service (AKS)
- Red Hat OpenShift 4.11
- On-premises installation of stock Kubernetes with load balancer support (and optionally PVCs for persistence)
Calisti resource requirements
Make sure that your Kubernetes or OpenShift cluster has sufficient resources to install Calisti. The following table shows the number of resources needed on the cluster:
Resource | Required |
---|---|
CPU | - 32 vCPU in total - 4 vCPU available for allocation per worker node (If you are testing on a cluster at a cloud provider, use nodes that have at least 4 CPUs, for example, c5.xlarge on AWS.) |
Memory | - 64 GiB in total - 4 GiB available for allocation per worker node for the Kubernetes cluster (8 GiB in case of the OpenShift cluster) |
Storage | 12 GB of ephemeral storage on the Kubernetes worker nodes (for Traces and Metrics) |
These minimum requirements need to be available for allocation within your cluster, in addition to the requirements of any other loads running in your cluster (for example, DaemonSets and Kubernetes node-agents). If Kubernetes cannot allocate sufficient resources to Service Mesh Manager, some pods will remain in Pending state, and Service Mesh Manager will not function properly.
Enabling additional features, such as High Availability increases this value.
The default installation, when enough headroom is available in the cluster, should be able to support at least 150 running Pods
with the same amount of Services
. For setting up Service Mesh Manager for bigger workloads, see scaling Service Mesh Manager.
This document describes how to upgrade Service Mesh Manager version 1.12.0 to Service Mesh Manager version 1.12.1.
Set up the environment
-
Set the KUBECONFIG location and context name for the
management-cluster
cluster.MANAGEMENT_CLUSTER_KUBECONFIG=management_cluster_kubeconfig.yaml MANAGEMENT_CLUSTER_CONTEXT=management-cluster kubectl config --kubeconfig "${MANAGEMENT_CLUSTER_KUBECONFIG}" get-contexts "${MANAGEMENT_CLUSTER_CONTEXT}"
Expected output:
CURRENT NAME CLUSTER AUTHINFO NAMESPACE * management-cluster management-cluster
-
Set the KUBECONFIG location and context name for the
workload-cluster-1
cluster.WORKLOAD_CLUSTER_1_KUBECONFIG=workload_cluster_1_kubeconfig.yaml WORKLOAD_CLUSTER_1_CONTEXT=workload-cluster-1 kubectl config --kubeconfig "${WORKLOAD_CLUSTER_1_KUBECONFIG}" get-contexts "${WORKLOAD_CLUSTER_1_CONTEXT}"
Expected output:
CURRENT NAME CLUSTER AUTHINFO NAMESPACE * workload-cluster-1 workload-cluster-1
Repeat this step for any additional workload clusters you want to use.
-
Add the cluster configurations to KUBECONFIG. Include any additional workload clusters you want to use.
KUBECONFIG=$KUBECONFIG:$MANAGEMENT_CLUSTER_KUBECONFIG:$WORKLOAD_CLUSTER_1_KUBECONFIG
-
Make sure the
management-cluster
Kubernetes context is the current context.kubectl config use-context "${MANAGEMENT_CLUSTER_CONTEXT}"
Expected output:
Switched to context "management-cluster".
Upgrade Service Mesh Manager
The high-level steps of the upgrade process are:
- Upgrade the
smm-operator
. - If you are also running Streaming Data Manager on your Calisti cluster and have installed it using the GitOps guide, upgrade the
sdm-operator
chart. - Upgrade the business applications (
demo-app
) to use the new control plane.
Upgrade the smm-operator
-
Clone your
calisti-gitops
repository. -
Remove the old version (1.12.0) of the
smm-operator
Helm chart.rm -rf charts/smm-operator
-
Pull the new version (1.12.1) of the
smm-operator
Helm chart and extract it into thecharts
folder.helm pull oci://registry.eticloud.io/smm-charts/smm-operator --destination ./charts/ --untar --version 1.12.1
-
Commit and push the changes to the Git repository.
git add . git commit -m "upgrade smm to 1.12.1" git push
-
Wait a few minutes until the upgrade is completed.
-
Open the Service Mesh Manager dashboard.
On the Dashboard Overview page everything should be fine, except for some validation issues. The validation issues show that the business application (
demo-app
) is behind thesmm-controlplane
and the business application should be updated.
Upgrade the sdm-operator
If you are also running Streaming Data Manager on your Calisti cluster and have installed it using the GitOps guide, upgrade the sdm-operator
chart. Otherwise, skip this section and upgrade your business applications.
-
Check your username and password on the download page.
-
Remove the previously installed sdm-operator chart from the GitOps repository with the following command:
rm -rf charts/supertubes-control-plane
-
Download the
sdm-operator
chart fromregistry.eticloud.io
into thecharts
directory of your Streaming Data Manager GitOps repository and extract it. Run the following commands:export HELM_EXPERIMENTAL_OCI=1 # Needed prior to Helm version 3.8.0 echo "${CALISTI_PASSWORD}" | helm registry login registry.eticloud.io -u "${CALISTI_USERNAME}" --password-stdin
Expected output:
Login Succeeded
helm pull oci://registry.eticloud.io/sdm-charts/supertubes-control-plane --destination ./charts/ --untar --version 1.12.1
Expected output:
Pulled: registry.eticloud.io/sdm-charts/supertubes-control-plane:1.12.1 Digest: sha256:someshadigest
-
Modify the sdm-operator Application CR by editing the
apps/sdm-operator/sdm-operator-app.yaml
file from the GitOps repository.Note: This is needed because of a change in the sdm-operator helm chart. The imagePullSecret helm value needs to be extended with a name key tag as in the following example.
spec: ... source: ... helm: helm: values: | imagePullSecrets: - name: "smm-registry.eticloud.io-pull-secret" ...
-
Commit the changes and push the repository.
git add . git commit -m "Update sdm-operator" git push origin
-
Apply the modified Application CR.
kubectl apply -f "apps/sdm-operator/sdm-operator-app.yaml"
Expected output:
application.argoproj.io/sdm-operator configured
-
Follow the progress of the upgrade by checking the status of the ApplicationManifest. You can do this on the Argo CD UI, or with the following command:
kubectl describe applicationmanifests.supertubes.banzaicloud.io -n smm-registry-access applicationmanifest
Expected output when the upgrade is finished:
Status: Cluster ID: ... Components: Cluster Registry: Image: ghcr.io/cisco-open/cluster-registry-controller:... Status: Removed Csr Operator: Image: registry.eticloud.io/csro/csr-operator:... Status: Available Imps Operator: Image Pull Secret Status: Unmanaged Status: Removed Istio Operator: Status: Removed Kafka Operator: Image: ghcr.io/banzaicloud/kafka-operator:... Status: Available Monitoring: Status: Available Supertubes: Image: registry.eticloud.io/sdm/supertubes:... Status: Available Zookeeper Operator: Image: pravega/zookeeper-operator:... Status: Available Status: Succeeded
-
If the following error shows up in the ApplicationManifest under the
Message
field:resource type is not allowed to be recreated: Job.batch "zookeeper-operator-post-install-upgrade" is invalid...
Delete the
zookeeper-operator-post-install-upgrade
job so it is recreated when ZooKeeper is reconciled:kubectl delete job -n zookeeper zookeeper-operator-post-install-upgrade
Upgrade Demo application
-
Update the
demo-app
.kubectl –context “${WORKLOAD_CLUSTER_1_CONTEXT}” -n smm-demo rollout restart deploy
-
Check the dashboard.