Upgrade SMM - GitOps - single cluster

This document describes how to upgrade Calisti and a business application.

CAUTION:

Do not push the secrets directly into the git repository, especially when it is a public repository. Argo CD provides solutions to keep secrets safe.

Prerequisites

To complete this procedure, you need:

  • A free registration for the Calisti download page
  • A Kubernetes cluster running Argo CD (called management-cluster in the examples).
  • A Kubernetes cluster running the previous version of Calisti (called workload-cluster-1 in the examples). It is assumed that Calisti has been installed on this cluster as described in the Calisti 1.12.0 documentation, and that the cluster meets the resource requirements of Calisti version 1.12.1.

CAUTION:

Supported providers and Kubernetes versions

The cluster must run a Kubernetes version that Service Mesh Manager supports: Kubernetes 1.21, 1.22, 1.23, 1.24.

Service Mesh Manager is tested and known to work on the following Kubernetes providers:

  • Amazon Elastic Kubernetes Service (Amazon EKS)
  • Google Kubernetes Engine (GKE)
  • Azure Kubernetes Service (AKS)
  • Red Hat OpenShift 4.11
  • On-premises installation of stock Kubernetes with load balancer support (and optionally PVCs for persistence)

Calisti resource requirements

Make sure that your Kubernetes or OpenShift cluster has sufficient resources to install Calisti. The following table shows the number of resources needed on the cluster:

Resource Required
CPU - 32 vCPU in total
- 4 vCPU available for allocation per worker node (If you are testing on a cluster at a cloud provider, use nodes that have at least 4 CPUs, for example, c5.xlarge on AWS.)
Memory - 64 GiB in total
- 4 GiB available for allocation per worker node for the Kubernetes cluster (8 GiB in case of the OpenShift cluster)
Storage 12 GB of ephemeral storage on the Kubernetes worker nodes (for Traces and Metrics)

These minimum requirements need to be available for allocation within your cluster, in addition to the requirements of any other loads running in your cluster (for example, DaemonSets and Kubernetes node-agents). If Kubernetes cannot allocate sufficient resources to Service Mesh Manager, some pods will remain in Pending state, and Service Mesh Manager will not function properly.

Enabling additional features, such as High Availability increases this value.

The default installation, when enough headroom is available in the cluster, should be able to support at least 150 running Pods with the same amount of Services. For setting up Service Mesh Manager for bigger workloads, see scaling Service Mesh Manager.

This document describes how to upgrade Service Mesh Manager version 1.12.0 to Service Mesh Manager version 1.12.1.

Set up the environment

  1. Set the KUBECONFIG location and context name for the management-cluster cluster.

    MANAGEMENT_CLUSTER_KUBECONFIG=management_cluster_kubeconfig.yaml
    MANAGEMENT_CLUSTER_CONTEXT=management-cluster
    kubectl config --kubeconfig "${MANAGEMENT_CLUSTER_KUBECONFIG}" get-contexts "${MANAGEMENT_CLUSTER_CONTEXT}"
    

    Expected output:

    CURRENT   NAME                 CLUSTER              AUTHINFO   NAMESPACE
    *         management-cluster   management-cluster
    
  2. Set the KUBECONFIG location and context name for the workload-cluster-1 cluster.

    WORKLOAD_CLUSTER_1_KUBECONFIG=workload_cluster_1_kubeconfig.yaml
    WORKLOAD_CLUSTER_1_CONTEXT=workload-cluster-1
    kubectl config --kubeconfig "${WORKLOAD_CLUSTER_1_KUBECONFIG}" get-contexts "${WORKLOAD_CLUSTER_1_CONTEXT}"
    

    Expected output:

    CURRENT   NAME                 CLUSTER              AUTHINFO                                          NAMESPACE
    *         workload-cluster-1   workload-cluster-1
    

    Repeat this step for any additional workload clusters you want to use.

  3. Add the cluster configurations to KUBECONFIG. Include any additional workload clusters you want to use.

    KUBECONFIG=$KUBECONFIG:$MANAGEMENT_CLUSTER_KUBECONFIG:$WORKLOAD_CLUSTER_1_KUBECONFIG
    
  4. Make sure the management-cluster Kubernetes context is the current context.

    kubectl config use-context "${MANAGEMENT_CLUSTER_CONTEXT}"
    

    Expected output:

    Switched to context "management-cluster".
    

Upgrade Service Mesh Manager

The high-level steps of the upgrade process are:

  • Upgrade the smm-operator.
  • If you are also running Streaming Data Manager on your Calisti cluster and have installed it using the GitOps guide, upgrade the sdm-operator chart.
  • Upgrade the business applications (demo-app) to use the new control plane.

Upgrade the smm-operator

  1. Clone your calisti-gitops repository.

  2. Remove the old version (1.12.0) of the smm-operator Helm chart.

    rm -rf charts/smm-operator
    
  3. Pull the new version (1.12.1) of the smm-operator Helm chart and extract it into the charts folder.

    helm pull oci://registry.eticloud.io/smm-charts/smm-operator --destination ./charts/ --untar --version 1.12.1
    
  4. Commit and push the changes to the Git repository.

    git add .
    
    git commit -m "upgrade smm to 1.12.1"
    
    git push
    
  5. Wait a few minutes until the upgrade is completed.

  6. Open the Service Mesh Manager dashboard.

    On the Dashboard Overview page everything should be fine, except for some validation issues. The validation issues show that the business application (demo-app) is behind the smm-controlplane and the business application should be updated.

    Service Mesh Manager Overview Service Mesh Manager Overview

    Service Mesh Manager Overview Service Mesh Manager Overview

Upgrade the sdm-operator

If you are also running Streaming Data Manager on your Calisti cluster and have installed it using the GitOps guide, upgrade the sdm-operator chart. Otherwise, skip this section and upgrade your business applications.

  1. Check your username and password on the download page.

  2. Remove the previously installed sdm-operator chart from the GitOps repository with the following command:

    rm -rf charts/supertubes-control-plane
    
  3. Download the sdm-operator chart from registry.eticloud.io into the charts directory of your Streaming Data Manager GitOps repository and extract it. Run the following commands:

    export HELM_EXPERIMENTAL_OCI=1 # Needed prior to Helm version 3.8.0
    
    echo "${CALISTI_PASSWORD}" | helm registry login registry.eticloud.io -u "${CALISTI_USERNAME}" --password-stdin
    

    Expected output:

    Login Succeeded
    
     helm pull oci://registry.eticloud.io/sdm-charts/supertubes-control-plane --destination ./charts/ --untar --version 1.12.1
    

    Expected output:

    Pulled: registry.eticloud.io/sdm-charts/supertubes-control-plane:1.12.1
    Digest: sha256:someshadigest
    
  4. Modify the sdm-operator Application CR by editing the apps/sdm-operator/sdm-operator-app.yaml file from the GitOps repository.

    Note: This is needed because of a change in the sdm-operator helm chart. The imagePullSecret helm value needs to be extended with a name key tag as in the following example.

    spec:
    ...
      source:
      ...
        helm:
          helm:
            values: |
              imagePullSecrets:
                - name: "smm-registry.eticloud.io-pull-secret"
    ...
    
  5. Commit the changes and push the repository.

    git add .
    git commit -m "Update sdm-operator"
    git push origin
    
  6. Apply the modified Application CR.

    kubectl apply -f "apps/sdm-operator/sdm-operator-app.yaml"
    

    Expected output:

    application.argoproj.io/sdm-operator configured
    
  7. Follow the progress of the upgrade by checking the status of the ApplicationManifest. You can do this on the Argo CD UI, or with the following command:

    kubectl describe applicationmanifests.supertubes.banzaicloud.io -n smm-registry-access applicationmanifest
    

    Expected output when the upgrade is finished:

    Status:
      Cluster ID:  ...
      Components:
        Cluster Registry:
          Image:   ghcr.io/cisco-open/cluster-registry-controller:...
          Status:  Removed
        Csr Operator:
          Image:   registry.eticloud.io/csro/csr-operator:...
          Status:  Available
        Imps Operator:
          Image Pull Secret Status:  Unmanaged
          Status:                    Removed
        Istio Operator:
          Status:  Removed
        Kafka Operator:
          Image:   ghcr.io/banzaicloud/kafka-operator:...
          Status:  Available
        Monitoring:
          Status:  Available
        Supertubes:
          Image:   registry.eticloud.io/sdm/supertubes:...
          Status:  Available
        Zookeeper Operator:
          Image:   pravega/zookeeper-operator:...
          Status:  Available
      Status:      Succeeded
    
  8. If the following error shows up in the ApplicationManifest under the Message field:

    resource type is not allowed to be recreated: Job.batch "zookeeper-operator-post-install-upgrade" is invalid...
    

    Delete the zookeeper-operator-post-install-upgrade job so it is recreated when ZooKeeper is reconciled:

    kubectl delete job -n zookeeper zookeeper-operator-post-install-upgrade
    

Upgrade Demo application

  1. Update the demo-app.

    kubectl –context “${WORKLOAD_CLUSTER_1_CONTEXT}” -n smm-demo rollout restart deploy

  2. Check the dashboard.

    Service Mesh Manager Overview Service Mesh Manager Overview