This is the multi-page printable view of this section.
Click here to print.
Return to the regular view of this page.
Calisti
Calisti includes the following main components:
-
Service Mesh Manager is a multi and hybrid-cloud enabled service mesh platform for constructing modern applications.
Built on Kubernetes and our Istio distribution, Service Mesh Manager enables flexibility, portability and consistency across on-premise datacenters and cloud environments.
-
Cisco Streaming Data Manager (Streaming Data Manager) is the deployment tool for setting up and operating production-ready Apache Kafka clusters on Kubernetes, leveraging a Cloud Native technology stack.
Streaming Data Manager includes Zookeeper, Koperator, Envoy, and many other components hosted in a managed service mesh. All components are automatically installed, configured, and managed in order to operate a production-ready Kafka cluster on Kubernetes.
Service Mesh Manager
Service Mesh Manager helps you to confidently scale your microservices over single- and multi-cluster environments and to make daily operational routines standardized and more efficient. The componentization and scaling of modern applications inevitably leads to a number of optimization and management issues:
- How do you spot bottlenecks? Are all components functioning correctly?
- How are connections between components secured?
- How does one reliably upgrade service components?
Service Mesh Manager helps you accomplish these tasks and many others in a simple and scalable way, by leveraging the Istio service mesh and building many automations around it. Our tag-line for the product captures this succinctly:
Service Mesh Manager operationalizes the service mesh to bring deep observability, convenient management, and policy-based security to modern container-based applications.
Learn more about Service Mesh Manager.
Streaming Data Manager
Cisco Streaming Data Manager is specifically built to run and manage Apache Kafka on Kubernetes. Other solutions use Kubernetes StatefulSets to run Apache Kafka, but this approach is not really suitable for Apache Kafka. Streaming Data Manager is based on simple Kubernetes resources (Pods, ConfigMaps, and PersistentVolumeClaims), allowing a much more flexible approach that makes possible to:
- Modify the configuration of unique Brokers: you can configure every Broker individually.
- Remove specific Brokers from clusters.
- Use multiple Persistent Volumes for each Broker.
- Do rolling upgrades without data loss or service disruption.
Learn more about Streaming Data Manager.
1 - What's new
Release 1.11.0 (2022-11-07)
Streaming Data Manager
Calisti now has a new component called Streaming Data Manager. Streaming Data Manager is a cloud-native, turnkey solution for deploying and managing Apache Kafka over Istio, providing:
- Security and encryption
- out-of-the-box observability
- RBAC integration
- Scaling
For details, see Overview.
Note: When using Streaming Data Manager on Amazon EKS, you must install the EBS CSI driver add-on on your cluster.
GitOps support
Service Mesh Manager and Streaming Data Manager can be used in GitOps environments as well. For details, see Install SMM - GitOps - single cluster, Install SMM - GitOps - multi-cluster, and Install SDM - GitOps.
Istio 1.15 support
Service Mesh Manager now supports Istio 1.15 and provides our Istio distribution based on that codebase.
This also means that Service Mesh Manager is fully compatible with Kubernetes v1.24.x.
Other changes
-
The health views of the Services and Workloads pages now have fixed URLs to make sharing easier.
-
If the name of your cluster doesn’t comply with the RFC 1123 DNS labels/subdomain restrictions, Service Mesh Manager now automatically converts it to a compliant format and sets it as the name of the cluster. In earlier versions, you had to manually set a compliant name for clusters with non-compliant names, otherwise certain operations (like smm install
and smm attach
) failed. Service Mesh Manager now automatically applies the following conversions if needed:
- Replace ‘_’ characters with ‘-’
- Replace ‘.’ characters with ‘-’
- Replace ‘:’ characters with ‘-’
- Truncate the name to 63 characters
-
The Service Mesh Manager CLI now returns an error message when trying to run a command on a cluster that’s running an unsupported Kubernetes version.
-
In Kubernetes 1.24 or newer, token secrets for service accounts aren’t created automatically. If Service Mesh Manager is running on a Kubernetes 1.24 (or newer) cluster, then when adding virtual machines to the mesh, you must create the token secrets manually. For details, see Add a virtual machine to the mesh.
Release 1.10.0 (2022-08-09)
RedHat-based virtual machines
Service Mesh Manager now supports attaching virtual machines running RedHat Enterprise Linux 8 to the mesh. For details, see Integrating Virtual Machines into the mesh.
Istio 1.13 support
Service Mesh Manager now supports Istio 1.13 and provides our Istio distribution based on that codebase.
Enterprise licenses
Paid-tier and enterprise licenses are now available for Service Mesh Manager.
- If you are interested in purchasing a license, contact us.
- If you have already purchased a license, apply it to your Service Mesh Manager deployments. For details, see Licensing options.
Other changes
- The
smm
CLI tool now supports MacOS running on M1 chips.
- The Prometheus node exporter service now uses port 19101 instead of 19100. That way, the Prometheus deployment of Service Mesh Manager can work side-by-side with a pre-existing Prometheus deployment. For details on other ports used by Service Mesh Manager, see Open Port Inventory.
Release 1.9.1 (2022-04-11)
Service Mesh Manager now supports attaching virtual machines to the mesh. After a virtual machine has been integrated into the mesh, Service Mesh Manager automatically updates the configuration of the virtual machine to ensure that it remains a part of the mesh and receives every configuration updates it needs to operate in teh mesh. In addition, the observability features available for Kubernetes pods are available for the virtual machines as well, for example:
For details, see Integrating Virtual Machines into the mesh.
Release 1.9.0 (2022-03-08)
Free tier
From now on, after a free registration, you can use Service Mesh Manager to manage your mesh of up to ten nodes. For details, see Licensing options and Getting started with the Free Tier.
Istio 1.12 support
Service Mesh Manager now supports Istio 1.12 and provides our Istio distribution based on that codebase.
Other changes
This release includes the following fixes:
- All custom resources used by Service Mesh Manager had been moved to the
smm.cisco.com
group. CLI is capable of migrating the objects to the new group.
- Topology:
- Mesh gateways are now fully visible on the topology page even in timeline mode
- Topology view now shows pod counts in timeline mode
- Fix an issue causing new SLOs to not to start calculating on creation
- IstioControlPlane settings can be overridden from Service Mesh Manager’s
ControlPlane
resource using the .spec.meshManager.istio.istioCRDOverrides
key (which contains a YAML string).
Removed features
The following commands have been removed from the Service Mesh Manager command-line tool. You can configure the related features from the dashboard.
smm sidecar-proxy egress get
smm sidecar-proxy egress set
smm sidecar-proxy egress delete
smm routing
smm mtls
- Integrated support for canary deployments. You can use the Flagger operator instead.
Release 1.8.2 (2021-12-14)
This release includes the following fixes:
Active-active fixes
- Fix secret cleanup for Istio in active-active setups.
- Update istio-operator to latest.
- Multiple active Istio control-plane support.
- Cluster name is now visible in istio status command.
- Control plane list now shows clusters as well.
Mesh view
- Stabilize the ordering of Istio clusters to prevent changed ordering on the UI.
cert-manager
Auth
- Fix an issue where 1.7 specific authentication tokens were generated during upgrade scenarios.
UI
- Fix an issue which caused topology to crash for ingress gateways.
Operators
- Add RBAC for Coordination resources so that operator leader election can use the resources.
- In case there is a merge conflict during reconciliation the smm operator will retry the reconciliation without failing.
- 1.7 Istio operators will be properly removed during uninstall.
Let’s Encrypt
- Validate DNS records on let’s encrypt enabled ingresses to ensure that the ingress and DNS records are matching.
Registry access
- Sort secret names to prevent changes always happening during reconciliation.
Release 1.8 (2021-10-26)
The primary goal of this release was to have a modern way to orchestrate Istio and the multi-cluster topologies Service Mesh Manager supports. As part of this work, the Cisco Istio Operator has been restructured from the ground up so that you can benefit from an API that has been adjusted to the modern Istio versions.
As this new version of the operator supports not just the Primary-Remote cluster topology, but also Multi-Primary both on the same and different network, this change paves the way for subsequent releases to add support into Service Mesh Manager for meshes with any number of Primary and Remote clusters.
Istio 1.11 support
Service Mesh Manager now supports Istio 1.11 and provides our Istio distribution based on that codebase.
This also means that Service Mesh Manager is fully compatible with Kubernetes v1.22.x.
OIDC and external dashboard access support
This release provides support for exposing the Service Mesh Manager dashboard via a public, https URL. For the required configuration please check out the Exposing the Dashboard page.
To entirely remove the need for downloading the Service Mesh Manager CLI and to better integrate with existing OIDC-enabled Kubernetes deployments, we are also supporting OIDC Authentication.
Release 1.7 (2021-07-28)
Release 1.7 is focusing on compliance, integrations, tech-debt and reusability.
GraphQL federation
The Service Mesh Manager GraphQL API is now broken down into separate components to increase reusability, and to provide the ability to switch components on/off in Service Mesh Manager in the future.
Protocol-specific observability
Istio provides several useful metrics for the TCP, HTTP, and GRPC protocols. To give you better observability and more insight into the traffic of your services, Service Mesh Manager displays protocol-specific metrics normally not available in Istio for MySQL and PostgreSQL traffic. Support for more protocols is planned in future releases.
Istio 1.10 support
Service Mesh Manager now supports Istio 1.10.
Cluster registry
A generic, distributed Kubernetes cluster registry is now serving as the base for keeping multi-cluster metadata. Cluster metadata is replicated across clusters using a gossip-like protocol.
Unified Istio distribution with SecureCN
SecureCN and Service Mesh Manager are now using the same Istio distribution that enables better integration between the two products.
CSDL Compliance
Service Mesh Manager has now reached CSDL “Planned” status.
DevNet Sandbox
Service Mesh Manager is now available on DevNet sandbox for design partners for solution testing.
Release 1.6.1 (2021-05-06)
This release is a security and bugfix release.
Included changes are:
- Add support for Istio 1.8.5 for customers still using the old version of Istio instead of 1.9
- Fix an issue in the Istio operator that required permissions for the
authentication.istio.io
and config.istio.io
groups, while those are only needed for Istio versions < 1.8
smm activate
command now resets all of the user’s registry settings, making changing IAM credentials easier. Previously the end user-needed to remove the registry access credentials manually using the smm registry remove
command
Release 1.6 (2021-04-09)
Group your clusters into networks to optimize your mesh topology using a mix of gateway-based and flat-network connections between your clusters, decreasing cross-cluster latencies and transfer costs. Clusters belonging to the same network can access each other directly, without using the cluster gateway. For details, see Cluster network and Attach a new cluster to the mesh.
UI improvements
Istio 1.9 support
Service Mesh Manager now supports Istio 1.9.
2 - Service Mesh Manager
Service Mesh Manager is a multi and hybrid-cloud enabled service mesh platform for constructing modern applications.
Built on Kubernetes and our Istio distribution, Service Mesh Manager enables flexibility, portability and consistency across on-premise datacenters and cloud environments.
Service Mesh Manager helps you to confidently scale your microservices over single- and multi-cluster environments and to make daily operational routines standardized and more efficient. The componentization and scaling of modern applications inevitably leads to a number of optimization and management issues:
- How do you spot bottlenecks? Are all components functioning correctly?
- How are connections between components secured?
- How does one reliably upgrade service components?
Service Mesh Manager helps you accomplish these tasks and many others in a simple and scalable way, by leveraging the Istio service mesh and building many automations around it. Our tag-line for the product captures this succinctly:
Service Mesh Manager operationalizes the service mesh to bring deep observability, convenient management, and policy-based security to modern container-based applications.
Learn more about Service Mesh Manager.
2.1 - Overview
Service Mesh Manager is a multi and hybrid-cloud enabled service mesh platform for constructing modern applications.
Built on Kubernetes and our Istio distribution, Service Mesh Manager enables flexibility, portability and consistency across on-premise datacenters and cloud environments.
Service Mesh Manager helps you to confidently scale your microservices over single- and multi-cluster environments and to make daily operational routines standardized and more efficient. The componentization and scaling of modern applications inevitably leads to a number of optimization and management issues:
- How do you spot bottlenecks? Are all components functioning correctly?
- How are connections between components secured?
- How does one reliably upgrade service components?
Service Mesh Manager helps you accomplish these tasks and many others in a simple and scalable way, by leveraging the Istio service mesh and building many automations around it. Our tag-line for the product captures this succinctly:
Service Mesh Manager operationalizes the service mesh to bring deep observability, convenient management, and policy-based security to modern container-based applications.
Key features
Service Mesh Manager takes the pain out of Istio by offering great UX from installation and mesh management to runtime diagnostics and more.
Istio distribution
Service Mesh Manager is built on Istio, but offers enhanced functionality, for example, operator-based Istio management, a full-featured CLI tool, and an intuitive and easy to use UI. It is not a new abstraction layer on top of Istio, and stays fully compatible with the upstream. Service Mesh Manager is designed for enterprise users and comes with commercial support.
For a detailed list of changes compared to upstream Istio, see Istio distribution.
Observability
The Service Mesh Manager UI gives you insight into the operation of your services. It not only shows the service topology with real-time and historical metrics, but also allows you to drill-down and analyze the metrics in context. Service Mesh Manager automatically calculates the health of your services and workloads based on the available metrics. If you still need additional details, you can access the related Grafana dashboards with a single click.
You can also monitor communications with services that are external to your mesh.
Root cause diagnostics
Root cause diagnostics help you efficiently isolate and solve operational issues related to your services. Service Mesh Manager offers:
Control
You can manage Istio through the Service Mesh Manager UI and the CLI.
Service Mesh Manager gives you easy access to the configuration of the Istio service mesh and its underlying traffic-management features, including:
With Service Mesh Manager, you can manage service-updates using automated, industry-standard upgrade strategies, like canary releases.
Multi-cluster
With Service Mesh Manager, you can monitor and manage your hybrid multi-cloud service infrastructure from a single pane of glass. You can easily attach and detach clusters using the CLI, and take advantage of enhanced multi-cluster telemetry.
Service Mesh Manager supports multiple mesh topologies, so you can use the one that best fits for your use-cases. In multi-cluster configurations it provides automatic locality load-balancing.
Service Level Objectives and burn-rate alerts
Service Mesh Manager helps SREs and operation engineers to observe and control the health of their services and applications. You can create and track service level objectives and corresponding alerting rules on the Service Mesh Manager dashboard.
Security & Compliance
Service Mesh Manager helps you secure your services through industry-standard authorization and authentication practices, including:
High-level architecture
Service Mesh Manager consists of the following components:
-
Service mesh management: The open source Cisco Istio operator helps to install/upgrade/manage Istio deployments. Its unique features include managing multiple ingress/egress gateways in a declarative fashion, and automated and fine-tuned multi-cluster management.
-
The core components of Service Mesh Manager are:
- the Service Mesh Manager backend (exposing a GraphQL API)
- the Service Mesh Manager UI, a web interface
- the Service Mesh Manager CLI
- the Service Mesh Manager operator
Service Mesh Manager’s soul is its backend, which exposes a GraphQL API. The Service Mesh Manager UI (dashboard) and CLI interact with this API. The Service Mesh Manager operator is an optional component which helps with a declarative installation method to support GitOps workflows.
-
External out-of-the-box integrations:
These components are automatically installed and configured by Service Mesh Manager by default to be able to work with Istio. You can also integrate Service Mesh Manager with your own Prometheus, Grafana, Jaeger, or Cert manager - Service Mesh Manager follow the batteries included but replaceable paradigm.
Istio-operator and Service Mesh Manager
The Calisti team actively maintains its fully upstream-compatible Istio distribution and several open-source projects and integrations that serve as the basis for Cisco Service Mesh Manager. From the perspective of Istio management, the Calisti team maintains the following:
- The Istio operator is an open source project, which is the core component involved in Istio control plane and gateway lifecycle management for Cisco Service Mesh Manager.
- Cisco Service Mesh Manager is a commercial product that includes all the features mentioned in this guide, enterprise support, and optionally integration support for customer environments.
Read the detailed comparison.
Next steps
2.1.1 - Features
Service Mesh Manager addresses the whole cloud-native lifecycle of a service mesh-based solution by providing various tools starting from day 0 to day 2 operations. As such a solution requires quite many components to provide the core service mesh functionality, tracing, metrics or safe canary-based deployments (just to name a few) we are dividing Service Mesh Manager into the following layers:
Now let’s see how these layers add up to a complete solution over the whole lifecycle of the product:
Day 0
Day 0, in software development, represents the design phase, during which project requirements are specified and the architecture of the solution is decided. Service Meshes, even if they are offloading the burden of security and traffic routing from the microservices' native side, are complex in nature.
Service Mesh Manager is designed with Day 0 experimentation in mind: we are providing a CLI tool that allows to install Service Mesh Manager without prior experience: you can have an Istio-based service mesh up and running in 15 minutes - with monitoring and tracing included. The user interface allows for rapid experimentation with Istio features via an intuitive dashboard, so during the design phase Engineers can focus on what matters the most: finding the right architecture.
Day 1
Day 1 involves developing and deploying software that was designed in the Day 0 phase. In this phase you create not only the application itself, but also its infrastructure, network, and external services, then implement the initial configuration of it all.
After the initial experimentation, Service Mesh Manager aids this process by not just providing facilities for configuring the service mesh, but also by providing [validations]/docs/mesh-management/validation/ to check for any issues in the deployed settings, and integrated metrics and outlier-detection information to pinpoint any issues with the freshly changed services.
In case of interoperability issues, the traffic tap and automated tracing feature provides more detailed insight into the real-time traffic.
Day 2
Day 2 is the time when the product is shipped or made available to the customer. Here, most of the effort is focused on maintaining, monitoring, and optimizing the system. Analyzing the behavior of the system and reacting correctly are of crucial importance, as the resulting feedback loop is applied until the end of the application’s life.
Service meshes and Istio in particular are developing fast. This is reflected in the N-1 support model it uses: a new Istio version is released every 3 months, and only the last two are supported. Service Mesh Manager helps decrease the risk of these upgrades by providing canary-like control plane upgrades: SREs can gradually upgrade their services to the new version even on a Workload level, and in case an issue happens, the old version of Istio is always available in the cluster to fall back to.
Service Mesh Manager provides a Service Level Objective feature that allows to ensure the solution works within its expected operational parameters. In case of an issue, the automated outlier detection system detects failures and shows them on the topology view. We are aiding postmortems using our timeline feature, that allows for checking out the past state of the deployment, including health data.
2.1.1.1 - Istio distribution
Service Mesh Manager is built on Istio, but offers enhanced functionality, for example, operator-based Istio management, a full-featured CLI tool, and an intuitive and easy to use UI. It is not a new abstraction layer on top of Istio, and stays fully compatible with the upstream.
Service Mesh Manager is designed for enterprise users and comes with commercial support.
Notable changes compared to upstream Istio
FIPS 140-2 Level 1 compliant build
The FIPS build uses Google’s BoringCrypto for the go-based components and Envoy. All components are recompiled with the necessary configuration to provide Level 1 compliance. Also the allowed ciphers are restricted even more than FIPS would allow. For details, see FIPS-compliant service mesh.
Multiple control plane support
The upstream Istio does not have the proper support for having properly isolated multiple control planes within one cluster. Various changes (ENV name overrides, ConfigMap name overrides, and so on) were made to support proper isolation between control planes.
Protocol specific observability
Istio uses Envoy proxy under the hood, which has support for various data protocols and provides protocol-specific metrics for them. The upstream Istio can enable those metrics if a supported protocol is detected. That list has been extended with PostgreSQL, and other protocols are coming soon.
Direct connect through gateways
Direct connect means that a workload can be exposed through an Istio ingress gateway in a way that the internal mTLS is not terminated, but rather the workload proxy port is directly accessible through the gateway. This allows communication to a workload with mTLS from an external client. This feature is mainly used in the Streaming Data Manager (formerly called Supertubes) product.
DNS capture and report
With this feature the Istio proxy is able to capture DNS requests and responses and report them to an API endpoint.
This feature is used in the SecureCN product.
TLS interception
The mesh Certificate Authority (CA) can issue TLS certificates for arbitrary domain names, to be able to look into TLS encrypted traffic. This feature is used in the SecureCN product.
The certificates issued by Istio CA can store arbitrary, workload-specific key-value attributes in the certificates' subject directory attribute property. This is used in Panoptica, the Cisco Secure Application Cloud to propagate workload-specific information between clusters, without the need for a central database.
Standalone sidecar injector component
The standalone sidecar injector is used in multi-cluster topologies on the peer clusters to have the sidecar-injection functionality of Istiod with much smaller resource requirements.
2.1.1.2 - Mesh lifecycle management
Operating a Service Mesh at scale is hard due to the inherent complexity of Mesh configurations. To ensure the most optimal operations of Service Mesh Manager based solutions, we provide [validations]/docs/mesh-management/validation/ to highlight any common issues with the existing configuration.
When it comes to supporting a service mesh-based solution on the long run, Service Mesh Manager provides support to dynamically extend the existing cluster with zero downtime to additional clusters.
Given that Istio is a fast-moving project (a new release is available every 3 months), Service Mesh Manager needs to bridge the gap between a rapidly changing Cloud Native solution and the requirements of an enterprise deployment. To decrease the blast radius of these upgrades we have introduced canary control plane upgrades based on Cisco’s open source Istio operator.
2.1.1.3 - Observability toolbox
Service Mesh Manager includes integrated observability by default. This allows you to take the most out of your service mesh deployments, by not just using the security and traffic management features of Istio, but also by having access to all the monitoring and tracing features Istio is capable of.
Integrated monitoring
Service Mesh Manager includes Prometheus to ensure faster troubleshooting and recovery. For further information on our monitoring capabilities, see the Dashboard guide.
Integrated tracing
Distributed tracing is the process of tracking individual requests throughout their whole call stack in the system.
With distributed tracing in place, you can visualize full call stacks, to see:
- which service called which service,
- how long each call took, and
- the network latencies between them.
That way you can tell where a request failed, or which service took too much time to respond.
To access traces and see real-time traffic flowing thru the cluster, see the traffic tap feature.
Automated outlier detection system
Complex systems might be hard to understand. Service Mesh Manager provides an automated (zero configuration required) outlier detection system available on the topology, workloads, and services pages of the Service Mesh Manager UI.
Service Level Objectives
To ensure, and most importantly to measure the deployed workloads availability, you can use an integrated Service Level Objective based alerting system.
For defining SLOs, see Tracking Service Level Objectives (SLOs).
For alerting settings, see SLO-based alerting in Production.
2.1.1.4 - Multi-cluster topologies
Multi-cluster topologies overview
Service Mesh Manager is able to construct an Istio service mesh that spans multiple clusters.
In this scenario you combine multiple clusters into a single service mesh that you can manage from either a single or from multiple Istio control planes.
Single mesh scenarios are best suited to use cases where clusters are configured together, sharing resources, and are generally treated as one infrastructural component within an organization.
Service Mesh Manager not only automates setting up multi-cluster topologies, but also:
- Updates resources to keep everything running in case a cluster changes (for example, the IP address of a load balancer changes, and so on)
- Keeps the Istio CRs in sync between the clusters
- Creates federated trust between the clusters (this is a difference compared to Istio)
- Provides observability, tracing, traffic tapping and other features over multiple clusters
Supported multi-cluster topologies
Istio supports a variety of mesh topologies, as detailed in the official documentation.
Service Mesh Manager implements the Primary-Remote model, either using the different or the same network model.
Service Mesh Manager also supports the Primary-Primary model, either using the different or the same network model.
Moreover, the topologies can be combined and one can have multiple primaries and multiple remotes using Service Mesh Manager.
Creating a multi-cluster mesh
Read the multi-cluster installation guide for details on how to set up a multi-cluster mesh.
2.1.2 - Modes of operation
To support the different use-cases from Day 0 to Day 2 operations, Service Mesh Manager has different modes of operation. The same binary can act as:
You can also use the operator in GitOps scenarios.
Imperative mode
The main purpose of the imperative mode is to install Service Mesh Manager, get you started, and help you experiment with the various components. You can access only a small subset of the available configuration options and features (mostly just the default settings and some of the most important configuration flags) to avoid getting overloaded with command line flags.
Most notably, you can install
and delete
Service Mesh Manager from the command line. Internally, the install and delete commands change the component-specific parts of the main configuration, then trigger the reconciliation of the affected components.
Other commands do not necessarily change the main configuration, nor trigger reconciliation of any component. Such commands create dynamic resources which are out of scope for the reconcilers, but are convenient for getting started without having to leave the CLI.
Once you are finished experimenting with Service Mesh Manager, the recommended way forward is to start using the reconcile command, and apply all configuration through the custom resource directly. This is analogous to how you use kubectl create
and then switch to using kubectl apply
when you already have a full configuration and just want to apply changes incrementally. If you are an experienced Kubernetes user, you probably skip the imperative mode and start using the reconcile command from the beginning.
The drawback of the imperative mode is that there is no overall state of components, so it can’t tell what has already been installed.
Also, it it not suitable for automation. CD systems typically require Helm charts, Kustomize, or pure YAML resources to operate with. Although the imperative commands of Service Mesh Manager have a --dump-resources
flag that generates YAML files instead of applying them, you would still have to run the install command locally for each component, and commit the generated resources into version control. The CD workflow would then have to specify sequential steps for each component separately, making the whole flow difficult to extend over time.
Using the imperative mode
To use Service Mesh Manager in imperative mode, install the smm-cli command-line tool, then use its commands to install Service Mesh Manager and perform other actions. For a list of available commands, see the CLI reference.
Note: You can also configure many aspects of your service mesh using the Service Mesh Manager web interface. To access the web interface run the smm dashboard
command (if your KUBECONFIG file is set properly), the smm dashboard
command automatically performs the login).
Install/Uninstall components
The following components can be installed/uninstalled individually. The -a
flag installs/uninstalls them all. For details on installing and uninstalling the Service Mesh Manager operator, see Operator mode.
Reconciler mode
Reconciler mode is a declarative CLI mode. The reconcile
command is a one-shot version of an operator’s reconcile flow. It executes the component reconcilers in order, and can decide whether they require another reconciliation round, or are already finished. Reconciling can apply new configuration, and remove disabled components from the system.
Note: In this mode, the operator is not installed on the cluster. The controller code runs from the CLI on the client side.
A component can be anything that receives the whole configuration, understands its own part from it to configure itself, and is able to delete its managed resources when disabled or removed. Service Mesh Manager uses two different implementations:
-
The native reconciler triggers a “resource builder” to create Kubernetes resources along with their desired state (present or absent) based on the configuration of the component. Such resource builders create CRDs, RBAC, and a Deployment resource to be able to run an operator.
-
The other implementation is the Helm reconciler that basically acts as a Helm operator. It installs and upgrades an embedded chart if it has changed, or uninstalls it if it has been removed from the main configuration.
Compared to kubectl apply
, these solutions add ordering, and allow executing custom logic if required. Also, they remove resources that are not present in the config anymore. The CLI in this case executes the control logic as well.
Compared to terraform, the dependencies are managed in a predefined execution order and have static binding using deterministic names. Lower performance, but easier to follow. Remote state is the CR saved to the API server.
Using the reconciler mode
To use Service Mesh Manager in reconciler mode, complete the following steps. In this scenario, the manifest is read from a file, allowing you to declaratively provide custom configuration for the various components.
-
Login to your Service Mesh Manager installation.
-
Prepare the configuration settings you want to apply in a YAML file, and run the following command. For details on the configuration settings, see the ControlPlane Custom Resource.
smm reconcile --from-file <path-to-file>
-
The settings applied to the components are the result of merging the default settings
+ valuesOverride
+ managed settings
. You cannot change the managed settings to avoid misconfiguration and possible malfunction.
Operator mode
The operator mode follows the familiar operator pattern. In operator mode, Service Mesh Manager watches events on the ControlPlane Custom Resource, and triggers a reconciliation for all components in order, the same way you can trigger the reconcile command locally.
Note: Unlike in the declarative CLI mode, in operator mode the Service Mesh Manager operator is running inside Kubernetes, and not on a client machine. This naturally means that this mode is exclusive with the install, delete, and reconcile commands.
Using the operator mode is the recommended way to integrate the Service Mesh Manager installer into a Kubernetes-native continuous delivery solution, for example, Argo, where the integration boils down to applying YAML files to get the installer deployed as an operator.
Existing configurations managed using the reconcile
command work out-of-the box after switching to the operator mode.
Using the operator mode
To use Service Mesh Manager in operator mode, Install Service Mesh Manager in operator mode. In this scenario, the reconcile flow runs on the Kubernetes cluster as an operator that watches the ControlPlane
custom resources. Any changes made to the watched custom resource triggers the reconcile flow.
GitOps
GitOps is a way of implementing Continuous Deployment for cloud native applications. Based on Git and Continuous Deployment tools, GitOps provides a declarative way to store the desired state of your infrastructure and automated processes to realize the desired state in your production environment.
For example, to deploy a new application you update the repository, and the automated processes perform the actual deployment steps.
When used in operator mode, Service Mesh Manager works flawlessly with GitOps solutions such as Argo CD, and can be used to declaratively manage your service mesh. For a detailed tutorial on setting up Argo CD with Service Mesh Manager, see Install SMM - GitOps - single cluster.
2.1.3 - Istio-operator feature comparison
The Calisti team actively maintains its fully upstream-compatible Istio distribution and several open-source projects and integrations that serve as the basis for Cisco Service Mesh Manager. From the perspective of Istio management, the Calisti team maintains the following:
- The Istio operator is an open source project, which is the core component involved in Istio control plane and gateway lifecycle management for Cisco Service Mesh Manager.
- Cisco Service Mesh Manager is a commercial product that includes all the features mentioned in this guide, enterprise support, and optionally integration support for customer environments.
|
Istio operator |
Cisco Service Mesh Manager |
Install Istio |
✓ |
✓ |
Manage Istio |
✓ |
✓ |
Upgrade Istio |
✓ |
✓ |
Uninstall Istio |
✓ |
✓ |
Multiple gateways support |
✓ |
✓ |
Multi cluster support |
needs some manual steps |
fully automatic |
Prometheus |
|
✓ |
Grafana |
|
✓ |
Jaeger |
|
✓ |
Cert manager |
|
✓ |
Dashboard |
|
✓ |
CLI |
|
✓ |
OIDC authentication |
|
✓ |
VM integration |
|
✓ |
Topology graph |
|
✓ |
Outlier detection |
|
✓ |
Service Level Objectives |
|
✓ |
Live access logs |
|
✓ |
mTLS management |
|
✓ |
Gateway management |
|
✓ |
Istio traffic management |
|
✓ |
Validations |
|
✓ |
Support |
Community |
Enterprise |
2.2 - Getting started with the Free Tier
This Getting started guide helps you access and install the free version of Service Mesh Manager. If you are a paying customer, see Installation for installation options.
To get started with Service Mesh Manager, you will install Service Mesh Manager and a demo application on a single cluster. After that, you can attach other clusters to the mesh and redeploy the demo application to run on multiple clusters.
Free tier limitations
- The free tier of Service Mesh Manager allows you to use Service Mesh Manager on maximum of two Kubernetes clusters where the total number of worker nodes in your clusters is 10. For details, see Licensing options.
- SMM Operator helm charts is not supported.
To buy an enterprise license, contact your Cisco sales representative, or directly Cisco Emerging Technologies and Incubation.
Prerequisites
You need a Kubernetes cluster to run Service Mesh Manager. If you don’t already have a Kubernetes cluster to work with, then:
-
Create a cluster that meets the following resource requirements with your favorite provider.
CAUTION:
Supported providers and Kubernetes versions
The cluster must run a Kubernetes version that Service Mesh Manager supports: Kubernetes 1.21, 1.22, 1.23, 1.24.
Service Mesh Manager is tested and known to work on the following Kubernetes providers:
- Amazon Elastic Kubernetes Service (Amazon EKS)
- Google Kubernetes Engine (GKE)
- Azure Kubernetes Service (AKS)
- On-premises installation of stock Kubernetes with load balancer support (and optionally PVCs for persistence)
Resource requirements
Make sure that your Kubernetes cluster has sufficient resources. The default installation (with Service Mesh Manager and demo application) requires the following amount of resources on the cluster:
|
Only Service Mesh Manager |
Service Mesh Manager and Streaming Data Manager |
CPU |
- 12 vCPU in total - 4 vCPU available for allocation per worker node (If you are testing on a cluster at a cloud provider, use nodes that have at least 4 CPUs, for example, c5.xlarge on AWS.) |
- 24 vCPU in total - 4 vCPU available for allocation per worker node (If you are testing on a cluster at a cloud provider, use nodes that have at least 4 CPUs, for example, c5.xlarge on AWS.) |
Memory |
- 16 GB in total - 2 GB available for allocation per worker node |
- 36 GB in total - 2 GB available for allocation per worker node |
Storage |
12 GB of ephemeral storage on the Kubernetes worker nodes (for Traces and Metrics) |
12 GB of ephemeral storage on the Kubernetes worker nodes (for Traces and Metrics) |
These minimum requirements need to be available for allocation within your cluster, in addition to the requirements of any other loads running in your cluster (for example, DaemonSets and Kubernetes node-agents). If Kubernetes cannot allocate sufficient resources to Service Mesh Manager, some pods will remain in Pending state, and Service Mesh Manager will not function properly.
Enabling additional features, such as High Availability increases this value.
The default installation, when enough headroom is available in the cluster, should be able to support at least 150 running Pods
with the same amount of Services
. For setting up Service Mesh Manager for bigger workloads, see scaling Service Mesh Manager.
-
Set Kubernetes configuration and context.
The Service Mesh Manager command-line tool uses your current Kubernetes context, from the file named in the KUBECONFIG
environment variable (~/.kube/config
by default). Check if this is the cluster you plan to deploy the product by running the following command:
kubectl config get-contexts
If there are multiple contexts in the Kubeconfig file, specify the one you want to use with the use-context
parameter, for example:
kubectl config use-context <context-to-use>
Preparation
To access and install the free version of Service Mesh Manager, complete the following steps.
-
You’ll need a Cisco Customer account to download Service Mesh Manager. If you don’t already have one here’s how to sign up:
- Visit the Cisco Account registration page and complete the registration form.
- Look out for an email from
no-reply@mail-id.cisco.com
titled Activate Account and click on the Activate Account button to activate your account.
-
Download the Service Mesh Manager command-line tool.
- Visit the Service Mesh Manager download center.
- If you’re redirected to the home page, check the upper right-hand corner to see if you’re signed in. If you see a login button go ahead and login using your Cisco Customer account credentials. If, instead, you see “welcome, ” then you are already logged in.
- Once you have logged in, navigate to the Service Mesh Manager download center again.
- Read and accept the End-User License Agreement (EULA).
- Download the Service Mesh Manager command-line tool (CLI) suitable for your system. The CLI supports macOS and Linux (x86_64). On Windows, install the Windows Subsystem for Linux (WSL) and use the Linux binary.
- Extract the archive. The archive contains two binaries,
smm
for Service Mesh Manager, and supertubes
for Streaming Data Manager.
- Navigate to the directory where you have extracted the CLI.
-
The Service Mesh Manager download page shows your credentials that you can use to access the Service Mesh Manager docker images.
Open a terminal and login to the image registries of Service Mesh Manager by running:
SMM_REGISTRY_PASSWORD=<your-password> ./smm activate \
--host=registry.eticloud.io \
--prefix=smm \
--user='<your-username>'
Where the <your-password>
and <your-username>
parts contain the access credentials to the registries.
Install Service Mesh Manager on a single cluster
-
Run the following command. This will install the main Service Mesh Manager components.
smm install -a --cluster-name <name-of-your-cluster>
Note: If you are installing Service Mesh Manager on a managed Kubernetes solution of a public cloud provider (for example, Amazon EKS, AKS, or GKE) or kOps, the cluster name auto-discovered by Service Mesh Manager is incompatible with Kubernetes resource naming restrictions and Istio’s method of identifying clusters in a multicluster mesh.
In earlier Service Mesh Manager versions, you had to manually use the --cluster-name
parameter to set a cluster name that complies with the RFC 1123 DNS subdomain/label format (alphanumeric string without “_” or “.” characters).
Starting with Service Mesh Manager version 1.11, non-compliant names are automatically converted using the following rules:
- Replace ‘_’ characters with ‘-’
- Replace ‘.’ characters with ‘-’
- Replace ‘:’ characters with ‘-’
- Truncate the name to 63 characters
Service Mesh Manager supports KUBECONFIG contexts having the following authentication methods:
- certfile and keyfile
- certdata and keydata
- bearer token
- exec/auth provider
Username-password pairs are not supported.
If you are installing Service Mesh Manager in a test environment, you can install it without requiring authentication by running:
smm install --anonymous-auth -a --run-demo
If you experience errors during the installation, try running the installation in verbose mode: smm install -v
-
Wait until the installation is completed. This can take a few minutes.
-
(Optional) If you don’t already have Istio workload and traffic on this cluster, install the demo application:
-
Run the following command to open the dashboard. If you don’t already have Istio workload and traffic, the dashboard will be empty.
-
(Optional)
If you are installing Service Mesh Manager on a managed Kubernetes solution of a public cloud provider (for example, AWS, Azure, or Google Cloud), assign admin roles so that you can tail the logs of your containers from the Service Mesh Manager UI and perform various tasks from the CLI that require custom permissions. Run the following command:
kubectl create clusterrolebinding user-cluster-admin --clusterrole=cluster-admin --user=<gcp/aws/azure username>
-
At this point, Service Mesh Manager is up and running. On the dashboard select MENU > TOPOLOGY to see how the traffic flows through your mesh, and experiment with any of the available features described in the documentation.
-
To evaluate Streaming Data Manager, see Getting tarted with Streaming Data Manager.
Get help
If you run into errors, experience problems, or just have a question or feedback while using the Free Tier of Service Mesh Manager, visit our Application Networking and Observability community site.
Support details for the Pro and Enterprise Tiers are provided in the purchased plan.
2.3 - Installation
To evaluate the services Service Mesh Manager offers, we recommend using the free edition of Service Mesh Manager in a test environment and using our demo application.
This way you can start over any time, and try all the options you are interested in without having to worry about changes made to your existing environment, even if it’s not used in production.
Production installation is very similar, but of course you won’t need to deploy the demo application, and you must exactly specify which components you want to use.
2.3.1 - Prerequisites
Before deploying Service Mesh Manager on your cluster, complete the following tasks.
Create a cluster
You need a Kubernetes cluster to run Service Mesh Manager (and optionally, Streaming Data Manager). If you don’t already have a Kubernetes cluster to work with, create one with one of the methods described in Create a test cluster.
CAUTION:
Supported providers and Kubernetes versions
The cluster must run a Kubernetes version that Service Mesh Manager supports: Kubernetes 1.21, 1.22, 1.23, 1.24.
Service Mesh Manager is tested and known to work on the following Kubernetes providers:
- Amazon Elastic Kubernetes Service (Amazon EKS)
- Google Kubernetes Engine (GKE)
- Azure Kubernetes Service (AKS)
- On-premises installation of stock Kubernetes with load balancer support (and optionally PVCs for persistence)
Resource requirements
Make sure that your Kubernetes cluster has sufficient resources. The default installation (with Service Mesh Manager and demo application) requires the following amount of resources on the cluster:
|
Only Service Mesh Manager |
Service Mesh Manager and Streaming Data Manager |
CPU |
- 12 vCPU in total - 4 vCPU available for allocation per worker node (If you are testing on a cluster at a cloud provider, use nodes that have at least 4 CPUs, for example, c5.xlarge on AWS.) |
- 24 vCPU in total - 4 vCPU available for allocation per worker node (If you are testing on a cluster at a cloud provider, use nodes that have at least 4 CPUs, for example, c5.xlarge on AWS.) |
Memory |
- 16 GB in total - 2 GB available for allocation per worker node |
- 36 GB in total - 2 GB available for allocation per worker node |
Storage |
12 GB of ephemeral storage on the Kubernetes worker nodes (for Traces and Metrics) |
12 GB of ephemeral storage on the Kubernetes worker nodes (for Traces and Metrics) |
These minimum requirements need to be available for allocation within your cluster, in addition to the requirements of any other loads running in your cluster (for example, DaemonSets and Kubernetes node-agents). If Kubernetes cannot allocate sufficient resources to Service Mesh Manager, some pods will remain in Pending state, and Service Mesh Manager will not function properly.
Enabling additional features, such as High Availability increases this value.
The default installation, when enough headroom is available in the cluster, should be able to support at least 150 running Pods
with the same amount of Services
. For setting up Service Mesh Manager for bigger workloads, see scaling Service Mesh Manager.
Install the Service Mesh Manager tool
Install the Service Mesh Manager command-line tool. You can use the Service Mesh Manager CLI tool to install Service Mesh Manager and other components on your cluster.
Note: The Service Mesh Manager CLI supports macOS and Linux (x86_64). On Windows, install the Windows Subsystem for Linux (WSL) and use the Linux binary.
-
Install the Service Mesh Manager CLI for your environment.
-
Set Kubernetes configuration and context.
The Service Mesh Manager command-line tool uses your current Kubernetes context, from the file named in the KUBECONFIG
environment variable (~/.kube/config
by default). Check if this is the cluster you plan to deploy the product by running the following command:
kubectl config get-contexts
If there are multiple contexts in the Kubeconfig file, specify the one you want to use with the use-context
parameter, for example:
kubectl config use-context <context-to-use>
Deploy Service Mesh Manager
After you have completed the previous steps, you can install Service Mesh Manager on a single cluster, or you can form a multi-cluster mesh right away.
Note: The default version of Service Mesh Manager is built with the standard SSL libraries. To use a FIPS-compliant version of Istio, see Install FIPS images.
Select the installation method you want to use:
You can install Service Mesh Manager on a single cluster first, and attach additional clusters later to form a multi-cluster mesh.
2.3.1.1 - Accessing the Service Mesh Manager binaries
To evaluate Service Mesh Manager we recommend using the free tier option.
If you don’t already have a Cisco Customer Identity (CCI) account, you’ll also have to complete a brief sign-up procedure.
To access the CLI binaries, you can either download it from the Service Mesh Manager download page or from registry.eticloud.io
using ORAS.
Download the CLI
- Visit the Service Mesh Manager download center.
- If you’re redirected to the home page, check the upper right-hand corner to see if you’re signed in. If you see a login button go ahead and login using your Cisco Customer account credentials. If, instead, you see “welcome, ” then you are already logged in.
- Once you have logged in, navigate to the Service Mesh Manager download center again.
- Read and accept the End-User License Agreement (EULA).
- Download the Service Mesh Manager command-line tool (CLI) suitable for your system. The CLI supports macOS and Linux (x86_64). On Windows, install the Windows Subsystem for Linux (WSL) and use the Linux binary.
- Extract the archive. The archive contains two binaries,
smm
for Service Mesh Manager, and supertubes
for Streaming Data Manager.
- Navigate to the directory where you have extracted the CLI.
Download the CLI using ORAS
To install the Service Mesh Manager CLI using ORAS, complete the following steps.
-
Install OCI Registry As Storage (ORAS). For details, see the ORAS installation guide for your operating system. For example, on macOS you can run brew install oras
-
Log in to registry.eticloud.io using ORAS. You can find your credentials and the activation command on the Service Mesh Manager download page. (If you haven’t registered yet, sign up on the Service Mesh Manager page).
Run the following command to log in, then enter your username and password.
oras login registry.eticloud.io
-
Download the Service Mesh Manager CLI by running:
oras pull registry.eticloud.io/smm/smm-cli:v1.11.0
-
To manage Apache Kafka installations using Streaming Data Manager, download the Streaming Data Manager command-line tool (called supertubes-cli) as well.
oras pull registry.eticloud.io/sdm/supertubes-cli:v1.11.0
-
Extract the archive for your operating system.
-
Navigate to the directory where you have extracted the CLI.
Activate the CLI
Due to legal requirements the docker images for Service Mesh Manager are stored in a docker registry requiring authentication. Service Mesh Manager has built-in support for transparently performing this authentication. For this feature to work you must “activate” the CLI on every workstation that will be used to install, upgrade, or change the Service Mesh Manager deployment. For using the dashboard or any other CLI command this activation step can be skipped.
You can find your credentials and the activation command on the Service Mesh Manager download page.
Open a terminal and login to the image registries of Service Mesh Manager by running:
SMM_REGISTRY_PASSWORD=<your-password> ./smm activate \
--host=registry.eticloud.io \
--prefix=smm \
--user='<your-username>'
Where the <your-password>
and <your-username>
parts contain the access credentials to the registries.
After the activation, you can install Service Mesh Manager on a single cluster or multiple clusters, or manage an existing installation.
Upgrading an already activated 1.8.x SMM
In case your local 1.8.x CLI is already activated using the ECR repositories (the old activate
command is still available as activate-ecr
command), feel free to continue using the existing ECR repositories, they will remain supported.
If you’d like to start using the new repositories, execute the activate
command as shown above. That updates the local environment to rely on the new repositories. When you use the install
or operator reconcile
command for the next time, the Kubernetes cluster will be automatically updated to use the new access credentials.
2.3.1.2 - Create a test cluster
You need a Kubernetes cluster to run Service Mesh Manager. If you don’t already have a Kubernetes cluster to work with, create one with one of the following methods.
- Run locally (~5 minutes): Deploy Service Mesh Manager to a single-node Kubernetes cluster running on your development machine.
- Run on a Kubernetes cluster (~10 minutes): Deploy Service Mesh Manager to a Kubernetes cluster of your choice.
Run Service Mesh Manager locally
Recommended if you don’t have or don’t want to create a Kubernetes cluster, but want to try out Service Mesh Manager quickly.
-
Install one of the following tools to run a Kubernetes cluster locally:
-
Ensure that the local Kubernetes cluster meets the following requirements:
CAUTION:
Supported providers and Kubernetes versions
The cluster must run a Kubernetes version that Service Mesh Manager supports: Kubernetes 1.21, 1.22, 1.23, 1.24.
Service Mesh Manager is tested and known to work on the following Kubernetes providers:
- Amazon Elastic Kubernetes Service (Amazon EKS)
- Google Kubernetes Engine (GKE)
- Azure Kubernetes Service (AKS)
- On-premises installation of stock Kubernetes with load balancer support (and optionally PVCs for persistence)
Resource requirements
Make sure that your Kubernetes cluster has sufficient resources. The default installation (with Service Mesh Manager and demo application) requires the following amount of resources on the cluster:
|
Only Service Mesh Manager |
Service Mesh Manager and Streaming Data Manager |
CPU |
- 12 vCPU in total - 4 vCPU available for allocation per worker node (If you are testing on a cluster at a cloud provider, use nodes that have at least 4 CPUs, for example, c5.xlarge on AWS.) |
- 24 vCPU in total - 4 vCPU available for allocation per worker node (If you are testing on a cluster at a cloud provider, use nodes that have at least 4 CPUs, for example, c5.xlarge on AWS.) |
Memory |
- 16 GB in total - 2 GB available for allocation per worker node |
- 36 GB in total - 2 GB available for allocation per worker node |
Storage |
12 GB of ephemeral storage on the Kubernetes worker nodes (for Traces and Metrics) |
12 GB of ephemeral storage on the Kubernetes worker nodes (for Traces and Metrics) |
These minimum requirements need to be available for allocation within your cluster, in addition to the requirements of any other loads running in your cluster (for example, DaemonSets and Kubernetes node-agents). If Kubernetes cannot allocate sufficient resources to Service Mesh Manager, some pods will remain in Pending state, and Service Mesh Manager will not function properly.
Enabling additional features, such as High Availability increases this value.
The default installation, when enough headroom is available in the cluster, should be able to support at least 150 running Pods
with the same amount of Services
. For setting up Service Mesh Manager for bigger workloads, see scaling Service Mesh Manager.
-
Launch the local Kubernetes cluster with one of the following tools:
-
Proceed to Install Service Mesh Manager.
-
When you’re done experimenting, you can remove the demo application, Service Mesh Manager, and Istio from your cluster with the following command, which removes all of these components in the correct order:
Note: Uninstalling Service Mesh Manager does not remove the Custom Resource Definitions (CRDs) from the cluster, because removing a CRD removes all related resources. Since Service Mesh Manager uses several external components, this could remove things not belonging to Service Mesh Manager.
Run on a Kubernetes cluster
Recommended if you have a Kubernetes cluster and want to try out Service Mesh Manager quickly.
-
Create a cluster that meets the following resource requirements with your favorite provider.
CAUTION:
Supported providers and Kubernetes versions
The cluster must run a Kubernetes version that Service Mesh Manager supports: Kubernetes 1.21, 1.22, 1.23, 1.24.
Service Mesh Manager is tested and known to work on the following Kubernetes providers:
- Amazon Elastic Kubernetes Service (Amazon EKS)
- Google Kubernetes Engine (GKE)
- Azure Kubernetes Service (AKS)
- On-premises installation of stock Kubernetes with load balancer support (and optionally PVCs for persistence)
Resource requirements
Make sure that your Kubernetes cluster has sufficient resources. The default installation (with Service Mesh Manager and demo application) requires the following amount of resources on the cluster:
|
Only Service Mesh Manager |
Service Mesh Manager and Streaming Data Manager |
CPU |
- 12 vCPU in total - 4 vCPU available for allocation per worker node (If you are testing on a cluster at a cloud provider, use nodes that have at least 4 CPUs, for example, c5.xlarge on AWS.) |
- 24 vCPU in total - 4 vCPU available for allocation per worker node (If you are testing on a cluster at a cloud provider, use nodes that have at least 4 CPUs, for example, c5.xlarge on AWS.) |
Memory |
- 16 GB in total - 2 GB available for allocation per worker node |
- 36 GB in total - 2 GB available for allocation per worker node |
Storage |
12 GB of ephemeral storage on the Kubernetes worker nodes (for Traces and Metrics) |
12 GB of ephemeral storage on the Kubernetes worker nodes (for Traces and Metrics) |
These minimum requirements need to be available for allocation within your cluster, in addition to the requirements of any other loads running in your cluster (for example, DaemonSets and Kubernetes node-agents). If Kubernetes cannot allocate sufficient resources to Service Mesh Manager, some pods will remain in Pending state, and Service Mesh Manager will not function properly.
Enabling additional features, such as High Availability increases this value.
The default installation, when enough headroom is available in the cluster, should be able to support at least 150 running Pods
with the same amount of Services
. For setting up Service Mesh Manager for bigger workloads, see scaling Service Mesh Manager.
-
Set Kubernetes configuration and context.
The Service Mesh Manager command-line tool uses your current Kubernetes context, from the file named in the KUBECONFIG
environment variable (~/.kube/config
by default). Check if this is the cluster you plan to deploy the product by running the following command:
kubectl config get-contexts
If there are multiple contexts in the Kubeconfig file, specify the one you want to use with the use-context
parameter, for example:
kubectl config use-context <context-to-use>
-
Proceed to Install Service Mesh Manager.
2.3.2 - Licensing options
Service Mesh Manager is available in two different editions:
Tier |
Capacity |
Support |
Free |
Maximum of 10 nodes total in maximum 2 two clusters. |
Community |
Pro |
Maximum of 25 nodes total in any number of clusters. |
Cisco |
Enterprise |
Unlimited, per node pricing. |
Cisco |
Free tier
The free tier allows using 10 nodes in maximum 2 clusters inside the same mesh (active/active or active/passive setups are allowed).
For example, you can use a single cluster with 10 worker nodes (counting the active nodes too), or you can have an active cluster with 6 worker nodes and a passive cluster attached with 4 additional nodes.
If you exceed these limits, the management functionalities of Service Mesh Manager become restricted until the number of nodes and clusters are back to the allowed values:
-
CLI commands for installing Service Mesh Manager and attaching clusters to the mesh will fail until the node count has been decreased.
Note: The uninstall command works regardless of the node count.
-
The dashboard shows an error detailing the license violations when over limits. As Kubernetes is dynamic in nature, you can exceed the node limit by 1 for one day, so you can keep using the Service Mesh Manager dashboard during node rotations.
The Istio data plane is available regardless the number of nodes, ensuring that no production outage happens due to license violation.
To register for free-tier access, see Getting started with the Free Tier.
Paid tier
To buy a Pro (paid-tier) license, visit the Service Mesh Manager website. If you have purchased a Pro license for Service Mesh Manager, you have to apply the license to your Service Mesh Manager installations. To achieve that, complete the following steps.
-
Copy your license token into a file (for example, license.key).
-
Apply the license to your Service Mesh Manager installation. If you have a multi-cluster setup, apply it to the primary cluster (where the Service Mesh Manager control plane is running). Run the following command:
smm license apply --licenseKeyPath <license.key>
Note: If you are using Service Mesh Manager with a commercial license in a multi-cluster scenario, Service Mesh Manager automatically synchronizes the license to the attached clusters. If the peer cluster already has a license, it is automatically deleted and replaced with the license of the primary Service Mesh Manager cluster. Detaching a peer cluster automatically deletes the license from the peer cluster.
-
Run the following command to verify that the new license has been added to the Service Mesh Manager installation.
Alternatively, you can open the Service Mesh Manager dashboard, open the user account in the top-right, then select License to display the details of the license.
The details of the license include:
- the number of permitted clusters (MaxClusters) in the mesh,
- the total number of permitted nodes in the mesh (MaxNodes), and
- the number of maximum permitted nodes for a cluster (MaxNodesPerCluster).
Exceeding the paid-tier license limit
In case you exceed the license limit for a paid-tier license, you lose access to the Service Mesh Manager dashboard until you decrease the size of the mesh to comply with the license limits.
The Istio data plane is available regardless the number of nodes, ensuring that no production outage happens due to license violation.
Enterprise license
To buy an enterprise license, contact your Cisco sales representative, or directly the Service Mesh Manager sales team.
To apply an enterprise license to your Service Mesh Manager installations, follow the steps described for the Paid tier.
2.3.3 - Create single cluster mesh
Prerequisites
You need the Service Mesh Manager CLI
tool installed on your computer and a Kubernetes cluster as described in the Prerequisites section.
Install Service Mesh Manager
For a quick demo or evaluation, complete the following steps to install Service Mesh Manager with every component, including the demo application. If you prefer a more interactive installation, see Installing Service Mesh Manager interactively.
Note: If you are installing Service Mesh Manager on a managed Kubernetes solution of a public cloud provider (for example, Amazon EKS, AKS, or GKE) or kOps, the cluster name auto-discovered by Service Mesh Manager is incompatible with Kubernetes resource naming restrictions and Istio’s method of identifying clusters in a multicluster mesh.
In earlier Service Mesh Manager versions, you had to manually use the --cluster-name
parameter to set a cluster name that complies with the RFC 1123 DNS subdomain/label format (alphanumeric string without “_” or “.” characters).
Starting with Service Mesh Manager version 1.11, non-compliant names are automatically converted using the following rules:
- Replace ‘_’ characters with ‘-’
- Replace ‘.’ characters with ‘-’
- Replace ‘:’ characters with ‘-’
- Truncate the name to 63 characters
-
Run the following command. This will install the main Service Mesh Manager components.
-
If you are installing Service Mesh Manager on a managed Kubernetes solution of a public cloud provider (for example, Amazon EKS, AKS, or GKE) or kOps, run
smm install -a --cluster-name <name-of-your-cluster>
-
Otherwise, run
Service Mesh Manager supports KUBECONFIG contexts having the following authentication methods:
- certfile and keyfile
- certdata and keydata
- bearer token
- exec/auth provider
Username-password pairs are not supported.
If you are installing Service Mesh Manager in a test environment, you can install it without requiring authentication by running:
smm install --anonymous-auth -a --run-demo
If you experience errors during the installation, try running the installation in verbose mode: smm install -v
Note: If you are installing Service Mesh Manager on a local cluster (for example, using MiniKube) and you don’t have a local LoadBalancer setup, disable the meshexpansion gateway support.
To do that, create a file called local_icp_cr.yaml
with the following content:
apiVersion: servicemesh.cisco.com/v1alpha1
kind: IstioControlPlane
metadata:
name: mesh
namespace: istio-system
spec:
meshExpansion:
enabled: false
Then, run the following command: smm install --istio-cr-file local_icp_cr.yaml
-
Wait until the installation is completed. This can take a few minutes. Run the following command to open the dashboard.
If you don’t already have Istio workload and traffic, the dashboard will be empty. To install the demo application, run:
After installation, the demo application automatically starts generating traffic, and the dashboard draws a picture of the data flow. (If it doesn’t, run the smm demoapp load start
command, or Generate load on the UI. If you want to stop generating traffic, run smm demoapp load stop
.)
-
If you are installing Service Mesh Manager on a managed Kubernetes solution of a public cloud provider (for example, AWS, Azure, or Google Cloud), assign admin roles so that you can tail the logs of your containers from the Service Mesh Manager UI and perform various tasks from the CLI that require custom permissions. Run the following command:
kubectl create clusterrolebinding user-cluster-admin --clusterrole=cluster-admin --user=<gcp/aws/azure username>
-
At this point, Service Mesh Manager is up and running. On the dashboard select MENU > TOPOLOGY to see how the traffic flows through your mesh, and experiment with any of the available features described in the documentation.
-
If you have purchased a commercial license for Service Mesh Manager, apply the license. For details, see Paid tier.
Install Service Mesh Manager interactively
With the interactive installation, you can:
- Install the Service Mesh Manager core, which provides a dashboard and an internal API for handling the service mesh.
- Install and execute the Istio operator.
- Install a demo application (optional).
Complete the following steps.
-
Start the installation.
-
If you are installing Service Mesh Manager on a managed Kubernetes solution of a public cloud provider (for example, AWS, Azure, or GCP), run
smm install --cluster-name <name-of-your-cluster>
-
Otherwise, run
If you experience errors during the installation, try running the installation in verbose mode: smm install -v
During installation, answer the interactive questions in the terminal.
? Install istio-operator (recommended). Press enter to accept Yes
? Install cert-manager (recommended). Press enter to accept Yes
? Install canary-operator (recommended). Press enter to accept Yes
? Install and run demo application (optional). Press enter to skip (y/N) y
Note: If you don’t need the demo application, you can simply accept the defaults by pressing enter for each question as it will only install the core components. You can install additional components later.
-
Wait until the installation is completed. This can take a few minutes. If you have selected to install the demo application, the Service Mesh Manager dashboard automatically opens in your browser. Otherwise, run the following command to open the dashboard.
If you don’t already have Istio workload and traffic, the dashboard will be empty. To install the demo application, run:
After installation, the demo application automatically starts generating traffic, and the dashboard draws a picture of the data flow. (If it doesn’t, run the smm demoapp load start
command, or Generate load on the UI. If you want to stop generating traffic, run smm demoapp load stop
.)
-
If you are installing Service Mesh Manager on a managed Kubernetes solution of a public cloud provider (for example, AWS, Azure, or Google Cloud), assign admin roles so that you can tail the logs of your containers from the Service Mesh Manager UI and perform various tasks from the CLI that require custom permissions. Run the following command:
kubectl create clusterrolebinding user-cluster-admin --clusterrole=cluster-admin --user=<gcp/aws/azure username>
-
At this point, Service Mesh Manager is up and running. On the dashboard select MENU > TOPOLOGY to see how the traffic flows through your mesh, and experiment with any of the available features described in the documentation.
-
If you have purchased a commercial license for Service Mesh Manager, apply the license. For details, see Paid tier.
2.3.4 - Create multi-cluster mesh
Prerequisites
To create a multi-cluster mesh with Service Mesh Manager, you need:
- At least two Kubernetes clusters, with access to their kubeconfig files.
- The
Service Mesh Manager CLI
tool installed on your computer.
- Network connectivity properly configured between the participating clusters.
Create a multi-cluster mesh
To create a multi-cluster mesh with Service Mesh Manager, complete the following steps.
-
Install Service Mesh Manager to the primary cluster using the following command. This will install all Service Mesh Manager components to the cluster. Run smm install -a --cluster-name <name-of-your-cluster>
or smm install -a
Note: If you are installing Service Mesh Manager on a managed Kubernetes solution of a public cloud provider (for example, Amazon EKS, AKS, or GKE) or kOps, the cluster name auto-discovered by Service Mesh Manager is incompatible with Kubernetes resource naming restrictions and Istio’s method of identifying clusters in a multicluster mesh.
In earlier Service Mesh Manager versions, you had to manually use the --cluster-name
parameter to set a cluster name that complies with the RFC 1123 DNS subdomain/label format (alphanumeric string without “_” or “.” characters).
Starting with Service Mesh Manager version 1.11, non-compliant names are automatically converted using the following rules:
- Replace ‘_’ characters with ‘-’
- Replace ‘.’ characters with ‘-’
- Replace ‘:’ characters with ‘-’
- Truncate the name to 63 characters
If you experience errors during the installation, try running the installation in verbose mode: smm install -v
Service Mesh Manager supports KUBECONFIG contexts having the following authentication methods:
- certfile and keyfile
- certdata and keydata
- bearer token
- exec/auth provider
Username-password pairs are not supported.
If you are installing Service Mesh Manager in a test environment, you can install it without requiring authentication by running:
smm install --anonymous-auth -a --run-demo
-
On the primary Service Mesh Manager cluster, attach the peer cluster to the mesh using one of the following commands.
Note: To understand the difference between the remote Istio and primary Istio clusters, see the Istio control plane models section in the official Istio documentation.
The short summary is that remote Istio clusters do not have a separate Istio control plane, while primary Istio clusters do.
The following commands automate the process of creating the resources necessary for the peer cluster, generate and set up the kubeconfig for that cluster, and attach the cluster to the mesh.
-
To attach a remote Istio cluster with the default options, run:
smm istio cluster attach <PEER_CLUSTER_KUBECONFIG_FILE>
-
To attach a primary Istio cluster (one that has an active Istio control plane installed), run:
smm istio cluster attach <PEER_CLUSTER_KUBECONFIG_FILE> --active-istio-control-plane
Note: If the name of the cluster cannot be used as a Kubernetes resource name (for example, because it contains the underscore, colon, or another special character), you must manually specify a name to use when you are attaching the cluster to the service mesh. For example:
smm istio cluster attach <PEER-CLUSTER-KUBECONFIG-FILE> --name <KUBERNETES-COMPLIANT-CLUSTER-NAME> --active-istio-control-plane
Otherwise, the following error occurs when you try to attach the cluster:
could not attach peer cluster: graphql: Secret "example-secret" is invalid: metadata.name: Invalid value: "gke_gcp-cluster_region": a DNS-1123 subdomain must consist of lower case alphanumeric characters, '-' or '.'**
-
Verify the name that will be used to refer to the cluster in the mesh. To use the name of the cluster, press Enter.
Note: If you are installing Service Mesh Manager on a managed Kubernetes solution of a public cloud provider (for example, Amazon EKS, AKS, or GKE) or kOps, the cluster name auto-discovered by Service Mesh Manager is incompatible with Kubernetes resource naming restrictions and Istio’s method of identifying clusters in a multicluster mesh.
In earlier Service Mesh Manager versions, you had to manually use the --cluster-name
parameter to set a cluster name that complies with the RFC 1123 DNS subdomain/label format (alphanumeric string without “_” or “.” characters).
Starting with Service Mesh Manager version 1.11, non-compliant names are automatically converted using the following rules:
- Replace ‘_’ characters with ‘-’
- Replace ‘.’ characters with ‘-’
- Replace ‘:’ characters with ‘-’
- Truncate the name to 63 characters
? Cluster must be registered. Please enter the name of the cluster (<current-name-of-the-cluster>)
-
Wait until the peer cluster is attached. Attaching the peer cluster takes some time, because it can be completed only after the ingress gateway address works. You can verify that the peer cluster is attached successfully with the following command:
The process is finished when you see Available
in the Status
field of all clusters.
To attach other clusters, or to customize the network settings of the cluster, see Attach a new cluster to the mesh.
-
Deploy the demo application. You can deploy the demo application in a distributed way to multiple clusters with the following commands:
smm demoapp install -s frontpage,catalog,bookings,postgresql
smm -c <PEER_CLUSTER_KUBECONFIG_FILE> demoapp install -s movies,payments,notifications,analytics,database,mysql --peer
After installation, the demo application automatically starts generating traffic, and the dashboard draws a picture of the data flow. (If it doesn’t, run the smm demoapp load start
command, or Generate load on the UI. If you want to stop generating traffic, run smm demoapp load stop
.)
If you are looking to deploy your own application, check out Deploy custom application for some guidelines.
-
If you are installing Service Mesh Manager on a managed Kubernetes solution of a public cloud provider (for example, AWS, Azure, or Google Cloud), assign admin roles so that you can tail the logs of your containers from the Service Mesh Manager UI and perform various tasks from the CLI that require custom permissions. Run the following command:
kubectl create clusterrolebinding user-cluster-admin --clusterrole=cluster-admin --user=<gcp/aws/azure username>
-
Open the dashboard and look around.
-
If you have purchased a commercial license for Service Mesh Manager, apply the license. For details, see Paid tier.
Cleanup
-
To remove the demo application from a peer cluster, run the following command:
smm -c <PEER_CLUSTER_KUBECONFIG_FILE> demoapp uninstall
-
To remove a peer cluster from the mesh, run the following command:
smm istio cluster detach <PEER_CLUSTER_KUBECONFIG_FILE>
For details, see Detach a cluster from the mesh.
2.3.5 - SMM Operator helm charts
You can deploy Service Mesh Manager by using Helm with the SMM Operator chart.
SMM Operator is a Kubernetes operator to deploy and manage Service Mesh Manager. In this chart the CRD is not managed by the operator, and we expect CI/CD tools to take care of updating CRD.
CAUTION:
Installing Service Mesh Manager by using the smm-operator is recommended only for advanced users. In general, the recommended method is to install Service Mesh Manager by using the
smm CLI tool, as the tool handles all the integration and setup tasks of the ControlPlane resource.
2.3.5.1 - Install SMM with the SMM Operator chart
SMM Operator is a Kubernetes operator to deploy and manage Service Mesh Manager. In this chart the CRD is not managed by the operator, and we expect CI/CD tools to take care of updating CRD.
In case you have your own cluster deployed and are authorized to fetch images from
the Cisco provided repositories, then you can rely on BasicAuth(url, username, password)
for authentication required to pull images.
You can get a Username and Password by signing up for the Free tier version of Service Mesh Manager.
Prerequisites
Helm version 3.7 or newer.
Steps
-
Create two namespaces, one for smm-operator (called smm-registry-access
), and one for cert-manager:
kubectl create ns smm-registry-access
kubectl create ns cert-manager
(The smm-registry-access
namespace is used because smm-operator should be in the same namespace as the imagepullsecrets-controller.)
-
Run the following helm commands. Replace <your-username>
and <your-password>
with the ones shown on your Service Mesh Manager download page.
export HELM_EXPERIMENTAL_OCI=1
echo <your-password> | helm registry login registry.eticloud.io -u '<your-username>' --password-stdin
helm pull oci://registry.eticloud.io/smm-charts/smm-operator --version 1.11.0
helm install \
--namespace=smm-registry-access \
--set "global.ecr.enabled=false" \
--set "global.basicAuth.username=<your-username>" \
--set "global.basicAuth.password=<your-password>" \
smm-operator \
oci://registry.eticloud.io/smm-charts/smm-operator --version 1.11.0
-
Install Service Mesh Manager by creating a ControlPlane resource. We recommend that you start with the following ControlPlane resource. This CR assumes that you are using docker-registry authentication, and the secrets referenced in the .spec.registryAccess
is used to pull smm-operator image and sync across other namespaces created by the smm-operator chart.
Replace <cluster-name>
with the name of your cluster. The cluster name format must comply with the RFC 1123 DNS subdomain/label format (alphanumeric string without “_” or “.” characters). Otherwise, you get an error message starting with: Reconciler error: cannot determine cluster name controller=controlplane, controllerGroup=smm.cisco.com, controllerKind=ControlPlane
kubectl apply -f - << EOF
apiVersion: smm.cisco.com/v1alpha1
kind: ControlPlane
metadata:
name: smm
spec:
clusterName: <cluster-name>
certManager:
enabled: true
namespace: cert-manager
clusterRegistry:
enabled: true
namespace: cluster-registry
log: {}
meshManager:
enabled: true
istio:
enabled: true
istioCRRef:
name: cp-v115x
namespace: istio-system
operators:
namespace: smm-system
namespace: smm-system
nodeExporter:
enabled: true
namespace: smm-system
psp:
enabled: false
rbac:
enabled: true
oneEye: {}
registryAccess:
enabled: true
imagePullSecretsController: {}
namespace: smm-registry-access
pullSecrets:
- name: smm-registry.eticloud.io-pull-secret
namespace: smm-registry-access
repositoryOverride:
host: registry.eticloud.io
prefix: smm
role: active
smm:
als:
enabled: true
log: {}
application:
enabled: true
log: {}
auth:
mode: impersonation
certManager:
enabled: true
enabled: true
federationGateway:
enabled: true
name: smm
service:
enabled: true
name: smm-federation-gateway
port: 80
federationGatewayOperator:
enabled: true
impersonation:
enabled: true
istio:
revision: cp-v115x.istio-system
leo:
enabled: true
log: {}
log: {}
namespace: smm-system
prometheus:
enabled: true
replicas: 1
prometheusOperator: {}
releaseName: smm
role: active
sre:
enabled: true
useIstioResources: true
EOF
Uninstalling the chart
To uninstall/delete the ControlPlane
resource and smm-operator
release, run:
kubectl delete controlplanes.smm.cisco.com smm
helm uninstall --namespace=smm-registry-access smm-operator
Chart configuration
The following table lists the configurable parameters of the Service Mesh Manager chart and their default values.
Parameter |
Description |
Default |
operator.image.repository |
Operator container image repository |
registry.eticloud.io/smm/smm-operator |
operator.image.tag |
Operator container image tag |
Same as chart version |
operator.image.pullPolicy |
Operator container image pull policy |
IfNotPresent |
operator.resources |
CPU/Memory resource requests/limits (YAML) |
Memory: 256Mi , CPU: 200m |
prometheusMetrics.enabled |
If true, use direct access for Prometheus metrics |
false |
prometheusMetrics.authProxy.enabled |
If true, use auth proxy for Prometheus metrics |
true |
prometheusMetrics.authProxy.image.repository |
Auth proxy container image repository |
gcr.io/kubebuilder/kube-rbac-proxy |
prometheusMetrics.authProxy.image.tag |
Auth proxy container image tag |
v0.5.0 |
prometheusMetrics.authProxy.image.pullPolicy |
Auth proxy container image pull policy |
IfNotPresent |
rbac.enabled |
Create rbac service account and roles |
true |
rbac.psp.enabled |
Create pod security policy and binding |
false |
ecr.enabled |
Should SMM Operator Chart handle the ECR login procedure |
true |
ecr.accessKeyID |
Access Key ID to be used for ECR logins |
Empty |
ecr.secretAccessKey |
Secret Access Key to be used for ECR logins |
Empty |
2.3.5.2 - The ControlPlane Custom Resource
Service Mesh Manager installs the ControlPlane Custom Resource with the following default values.
apiVersion: smm.cisco.com/v1alpha1
kind: ControlPlane
metadata:
name: smm
spec:
smm:
als:
enabled: true
log: {}
application:
enabled: true
log: {}
auth:
mode: impersonation
certManager:
enabled: true
enabled: true
highAvailability:
enabled: true
impersonation:
enabled: true
istio:
revision: cp-v115x.istio-system
leo:
enabled: true
log: {}
log: {}
namespace: smm-system
prometheus:
enabled: true
replicas: 1
releaseName: smm
sre:
enabled: true
useIstioResources: true
web:
enabled: true
canaryOperator:
enabled: false
namespace: smm-canary
prometheusURL: http://smm-prometheus.smm-system.svc.cluster.local:59090/prometheus
releaseName: ""
certManager:
manageNamespace: true
enabled: true
namespace: cert-manager
clusterRegistry:
enabled: true
namespace: cluster-registry
meshManager:
enabled: true
istio:
istioCRRef:
name: cp-v115x
namespace: istio-system
namespace: smm-system
prometheusMetrics:
authProxy:
image:
repository: quay.io/brancz/kube-rbac-proxy
tag: v0.11.0
log: {}
nodeExporter:
enabled: true
namespace: smm-system
psp:
enabled: false
rbac:
enabled: true
oneEye: {}
registryAccess:
enabled: true
imagePullSecretsController: {}
namespace: smm-registry-access
pullSecrets:
- name: ecr-smm-operator-eti-sre
namespace: smm-system
- name: ecr-smm-operator-banzai-customer
namespace: smm-system
role: active
To understand how Service Mesh Manager can be customized through its CRs, see Customize Installation.
2.3.6 - Install SMM - GitOps - single cluster
This guide details how to set up a GitOps environment for Service Mesh Manager using Argo CD. The same principles can be used for other tools as well.
CAUTION:
Do not push the secrets directly into the git repository, especially when it is a public repository. Argo CD provides solutions to
keep secrets safe.
Architecture
The high level architecture for Argo CD with a single-cluster Service Mesh Manager consists of the following components:
- A git repository that stores the various charts and manifests,
- a management cluster that runs the Argo CD server, and
- the Service Mesh Manager cluster managed by Argo CD.
Prerequisites
To complete this procedure, you need:
- A free registration for the Service Mesh Manager download page
- A Kubernetes cluster to deploy Argo CD on (called
management-cluster
in the examples).
- A Kubernetes cluster to deploy Service Mesh Manager on (called
workload-cluster-1
in the examples).
CAUTION:
Supported providers and Kubernetes versions
The cluster must run a Kubernetes version that Service Mesh Manager supports: Kubernetes 1.21, 1.22, 1.23, 1.24.
Service Mesh Manager is tested and known to work on the following Kubernetes providers:
- Amazon Elastic Kubernetes Service (Amazon EKS)
- Google Kubernetes Engine (GKE)
- Azure Kubernetes Service (AKS)
- On-premises installation of stock Kubernetes with load balancer support (and optionally PVCs for persistence)
Resource requirements
Make sure that your Kubernetes cluster has sufficient resources. The default installation (with Service Mesh Manager and demo application) requires the following amount of resources on the cluster:
|
Only Service Mesh Manager |
Service Mesh Manager and Streaming Data Manager |
CPU |
- 12 vCPU in total - 4 vCPU available for allocation per worker node (If you are testing on a cluster at a cloud provider, use nodes that have at least 4 CPUs, for example, c5.xlarge on AWS.) |
- 24 vCPU in total - 4 vCPU available for allocation per worker node (If you are testing on a cluster at a cloud provider, use nodes that have at least 4 CPUs, for example, c5.xlarge on AWS.) |
Memory |
- 16 GB in total - 2 GB available for allocation per worker node |
- 36 GB in total - 2 GB available for allocation per worker node |
Storage |
12 GB of ephemeral storage on the Kubernetes worker nodes (for Traces and Metrics) |
12 GB of ephemeral storage on the Kubernetes worker nodes (for Traces and Metrics) |
These minimum requirements need to be available for allocation within your cluster, in addition to the requirements of any other loads running in your cluster (for example, DaemonSets and Kubernetes node-agents). If Kubernetes cannot allocate sufficient resources to Service Mesh Manager, some pods will remain in Pending state, and Service Mesh Manager will not function properly.
Enabling additional features, such as High Availability increases this value.
The default installation, when enough headroom is available in the cluster, should be able to support at least 150 running Pods
with the same amount of Services
. For setting up Service Mesh Manager for bigger workloads, see scaling Service Mesh Manager.
Procedure overview
The high-level steps of the procedure are:
- Install Argo CD and register the clusters
- Prepare the Git repository
- Deploy Service Mesh Manager
Install Argo CD
Complete the following steps to install Argo CD on the management cluster.
Set up the environment
-
Set the KUBECONFIG location and context name for the management-cluster
cluster.
MANAGEMENT_CLUSTER_KUBECONFIG=management_cluster_kubeconfig.yaml
MANAGEMENT_CLUSTER_CONTEXT=management-cluster
kubectl config --kubeconfig "${MANAGEMENT_CLUSTER_KUBECONFIG}" get-contexts "${MANAGEMENT_CLUSTER_CONTEXT}"
Expected output:
CURRENT NAME CLUSTER AUTHINFO NAMESPACE
* management-cluster management-cluster
-
Set the KUBECONFIG location and context name for the workload-cluster-1
cluster.
WORKLOAD_CLUSTER_1_KUBECONFIG=workload_cluster_1_kubeconfig.yaml
WORKLOAD_CLUSTER_1_CONTEXT=workload-cluster-1
kubectl config --kubeconfig "${WORKLOAD_CLUSTER_1_KUBECONFIG}" get-contexts "${WORKLOAD_CLUSTER_1_CONTEXT}"
Expected output:
CURRENT NAME CLUSTER AUTHINFO NAMESPACE
* workload-cluster-1 workload-cluster-1
Repeat this step for any additional workload clusters you want to use.
-
Make sure the management-cluster
Kubernetes context is the current context.
kubectl config --kubeconfig "${MANAGEMENT_CLUSTER_KUBECONFIG}" use-context "${MANAGEMENT_CLUSTER_CONTEXT}"
Expected output:
Switched to context "management-cluster".
Install Argo CD Server
-
Install the Argo CD Server. Run the following commands.
kubectl create namespace argocd
Expected output:
kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml
-
Wait until the installation is complete, then check that the Argo CD pods are up and running.
kubectl get pods -n argocd
The output should be similar to:
NAME READY STATUS RESTARTS AGE
pod/argocd-application-controller-0 1/1 Running 0 7h59m
pod/argocd-applicationset-controller-78b8b554f9-pgwbl 1/1 Running 0 7h59m
pod/argocd-dex-server-6bbc85c688-8p7zf 1/1 Running 0 16h
pod/argocd-notifications-controller-75847756c5-dbbm5 1/1 Running 0 16h
pod/argocd-redis-f4cdbff57-wcpxh 1/1 Running 0 7h59m
pod/argocd-repo-server-d5c7f7ffb-c8962 1/1 Running 0 7h59m
pod/argocd-server-76497676b-pnvf4 1/1 Running 0 7h59m
-
For the Argo CD UI, set the argocd-server service
type to LoadBalancer
.
kubectl patch svc argocd-server -n argocd -p '{"spec": {"type": "LoadBalancer"}}'
Expected output:
service/argocd-server patched
-
Patch the App of Apps health check in Argo CD configuration to ignore diffs of controller/operator managed fields. For details about this patch, see the Argo CD documentation sections Resource Health and Diffing Customization.
Apply the new Argo CD health check configurations:
kubectl apply -f - <<EOF
apiVersion: v1
kind: ConfigMap
metadata:
name: argocd-cm
namespace: argocd
labels:
app.kubernetes.io/name: argocd-cm
app.kubernetes.io/part-of: argocd
data:
# App of app health check
resource.customizations.health.argoproj.io_Application: |
hs = {}
hs.status = "Progressing"
hs.message = ""
if obj.status ~= nil then
if obj.status.health ~= nil then
hs.status = obj.status.health.status
if obj.status.health.message ~= nil then
hs.message = obj.status.health.message
end
end
end
return hs
# Ignoring RBAC changes made by AggregateRoles
resource.compareoptions: |
# disables status field diffing in specified resource types
ignoreAggregatedRoles: true
# disables status field diffing in specified resource types
# 'crd' - CustomResourceDefinition-s (default)
# 'all' - all resources
# 'none' - disabled
ignoreResourceStatusField: all
EOF
Expected output:
configmap/argocd-cm configured
-
Get the initial password for the admin
user.
kubectl -n argocd get secret argocd-initial-admin-secret -o jsonpath="{.data.password}" | base64 -d; echo
Expected output:
-
Check the external-ip-or-hostname
address of the argocd-server
service.
kubectl get service -n argocd argocd-server
The output should be similar to:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
argocd-server LoadBalancer 10.108.14.130 external-ip-or-hostname 80:31306/TCP,443:30063/TCP 7d13h
-
Open the https://external-ip-or-hostname
URL and log in to the Argo CD server using the password received in the previous step.
# Exactly one of hostname or IP will be available and used for the remote URL.
open https://$(kubectl get service -n argocd argocd-server -o jsonpath='{.status.loadBalancer.ingress[0].hostname}{.status.loadBalancer.ingress[0].ip}')
Install Argo CD CLI
-
Install Argo CD CLI on your computer. For details, see the Argo CD documentation.
-
Log in with the CLI:
# Exactly one of hostname or IP will be available and used for the remote URL.
argocd login $(kubectl get service -n argocd argocd-server -o jsonpath='{.status.loadBalancer.ingress[0].hostname}{.status.loadBalancer.ingress[0].ip}') --insecure --username admin --password $(kubectl -n argocd get secret argocd-initial-admin-secret -o jsonpath="{.data.password}" | base64 -d)
Expected output:
'admin:login' logged in successfully
For more details about Argo CD installation, see the Argo CD getting started guide.
Register clusters
-
Register the clusters that will run Service Mesh Manager in Argo CD. In this example, register workload-cluster-1
using one of the following methods.
-
Register the cluster from the command line by running:
argocd cluster add --kubeconfig "${WORKLOAD_CLUSTER_1_KUBECONFIG}" "${WORKLOAD_CLUSTER_1_CONTEXT}"
Expected output:
WARNING: This will create a service account `argocd-manager` on the cluster referenced by context `workload-cluster-1` with full cluster level privileges. Do you want to continue [y/N]? y
INFO[0005] ServiceAccount "argocd-manager" created in namespace "kube-system"
INFO[0005] ClusterRole "argocd-manager-role" created
INFO[0005] ClusterRoleBinding "argocd-manager-role-binding" created
INFO[0011] Created bearer token secret for ServiceAccount "argocd-manager"
Cluster 'https://workload-cluster-1-ip-or-hostname' added
-
Alternatively, you can register clusters declaratively as Kubernetes secrets. Modify the following command for your environment and apply it. For details, see the Argo CD documentation.
WORKLOAD_CLUSTER_1_IP="https://workload-cluster-1-IP" ARGOCD_BEARER_TOKEN="authentication-token" ARGOCD_CA_B64="base64 encoded certificate" ; kubectl apply -f - <<EOF
apiVersion: v1
kind: Secret
metadata:
name: workload-cluster-1-secret
labels:
argocd.argoproj.io/secret-type: cluster
type: Opaque
stringData:
name: workload-cluster-1
server: "${WORKLOAD_CLUSTER_1_IP}"
config: |
{
"bearerToken": "${ARGOCD_BEARER_TOKEN}",
"tlsClientConfig": {
"insecure": false,
"caData": "${ARGOCD_CA_B64}"
}
}
EOF
-
Make sure that the cluster is registered in Argo CD by running the following command:
The output should be similar to:
SERVER NAME VERSION STATUS MESSAGE PROJECT
https://kubernetes.default.svc in-cluster Unknown Cluster has no applications and is not being monitored.
https://workload-cluster-1-ip-or-hostname workload-cluster-1 Unknown Cluster has no applications and is not being monitored.
Prepare Git repository
-
Create an empty repository called smm-gitops
on GitHub (or another provider that Argo CD supports) and initialize it with a README.md file so that you can clone the repository. Because Service Mesh Manager credentials will be stored in this repository, make it a private repository.
GITHUB_ID="github-id"
GITHUB_REPOSITORY_NAME="calisti-gitops"
-
Obtain a personal access token to the repository (on GitHub, see Creating a personal access token), that has the following permissions:
- admin:org_hook
- admin:repo_hook
- read:org
- read:public_key
- repo
-
Log in with your personal access token with git
.
export GH_TOKEN="github-personal-access-token" # Note: this environment variable needs to be exported so the `git` binary is going to use it automatically for authentication.
-
Clone the repository into your local workspace.
git clone "https://github.com/${GITHUB_ID}/${GITHUB_REPOSITORY_NAME}.git"
Expected output:
Cloning into 'calisti-gitops'...
remote: Enumerating objects: 144, done.
remote: Counting objects: 100% (144/144), done.
remote: Compressing objects: 100% (93/93), done.
remote: Total 144 (delta 53), reused 135 (delta 47), pack-reused 0
Receiving objects: 100% (144/144), 320.08 KiB | 746.00 KiB/s, done.
Resolving deltas: 100% (53/53), done.
-
Add the repository to Argo CD by running the following command. Alternatively, you can add it on Argo CD Web UI.
argocd repo add "https://github.com/${GITHUB_ID}/${GITHUB_REPOSITORY_NAME}.git" --name "${GITHUB_REPOSITORY_NAME}" --username "${GITHUB_ID}" --password "${GH_TOKEN}"
Expected output:
Repository 'https://github.com/github-id/calisti-gitops.git' added
-
Verify that the repository is connected by running:
In the output, Status should be Successful:
TYPE NAME REPO INSECURE OCI LFS CREDS STATUS MESSAGE PROJECT
git calisti-gitops https://github.com/github-id/calisti-gitops.git false false false true Successful
-
Change into the root directory of the cloned repository and create the following directories.
cd "${GITHUB_REPOSITORY_NAME}"
mkdir -p apps/demo-app apps/smm-controlplane apps/smm-operator charts demo-app manifests
The final structure of the repository will look like this:
.
├── apps
│ ├── demo-app
│ │ └── demo-app.yaml
│ ├── smm-controlplane
│ │ └── smm-controlplane.yaml
│ └── smm-operator
│ └── smm-operator.yaml
├── charts
│ └── smm-operator
│ └── ...
├── demo-app
│ ├── demo-app-ns.yaml
│ └── demo-app.yaml
└── manifests
├── cert-manager-namespace.yaml
├── smm-controlplane.yaml
├── istio-cp-v115x.yaml
└── istio-system-namespace.yaml
- The
apps
folder contains the Argo CD Application of the smm-operator
, the smm-controlplane
, and the demo-app
.
- The
charts
folder contains the Helm chart of the smm-operator
.
- The
demo-app
folder contains the manifest files of the demo application that represents your business application.
- The
manifests
folder contains the smm-controlplane
file, the istio-controlplane
file, and the cert-manager
and istio-system
namespace files.
Prepare the helm charts
-
You need an active Service Mesh Manager registration to download the Service Mesh Manager charts and images. You can sign up for free, or obtain Enterprise credentials on the official Cisco Service Mesh Manager page. After registration, you can obtain your username and password from the Download Center. Set them as environment variables.
CALISTI_USERNAME="<your-calisti-username>"
CALISTI_PASSWORD="<your-calisti-password>"
-
Download the smm-operator
chart from registry.eticloud.io
into the charts
directory of your Service Mesh Manager GitOps repository and extract it. Run the following commands:
export HELM_EXPERIMENTAL_OCI=1 # Needed prior to Helm version 3.8.0
echo "${CALISTI_PASSWORD}" | helm registry login registry.eticloud.io -u "${CALISTI_USERNAME}" --password-stdin
Expected output:
helm pull oci://registry.eticloud.io/smm-charts/smm-operator --destination ./charts/ --untar --version 1.11.0
Expected output:
Pulled: registry.eticloud.io/smm-charts/smm-operator:latest-stable-version
Digest: sha256:someshadigest
Deploy Service Mesh Manager
Deploy the smm-operator application
Complete the following steps to deploy the smm-operator
chart using Argo CD.
-
Create an Argo CD Application CR for smm-operator
.
Before running the following command, edit it if needed:
- If you are not using a GitHub repository, set the
repoURL
field to your repository.
- Set the value of the
PUBLIC_API_SERVER_ENDPOINT_ADDRESS
variable to the public API endpoint of your cluster. Some managed Kubernetes solutions of public cloud providers have different API Server endpoints for internal and public access.
ARGOCD_CLUSTER_NAME="${WORKLOAD_CLUSTER_1_CONTEXT}" PUBLIC_API_SERVER_ENDPOINT_ADDRESS="" ; cat > "apps/smm-operator/smm-operator-app.yaml" <<EOF
# apps/smm-operator/smm-operator-app.yaml
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: smm-operator
namespace: argocd
finalizers:
- resources-finalizer.argocd.argoproj.io
spec:
project: default
source:
repoURL: https://github.com/${GITHUB_ID}/${GITHUB_REPOSITORY_NAME}.git
targetRevision: HEAD
path: charts/smm-operator
helm:
parameters:
- name: "global.ecr.enabled"
value: 'false'
- name: "global.basicAuth.username"
value: "${CALISTI_USERNAME}"
- name: "global.basicAuth.password"
value: "${CALISTI_PASSWORD}"
- name: "apiServerEndpointAddress"
value: "${PUBLIC_API_SERVER_ENDPOINT_ADDRESS}" # The publicly accessible address of the k8s api server. Some Cloud providers have different API Server endpoint for internal and for public access. In that case the public endpoint needs to be specified here.
destination:
name: ${ARGOCD_CLUSTER_NAME}
namespace: smm-registry-access
syncPolicy:
automated:
prune: true
selfHeal: true
syncOptions:
- Validate=false
- PruneLast=true
- CreateNamespace=true
- Replace=true
EOF
-
Commit and push the calisti-gitops
repository.
git add apps/smm-operator charts/smm-operator
git commit -m "add smm-operator app"
Expected output:
Enumerating objects: 48, done.
Counting objects: 100% (48/48), done.
Delta compression using up to 12 threads
Compressing objects: 100% (44/44), done.
Writing objects: 100% (47/47), 282.18 KiB | 1.99 MiB/s, done.
Total 47 (delta 20), reused 0 (delta 0), pack-reused 0
remote: Resolving deltas: 100% (20/20), done.
To github.com:pregnor/calisti-gitops.git
+ 8dd47c2...db9e7af main -> main (forced update)
-
Apply the Application manifest.
kubectl apply -f "apps/smm-operator/smm-operator-app.yaml"
Expected output:
application.argoproj.io/smm-operator created
-
Verify that the applications have been added to Argo CD and are healthy.
Expected output:
NAME CLUSTER NAMESPACE PROJECT STATUS HEALTH SYNCPOLICY CONDITIONS REPO PATH TARGET
smm-operator workload-cluster-1 smm-registry-access default Synced Healthy Auto-Prune <none> https://github.com/github-id/calisti-gitops.git charts/smm-operator HEAD
-
Check the smm-operator
application on the Argo CD Web UI.
Deploy the smm-controlplane application
-
Create the following namespace for the Service Mesh Manager ControlPlane.
cat > manifests/cert-manager-namespace.yaml <<EOF
# manifests/cert-manager-namespace.yaml
apiVersion: v1
kind: Namespace
metadata:
annotations:
argocd.argoproj.io/sync-wave: "1"
name: cert-manager
EOF
-
Create the istio-system-ns.yaml
file.
cat > manifests/istio-system-namespace.yaml << EOF
# manifests/istio-system-namespace.yaml
apiVersion: v1
kind: Namespace
metadata:
annotations:
argocd.argoproj.io/sync-wave: "2"
name: istio-system
EOF
-
Create the istio-cp-v115x.yaml
file.
cat > manifests/istio-cp-v115x.yaml << EOF
apiVersion: servicemesh.cisco.com/v1alpha1
kind: IstioControlPlane
metadata:
annotations:
argocd.argoproj.io/sync-wave: "5"
name: cp-v115x
namespace: istio-system
spec:
containerImageConfiguration:
imagePullPolicy: Always
imagePullSecrets:
- name: smm-pull-secret
distribution: cisco
istiod:
deployment:
env:
- name: ISTIO_MULTIROOT_MESH
value: "true"
image: registry.eticloud.io/smm/istio-pilot:v1.15.3-bzc.0
k8sResourceOverlays:
- groupVersionKind:
group: apps
kind: Deployment
version: v1
objectKey:
name: istiod-cp-v115x
namespace: istio-system
patches:
- path: /spec/template/spec/containers/0/args/-
type: replace
value: --tls-cipher-suites=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_AES_256_GCM_SHA384,TLS_AES_128_GCM_SHA256
meshConfig:
defaultConfig:
envoyAccessLogService:
address: smm-als.smm-system.svc.cluster.local:50600
tcpKeepalive:
interval: 10s
probes: 3
time: 10s
tlsSettings:
mode: ISTIO_MUTUAL
holdApplicationUntilProxyStarts: true
proxyMetadata:
ISTIO_META_ALS_ENABLED: "true"
PROXY_CONFIG_XDS_AGENT: "true"
tracing:
tlsSettings:
mode: ISTIO_MUTUAL
zipkin:
address: smm-zipkin.smm-system.svc.cluster.local:59411
enableEnvoyAccessLogService: true
enableTracing: true
meshExpansion:
enabled: true
gateway:
deployment:
podMetadata:
labels:
app: istio-meshexpansion-gateway
istio: meshexpansiongateway
service:
ports:
- name: tcp-smm-als-tls
port: 50600
protocol: TCP
targetPort: 50600
- name: tcp-smm-zipkin-tls
port: 59411
protocol: TCP
targetPort: 59411
meshID: mesh1
mode: ACTIVE
networkName: network1
proxy:
image: registry.eticloud.io/smm/istio-proxyv2:v1.15.3-bzc.0
proxyInit:
cni:
daemonset:
image: registry.eticloud.io/smm/istio-install-cni:v1.15.3-bzc.0
image: registry.eticloud.io/smm/istio-proxyv2:v1.15.3-bzc.0
sidecarInjector:
deployment:
image: registry.eticloud.io/smm/istio-sidecar-injector:v1.15.3-bzc.0
version: 1.15.3
EOF
-
Create the smm-controlplane
CR for the ControlPlane
.
ISTIO_MINOR_VERSION="1.15" ; cat > "manifests/smm-controlplane.yaml" <<EOF
# manifests/smm-controlplane.yaml
apiVersion: smm.cisco.com/v1alpha1
kind: ControlPlane
metadata:
annotations:
argocd.argoproj.io/sync-wave: "10"
name: smm
spec:
certManager:
enabled: true
namespace: cert-manager
clusterName: ${ARGOCD_CLUSTER_NAME}
clusterRegistry:
enabled: true
namespace: cluster-registry
log: {}
meshManager:
enabled: true
istio:
enabled: true
istioCRRef:
name: cp-v${ISTIO_MINOR_VERSION/.}x
namespace: istio-system
operators:
namespace: smm-system
namespace: smm-system
nodeExporter:
enabled: true
namespace: smm-system
psp:
enabled: false
rbac:
enabled: true
oneEye: {}
registryAccess:
enabled: true
imagePullSecretsController: {}
namespace: smm-registry-access
pullSecrets:
- name: smm-registry.eticloud.io-pull-secret
namespace: smm-registry-access
repositoryOverride:
host: registry.eticloud.io
prefix: smm
role: active
smm:
exposeDashboard:
meshGateway:
enabled: true
als:
enabled: true
log: {}
application:
enabled: true
log: {}
auth:
forceUnsecureCookies: true
mode: anonymous
certManager:
enabled: true
enabled: true
federationGateway:
enabled: true
name: smm
service:
enabled: true
name: smm-federation-gateway
port: 80
federationGatewayOperator:
enabled: true
impersonation:
enabled: true
istio:
revision: cp-v${ISTIO_MINOR_VERSION/.}x.istio-system
leo:
enabled: true
log: {}
log: {}
namespace: smm-system
prometheus:
enabled: true
replicas: 1
prometheusOperator: {}
releaseName: smm
role: active
sre:
enabled: true
useIstioResources: true
EOF
-
Create the Argo CD Application CR for the smm-controlplane
.
ARGOCD_CLUSTER_NAME="${WORKLOAD_CLUSTER_1_CONTEXT}" ; cat > "apps/smm-controlplane/smm-controlplane-app.yaml" <<EOF
# apps/smm-controlplane/smm-controlplane-app.yaml
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: smm-controlplane
namespace: argocd
finalizers:
- resources-finalizer.argocd.argoproj.io
spec:
project: default
source:
repoURL: https://github.com/${GITHUB_ID}/${GITHUB_REPOSITORY_NAME}.git
targetRevision: HEAD
path: manifests
destination:
name: ${ARGOCD_CLUSTER_NAME}
syncPolicy:
automated:
prune: true
selfHeal: true
syncOptions:
- Validate=false
- CreateNamespace=true
- PrunePropagationPolicy=foreground
- PruneLast=true
- Replace=true
EOF
-
Commit the changes and push the calisti-gitops
repository.
git add apps/smm-controlplane manifests
git commit -m "add smm-controlplane app"
Expected output:
[main 25ba7e8] add smm-controlplane app
5 files changed, 212 insertions(+)
create mode 100644 apps/smm-controlplane/smm-controlplane-app.yaml
create mode 100644 manifests/cert-manager-namespace.yaml
create mode 100644 manifests/istio-cp-v115x.yaml
create mode 100644 manifests/istio-system-namespace.yaml
create mode 100644 manifests/smm-controlplane.yaml
Expected output:
Enumerating objects: 12, done.
Counting objects: 100% (12/12), done.
Delta compression using up to 10 threads
Compressing objects: 100% (10/10), done.
Writing objects: 100% (10/10), 2.70 KiB | 2.70 MiB/s, done.
Total 10 (delta 1), reused 0 (delta 0), pack-reused 0
remote: Resolving deltas: 100% (1/1), done.
To github.com:<username>/calisti-gitops.git
529545a..25ba7e8 main -> main
-
Apply the Application manifest.
kubectl apply -f "apps/smm-controlplane/smm-controlplane-app.yaml"
Expected output:
application.argoproj.io/smm-controlplane created
-
Verify that the application has been added to Argo CD and is healthy.
Expected output:
NAME CLUSTER NAMESPACE PROJECT STATUS HEALTH SYNCPOLICY CONDITIONS REPO PATH TARGET
smm-controlplane workload-cluster-1 default Synced Healthy Auto-Prune <none> https://github.com/github-id/calisti-gitops.git manifests HEAD
smm-operator workload-cluster-1 smm-registry-access default Synced Healthy Auto-Prune <none> https://github.com/github-id/calisti-gitops.git charts/smm-operator HEAD
-
Check that all pods are healthy and running in the smm-system
namespace of workload-cluster-1
.
kubectl get pods -n smm-system --kubeconfig "${WORKLOAD_CLUSTER_1_KUBECONFIG}" --context "${WORKLOAD_CLUSTER_1_CONTEXT}"
-
Check the application on Argo CD Web UI.
# Exactly one of hostname or IP will be available and used for the remote URL.
open https://$(kubectl get service -n argocd argocd-server -o jsonpath='{.status.loadBalancer.ingress[0].hostname}{.status.loadBalancer.ingress[0].ip}')
At this point, you have successfully installed smm-operator
and smm-controlplane
on workload-cluster-1
.
Deploy an application
If you want to deploy an application into the service mesh, complete the following steps. The examples use the Service Mesh Manager demo application.
-
Create a namespace for the application: create the demo-app-ns.yaml
file.
cat > demo-app/demo-app-ns.yaml << EOF
apiVersion: v1
kind: Namespace
metadata:
labels:
app.kubernetes.io/instance: smm-demo
app.kubernetes.io/name: smm-demo
app.kubernetes.io/part-of: smm-demo
app.kubernetes.io/version: 0.1.4
istio.io/rev: cp-v115x.istio-system
name: smm-demo
EOF
-
Create the demo-app.yaml
file.
cat > demo-app/demo-app.yaml << EOF
apiVersion: smm.cisco.com/v1alpha1
kind: DemoApplication
metadata:
name: smm-demo
namespace: smm-demo
spec:
autoscaling:
enabled: true
controlPlaneRef:
name: smm
deployIstioResources: true
deploySLOResources: true
enabled: true
enabledComponents:
- frontpage
- catalog
- bookings
- postgresql
- payments
- notifications
- movies
- analytics
- database
- mysql
istio:
revision: cp-v115x.istio-system
load:
enabled: true
maxRPS: 30
minRPS: 10
swingPeriod: 1380000000000
replicas: 1
resources:
limits:
cpu: "2"
memory: 192Mi
requests:
cpu: 40m
memory: 64Mi
EOF
-
Create an Argo CD Application file for the application. Create the demo-app.yaml
file.
ARGOCD_CLUSTER_NAME="${WORKLOAD_CLUSTER_1_CONTEXT}" ; cat > apps/demo-app/demo-app.yaml << EOF
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: demo-app
namespace: argocd
finalizers:
- resources-finalizer.argocd.argoproj.io
spec:
project: default
source:
repoURL: https://github.com/${GITHUB_ID}/${GITHUB_REPOSITORY_NAME}.git
targetRevision: HEAD
path: demo-app
destination:
name: ${ARGOCD_CLUSTER_NAME}
namespace: smm-demo
syncPolicy:
automated:
prune: true
selfHeal: true
syncOptions:
- Validate=false
- CreateNamespace=true
- PruneLast=true
- Replace=true
EOF
-
Commit and push the calisti-gitops
repository.
git add apps/demo-app demo-app
git commit -m "add demo app"
Expected output:
[main 58a236e] add demo app
3 files changed, 74 insertions(+)
create mode 100644 apps/demo-app/demo-app.yaml
create mode 100644 demo-app/demo-app-ns.yaml
create mode 100644 demo-app/demo-app.yaml
Expected output:
Enumerating objects: 10, done.
Counting objects: 100% (10/10), done.
Delta compression using up to 10 threads
Compressing objects: 100% (7/7), done.
Writing objects: 100% (8/8), 1.37 KiB | 1.37 MiB/s, done.
Total 8 (delta 0), reused 0 (delta 0), pack-reused 0
To github.com:<username>/calisti-gitops.git
e16549e..58a236e main -> main
-
Deploy the application.
kubectl apply -f apps/demo-app/demo-app.yaml
-
Wait until all the pods in the application namespace (smm-demo
) are up and running.
kubectl get pods -n smm-demo --kubeconfig "${WORKLOAD_CLUSTER_1_KUBECONFIG}" --context "${WORKLOAD_CLUSTER_1_CONTEXT}"
Expected output:
NAME READY STATUS RESTARTS AGE
analytics-v1-7899bd4d4-bnf24 2/2 Running 0 109s
bombardier-6455fd74f6-jndpv 2/2 Running 0 109s
bookings-v1-559768454c-7vhzr 2/2 Running 0 109s
catalog-v1-99b7bb56d-fjvhl 2/2 Running 0 109s
database-v1-5cb4b4ff67-95ttk 2/2 Running 0 109s
frontpage-v1-5b4dcbfcb4-djr72 2/2 Running 0 108s
movies-v1-78fcf666dc-z8c2z 2/2 Running 0 108s
movies-v2-84d9f5658f-kc65j 2/2 Running 0 108s
movies-v3-86bbbc9745-r84bl 2/2 Running 0 108s
mysql-d6b6b78fd-b7dwb 2/2 Running 0 108s
notifications-v1-794c5dd8f6-lndh4 2/2 Running 0 108s
payments-v1-858d4b4ffc-vtxxl 2/2 Running 0 108s
postgresql-555fd55bdb-jn5pq 2/2 Running 0 108s
-
Verify that the application appears on the Argo CD admin view, it is Healthy, and Synced.
Access the Service Mesh Manager dashboard
-
You can access the Service Mesh Manager dashboard via the smm-ingressgateway-external
LoadBalancer external-ip-or-hostname
address. Run the following command to retrieve the IP address:
kubectl get services -n smm-system smm-ingressgateway-external --kubeconfig "${WORKLOAD_CLUSTER_1_KUBECONFIG}" --context "${WORKLOAD_CLUSTER_1_CONTEXT}"
Expected output:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
smm-ingressgateway-external LoadBalancer 10.0.0.199 external-ip-or-hostname 80:32505/TCP 2m28s
-
Open the Service Mesh Manager dashboard using one of the following methods:
-
Open the http://<external-ip-or-hostname>
URL in your browser.
-
Run the following command to open the dashboard with your default browser:
# Exactly one of hostname or IP will be available and used for the remote URL.
open http://$(kubectl get services -n smm-system smm-ingressgateway-external -o jsonpath='{.status.loadBalancer.ingress[0].hostname}{.status.loadBalancer.ingress[0].ip}' --kubeconfig "${WORKLOAD_CLUSTER_1_KUBECONFIG}" --context "${WORKLOAD_CLUSTER_1_CONTEXT}")
-
If you have installed the Service Mesh Manager CLI on your machine, run the following command to open the Service Mesh Manager Dashboard in the default browser.
smm dashboard --kubeconfig "${WORKLOAD_CLUSTER_1_KUBECONFIG}" --context "${WORKLOAD_CLUSTER_1_CONTEXT}"
Expected output:
✓ validate-kubeconfig ❯ checking cluster reachability...
✓ opening Service Mesh Manager at http://127.0.0.1:50500
-
Check the deployments on the dashboard, for example, on the MENU > Overview, MENU > MESH, and MENU > TOPOLOGY pages.
2.3.7 - Install SMM - GitOps - multi-cluster
This guide details how to set up a multi-cluster Service Mesh Manager scenario in a GitOps environment for Service Mesh Manager using Argo CD. The same principles can be used for other tools as well.
CAUTION:
Do not push the secrets directly into the git repository, especially when it is a public repository. Argo CD provides solutions to
keep secrets safe.
Architecture
Service Mesh Manager supports multiple mesh topologies, so you can use the one that best fits your use cases. In multi-cluster configurations it provides automatic locality load-balancing.
The high level architecture for Argo CD with a multi-cluster Service Mesh Manager setup consists of the following components:
- A git repository that stores the various charts and manifests,
- a management cluster that runs the Argo CD server, and
- the Service Mesh Manager clusters managed by Argo CD.
Deployment models
When deploying Service Mesh Manager in a multi-cluster scenario you can deploy Service Mesh Manager in an active-passive model. For details on Service Mesh Manager clusters and their relationship to Istio clusters, see Istio clusters and SMM clusters.
Prerequisites
- A free registration for the Service Mesh Manager download page
- A Kubernetes cluster to deploy Argo CD on (called
management-cluster
in the examples).
- Two Kubernetes clusters to deploy Service Mesh Manager on (called
workload-cluster-1
and workload-cluster-2
in the examples).
CAUTION:
Supported providers and Kubernetes versions
The cluster must run a Kubernetes version that Service Mesh Manager supports: Kubernetes 1.21, 1.22, 1.23, 1.24.
Service Mesh Manager is tested and known to work on the following Kubernetes providers:
- Amazon Elastic Kubernetes Service (Amazon EKS)
- Google Kubernetes Engine (GKE)
- Azure Kubernetes Service (AKS)
- On-premises installation of stock Kubernetes with load balancer support (and optionally PVCs for persistence)
Resource requirements
Make sure that your Kubernetes cluster has sufficient resources. The default installation (with Service Mesh Manager and demo application) requires the following amount of resources on the cluster:
|
Only Service Mesh Manager |
Service Mesh Manager and Streaming Data Manager |
CPU |
- 12 vCPU in total - 4 vCPU available for allocation per worker node (If you are testing on a cluster at a cloud provider, use nodes that have at least 4 CPUs, for example, c5.xlarge on AWS.) |
- 24 vCPU in total - 4 vCPU available for allocation per worker node (If you are testing on a cluster at a cloud provider, use nodes that have at least 4 CPUs, for example, c5.xlarge on AWS.) |
Memory |
- 16 GB in total - 2 GB available for allocation per worker node |
- 36 GB in total - 2 GB available for allocation per worker node |
Storage |
12 GB of ephemeral storage on the Kubernetes worker nodes (for Traces and Metrics) |
12 GB of ephemeral storage on the Kubernetes worker nodes (for Traces and Metrics) |
These minimum requirements need to be available for allocation within your cluster, in addition to the requirements of any other loads running in your cluster (for example, DaemonSets and Kubernetes node-agents). If Kubernetes cannot allocate sufficient resources to Service Mesh Manager, some pods will remain in Pending state, and Service Mesh Manager will not function properly.
Enabling additional features, such as High Availability increases this value.
The default installation, when enough headroom is available in the cluster, should be able to support at least 150 running Pods
with the same amount of Services
. For setting up Service Mesh Manager for bigger workloads, see scaling Service Mesh Manager.
Install Argo CD
Complete the following steps to install Argo CD on the management cluster.
Set up the environment
-
Set the KUBECONFIG location and context name for the management-cluster
cluster.
MANAGEMENT_CLUSTER_KUBECONFIG=management_cluster_kubeconfig.yaml
MANAGEMENT_CLUSTER_CONTEXT=management-cluster
kubectl config --kubeconfig "${MANAGEMENT_CLUSTER_KUBECONFIG}" get-contexts "${MANAGEMENT_CLUSTER_CONTEXT}"
Expected output:
CURRENT NAME CLUSTER AUTHINFO NAMESPACE
* management-cluster management-cluster
-
Set the KUBECONFIG location and context name for the workload-cluster-1
cluster.
WORKLOAD_CLUSTER_1_KUBECONFIG=workload_cluster_1_kubeconfig.yaml
WORKLOAD_CLUSTER_1_CONTEXT=workload-cluster-1
kubectl config --kubeconfig "${WORKLOAD_CLUSTER_1_KUBECONFIG}" get-contexts "${WORKLOAD_CLUSTER_1_CONTEXT}"
Expected output:
CURRENT NAME CLUSTER AUTHINFO NAMESPACE
* workload-cluster-1 workload-cluster-1
Repeat this step for any additional workload clusters you want to use.
-
Make sure the management-cluster
Kubernetes context is the current context.
kubectl config --kubeconfig "${MANAGEMENT_CLUSTER_KUBECONFIG}" use-context "${MANAGEMENT_CLUSTER_CONTEXT}"
Expected output:
Switched to context "management-cluster".
Install Argo CD Server
-
Install the Argo CD Server. Run the following commands.
kubectl create namespace argocd
Expected output:
kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml
-
Wait until the installation is complete, then check that the Argo CD pods are up and running.
kubectl get pods -n argocd
The output should be similar to:
NAME READY STATUS RESTARTS AGE
pod/argocd-application-controller-0 1/1 Running 0 7h59m
pod/argocd-applicationset-controller-78b8b554f9-pgwbl 1/1 Running 0 7h59m
pod/argocd-dex-server-6bbc85c688-8p7zf 1/1 Running 0 16h
pod/argocd-notifications-controller-75847756c5-dbbm5 1/1 Running 0 16h
pod/argocd-redis-f4cdbff57-wcpxh 1/1 Running 0 7h59m
pod/argocd-repo-server-d5c7f7ffb-c8962 1/1 Running 0 7h59m
pod/argocd-server-76497676b-pnvf4 1/1 Running 0 7h59m
-
For the Argo CD UI, set the argocd-server service
type to LoadBalancer
.
kubectl patch svc argocd-server -n argocd -p '{"spec": {"type": "LoadBalancer"}}'
Expected output:
service/argocd-server patched
-
Patch the App of Apps health check in Argo CD configuration to ignore diffs of controller/operator managed fields. For details about this patch, see the Argo CD documentation sections Resource Health and Diffing Customization.
Apply the new Argo CD health check configurations:
kubectl apply -f - <<EOF
apiVersion: v1
kind: ConfigMap
metadata:
name: argocd-cm
namespace: argocd
labels:
app.kubernetes.io/name: argocd-cm
app.kubernetes.io/part-of: argocd
data:
# App of app health check
resource.customizations.health.argoproj.io_Application: |
hs = {}
hs.status = "Progressing"
hs.message = ""
if obj.status ~= nil then
if obj.status.health ~= nil then
hs.status = obj.status.health.status
if obj.status.health.message ~= nil then
hs.message = obj.status.health.message
end
end
end
return hs
# Ignoring RBAC changes made by AggregateRoles
resource.compareoptions: |
# disables status field diffing in specified resource types
ignoreAggregatedRoles: true
# disables status field diffing in specified resource types
# 'crd' - CustomResourceDefinition-s (default)
# 'all' - all resources
# 'none' - disabled
ignoreResourceStatusField: all
EOF
Expected output:
configmap/argocd-cm configured
-
Get the initial password for the admin
user.
kubectl -n argocd get secret argocd-initial-admin-secret -o jsonpath="{.data.password}" | base64 -d; echo
Expected output:
-
Check the external-ip-or-hostname
address of the argocd-server
service.
kubectl get service -n argocd argocd-server
The output should be similar to:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
argocd-server LoadBalancer 10.108.14.130 external-ip-or-hostname 80:31306/TCP,443:30063/TCP 7d13h
-
Open the https://external-ip-or-hostname
URL and log in to the Argo CD server using the password received in the previous step.
# Exactly one of hostname or IP will be available and used for the remote URL.
open https://$(kubectl get service -n argocd argocd-server -o jsonpath='{.status.loadBalancer.ingress[0].hostname}{.status.loadBalancer.ingress[0].ip}')
Install Argo CD CLI
-
Install Argo CD CLI on your computer. For details, see the Argo CD documentation.
-
Log in with the CLI:
# Exactly one of hostname or IP will be available and used for the remote URL.
argocd login $(kubectl get service -n argocd argocd-server -o jsonpath='{.status.loadBalancer.ingress[0].hostname}{.status.loadBalancer.ingress[0].ip}') --insecure --username admin --password $(kubectl -n argocd get secret argocd-initial-admin-secret -o jsonpath="{.data.password}" | base64 -d)
Expected output:
'admin:login' logged in successfully
For more details about Argo CD installation, see the Argo CD getting started guide.
Register clusters
-
Register the clusters that will run Service Mesh Manager in Argo CD. In this example, register workload-cluster-1
and workload-cluster-2
using one of the following methods.
-
Register the cluster from the command line by running:
argocd cluster add --kubeconfig "${WORKLOAD_CLUSTER_1_KUBECONFIG}" "${WORKLOAD_CLUSTER_1_CONTEXT}"
Expected output:
WARNING: This will create a service account `argocd-manager` on the cluster referenced by context `workload-cluster-1` with full cluster level privileges. Do you want to continue [y/N]? y
INFO[0005] ServiceAccount "argocd-manager" created in namespace "kube-system"
INFO[0005] ClusterRole "argocd-manager-role" created
INFO[0005] ClusterRoleBinding "argocd-manager-role-binding" created
INFO[0011] Created bearer token secret for ServiceAccount "argocd-manager"
Cluster 'https://workload-cluster-1-ip-or-hostname' added
argocd cluster add --kubeconfig "${WORKLOAD_CLUSTER_2_KUBECONFIG}" "${WORKLOAD_CLUSTER_2_CONTEXT}"
Expected output:
WARNING: This will create a service account `argocd-manager` on the cluster referenced by context `workload-cluster-2` with full cluster level privileges. Do you want to continue [y/N]? y
INFO[0005] ServiceAccount "argocd-manager" created in namespace "kube-system"
INFO[0005] ClusterRole "argocd-manager-role" created
INFO[0005] ClusterRoleBinding "argocd-manager-role-binding" created
INFO[0011] Created bearer token secret for ServiceAccount "argocd-manager"
Cluster 'https://workload-cluster-2-ip-or-hostname' added
-
Alternatively, you can register clusters declaratively as Kubernetes secrets. Modify the following command for your environment and apply it. For details, see the Argo CD documentation.
WORKLOAD_CLUSTER_1_IP="https://workload-cluster-1-IP" ARGOCD_BEARER_TOKEN="authentication-token" ARGOCD_CA_B64="base64 encoded certificate" ; kubectl apply -f - <<EOF
apiVersion: v1
kind: Secret
metadata:
name: workload-cluster-1-secret
labels:
argocd.argoproj.io/secret-type: cluster
type: Opaque
stringData:
name: workload-cluster-1
server: "${WORKLOAD_CLUSTER_1_IP}"
config: |
{
"bearerToken": "${ARGOCD_BEARER_TOKEN}",
"tlsClientConfig": {
"insecure": false,
"caData": "${ARGOCD_CA_B64}"
}
}
EOF
WORKLOAD_CLUSTER_2_IP="https://workload-cluster-2-IP" ARGOCD_BEARER_TOKEN="authentication-token" ARGOCD_CA_B64="base64 encoded certificate" ; kubectl apply -f - <<EOF
apiVersion: v1
kind: Secret
metadata:
name: workload-cluster-2-secret
labels:
argocd.argoproj.io/secret-type: cluster
type: Opaque
stringData:
name: workload-cluster-2
server: "${WORKLOAD_CLUSTER_2_IP}"
config: |
{
"bearerToken": "${ARGOCD_BEARER_TOKEN}",
"tlsClientConfig": {
"insecure": false,
"caData": "${ARGOCD_CA_B64}"
}
}
EOF
-
Make sure that the cluster is registered in Argo CD by running the following command:
The output should be similar to:
SERVER NAME VERSION STATUS MESSAGE PROJECT
https://kubernetes.default.svc in-cluster Unknown Cluster has no applications and is not being monitored.
https://workload-cluster-1-ip-or-hostname workload-cluster-1 Unknown Cluster has no applications and is not being monitored.
https://workload-cluster-2-ip-or-hostname workload-cluster-2 Unknown Cluster has no applications and is not being monitored.
Prepare Git repository
-
Create an empty repository called smm-gitops
on GitHub (or another provider that Argo CD supports) and initialize it with a README.md file so that you can clone the repository. Because Service Mesh Manager credentials will be stored in this repository, make it a private repository.
GITHUB_ID="github-id"
GITHUB_REPOSITORY_NAME="calisti-gitops"
-
Obtain a personal access token to the repository (on GitHub, see Creating a personal access token), that has the following permissions:
- admin:org_hook
- admin:repo_hook
- read:org
- read:public_key
- repo
-
Log in with your personal access token with git
.
export GH_TOKEN="github-personal-access-token" # Note: this environment variable needs to be exported so the `git` binary is going to use it automatically for authentication.
-
Clone the repository into your local workspace, for example:
git clone "https://github.com/${GITHUB_ID}/${GITHUB_REPOSITORY_NAME}.git"
Expected output:
Cloning into 'calisti-gitops'...
remote: Enumerating objects: 144, done.
remote: Counting objects: 100% (144/144), done.
remote: Compressing objects: 100% (93/93), done.
remote: Total 144 (delta 53), reused 135 (delta 47), pack-reused 0
Receiving objects: 100% (144/144), 320.08 KiB | 746.00 KiB/s, done.
Resolving deltas: 100% (53/53), done.
-
Add the repository to Argo CD by running the following command. Alternatively, you can add it on Argo CD Web UI.
argocd repo add "https://github.com/${GITHUB_ID}/${GITHUB_REPOSITORY_NAME}.git" --name "${GITHUB_REPOSITORY_NAME}" --username "${GITHUB_ID}" --password "${GH_TOKEN}"
Expected output:
Repository 'https://github.com/github-id/calisti-gitops.git' added
-
Verify that the repository is connected by running:
In the output, Status should be Successful:
TYPE NAME REPO INSECURE OCI LFS CREDS STATUS MESSAGE PROJECT
git calisti-gitops https://github.com/github-id/calisti-gitops.git false false false true Successful
-
Change into the directory of the cloned repository (for example, smm-gitops
) and create the following directories.
cd "${GITHUB_REPOSITORY_NAME}"
mkdir -p apps/smm-controlplane apps/smm-operator apps/demo-app charts manifests/smm-controlplane/base manifests/smm-controlplane/overlays/workload-cluster-1 manifests/smm-controlplane/overlays/workload-cluster-2 manifests/demo-app/base manifests/demo-app/overlays/workload-cluster-1 manifests/demo-app/overlays/workload-cluster-2
The final structure of the repository will look like this:
.
├── README.md
├── apps
│ ├── smm-controlplane
│ │ └── app-set.yaml
│ ├── smm-operator
│ │ └── app-set.yaml
│ └── demo-app
│ └── app-set.yaml
├── charts
│ └── smm-operator
│ ├── Chart.yaml
│ └── ...
├── export-secrets.sh
└── manifests
├── smm-controlplane
│ ├── base
│ │ ├── control-plane.yaml
│ │ ├── cert-manager-namespace.yaml
│ │ ├── istio-system-namespace.yaml
│ │ ├── istio-cp-v115x.yaml
│ │ └── kustomization.yaml
│ └── overlays
│ ├── workload-cluster-1
│ │ ├── control-plane.yaml
│ │ ├── istio-cp-v115x.yaml
│ │ └── kustomization.yaml
│ └── workload-cluster-2
│ ├── control-plane.yaml
│ ├── istio-cp-v115x.yaml
│ └── kustomization.yaml
└── demo-app
├── base
│ ├── demo-app-namespace.yaml
│ ├── demo-app.yaml
│ └── kustomization.yaml
└── overlays
├── workload-cluster-1
│ ├── demo-app.yaml
│ └── kustomization.yaml
└── workload-cluster-2
├── demo-app.yaml
└── kustomization.yaml
- The
apps
folder contains the Argo CD Application of the smm-operator
, the smm-controlplane
, and the demo-app
.
- The
charts
folder contains the Helm chart of the smm-operator
.
- The
manifests/demo-app
folder contains the manifest files of the demo application that represents your business application.
- The
manifests/smm-controlplane
folder contains the manifest files of the SMM ControlPlane.
Prepare the helm charts
-
You need an active Service Mesh Manager registration to download the Service Mesh Manager charts and images. You can sign up for free, or obtain Enterprise credentials on the official Cisco Service Mesh Manager page. After registration, you can obtain your username and password from the Download Center. Set them as environment variables.
CALISTI_USERNAME="<your-calisti-username>"
CALISTI_PASSWORD="<your-calisti-password>"
-
Download the smm-operator
chart from registry.eticloud.io
into the charts
directory of your Service Mesh Manager GitOps repository and extract it. Run the following commands:
export HELM_EXPERIMENTAL_OCI=1 # Needed prior to Helm version 3.8.0
echo "${CALISTI_PASSWORD}" | helm registry login registry.eticloud.io -u "${CALISTI_USERNAME}" --password-stdin
Expected output:
helm pull oci://registry.eticloud.io/smm-charts/smm-operator --destination ./charts/ --untar --version 1.11.0
Expected output:
Pulled: registry.eticloud.io/smm-charts/smm-operator:latest-stable-version
Digest: sha256:someshadigest
Deploy Service Mesh Manager
Deploy the smm-operator application set
Complete the following steps to deploy the smm-operator chart using Argo CD.
-
Create the smm-operator
’s Argo CD ApplicationSet
CR. Argo CD ApplicationSet
is perfect for deploying the same application on to different clusters. You can use list generators with cluster data in the ApplicationSet
.
Before running the following command, edit it if needed:
- If you are not using a GitHub repository, set the
repoURL
field to your repository.
- Set the value of the
PUBLIC_API_SERVER_ENDPOINT_ADDRESS
variable to the public API endpoint of your cluster. Some managed Kubernetes solutions of public cloud providers have different API Server endpoints for internal and public access.
PUBLIC_API_SERVER_ENDPOINT_ADDRESS_1="" PUBLIC_API_SERVER_ENDPOINT_ADDRESS_2="" ;
cat > "apps/smm-operator/app-set.yaml" <<EOF
# apps/smm-operator/app-set.yaml
apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
name: smm-operator-appset
namespace: argocd
spec:
generators:
- list:
elements:
- cluster: "${WORKLOAD_CLUSTER_1_CONTEXT}"
apiServerEndpointAddress: "${PUBLIC_API_SERVER_ENDPOINT_ADDRESS_1}"
- cluster: "${WORKLOAD_CLUSTER_2_CONTEXT}"
apiServerEndpointAddress: "${PUBLIC_API_SERVER_ENDPOINT_ADDRESS_2}"
template:
metadata:
name: 'smm-operator-{{cluster}}'
namespace: argocd
spec:
project: default
source:
repoURL: https://github.com/${GITHUB_ID}/${GITHUB_REPOSITORY_NAME}.git
targetRevision: HEAD
path: charts/smm-operator
helm:
parameters:
- name: "global.ecr.enabled"
value: 'false'
- name: "global.basicAuth.username"
value: "${CALISTI_USERNAME}"
- name: "global.basicAuth.password"
value: "${CALISTI_PASSWORD}"
- name: "apiServerEndpointAddress"
value: '{{apiServerEndpointAddress}}'
destination:
namespace: smm-registry-access
name: '{{cluster}}'
ignoreDifferences:
- kind: ValidatingWebhookConfiguration
group: admissionregistration.k8s.io
jsonPointers:
- /webhooks
syncPolicy:
automated:
prune: true
selfHeal: true
retry:
limit: 5
backoff:
duration: 5s
maxDuration: 3m0s
factor: 2
syncOptions:
- Validate=false
- PruneLast=true
- CreateNamespace=true
- Replace=true
EOF
-
Commit and push the calisti-gitops
repository.
git add apps/smm-operator charts/smm-operator
git commit -m "add smm-operator app"
-
Apply the Application manifests.
kubectl apply -f "apps/smm-operator/app-set.yaml"
-
Verify that the applications have been added to Argo CD and are healthy.
Expected output:
NAME CLUSTER NAMESPACE PROJECT STATUS HEALTH SYNCPOLICY CONDITIONS REPO PATH TARGET
argocd/smm-operator-workload-cluster-1 workload-cluster-1 smm-registry-access default Synced Healthy Auto-Prune <none> https://github.com/<github-user>/calisti-gitops-multi-cluster.git charts/smm-operator HEAD
argocd/smm-operator-workload-cluster-2 workload-cluster-2 smm-registry-access default Synced Healthy Auto-Prune <none> https://github.com/<github-user>/calisti-gitops-multi-cluster.git charts/smm-operator HEAD
-
Check the smm-operator
application on the Argo CD Web UI.
Deploy the smm-controlplane application
The following steps show you how to deploy the smm-controlplane
application as an active-passive deployment. To create and active-active deployment, follow the same steps, there is an optional step that changes the active-passive deployment to active-active. For details, see Deployment models.
Deploy the smm-controlplane
application using Kustomize: the active
on workload-cluster-1
and the passive
on workload-cluster-2
. The active cluster receives every component, while the passive cluster only a few required components. This part of the repository will look like this:
└── manifests
├── smm-controlplane
│ ├── base
│ │ ├── control-plane.yaml
│ │ ├── cert-manager-namespace.yaml
│ │ ├── istio-system-namespace.yaml
│ │ ├── istio-cp-v115x.yaml
│ │ └── kustomization.yaml
│ └── overlays
│ ├── workload-cluster-1
│ │ ├── control-plane.yaml
│ │ ├── istio-cp-v115x.yaml
│ │ └── kustomization.yaml
│ └── workload-cluster-2
│ ├── control-plane.yaml
│ ├── istio-cp-v115x.yaml
│ └── kustomization.yaml
-
Create the following namespaces files.
cat > manifests/smm-controlplane/base/cert-manager-namespace.yaml <<EOF
apiVersion: v1
kind: Namespace
metadata:
annotations:
argocd.argoproj.io/sync-wave: "1"
name: cert-manager
EOF
cat > manifests/smm-controlplane/base/istio-system-namespace.yaml << EOF
apiVersion: v1
kind: Namespace
metadata:
annotations:
argocd.argoproj.io/sync-wave: "2"
name: istio-system
EOF
-
Create the IstioControlPlane file.
cat > manifests/smm-controlplane/base/istio-cp-v115x.yaml << EOF
apiVersion: servicemesh.cisco.com/v1alpha1
kind: IstioControlPlane
metadata:
annotations:
argocd.argoproj.io/sync-wave: "5"
name: cp-v115x
namespace: istio-system
spec:
containerImageConfiguration:
imagePullPolicy: Always
imagePullSecrets:
- name: smm-pull-secret
distribution: cisco
istiod:
deployment:
env:
- name: ISTIO_MULTIROOT_MESH
value: "true"
image: registry.eticloud.io/smm/istio-pilot:v1.15.3-bzc.0
k8sResourceOverlays:
- groupVersionKind:
group: apps
kind: Deployment
version: v1
objectKey:
name: istiod-cp-v115x
namespace: istio-system
patches:
- path: /spec/template/spec/containers/0/args/-
type: replace
value: --tls-cipher-suites=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_AES_256_GCM_SHA384,TLS_AES_128_GCM_SHA256
meshConfig:
defaultConfig:
envoyAccessLogService:
address: smm-als.smm-system.svc.cluster.local:50600
tcpKeepalive:
interval: 10s
probes: 3
time: 10s
tlsSettings:
mode: ISTIO_MUTUAL
holdApplicationUntilProxyStarts: true
proxyMetadata:
ISTIO_META_ALS_ENABLED: "true"
PROXY_CONFIG_XDS_AGENT: "true"
tracing:
tlsSettings:
mode: ISTIO_MUTUAL
zipkin:
address: smm-zipkin.smm-system.svc.cluster.local:59411
enableEnvoyAccessLogService: true
enableTracing: true
meshExpansion:
enabled: true
gateway:
deployment:
podMetadata:
labels:
app: istio-meshexpansion-gateway
istio: meshexpansiongateway
service:
ports:
- name: tcp-smm-als-tls
port: 50600
protocol: TCP
targetPort: 50600
- name: tcp-smm-zipkin-tls
port: 59411
protocol: TCP
targetPort: 59411
meshID: mesh1
mode: ACTIVE
networkName: network1
proxy:
image: registry.eticloud.io/smm/istio-proxyv2:v1.15.3-bzc.0
proxyInit:
cni:
daemonset:
image: registry.eticloud.io/smm/istio-install-cni:v1.15.3-bzc.0
image: registry.eticloud.io/smm/istio-proxyv2:v1.15.3-bzc.0
sidecarInjector:
deployment:
image: registry.eticloud.io/smm/istio-sidecar-injector:v1.15.3-bzc.0
version: 1.15.3
EOF
-
Create the kustomization.yaml
file.
cat > manifests/smm-controlplane/base/kustomization.yaml <<EOF
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
metadata:
name: cluster-secrets
resources:
- cert-manager-namespace.yaml
- istio-system-namespace.yaml
- istio-cp-v115x.yaml
- control-plane.yaml
EOF
-
Create the manifests/smm-controlplane/base/control-plane.yaml
file. You don’t need to set the CLUSTER-NAME
here, you will set it with the overlays
customization.
cat > manifests/smm-controlplane/base/control-plane.yaml <<EOF
apiVersion: smm.cisco.com/v1alpha1
kind: ControlPlane
metadata:
annotations:
argocd.argoproj.io/sync-wave: "10"
name: smm
spec:
certManager:
namespace: cert-manager
clusterName: CLUSTER-NAME
clusterRegistry:
enabled: true
namespace: cluster-registry
log: {}
meshManager:
enabled: true
istio:
enabled: true
istioCRRef:
name: cp-v115x
namespace: istio-system
operators:
namespace: smm-system
namespace: smm-system
nodeExporter:
enabled: true
namespace: smm-system
psp:
enabled: false
rbac:
enabled: true
oneEye: {}
registryAccess:
enabled: true
imagePullSecretsController: {}
namespace: smm-registry-access
pullSecrets:
- name: smm-registry.eticloud.io-pull-secret
namespace: smm-registry-access
repositoryOverride:
host: registry.eticloud.io
prefix: smm
role: active
smm:
als:
enabled: true
log: {}
application:
enabled: true
log: {}
auth:
mode: impersonation
certManager:
enabled: true
enabled: true
federationGateway:
enabled: true
name: smm
service:
enabled: true
name: smm-federation-gateway
port: 80
federationGatewayOperator:
enabled: true
impersonation:
enabled: true
istio:
revision: cp-v115x.istio-system
leo:
enabled: true
log: {}
log: {}
namespace: smm-system
prometheus:
enabled: true
replicas: 1
prometheusOperator: {}
releaseName: smm
role: active
sdm:
enabled: false
sre:
enabled: true
useIstioResources: true
EOF
-
Create the kustomization.yaml
file for workload-cluster-1
.
cat > manifests/smm-controlplane/overlays/workload-cluster-1/kustomization.yaml <<EOF
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
bases:
- ../../base
patches:
- istio-cp-v115x.yaml
- control-plane.yaml
EOF
-
Set the clusterName
by overriding some settings coming from the base
configuration. Create the following files.
cat > manifests/smm-controlplane/overlays/workload-cluster-1/control-plane.yaml <<EOF
apiVersion: smm.cisco.com/v1alpha1
kind: ControlPlane
metadata:
name: smm
spec:
clusterName: workload-cluster-1
certManager:
enabled: true
smm:
exposeDashboard:
meshGateway:
enabled: true
auth:
forceUnsecureCookies: true
mode: anonymous
EOF
cat > manifests/smm-controlplane/overlays/workload-cluster-1/istio-cp-v115x.yaml <<EOF
apiVersion: servicemesh.cisco.com/v1alpha1
kind: IstioControlPlane
metadata:
annotations:
argocd.argoproj.io/sync-wave: "5"
name: cp-v115x
namespace: istio-system
spec:
meshID: mesh1
mode: ACTIVE
networkName: network1
EOF
-
Create the kustomization.yaml
file for workload-cluster-2
.
cat > manifests/smm-controlplane/overlays/workload-cluster-2/kustomization.yaml <<EOF
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
bases:
- ../../base
patches:
- istio-cp-v115x.yaml
- control-plane.yaml
EOF
-
Create the following files for workload-cluster-2
. This sets the clusterName
, and also overrides some settings of the base
configuration.
cat > manifests/smm-controlplane/overlays/workload-cluster-2/control-plane.yaml <<EOF
apiVersion: smm.cisco.com/v1alpha1
kind: ControlPlane
metadata:
name: smm
spec:
clusterName: workload-cluster-2
role: passive
smm:
als:
enabled: true
log: {}
application:
enabled: false
log: {}
auth:
mode: impersonation
certManager:
enabled: false
enabled: true
federationGateway:
enabled: false
name: smm
service:
enabled: true
name: smm-federation-gateway
port: 80
federationGatewayOperator:
enabled: true
grafana:
enabled: false
impersonation:
enabled: true
istio:
revision: cp-v115x.istio-system
kubestatemetrics:
enabled: true
leo:
enabled: false
log: {}
log: {}
namespace: smm-system
prometheus:
enabled: true
replicas: 1
retentionTime: 8h
prometheusOperator: {}
releaseName: smm
role: passive
sdm:
enabled: false
sre:
enabled: false
tracing:
enabled: true
useIstioResources: false
web:
enabled: false
EOF
cat > manifests/smm-controlplane/overlays/workload-cluster-2/istio-cp-v115x.yaml <<EOF
apiVersion: servicemesh.cisco.com/v1alpha1
kind: IstioControlPlane
metadata:
annotations:
argocd.argoproj.io/sync-wave: "5"
name: cp-v115x
namespace: istio-system
spec:
meshID: mesh1
mode: PASSIVE
networkName: workload-cluster-2
EOF
-
(Optional) If you want to change your active-passive deployment to active-active, complete this step. Otherwise, continue with the Commit deployment step.
-
Run the following commands to modify the control planes of workload-cluster-2
.
cat > manifests/smm-controlplane/overlays/workload-cluster-2/control-plane.yaml <<EOF
apiVersion: smm.cisco.com/v1alpha1
kind: ControlPlane
metadata:
name: smm
spec:
clusterName: workload-cluster-2
role: active
EOF
cat > manifests/smm-controlplane/overlays/workload-cluster-2/istio-cp-v115x.yaml <<EOF
apiVersion: servicemesh.cisco.com/v1alpha1
kind: IstioControlPlane
metadata:
annotations:
argocd.argoproj.io/sync-wave: "5"
name: cp-v115x
namespace: istio-system
spec:
meshID: mesh1
mode: ACTIVE
networkName: network1
EOF
-
Create the smm-controlplane
’s Argo CD ApplicationSet
CR.
cat > apps/smm-controlplane/app-set.yaml <<EOF
apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
name: smm-cp-appset
namespace: argocd
spec:
generators:
- list:
elements:
- cluster: "${WORKLOAD_CLUSTER_1_CONTEXT}"
path: "${WORKLOAD_CLUSTER_1_CONTEXT}"
- cluster: "${WORKLOAD_CLUSTER_2_CONTEXT}"
path: "${WORKLOAD_CLUSTER_2_CONTEXT}"
template:
metadata:
name: 'smm-cp-{{cluster}}'
namespace: argocd
spec:
project: default
source:
repoURL: https://github.com/${GITHUB_ID}/${GITHUB_REPOSITORY_NAME}.git
targetRevision: HEAD
path: manifests/smm-controlplane/overlays/{{path}}
destination:
name: '{{cluster}}'
syncPolicy:
automated:
prune: true
selfHeal: true
retry:
limit: 5
backoff:
duration: 5s
maxDuration: 3m0s
factor: 2
syncOptions:
- Validate=false
- PruneLast=true
- CreateNamespace=true
- Replace=true
EOF
-
Commit and push the calisti-gitops
repository.
git add apps/smm-controlplane manifests
git commit -m "add smm-controlplane app"
-
Apply the Application manifests.
kubectl apply -f "apps/smm-controlplane/app-set.yaml"
-
Verify that the applications have been added to Argo CD and are healthy.
-
To create trust between workload-cluster-1
and workload-cluster-2
, you must exchange the Secret CRs of the clusters. The cluster registry controller helps to form a group of Kubernetes clusters and synchronize any resources across those clusters arbitrarily.
Create and run the following bash script.
cat > export-secrets.sh <<EOF
set -e
kubectl --context workload-cluster-1 get cluster workload-cluster-1 -o yaml | kubectl --context workload-cluster-2 apply -f -
kubectl --context workload-cluster-1 -n cluster-registry get secrets workload-cluster-1 -o yaml | kubectl --context workload-cluster-2 apply -f -
kubectl --context workload-cluster-2 get cluster workload-cluster-2 -o yaml | kubectl --context workload-cluster-1 apply -f -
kubectl --context workload-cluster-2 -n cluster-registry get secrets workload-cluster-2 -o yaml | kubectl --context workload-cluster-1 apply -f -
echo "Exporting cluster and secrets CRs successfully."
EOF
chmod +x export-secrets.sh
./export-secrets.sh
Expected output:
cluster.clusterregistry.k8s.cisco.com/workload-cluster-1 created
secret/workload-cluster-1 created
cluster.clusterregistry.k8s.cisco.com/workload-cluster-2 created
secret/workload-cluster-2 created
Exporting cluster and secrets CRs successfully.
-
Check that all pods are healthy and running in the smm-system
namespace on workload-cluster-1
and workload-cluster-2
. Note that it takes some time while the ControlPlane operator reconciles the resources.
For workload-cluster-1
:
kubectl get pods -n smm-system --kubeconfig "${WORKLOAD_CLUSTER_1_KUBECONFIG}" --context "${WORKLOAD_CLUSTER_1_CONTEXT}"
Expected output:
NAME READY STATUS RESTARTS AGE
istio-operator-v115x-7d77fc549f-fxmtd 2/2 Running 0 8m15s
mesh-manager-0 2/2 Running 0 19m
prometheus-node-exporter-9xnmj 1/1 Running 0 17m
prometheus-node-exporter-bf7g5 1/1 Running 0 17m
prometheus-node-exporter-cl69q 1/1 Running 0 17m
prometheus-smm-prometheus-0 4/4 Running 0 18m
smm-7f4d5d4fff-4dlcp 2/2 Running 0 18m
smm-7f4d5d4fff-59k7g 2/2 Running 0 18m
smm-als-7cc4bfb998-wjsr6 2/2 Running 0 18m
smm-authentication-569484f748-fj5zk 2/2 Running 0 18m
smm-federation-gateway-6964fb956f-pb5pv 2/2 Running 0 18m
smm-federation-gateway-operator-6664774695-9tmzj 2/2 Running 0 18m
smm-grafana-59c54f67f4-9snc5 3/3 Running 0 18m
smm-health-75bf4f49c5-z9tqg 2/2 Running 0 18m
smm-health-api-7767d4f46-744wn 2/2 Running 0 18m
smm-ingressgateway-6ffdfc6d79-jttjz 1/1 Running 0 11m
smm-ingressgateway-external-8c9bb9445-kjt8h 1/1 Running 0 11m
smm-kubestatemetrics-86c6f96789-lp576 2/2 Running 0 18m
smm-leo-67cd7d49b5-gmcvf 2/2 Running 0 18m
smm-prometheus-operator-ffbfb8b67-fwj6g 3/3 Running 0 18m
smm-sre-alert-exporter-6654968479-fthk6 2/2 Running 0 18m
smm-sre-api-86c9fb7cd7-mq7cm 2/2 Running 0 18m
smm-sre-controller-6889685f9-hxxh5 2/2 Running 0 18m
smm-tracing-5886d59dd-v8nb8 2/2 Running 0 18m
smm-vm-integration-5b89c4f7c9-wz4bt 2/2 Running 0 18m
smm-web-d5b49c7f6-jgz7b 3/3 Running 0 18m
For workload-cluster-2
:
kubectl get pods -n smm-system --kubeconfig "${WORKLOAD_CLUSTER_2_KUBECONFIG}" --context "${WORKLOAD_CLUSTER_2_CONTEXT}"
Expected output:
NAME READY STATUS RESTARTS AGE
istio-operator-v115x-7d77fc549f-s5wnz 2/2 Running 0 9m5s
mesh-manager-0 2/2 Running 0 21m
prometheus-node-exporter-fzdn4 1/1 Running 0 5m18s
prometheus-node-exporter-rkbcl 1/1 Running 0 5m18s
prometheus-node-exporter-x2mwp 1/1 Running 0 5m18s
prometheus-smm-prometheus-0 3/3 Running 0 5m20s
smm-ingressgateway-5db7859d45-6d6ns 1/1 Running 0 12m
smm-kubestatemetrics-86c6f96789-j64q2 2/2 Running 0 19m
smm-prometheus-operator-ffbfb8b67-zwqn2 3/3 Running 1 (11m ago) 19m
-
Check the applications on the Argo CD Web UI.
At this point, you have successfully installed smm-operator
and workload-cluster-1
and workload-cluster-2
. You can open the Service Mesh Manager dashboard to check them, or deploy an application.
Deploy an application
If you want to deploy want to deploy an application into the service mesh, complete the following steps. The examples use the Service Mesh Manager demo application.
The file structure for the demo application looks like this:
.
├── README.md
├── apps
│ ├── demo-app
│ │ └── app-set.yaml
│ └── ...
...manifests
└── demo-app
├── base
│ ├── demo-app-namespace.yaml
│ ├── demo-app.yaml
│ └── kustomization.yaml
└── overlays
├── workload-cluster-1
│ ├── demo-app.yaml
│ └── kustomization.yaml
└── workload-cluster-2
├── demo-app.yaml
└── kustomization.yaml
...
-
Create the application manifest files.
cat > manifests/demo-app/base/kustomization.yaml <<EOF
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
metadata:
name: demo-app
resources:
- demo-app-namespace.yaml
- demo-app.yaml
EOF
cat > manifests/demo-app/base/demo-app-namespace.yaml <<EOF
apiVersion: v1
kind: Namespace
metadata:
labels:
app.kubernetes.io/instance: smm-demo
app.kubernetes.io/name: smm-demo
app.kubernetes.io/part-of: smm-demo
app.kubernetes.io/version: 0.1.4
istio.io/rev: cp-v115x.istio-system
name: smm-demo
EOF
cat > manifests/demo-app/base/demo-app.yaml <<EOF
apiVersion: smm.cisco.com/v1alpha1
kind: DemoApplication
metadata:
name: smm-demo
namespace: smm-demo
spec:
autoscaling:
enabled: true
controlPlaneRef:
name: smm
EOF
cat > manifests/demo-app/overlays/workload-cluster-1/kustomization.yaml <<EOF
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
bases:
- ../../base
patches:
- demo-app.yaml
EOF
cat > manifests/demo-app/overlays/workload-cluster-1/demo-app.yaml <<EOF
apiVersion: smm.cisco.com/v1alpha1
kind: DemoApplication
metadata:
name: smm-demo
namespace: smm-demo
spec:
autoscaling:
enabled: true
controlPlaneRef:
name: smm
deployIstioResources: true
deploySLOResources: true
enabled: true
enabledComponents:
- frontpage
- catalog
- bookings
- postgresql
istio:
revision: cp-v115x.istio-system
load:
enabled: true
maxRPS: 30
minRPS: 10
swingPeriod: 1380000000000
replicas: 1
resources:
limits:
cpu: "2"
memory: 192Mi
requests:
cpu: 40m
memory: 64Mi
EOF
cat > manifests/demo-app/overlays/workload-cluster-2/kustomization.yaml <<EOF
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
bases:
- ../../base
patches:
- demo-app.yaml
EOF
cat > manifests/demo-app/overlays/workload-cluster-2/demo-app.yaml <<EOF
apiVersion: smm.cisco.com/v1alpha1
kind: DemoApplication
metadata:
name: smm-demo
namespace: smm-demo
spec:
autoscaling:
enabled: true
controlPlaneRef:
name: smm
deployIstioResources: false
deploySLOResources: false
enabled: true
enabledComponents:
- movies
- payments
- notifications
- analytics
- database
- mysql
istio:
revision: cp-v115x.istio-system
replicas: 1
resources:
limits:
cpu: "2"
memory: 192Mi
requests:
cpu: 40m
memory: 64Mi
EOF
-
Create the Demo application ApplicationSet
.
cat > apps/demo-app/app-set.yaml <<EOF
apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
name: demo-app-appset
namespace: argocd
spec:
generators:
- list:
elements:
- cluster: "${WORKLOAD_CLUSTER_1_CONTEXT}"
path: "${WORKLOAD_CLUSTER_1_CONTEXT}"
- cluster: "${WORKLOAD_CLUSTER_2_CONTEXT}"
path: "${WORKLOAD_CLUSTER_2_CONTEXT}"
template:
metadata:
name: 'demo-app-{{cluster}}'
namespace: argocd
spec:
project: default
source:
repoURL: https://github.com/${GITHUB_ID}/${GITHUB_REPOSITORY_NAME}.git
targetRevision: HEAD
path: manifests/demo-app/overlays/{{path}}
destination:
name: '{{cluster}}'
syncPolicy:
automated:
prune: true
selfHeal: true
retry:
limit: 5
backoff:
duration: 5s
maxDuration: 3m0s
factor: 2
syncOptions:
- Validate=false
- PruneLast=true
- CreateNamespace=true
- Replace=true
EOF
-
Commit and push the calisti-gitops
repository.
git add apps/demo-app manifests
git commit -m "add demo application"
git push origin
-
Deploy the demo application on the clusters.
kubectl apply -f apps/demo-app/app-set.yaml
-
Wait until all the pods in the smm-demo
namespace are up and running.
kubectl get pods -n smm-demo --kubeconfig "${WORKLOAD_CLUSTER_1_KUBECONFIG}" --context "${WORKLOAD_CLUSTER_1_CONTEXT}"
Expected output:
NAME READY STATUS RESTARTS AGE
bombardier-5f59948978-zx99c 2/2 Running 0 3m21s
bookings-v1-68dd865855-fdcxk 2/2 Running 0 3m21s
catalog-v1-6d564bbcb8-qmhbx 2/2 Running 0 3m21s
frontpage-v1-b4686759b-fhfmv 2/2 Running 0 3m21s
postgresql-7cf55cd596-grs46 2/2 Running 0 3m21s
kubectl get pods -n smm-demo --kubeconfig "${WORKLOAD_CLUSTER_2_KUBECONFIG}" --context "${WORKLOAD_CLUSTER_2_CONTEXT}"
Expected output:
NAME READY STATUS RESTARTS AGE
analytics-v1-799d668f84-p4nkk 2/2 Running 0 3m58s
database-v1-6896cd4b59-9xxgg 2/2 Running 0 3m58s
movies-v1-9594fff5f-8hv9l 2/2 Running 0 3m58s
movies-v2-5559c5567c-2279n 2/2 Running 0 3m58s
movies-v3-649b99d977-nkdxc 2/2 Running 0 3m58s
mysql-669466cc8d-bs4s9 2/2 Running 0 3m58s
notifications-v1-79bc79c89b-4bbss 2/2 Running 0 3m58s
payments-v1-547884bfdf-dg2dm 2/2 Running 0 3m58s
-
Check the applications on the Argo CD web UI.
-
Open the Service Mesh Manager web interface, select MENU > TOPOLOGY, then select the smm-demo
namespace.
Access the Service Mesh Manager dashboard
-
You can access the Service Mesh Manager dashboard via the smm-ingressgateway-external
LoadBalancer external-ip-or-hostname
address. Run the following command to retrieve the IP address:
kubectl get services -n smm-system smm-ingressgateway-external --kubeconfig "${WORKLOAD_CLUSTER_1_KUBECONFIG}" --context "${WORKLOAD_CLUSTER_1_CONTEXT}"
Expected output:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
smm-ingressgateway-external LoadBalancer 10.0.0.199 external-ip-or-hostname 80:32505/TCP 2m28s
-
Open the Service Mesh Manager dashboard using one of the following methods:
-
Open the http://<external-ip-or-hostname>
URL in your browser.
-
Run the following command to open the dashboard with your default browser:
# Exactly one of hostname or IP will be available and used for the remote URL.
open http://$(kubectl get services -n smm-system smm-ingressgateway-external -o jsonpath='{.status.loadBalancer.ingress[0].hostname}{.status.loadBalancer.ingress[0].ip}' --kubeconfig "${WORKLOAD_CLUSTER_1_KUBECONFIG}" --context "${WORKLOAD_CLUSTER_1_CONTEXT}")
-
If you have installed the Service Mesh Manager CLI on your machine, run the following command to open the Service Mesh Manager Dashboard in the default browser.
smm dashboard --kubeconfig "${WORKLOAD_CLUSTER_1_KUBECONFIG}" --context "${WORKLOAD_CLUSTER_1_CONTEXT}"
Expected output:
✓ validate-kubeconfig ❯ checking cluster reachability...
✓ opening Service Mesh Manager at http://127.0.0.1:50500
-
Check the deployments on the dashboard, for example, on the MENU > Overview, MENU > MESH, and MENU > TOPOLOGY pages.
2.3.8 - Install FIPS images
To install the FIPS-compliant build of Service Mesh Manager, complete the following steps.
-
Download the following YAML file. It contains the list of FIPS-compliant images the installer should use.
apiVersion: servicemesh.cisco.com/v1alpha1
kind: IstioControlPlane
metadata:
name: cp-v115x
spec:
version: 1.15.3
mode: ACTIVE
istiod:
deployment:
image: 033498657557.dkr.ecr.us-east-2.amazonaws.com/banzaicloud/istio-pilot:v1.15.3-bzc.0-fips
proxy:
image: 033498657557.dkr.ecr.us-east-2.amazonaws.com/banzaicloud/istio-proxyv2:v1.15.3-bzc.0-fips
proxyInit:
cni:
daemonset:
image: 033498657557.dkr.ecr.us-east-2.amazonaws.com/banzaicloud/istio-install-cni:v1.15.3-bzc.0-fips
image: 033498657557.dkr.ecr.us-east-2.amazonaws.com/banzaicloud/istio-proxyv2:v1.15.3-bzc.0-fips
sidecarInjector:
deployment:
image: 033498657557.dkr.ecr.us-east-2.amazonaws.com/banzaicloud/istio-sidecar-injector:v1.15.3-bzc.0-fips
-
Follow any of the regular installation guides (for example, Create single cluster mesh or Create multi-cluster mesh), but use the following customized YAML file with the initial installation command to use the FIPS-compliant versions of the images. For example, for a non-interactive single-cluster installation, run:
smm install -a --cluster-name <name-of-your-cluster> --istio-cr-file istio-fips.yaml
2.3.9 - Customize installation
The installation of Service Mesh Manager can be customized through its CRs.
This page covers the most frequently used configuration options for Service Mesh Manager.
Service Mesh Manager images
The ControlPlane
CR can be configured to set the following container images:
apiVersion: smm.cisco.com/v1alpha1
kind: ControlPlane
metadata:
name: smm
spec:
smm:
als:
enabled: true
image:
repository: 033498657557.dkr.ecr.us-east-2.amazonaws.com/smm-als
tag: v1.11.0
log: {}
application:
enabled: true
image:
repository: 033498657557.dkr.ecr.us-east-2.amazonaws.com/smm
tag: v1.11.0
log: {}
auth:
image:
repository: 033498657557.dkr.ecr.us-east-2.amazonaws.com/smm-authentication
tag: v1.11.0
mode: impersonation
certManager:
enabled: true
enabled: true
federationGateway:
image:
repository: 033498657557.dkr.ecr.us-east-2.amazonaws.com/smm-federation-gateway
tag: v1.11.0
enabled: true
name: smm
service:
enabled: true
name: smm-federation-gateway
port: 80
federationGatewayOperator:
enabled: true
image:
repository: 033498657557.dkr.ecr.us-east-2.amazonaws.com/smm-federation-gateway-operator
tag: v1.11.0
grafana:
enabled: true
image:
repository: grafana/grafana
tag: 7.5.11
sidecar:
image:
repository: ghcr.io/banzaicloud/k8s-sidecar
tag: v1.11.3-bzc
health:
enabled: true
api:
image:
repository: 033498657557.dkr.ecr.us-east-2.amazonaws.com/smm-health-api
tag: v1.11.0
controller:
image:
repository: 033498657557.dkr.ecr.us-east-2.amazonaws.com/smm-health
tag: v1.11.0
impersonation:
enabled: true
istio:
revision: cp-v115x.istio-system
kubestatemetrics:
enabled: true
image:
repository: k8s.gcr.io/kube-state-metrics/kube-state-metrics
tag: v2.6.0
leo:
enabled: true
image:
repository: 033498657557.dkr.ecr.us-east-2.amazonaws.com/smm-leo
tag: v1.11.0
log: {}
log: {}
namespace: smm-system
prometheus:
enabled: true
replicas: 1
image:
repository: prom/prometheus
tag: v2.39.1
configReloader:
image:
repository: quay.io/prometheus-operator/prometheus-config-reloader
tag: v0.60.1
thanos:
image:
repository: quay.io/thanos/thanos
tag: v0.28.1
prometheusOperator:
image:
repository: quay.io/prometheus-operator/prometheus-operator
tag: v0.60.1
k8sproxy:
image:
repository: 033498657557.dkr.ecr.us-east-2.amazonaws.com/banzaicloud/k8s-proxy
tag: v0.0.9
releaseName: smm
role: active
sre:
enabled: true
alertExporter:
image:
repository: 033498657557.dkr.ecr.us-east-2.amazonaws.com/smm-sre-alert-exporter
tag: v1.11.0
api:
image:
repository: 033498657557.dkr.ecr.us-east-2.amazonaws.com/smm-sre-api
tag: v1.11.0
controller:
image:
repository: 033498657557.dkr.ecr.us-east-2.amazonaws.com/smm-sre
tag: v1.11.0
useIstioResources: true
tracing:
enabled: true
jaeger:
image:
repository: jaegertracing/all-in-one
tag: "1.28.0"
web:
enabled: true
image:
repository: 033498657557.dkr.ecr.us-east-2.amazonaws.com/smm-web
tag: v1.11.0
downloads:
enabled: true
image:
repository: 033498657557.dkr.ecr.us-east-2.amazonaws.com/smm-cli
tag: v1.11.0-nginx
vmIntegration:
enabled: true
image:
repository: 033498657557.dkr.ecr.us-east-2.amazonaws.com/smm-vm-integration
tag: v1.11.0
certManager:
enabled: true
namespace: cert-manager
manageNamespace: true
image:
repository: quay.io/jetstack/cert-manager-controller
tag: v1.9.1
cainjector:
image:
repository: quay.io/jetstack/cert-manager-cainjector
tag: v1.9.1
webhook:
image:
repository: quay.io/jetstack/cert-manager-webhook
tag: v1.9.1
clusterName: primary
clusterRegistry:
enabled: true
namespace: cluster-registry
image:
repository: 033498657557.dkr.ecr.us-east-2.amazonaws.com/banzaicloud/cluster-registry-controller
tag: v0.2.4
log: {}
meshManager:
enabled: true
image:
repository: 033498657557.dkr.ecr.us-east-2.amazonaws.com/smm-mesh-manager
tag: v1.11.0
istio:
istioCRRef:
name: cp-v115x
namespace: istio-system
istioCROverrides: |
spec:
istiod:
deployment:
podDisruptionBudget:
minAvailable: 0
sidecarInjector:
image:
repository: 033498657557.dkr.ecr.us-east-2.amazonaws.com/banzaicloud/istio-sidecar-injector
tag: v1.15.3-bzc.0
initCNIConfiguration:
image:
repository: 033498657557.dkr.ecr.us-east-2.amazonaws.com/banzaicloud/istio-install-cni
tag: v1.15.3-bzc.0
pilot:
image:
repository: 033498657557.dkr.ecr.us-east-2.amazonaws.com/banzaicloud/istio-pilot
tag: v1.15.3-bzc.0
proxy:
image:
repository: 033498657557.dkr.ecr.us-east-2.amazonaws.com/banzaicloud/istio-proxyv2
tag: v1.15.3-bzc.0
operators:
namespace: smm-system
instances:
- image:
repository: ghcr.io/banzaicloud/istio-operator
tag: v2.15.3
name: v115x
version: 1.15.3
- image:
repository: ghcr.io/banzaicloud/istio-operator
tag: v2.13.5
name: v113x
version: 1.13.5
namespace: smm-system
prometheusMetrics:
authProxy:
image:
repository: quay.io/brancz/kube-rbac-proxy
tag: v0.11.0
nodeExporter:
enabled: true
namespace: smm-system
psp:
enabled: false
rbac:
enabled: true
image:
repository: quay.io/prometheus/node-exporter
tag: v1.2.2
registryAccess:
enabled: true
imagePullSecretsController:
image:
repository: ghcr.io/banzaicloud/imagepullsecrets
tag: v0.3.5
namespace: smm-registry-access
pullSecrets:
- name: smm--033498657557.dkr.ecr.us-east-2.amazonaws.com-pull-secret-c17fa163
namespace: smm-registry-access
- name: smm--626007623524.dkr.ecr.us-east-2.amazonaws.com-pull-secret-72b452b5
namespace: smm-registry-access
role: active
The IstioOperator
CR can be configured to set the istio-operator container image:
apiVersion: smm.cisco.com/v1alpha1
kind: IstioOperator
metadata:
name: v115x
spec:
enabled: true
image:
repository: ghcr.io/banzaicloud/istio-operator
tag: v2.15.3
version: 1.15.3
If you installed Service Mesh Manager in operator mode, the changes in these CRs should be reflected automatically on your cluster.
If you don’t have the Service Mesh Manager operator installed, run the following command so that the changes take effect:
Istio images
The IstioControlPlane
CR can be configured to set the following Istio container images:
apiVersion: servicemesh.cisco.com/v1alpha1
kind: IstioControlPlane
metadata:
name: cp-v115x
spec:
version: "1.15.3"
mode: ACTIVE
istiod:
deployment:
image: 033498657557.dkr.ecr.us-east-2.amazonaws.com/banzaicloud/istio-pilot:v1.15.3-bzc.0
sidecarInjector:
deployment:
image: 033498657557.dkr.ecr.us-east-2.amazonaws.com/banzaicloud/istio-sidecar-injector:v1.15.3-bzc.0
proxy:
image: 033498657557.dkr.ecr.us-east-2.amazonaws.com/banzaicloud/istio-proxyv2:v1.15.3-bzc.0
proxyInit:
image: 033498657557.dkr.ecr.us-east-2.amazonaws.com/banzaicloud/istio-proxyv2:v1.15.3-bzc.0
cni:
daemonset:
image: 033498657557.dkr.ecr.us-east-2.amazonaws.com/banzaicloud/istio-install-cni:v1.15.3-bzc.0
These changes should be automatically reflected on your cluster after editing the CR.
List of configurable images
Based on the CRs above, you can configure the following components in Service Mesh Manager:
Images |
Repository |
Tag |
smm-als |
033498657557.dkr.ecr.us-east-2.amazonaws.com/smm-als |
v1.11.0 |
smm |
033498657557.dkr.ecr.us-east-2.amazonaws.com/smm |
v1.11.0 |
smm-auth |
033498657557.dkr.ecr.us-east-2.amazonaws.com/smm-authentication |
v1.11.0 |
smm-grafana |
grafana/grafana |
7.5.11 |
smm-federation-gateway |
033498657557.dkr.ecr.us-east-2.amazonaws.com/smm-federation-gateway |
v1.11.0 |
smm-federation-gateway-operator |
033498657557.dkr.ecr.us-east-2.amazonaws.com/smm-federation-gateway-operator |
v1.11.0 |
smm-health |
033498657557.dkr.ecr.us-east-2.amazonaws.com/smm-health |
v1.11.0 |
smm-health-api |
033498657557.dkr.ecr.us-east-2.amazonaws.com/smm-health-api |
v1.11.0 |
smm-kubestatemetrics |
k8s.gcr.io/kube-state-metrics/kube-state-metrics |
v2.6.0 |
smm-leo |
033498657557.dkr.ecr.us-east-2.amazonaws.com/smm-leo |
v1.11.0 |
smm-prometheus |
prom/prometheus |
v2.39.1 |
smm-prometheus-config-reloader |
quay.io/prometheus-operator/prometheus-config-reloader |
v0.60.1 |
smm-thanos |
quay.io/thanos/thanos |
v0.28.1 |
smm-prometheus-operator |
quay.io/prometheus-operator/prometheus-operator |
v0.60.1 |
smm-k8s-proxy |
033498657557.dkr.ecr.us-east-2.amazonaws.com/banzaicloud/k8s-proxy |
v0.0.9 |
smm-sre |
033498657557.dkr.ecr.us-east-2.amazonaws.com/smm-sre |
v1.11.0 |
smm-sre-api |
033498657557.dkr.ecr.us-east-2.amazonaws.com/smm-sre-api |
v1.11.0 |
smm-sre-alert-exporter |
033498657557.dkr.ecr.us-east-2.amazonaws.com/smm-sre-alert-exporter |
v1.11.0 |
smm-tracing |
jaegertracing/all-in-one |
“1.28.0” |
smm-web |
033498657557.dkr.ecr.us-east-2.amazonaws.com/smm-web |
v1.11.0 |
smm-vm-integration |
033498657557.dkr.ecr.us-east-2.amazonaws.com/smm-vm-integration |
v1.11.0 |
kube-rbac-proxy |
gcr.io/kubebuilder/kube-rbac-proxy |
v0.11.0 |
cert-manager |
quay.io/jetstack/cert-manager-controller |
v1.9.1 |
cert-manager-cainjector |
quay.io/jetstack/cert-manager-cainjector |
v1.9.1 |
cert-manager-webhook |
quay.io/jetstack/cert-manager-webhook |
v1.9.1 |
cluster-registry |
033498657557.dkr.ecr.us-east-2.amazonaws.com/banzaicloud/cluster-registry-controller |
v0.2.4 |
imagepullsecrets-controller |
ghcr.io/banzaicloud/imagepullsecrets |
v0.3.5 |
istio-operator |
ghcr.io/banzaicloud/istio-operator |
v2.15.3(v115x) |
istio-sidecarinjector |
033498657557.dkr.ecr.us-east-2.amazonaws.com/banzaicloud/istio-sidecar-injector |
v1.15.x-bzc.0 |
istio-pilot |
033498657557.dkr.ecr.us-east-2.amazonaws.com/banzaicloud/istio-pilot |
v1.15.x-bzc.0 |
istio-proxy |
033498657557.dkr.ecr.us-east-2.amazonaws.com/banzaicloud/istio-proxyv2 |
v1.15.x-bzc.0 |
istio-init-cni |
033498657557.dkr.ecr.us-east-2.amazonaws.com/banzaicloud/istio-install-cni |
v1.15.x-bzc.0 |
(Updated as of November 11, 2022)
If there is a Service Mesh Manager related image that you’d like to change and that image is not listed here, contact us!
Customize IstioControlPlane CR
You can customize the ControlPlane
CR to change the configuration of the IstioControlPlane
CR. Set your custom values under the spec.meshManager.istio.istioCROverrides of the ControlPlane
CR, and Service Mesh Manager merges them to the IstioControlPlane
CR.
For example to enable basic DNS proxying, you can set the ISTIO_META_DNS_CAPTURE field using a similar configuration:
apiVersion: smm.cisco.com/v1alpha1
kind: ControlPlane
metadata:
name: smm
spec:
...
meshmanager:
istio:
...
istioCROverrides: |
spec:
meshConfig:
defaultConfig:
proxyMetadata:
# Enable basic DNS proxying
ISTIO_META_DNS_CAPTURE: "true"
# Enable automatic address allocation, optional
ISTIO_META_DNS_AUTO_ALLOCATE: "true"
2.4 - Upgrade
Service Mesh Manager (SMM) provides safe upgrades both for the Istio control plane and the Service Mesh Manager dashboard.
Istio follows a rolling support cycle: only the last few versions are supported by the Istio community. The Cisco Istio Distribution included in Service Mesh Manager follows the same model.
Service Mesh Manager follows semantic versioning. To support a new Istio version, a new minor version is created (for example, Istio 1.11 was introduced in Service Mesh Manager 1.8, Istio 1.12 was introduced in Service Mesh Manager 1.9). Always consult the What’s new page to see if a new version of Istio has been introduced.
CAUTION:
Supported upgrade paths
Service Mesh Manager supports upgrades from the prior minor release and patch releases. The current supported upgrade path: v1.10.x
to v1.11.x
Overview of the upgrade procedure
The upgrade procedure consists of two steps.
-
Upgrading the Service Mesh Manager control plane. This is needed regardless of the target Istio version. This step ensures that all Service Mesh Manager components are containing the latest features and security fixes.
This upgrade also upgrades Istio to the latest patch level. For example: if before the upgrade the cluster had Istio 1.11.0, and the target Service Mesh Manager version contains Istio 1.11.2, then this step upgrades Istio to 1.11.2.
For details on performing this step, see Upgrading SMM and SDM.
-
If the new version of Service Mesh Manager contains a new minor or major version of Istio (for example, you have Istio 1.11.2 installed, and the new version contains Istio 1.12), complete the Canary control plane upgrades procedure after upgrading Service Mesh Manager.
Service Mesh Manager avoids big changes to the production traffic by running two versions of the Istio control planes in parallel (for example, 1.11.2 and 1.12.0) on the same cluster. After the upgrade, the existing workloads continue using the older version of Istio (for example, 1.11.2). You can gradually (on a per-namespace basis) move workloads to the new (in the example the 1.12.0) version. This allows operators to start moving services with less business value or risk associated to the new Istio version before moving on to more mission critical services.
2.4.1 - Upgrading SMM and SDM
The procedure to upgrade Service Mesh Manager depends on whether you have installed Service Mesh Manager in imperative mode or in operator mode.
- If you have installed Service Mesh Manager in imperative mode, upgrade it using the CLI.
- If you have installed Service Mesh Manager in operator mode, upgrade the operator.
- If you have installed Service Mesh Manager using our GitOps guide, upgrade the operator chart.
CAUTION:
Supported upgrade paths
Service Mesh Manager supports upgrades from the prior minor release and patch releases. The current supported upgrade path: v1.10.x
to v1.11.x
Before upgrading
If you have cert-manager installed on your Service Mesh Manager cluster, optionally complete the following step.
Before upgrading Service Mesh Manager 1.10 to 1.11, apply the following patch to your Service Mesh Manager v1.10 cluster to modify the spec field of a job that cleans up the cert-manager-startupapicheck
job after 100sec when completed. If you skip this step, you might see a “cert-manager-startupapicheck” related error during the upgrade. The error is non-blocking and doesn’t stop the upgrade process. Alternatively, you can apply the patch after you have upgraded the cluster.
kubectl patch jobs.batch -n cert-manager cert-manager-startupapicheck -p '{"spec":{"ttlSecondsAfterFinished":100}}' --type=merge
Using the CLI
In case your Service Mesh Manager deployment is managed using the Service Mesh Manager CLI it should be used to upgrade to the new version.
For an example of upgrading Service Mesh Manager from 1.10.0 to 1.11.0 in a multi-cluster setup, see Multi-cluster upgrade from 1.10.0 to 1.11.0.
-
Download the Service Mesh Manager command-line tool for version 1.11.0. The archive contains the smm
and supertubes
binaries. Extract these binaries and update your local copy on your machine. For details, see Accessing the Service Mesh Manager binaries.
-
Deploy a new version of Service Mesh Manager.
The following command upgrades the Service Mesh Manager control plane. It also installs the new Istio control plane (version 1.15.x), but the applications keep using the old control plane until you restart your workloads.
In the following examples, smm
refers to version 1.11.0 of the binary.
-
Check that the Service Mesh Manager control plane is upgraded and already uses the new Istio control plane.
-
If you are upgrading only Service Mesh Manager upgrade, run the following command to verify that the installation is complete.
kubectl get pods -n=smm-system -L istio.io/rev
The output should be similar to:
NAME READY STATUS RESTARTS AGE REV
istio-operator-v113x-64bc574fdf-mdtwj 2/2 Running 0 21m
istio-operator-v115x-8558dbb88c-6r6fx 2/2 Running 0 21m
mesh-manager-0 2/2 Running 0 21m
prometheus-node-exporter-76jwv 1/1 Running 0 18m
prometheus-node-exporter-ptbwk 1/1 Running 0 18m
prometheus-node-exporter-w86lc 1/1 Running 0 18m
prometheus-smm-prometheus-0 4/4 Running 0 19m cp-v115x.istio-system
smm-6b5575474d-l88lg 2/2 Running 0 19m cp-v115x.istio-system
smm-6b5575474d-wp727 2/2 Running 0 19m cp-v115x.istio-system
smm-als-6b995458c-z8jt9 2/2 Running 0 19m cp-v115x.istio-system
smm-authentication-78d96d6fc9-hg89p 2/2 Running 0 19m cp-v115x.istio-system
smm-federation-gateway-7c7d9b7fb5-xgv5t 2/2 Running 0 19m cp-v115x.istio-system
smm-federation-gateway-operator-ff8598cb7-xj7pk 2/2 Running 0 19m cp-v115x.istio-system
smm-grafana-7bcf9f5885-jhwpg 3/3 Running 0 19m cp-v115x.istio-system
smm-health-56896f5b9b-r54w8 2/2 Running 0 19m cp-v115x.istio-system
smm-health-api-665d4787-pw7z4 2/2 Running 0 19m cp-v115x.istio-system
smm-ingressgateway-b6d5b5b84-l5llx 1/1 Running 0 17m cp-v115x.istio-system
smm-kubestatemetrics-5455b9697-5tbgq 2/2 Running 0 19m cp-v115x.istio-system
smm-leo-7b64559786-2sj4c 2/2 Running 0 19m cp-v115x.istio-system
smm-prometheus-operator-66dbdb499d-sz6t8 3/3 Running 1 19m cp-v115x.istio-system
smm-sre-alert-exporter-668d9cbd68-926t5 2/2 Running 0 19m cp-v115x.istio-system
smm-sre-api-86cf44fbbb-lxvxd 2/2 Running 0 19m cp-v115x.istio-system
smm-sre-controller-858b984df6-6b5r6 2/2 Running 0 19m cp-v115x.istio-system
smm-tracing-76c688ff6f-7ctjk 2/2 Running 0 19m cp-v115x.istio-system
smm-vm-integration-5df64bdb4b-68xgh 2/2 Running 0 19m cp-v115x.istio-system
smm-web-677b9f4f5b-ss9zs 3/3 Running 0 19m cp-v115x.istio-system
-
If you are upgrading both Service Mesh Manager and Streaming Data Manager, run the following command to verify that the installation is complete.
kubectl get pods -A -L istio.io/rev
The output should be similar to:
NAMESPACE NAME READY STATUS RESTARTS AGE REV
cert-manager cert-manager-67575448dd-8qbws 1/1 Running 0 5h56m
cert-manager cert-manager-cainjector-79f8d775c7-ww7fw 1/1 Running 0 5h56m
cert-manager cert-manager-webhook-5949cc4b67-gwknv 1/1 Running 0 5h56m
cluster-registry cluster-registry-controller-b86f8857c-44jh8 1/1 Running 0 5h57m
csr-operator-system csr-operator-5955b44674-bvl9p 2/2 Running 0 5h56m
istio-system istio-meshexpansion-v115x-d8555488f-btdx6 1/1 Running 0 37m v115x.istio-system
istio-system istiod-v115x-555749b797-dcwwm 1/1 Running 0 5h55m v115x.istio-system
istio-system istiod-sdm-iv115x-6c8cfb5fc5-85w2d 1/1 Running 0 5h55m sdm-iv115x.istio-system
kafka kafka-operator-operator-76df6db8d4-l4kkq 3/3 Running 2 (5h52m ago) 5h53m sdm-iv115x.istio-system
smm-registry-access imagepullsecrets-controller-6c45b46459-qb9j8 1/1 Running 0 6h1m
smm-system istio-operator-v113x-6fb944b86b-xgpbd 2/2 Running 0 5h55m
smm-system istio-operator-v115x-68dcbc59c8-vt2mp 2/2 Running 0 5h55m
smm-system mesh-manager-0 2/2 Running 0 5h56m
smm-system prometheus-node-exporter-74dcm 1/1 Running 0 5h53m
smm-system prometheus-node-exporter-8s458 1/1 Running 0 5h59m
smm-system prometheus-node-exporter-vmth4 1/1 Running 0 5h59m
smm-system prometheus-node-exporter-xsk8j 1/1 Running 0 5h59m
smm-system prometheus-smm-prometheus-0 4/4 Running 0 5h55m v115x.istio-system
smm-system smm-656d45f7cc-c2kd6 2/2 Running 0 5h55m v115x.istio-system
smm-system smm-656d45f7cc-xrx9n 2/2 Running 0 5h55m v115x.istio-system
smm-system smm-als-855c6878b7-55gvd 2/2 Running 0 5h55m v115x.istio-system
smm-system smm-authentication-666547f79f-hwt6t 2/2 Running 0 5h55m v115x.istio-system
smm-system smm-federation-gateway-fd4bbb4f8-4nql8 2/2 Running 1 (5h54m ago) 5h55m v115x.istio-system
smm-system smm-federation-gateway-operator-bd94d8444-nbvjz 2/2 Running 0 5h55m v115x.istio-system
smm-system smm-grafana-59c54f67f4-tft2h 3/3 Running 0 5h55m v115x.istio-system
smm-system smm-health-86b8dbdf68-k8bfr 2/2 Running 0 5h55m v115x.istio-system
smm-system smm-health-api-69bc97d89-gkdp5 2/2 Running 0 5h55m v115x.istio-system
smm-system smm-ingressgateway-9875bc895-v95m9 1/1 Running 0 37m v115x.istio-system
smm-system smm-kubestatemetrics-86c6f96789-cxsrb 2/2 Running 0 5h55m v115x.istio-system
smm-system smm-leo-8446486596-2w7fc 2/2 Running 0 5h55m v115x.istio-system
smm-system smm-prometheus-operator-77cd64556d-ghz5r 3/3 Running 1 (5h55m ago) 5h55m v115x.istio-system
smm-system smm-sre-alert-exporter-5dd8b64d58-ccrnh 2/2 Running 0 5h55m v115x.istio-system
smm-system smm-sre-api-998fc554b-lpvsq 2/2 Running 0 5h55m v115x.istio-system
smm-system smm-sre-controller-68c974c9db-grb44 2/2 Running 0 5h55m v115x.istio-system
smm-system smm-tracing-5886d59dd-7k6kt 2/2 Running 0 5h55m v115x.istio-system
smm-system smm-vm-integration-5cb96cdd78-mh5lh 2/2 Running 0 5h55m v115x.istio-system
smm-system smm-web-55f45cc8c5-gd894 3/3 Running 0 5h55m v115x.istio-system
supertubes-control-plane supertubes-control-plane-5bdbfcf5b6-85bw7 2/2 Running 0 5h57m
supertubes-system prometheus-operator-grafana-5fd88bcf86-55kgg 4/4 Running 0 5h53m sdm-iv115x.istio-system
supertubes-system prometheus-operator-kube-state-metrics-5dbf8656db-wlzfw 2/2 Running 2 (5h53m ago) 5h53m sdm-iv115x.istio-system
supertubes-system prometheus-operator-operator-7bdc575546-b4n94 2/2 Running 1 (5h53m ago) 5h53m sdm-iv115x.istio-system
supertubes-system prometheus-operator-prometheus-node-exporter-69cmx 1/1 Running 0 5h53m
supertubes-system prometheus-operator-prometheus-node-exporter-75b7q 1/1 Running 0 5h53m
supertubes-system prometheus-operator-prometheus-node-exporter-skksk 1/1 Running 0 5h53m
supertubes-system prometheus-operator-prometheus-node-exporter-v2pll 1/1 Running 0 5h53m
supertubes-system prometheus-prometheus-operator-prometheus-0 3/3 Running 0 5h53m sdm-iv115x.istio-system
supertubes-system supertubes-6f6b86b497-c5zqf 3/3 Running 1 (5h54m ago) 5h54m sdm-iv115x.istio-system
supertubes-system supertubes-ui-backend-c97564f84-c2vd6 2/2 Running 2 (5h54m ago) 5h54m sdm-iv115x.istio-system
zookeeper zookeeper-operator-6ff85cf58d-6kxhk 2/2 Running 1 (5h54m ago) 5h54m sdm-iv115x.istio-system
zookeeper zookeeper-operator-post-install-upgrade-qq4kf 0/1 Completed 0 5h54m
-
Restart your workloads to move your workloads to v115x mesh.
In operator mode
In case the deployment is managed in operator mode the upgrade procedure only consists of installing a newer version of the operator helm chart and allowing it to reconcile the cluster.
SMM upgrade
-
Uninstall the previous version (1.10.0) of the smm-operator chart.
helm uninstall smm-operator --namespace smm-registry-access
-
Install the new version (1.11.0) of the smm-operator chart.
helm install \
--namespace=smm-registry-access \
--set "global.ecr.enabled=false" \
--set "global.basicAuth.username=<your-username>" \
--set "global.basicAuth.password=<your-password>" \
smm-operator \
oci://registry.eticloud.io/smm-charts/smm-operator --version 1.11.0
Note: If the system uses helm for deploying the chart (and not some other CI/CD solution such as Argo CD), then the CustomResourceDefinitions (CRDs) will not be automatically upgraded. In this case, fetch the helm chart locally using the helm pull
command and apply the CRDs in the crds
folder of the helm chart manually.
-
After the operator has been started, monitor the status of the ControlPlane resource until it finishes the upgrade (reconciliation). Run the following command:
After the upgrade is finished, the output should be similar to the following. The Status: Succeeded
line shows that the deployment has been upgraded. In case of any errors, consult the Kubernetes logs of the operator (installed by Helm) for further information.
...
Status:
Components:
Cert Manager:
Status: Available
Cluster Registry:
Status: Available
Mesh Manager:
Status: Available
Node Exporter:
Status: Available
Registry Access:
Status: Available
Smm:
Status: Available
Status: Succeeded
-
Restart your workloads to move your workloads to the v115x mesh.
SDM upgrade
-
Uninstall previous version of the sdm-operator chart (if Streaming Data Manager is installed).
helm uninstall --namespace supertubes-control-plane sdm-operator
-
Install the new version 1.8.0 of the sdm-operator chart.
helm install \
--namespace supertubes-control-plane \
--set imagePullSecrets={smm-pull-secret} \
--set operator.image.repository="registry.eticloud.io/sdm/supertubes-control-plane" \
sdm-operator \
oci://registry.eticloud.io/sdm-charts/supertubes-control-plane --version 1.8.0
-
After the operator has been started, monitor the status of the applicationmanifest resource until it finishes the upgrade (reconciliation). Run the following command:
kubectl describe applicationmanifests.supertubes.banzaicloud.io -n supertubes-control-plane sdm-applicationmanifest
The output should be similar to:
...
Status:
Components:
Cluster Registry:
Status: Removed
Csr Operator:
Status: Available
Imps Operator:
Image Pull Secret Status: Unmanaged
Status: Removed
Istio Operator:
Status: Removed
Kafka Operator:
Status: Available
Monitoring:
Status: Available
Supertubes:
Status: Available
Zookeeper Operator:
Status: Available
Status: Succeeded
In a GitOps scenario
If you have installed Service Mesh Manager using our GitOps guide, complete the following steps to upgrade the operator chart.
-
Check your username and password on the download page.
-
Download the smm-operator
chart from registry.eticloud.io
into the charts
directory of your Service Mesh Manager GitOps repository and extract it. Run the following commands:
export HELM_EXPERIMENTAL_OCI=1 # Needed prior to Helm version 3.8.0
echo "${CALISTI_PASSWORD}" | helm registry login registry.eticloud.io -u "${CALISTI_USERNAME}" --password-stdin
Expected output:
helm pull oci://registry.eticloud.io/smm-charts/smm-operator --destination ./charts/ --untar --version 1.11.0
Expected output:
Pulled: registry.eticloud.io/smm-charts/smm-operator:latest-stable-version
Digest: sha256:someshadigest
-
Commit the changes and push the repository.
git add .
git commit -m "Update smm-operator chart"
git push origin
-
Restart your workloads to move your workloads to the v115x mesh.
Restarting workloads
After the upgrade has completed, the Pods running in applications' namespaces are still running the old version of Istio proxy sidecar.
-
To obtain the latest security patches, restart these Controllers
(Deployments
, StatefulSets
, and so on) either using the kubectl rollout
command, or by instructing the CI/CD systems enabled on the cluster. For example, to restart the deployments in a namespace, you can run:
kubectl rollout restart deployment --namespace <name-of-your-namespace>
-
If the upgrade also involved a minor or major version upgrade of Istio, the kubectl rollout
command will only ensure that the latest patch level is being used on the Pods.
For example: Service Mesh Manager 1.8.2 comes with Istio 1.11, while Service Mesh Manager 1.9.0 is bundled with Istio 1.12. By upgrading from Service Mesh Manager 1.8.2 to 1.9.0, and then restarting the Controllers
will only result in the latest 1.11 Istio sidecar proxy to be started in the Pods
.
To upgrade to the new minor/major version of Istio on your workloads, complete the Canary control plane upgrades procedure.
2.4.2 - Canary control plane upgrades
Overview
Upgrading between Istio minor/major releases (for example, from Istio 1.13.x to 1.15.x) is a high-risk operation. The official Istio distribution is designed in a way that the upgrade occurs as a big one-time upgrade, making recovery difficult in case of unexpected errors.
To address this issue, Service Mesh Manager runs both versions of the Istio control plane on the upgraded cluster, and allows you to migrate your workloads gradually to the new Istio version.
kubectl get istiocontrolplanes -n istio-system
The output should be similar to:
NAME MODE NETWORK STATUS MESH EXPANSION EXPANSION GW IPS ERROR AGE
cp-v113x ACTIVE network1 Available true ["3.122.28.53","3.122.43.249"] 87m
cp-v115x ACTIVE network1 Available true ["3.122.31.252","18.195.79.209"] 66m
Here cp-v113x
is running Istio 1.13.x, while cp-v115x
is running Istio 1.15.x.
A special label on the namespaces specifies which Istio control plane should the proxies use in that namespace. In the following example the smm-demo
namespace is attached to the cp-v113x.istio-system
control plane (where the .istio-system
is the name of the namespace of the Istio control plane).
kubctl get ns smm-demo -o yaml
The output should be similar to:
apiVersion: v1
kind: Namespace
metadata:
...
labels:
istio.io/rev: cp-v113x.istio-system
name: smm-demo
spec:
finalizers:
- kubernetes
status:
phase: Active
Of course both cp-v113x
and cp-v115x
are able to discover services in all namespaces. This means that:
- Workloads can communicate with each other regardless which Istio control plane they are attached to.
- In case of an error, any namespace can be rolled back to use the previous version of the Istio control plane by simply changing the annotation
Upgrading between major/minor Istio versions
-
To upgrade Istio, first upgrade Service Mesh Manager. The upgrade will also update the validation rules, detecting any possible issues with the existing Istio Custom Resources.
-
Before starting the migration of the workloads to the new Istio control plane, check the Validation UI and fix any errors with your configuration.
-
After the upgrade has been completed, find the name of the new Istio control plane by running the following command:
kubectl get istiocontrolplanes -n istio-system
The output should be similar to:
NAME MODE NETWORK STATUS MESH EXPANSION EXPANSION GW IPS ERROR AGE
cp-v113x ACTIVE network1 Available true ["3.122.28.53","3.122.43.249"] 87m
cp-v115x ACTIVE network1 Available true ["3.122.31.252","18.195.79.209"] 66m
In this case the new Istio Control Plane is called cp-v115x
which is running Istio 1.15.x.
-
Migrate a namespace to the new Istio control plane. Complete the following steps.
-
Select a namespace, preferably one with the least impact on production traffic. Edit the istio.io/rev
label on the namespace by running:
kubectl label ns <your-namespace> istio.io/rev=cp-v115x.istio-system --overwrite
Expected output:
namespace/<your-namespace> labeled
-
Restart all Controllers
(Deployments
, StatefulSets
, and so on) in the namespace. After the restart, the workloads in the namespace are attached to the new Istio control plane. For example, to restart the deployments in a namespace, you can run:
kubectl rollout restart deployment -n <name-of-your-namespace>
-
Test your application to verify that it works with the new control plane as expected. In case of any issues, refer to the rollback section to roll back to the original Istio control plane.
-
Migrate your other namespaces.
-
After all of the applications has been migrated to the new control plane and you have verified that the applications work as expected, you can delete the old Istio control plane.
Delete the old Istio Control Plane
After you have verified that your applications work as expected with the new Istio control plane, you can delete the old Istio control plane by completing the following steps.
-
Open the Service Mesh Manager dashboard and navigate to the MAIN MENU > MESH page.
-
Verify that no Pods are attached to the old Istio control plane (the number of proxies for the old control plane should be 0).
-
Delete the old Istio control plane:
kubectl delete istiocontrolplanes -n istio-system cp-v113x
Note: Deleting the prometheus-smm-prometheus-x pod erases historic timeline data. To persist timeline data for Prometheus rollout, see Set up Persistent Volumes for Prometheus.
Roll back the data plane to the old control plane in case of issues
CAUTION:
Perform this step only if you have issues with your data plane pods, which were working with the old Istio control plane, and you deliberately want to move your workloads back to that control plane!
-
If there is a problem and you want to roll the namespace back to the old control plane, set the istio.io/rev label on the namespace to point to the old Istio control plane, and restart the pod using the kubectl rollout restart deployment
command:
kubectl label ns <name-of-your-namespace-with-issues> istio.io/rev=cp-v113x.istio-system
kubectl rollout restart deployment -n <name-of-your-namespace-with-issues>
2.4.3 - Upgrade SMM - GitOps - single cluster
This document describes how to upgrade SMM and a business application.
CAUTION:
Do not push the secrets directly into the git repository, especially when it is a public repository. Argo CD provides solutions to
keep secrets safe.
Prerequisites
To complete this procedure, you need:
- A free registration for the Service Mesh Manager download page
- A Kubernetes cluster running Argo CD (called
management-cluster
in the examples).
- A Kubernetes cluster running the previous version of Service Mesh Manager (called
workload-cluster-1
in the examples). It is assumed that Service Mesh Manager has been installed on this cluster as described in the Service Mesh Manager 1.10.0 documentation, and that the cluster meets the resource requirements of Service Mesh Manager version 1.11.0.
CAUTION:
Supported providers and Kubernetes versions
The cluster must run a Kubernetes version that Service Mesh Manager supports: Kubernetes 1.21, 1.22, 1.23, 1.24.
Service Mesh Manager is tested and known to work on the following Kubernetes providers:
- Amazon Elastic Kubernetes Service (Amazon EKS)
- Google Kubernetes Engine (GKE)
- Azure Kubernetes Service (AKS)
- On-premises installation of stock Kubernetes with load balancer support (and optionally PVCs for persistence)
Resource requirements
Make sure that your Kubernetes cluster has sufficient resources. The default installation (with Service Mesh Manager and demo application) requires the following amount of resources on the cluster:
|
Only Service Mesh Manager |
Service Mesh Manager and Streaming Data Manager |
CPU |
- 12 vCPU in total - 4 vCPU available for allocation per worker node (If you are testing on a cluster at a cloud provider, use nodes that have at least 4 CPUs, for example, c5.xlarge on AWS.) |
- 24 vCPU in total - 4 vCPU available for allocation per worker node (If you are testing on a cluster at a cloud provider, use nodes that have at least 4 CPUs, for example, c5.xlarge on AWS.) |
Memory |
- 16 GB in total - 2 GB available for allocation per worker node |
- 36 GB in total - 2 GB available for allocation per worker node |
Storage |
12 GB of ephemeral storage on the Kubernetes worker nodes (for Traces and Metrics) |
12 GB of ephemeral storage on the Kubernetes worker nodes (for Traces and Metrics) |
These minimum requirements need to be available for allocation within your cluster, in addition to the requirements of any other loads running in your cluster (for example, DaemonSets and Kubernetes node-agents). If Kubernetes cannot allocate sufficient resources to Service Mesh Manager, some pods will remain in Pending state, and Service Mesh Manager will not function properly.
Enabling additional features, such as High Availability increases this value.
The default installation, when enough headroom is available in the cluster, should be able to support at least 150 running Pods
with the same amount of Services
. For setting up Service Mesh Manager for bigger workloads, see scaling Service Mesh Manager.
This document describes how to upgrade Service Mesh Manager version 1.10.0 to Service Mesh Manager version 1.11.0.
Set up the environment
-
Set the KUBECONFIG location and context name for the management-cluster
cluster.
MANAGEMENT_CLUSTER_KUBECONFIG=management_cluster_kubeconfig.yaml
MANAGEMENT_CLUSTER_CONTEXT=management-cluster
kubectl config --kubeconfig "${MANAGEMENT_CLUSTER_KUBECONFIG}" get-contexts "${MANAGEMENT_CLUSTER_CONTEXT}"
Expected output:
CURRENT NAME CLUSTER AUTHINFO NAMESPACE
* management-cluster management-cluster
-
Set the KUBECONFIG location and context name for the workload-cluster-1
cluster.
WORKLOAD_CLUSTER_1_KUBECONFIG=workload_cluster_1_kubeconfig.yaml
WORKLOAD_CLUSTER_1_CONTEXT=workload-cluster-1
kubectl config --kubeconfig "${WORKLOAD_CLUSTER_1_KUBECONFIG}" get-contexts "${WORKLOAD_CLUSTER_1_CONTEXT}"
Expected output:
CURRENT NAME CLUSTER AUTHINFO NAMESPACE
* workload-cluster-1 workload-cluster-1
Repeat this step for any additional workload clusters you want to use.
-
Make sure the management-cluster
Kubernetes context is the current context.
kubectl config --kubeconfig "${MANAGEMENT_CLUSTER_KUBECONFIG}" use-context "${MANAGEMENT_CLUSTER_CONTEXT}"
Expected output:
Switched to context "management-cluster".
Upgrade Service Mesh Manager
The high-level steps of the upgrade process are:
- Install the new IstioControlPlane
istio-cp-v115x
on the workload cluster.
- Upgrade the
smm-operator
and the smm-controlplane
. The smm-controlplane
will use the new istio-cp-v115x
IstioControlPlane, but the business applications (for example, demo-app
) will still use the old istio-cp-v113x
control plane.
- Upgrade the business applications (
demo-app
) to use the new control plane.
-
Remove the old version (1.10.0) of the smm-operator
Helm chart.
rm -rf charts/smm-operator
-
Pull the new version (1.11.0) of the smm-operator
Helm chart and extract it into the charts
folder.
helm pull oci://registry.eticloud.io/smm-charts/smm-operator --destination ./charts/ --untar --version 1.11.0
-
Create the istio-cp-v115x.yaml
file.
cat > manifests/istio-cp-v115x.yaml << EOF
apiVersion: servicemesh.cisco.com/v1alpha1
kind: IstioControlPlane
metadata:
annotations:
argocd.argoproj.io/sync-wave: "5"
name: cp-v115x
namespace: istio-system
spec:
containerImageConfiguration:
imagePullPolicy: Always
imagePullSecrets:
- name: smm-pull-secret
distribution: cisco
istiod:
deployment:
env:
- name: ISTIO_MULTIROOT_MESH
value: "true"
image: registry.eticloud.io/smm/istio-pilot:v1.15.3-bzc.0
k8sResourceOverlays:
- groupVersionKind:
group: apps
kind: Deployment
version: v1
objectKey:
name: istiod-cp-v115x
namespace: istio-system
patches:
- path: /spec/template/spec/containers/0/args/-
type: replace
value: --tls-cipher-suites=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_AES_256_GCM_SHA384,TLS_AES_128_GCM_SHA256
meshConfig:
defaultConfig:
envoyAccessLogService:
address: smm-als.smm-system.svc.cluster.local:50600
tcpKeepalive:
interval: 10s
probes: 3
time: 10s
tlsSettings:
mode: ISTIO_MUTUAL
holdApplicationUntilProxyStarts: true
proxyMetadata:
ISTIO_META_ALS_ENABLED: "true"
PROXY_CONFIG_XDS_AGENT: "true"
tracing:
tlsSettings:
mode: ISTIO_MUTUAL
zipkin:
address: smm-zipkin.smm-system.svc.cluster.local:59411
enableEnvoyAccessLogService: true
enableTracing: true
meshExpansion:
enabled: true
gateway:
deployment:
podMetadata:
labels:
app: istio-meshexpansion-gateway
istio: meshexpansiongateway
service:
ports:
- name: tcp-smm-als-tls
port: 50600
protocol: TCP
targetPort: 50600
- name: tcp-smm-zipkin-tls
port: 59411
protocol: TCP
targetPort: 59411
meshID: mesh1
mode: ACTIVE
networkName: network1
proxy:
image: registry.eticloud.io/smm/istio-proxyv2:v1.15.3-bzc.0
proxyInit:
cni:
daemonset:
image: registry.eticloud.io/smm/istio-install-cni:v1.15.3-bzc.0
image: registry.eticloud.io/smm/istio-proxyv2:v1.15.3-bzc.0
sidecarInjector:
deployment:
image: registry.eticloud.io/smm/istio-sidecar-injector:v1.15.3-bzc.0
version: 1.15.3
EOF
-
Update the smm-controlplane.yaml
file to use the istio-cp-v115x
IstioControlPlane.
cat > manifests/smm-controlplane.yaml << EOF
apiVersion: smm.cisco.com/v1alpha1
kind: ControlPlane
metadata:
annotations:
argocd.argoproj.io/sync-wave: "10"
name: smm
spec:
certManager:
enabled: true
namespace: cert-manager
clusterName: workload-cluster-1
clusterRegistry:
enabled: true
namespace: cluster-registry
log: {}
meshManager:
enabled: true
istio:
enabled: true
istioCRRef:
name: cp-v115x
namespace: istio-system
operators:
namespace: smm-system
namespace: smm-system
nodeExporter:
enabled: true
namespace: smm-system
psp:
enabled: false
rbac:
enabled: true
oneEye: {}
registryAccess:
enabled: true
imagePullSecretsController: {}
namespace: smm-registry-access
pullSecrets:
- name: smm-registry.eticloud.io-pull-secret
namespace: smm-registry-access
repositoryOverride:
host: registry.eticloud.io
prefix: smm
role: active
smm:
exposeDashboard:
meshGateway:
enabled: true
als:
enabled: true
log: {}
application:
enabled: true
log: {}
auth:
forceUnsecureCookies: true
mode: anonymous
certManager:
enabled: true
enabled: true
federationGateway:
enabled: true
name: smm
service:
enabled: true
name: smm-federation-gateway
port: 80
federationGatewayOperator:
enabled: true
impersonation:
enabled: true
istio:
revision: cp-v115x.istio-system
leo:
enabled: true
log: {}
log: {}
namespace: smm-system
prometheus:
enabled: true
replicas: 1
prometheusOperator: {}
releaseName: smm
role: active
sre:
enabled: true
useIstioResources: true
EOF
-
If you are upgrading Service Mesh Manager 1.10.0 to 1.11.0: complete this step, otherwise skip to the next step.
Apply the following patch to your Service Mesh Manager v1.10.0 cluster to modify the spec field of a job that cleans up the cert-manager-startupapicheck
job after 100 seconds when completed. If you skip this step, you might see a cert-manager-startupapicheck
related error during the upgrade.
kubectl patch jobs.batch -n cert-manager cert-manager-startupapicheck -p '{"spec":{"ttlSecondsAfterFinished":100}}' --type=merge --kubeconfig "${WORKLOAD_CLUSTER_1_KUBECONFIG}" --context "${WORKLOAD_CLUSTER_1_CONTEXT}"
-
Commit and push the changes to the Git repository.
git add .
git commit -m "upgrade smm to 1.11.0"
git push
-
Wait a few minutes, then check the new IstioControlPlane.
kubectl -n istio-system get istiocontrolplanes.servicemesh.cisco.com --kubeconfig "${WORKLOAD_CLUSTER_1_KUBECONFIG}" --context "${WORKLOAD_CLUSTER_1_CONTEXT}"
Expected output:
NAME MODE NETWORK STATUS MESH EXPANSION EXPANSION GW IPS ERROR AGE
cp-v113x ACTIVE network1 Available true ["52.208.63.154","54.155.81.181"] 61m
cp-v115x ACTIVE network1 Available true ["52.211.44.215","63.32.253.55"] 11m
-
Open the Service Mesh Manager dashboard.
On the MENU > OVERVIEW page everything should be fine, except for some validation issues. The validation issues show that the business application (demo-app
) is behind the smm-control-plane
and the business application should be updated to use the latest IstioControlPlane.
You can see the two IstioControlPlanes on the MENU > MESH page. The smm-control-plane
is using the cp-v115x.istio-system
IstioControlPlane and the demo-app
is using the cp-v113x.istio-system
IstioControlPlane.
Upgrade Demo application
The demo-app
application is just for demonstration purposes, but it represents your business applications. To upgrade a business application, set the istio.io/rev
label of the business application’s namespace to the target IstioControlPlane. Complete the following steps to update the demo-app
to use the cp-v115x.istio-system
IstioControlPlane and set the istio.io/rev
label accordingly.
-
Update the label on you application namespace (for the demo application, that’s the demo-app-ns.yaml
file).
cat > demo-app/demo-app-ns.yaml << EOF
apiVersion: v1
kind: Namespace
metadata:
labels:
app.kubernetes.io/instance: smm-demo
app.kubernetes.io/name: smm-demo
app.kubernetes.io/part-of: smm-demo
app.kubernetes.io/version: 0.1.4
istio.io/rev: cp-v115x.istio-system
name: smm-demo
EOF
-
Update the demo-app.yaml
file.
cat > demo-app/demo-app.yaml << EOF
apiVersion: smm.cisco.com/v1alpha1
kind: DemoApplication
metadata:
name: smm-demo
namespace: smm-demo
spec:
autoscaling:
enabled: true
controlPlaneRef:
name: smm
deployIstioResources: true
deploySLOResources: true
enabled: true
enabledComponents:
- frontpage
- catalog
- bookings
- postgresql
- payments
- notifications
- movies
- analytics
- database
- mysql
istio:
revision: cp-v115x.istio-system
load:
enabled: true
maxRPS: 30
minRPS: 10
swingPeriod: 1380000000000
replicas: 1
resources:
limits:
cpu: "2"
memory: 192Mi
requests:
cpu: 40m
memory: 64Mi
EOF
-
Commit and push the changes to the Git repository.
git add .
git commit -m "upgrade demo-app to istio-cp-v115x"
git push
-
Wait a few minutes, then check the IstioControlPlane of the demo-app
.
kubectl get ns smm-demo -o=jsonpath='{.metadata.labels.istio\.io/rev}' --kubeconfig "${WORKLOAD_CLUSTER_1_KUBECONFIG}" --context "${WORKLOAD_CLUSTER_1_CONTEXT}"
Expected output: the demo-app
is using the new istio-cp-v115x
IstioControlPlane.
-
Check the dashboard.
-
On the MENU > OVERVIEW page, everything should be fine.
-
On the MENU > MESH page, you can see both the old and the new IstioControlPlane, but both smm-controlplane
and demo-app
are using the new one.
-
Check the MENU > TOPOLOGY page.
2.4.4 - Upgrade SMM - GitOps - multi-cluster
This document describes how to upgrade SMM and a business application.
CAUTION:
Do not push the secrets directly into the git repository, especially when it is a public repository. Argo CD provides solutions to
keep secrets safe.
Prerequisites
To complete this procedure, you need:
- A free registration for the Service Mesh Manager download page
- A Kubernetes cluster running Argo CD (called
management-cluster
in the examples).
- Two Kubernetes clusters running the previous version of Service Mesh Manager (called
workload-cluster-1
and workload-cluster-2
in the examples). It is assumed that Service Mesh Manager has been installed on these clusters as described in the Service Mesh Manager 1.10.0 documentation, and that the clusters meet the resource requirements of Service Mesh Manager version 1.11.0.
CAUTION:
Supported providers and Kubernetes versions
The cluster must run a Kubernetes version that Service Mesh Manager supports: Kubernetes 1.21, 1.22, 1.23, 1.24.
Service Mesh Manager is tested and known to work on the following Kubernetes providers:
- Amazon Elastic Kubernetes Service (Amazon EKS)
- Google Kubernetes Engine (GKE)
- Azure Kubernetes Service (AKS)
- On-premises installation of stock Kubernetes with load balancer support (and optionally PVCs for persistence)
Resource requirements
Make sure that your Kubernetes cluster has sufficient resources. The default installation (with Service Mesh Manager and demo application) requires the following amount of resources on the cluster:
|
Only Service Mesh Manager |
Service Mesh Manager and Streaming Data Manager |
CPU |
- 12 vCPU in total - 4 vCPU available for allocation per worker node (If you are testing on a cluster at a cloud provider, use nodes that have at least 4 CPUs, for example, c5.xlarge on AWS.) |
- 24 vCPU in total - 4 vCPU available for allocation per worker node (If you are testing on a cluster at a cloud provider, use nodes that have at least 4 CPUs, for example, c5.xlarge on AWS.) |
Memory |
- 16 GB in total - 2 GB available for allocation per worker node |
- 36 GB in total - 2 GB available for allocation per worker node |
Storage |
12 GB of ephemeral storage on the Kubernetes worker nodes (for Traces and Metrics) |
12 GB of ephemeral storage on the Kubernetes worker nodes (for Traces and Metrics) |
These minimum requirements need to be available for allocation within your cluster, in addition to the requirements of any other loads running in your cluster (for example, DaemonSets and Kubernetes node-agents). If Kubernetes cannot allocate sufficient resources to Service Mesh Manager, some pods will remain in Pending state, and Service Mesh Manager will not function properly.
Enabling additional features, such as High Availability increases this value.
The default installation, when enough headroom is available in the cluster, should be able to support at least 150 running Pods
with the same amount of Services
. For setting up Service Mesh Manager for bigger workloads, see scaling Service Mesh Manager.
This document describes how to upgrade Service Mesh Manager version 1.10.0 to Service Mesh Manager version 1.11.0.
Set up the environment
-
Set the KUBECONFIG location and context name for the management-cluster
cluster.
MANAGEMENT_CLUSTER_KUBECONFIG=management_cluster_kubeconfig.yaml
MANAGEMENT_CLUSTER_CONTEXT=management-cluster
kubectl config --kubeconfig "${MANAGEMENT_CLUSTER_KUBECONFIG}" get-contexts "${MANAGEMENT_CLUSTER_CONTEXT}"
Expected output:
CURRENT NAME CLUSTER AUTHINFO NAMESPACE
* management-cluster management-cluster
-
Set the KUBECONFIG location and context name for the workload-cluster-1
cluster.
WORKLOAD_CLUSTER_1_KUBECONFIG=workload_cluster_1_kubeconfig.yaml
WORKLOAD_CLUSTER_1_CONTEXT=workload-cluster-1
kubectl config --kubeconfig "${WORKLOAD_CLUSTER_1_KUBECONFIG}" get-contexts "${WORKLOAD_CLUSTER_1_CONTEXT}"
Expected output:
CURRENT NAME CLUSTER AUTHINFO NAMESPACE
* workload-cluster-1 workload-cluster-1
Repeat this step for any additional workload clusters you want to use.
-
Make sure the management-cluster
Kubernetes context is the current context.
kubectl config --kubeconfig "${MANAGEMENT_CLUSTER_KUBECONFIG}" use-context "${MANAGEMENT_CLUSTER_CONTEXT}"
Expected output:
Switched to context "management-cluster".
Upgrade Service Mesh Manager
The high-level steps of the upgrade process are:
- Install the new IstioControlPlane
istio-cp-v115x
on the workload clusters.
- Upgrade the
smm-operator
and the smm-controlplane
. The smm-controlplane
will use the new istio-cp-v115x
IstioControlPlane, but the business applications (for example, demo-app
) will still use the old istio-cp-v113x
control plane.
- Upgrade the business applications (
demo-app
) to use the new control plane.
-
Remove the old version (1.10.0) of the smm-operator
Helm chart.
rm -rf charts/smm-operator
-
Pull the new version (1.11.0) of the smm-operator
Helm chart and extract it into the charts
folder.
helm pull oci://registry.eticloud.io/smm-charts/smm-operator --destination ./charts/ --untar --version 1.11.0
-
Create the istio-cp-v115x.yaml
file, and its overlays for the workload clusters.
cat > manifests/istio-cp-v115x.yaml << EOF
apiVersion: servicemesh.cisco.com/v1alpha1
kind: IstioControlPlane
metadata:
annotations:
argocd.argoproj.io/sync-wave: "5"
name: cp-v115x
namespace: istio-system
spec:
containerImageConfiguration:
imagePullPolicy: Always
imagePullSecrets:
- name: smm-pull-secret
distribution: cisco
istiod:
deployment:
env:
- name: ISTIO_MULTIROOT_MESH
value: "true"
image: registry.eticloud.io/smm/istio-pilot:v1.15.3-bzc.0
k8sResourceOverlays:
- groupVersionKind:
group: apps
kind: Deployment
version: v1
objectKey:
name: istiod-cp-v115x
namespace: istio-system
patches:
- path: /spec/template/spec/containers/0/args/-
type: replace
value: --tls-cipher-suites=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_AES_256_GCM_SHA384,TLS_AES_128_GCM_SHA256
meshConfig:
defaultConfig:
envoyAccessLogService:
address: smm-als.smm-system.svc.cluster.local:50600
tcpKeepalive:
interval: 10s
probes: 3
time: 10s
tlsSettings:
mode: ISTIO_MUTUAL
holdApplicationUntilProxyStarts: true
proxyMetadata:
ISTIO_META_ALS_ENABLED: "true"
PROXY_CONFIG_XDS_AGENT: "true"
tracing:
tlsSettings:
mode: ISTIO_MUTUAL
zipkin:
address: smm-zipkin.smm-system.svc.cluster.local:59411
enableEnvoyAccessLogService: true
enableTracing: true
meshExpansion:
enabled: true
gateway:
deployment:
podMetadata:
labels:
app: istio-meshexpansion-gateway
istio: meshexpansiongateway
service:
ports:
- name: tcp-smm-als-tls
port: 50600
protocol: TCP
targetPort: 50600
- name: tcp-smm-zipkin-tls
port: 59411
protocol: TCP
targetPort: 59411
meshID: mesh1
mode: ACTIVE
networkName: network1
proxy:
image: registry.eticloud.io/smm/istio-proxyv2:v1.15.3-bzc.0
proxyInit:
cni:
daemonset:
image: registry.eticloud.io/smm/istio-install-cni:v1.15.3-bzc.0
image: registry.eticloud.io/smm/istio-proxyv2:v1.15.3-bzc.0
sidecarInjector:
deployment:
image: registry.eticloud.io/smm/istio-sidecar-injector:v1.15.3-bzc.0
version: 1.15.3
EOF
For workload-cluster-1
:
cat > manifests/smm-controlplane/overlays/workload-cluster-1/istio-cp-v115x.yaml <<EOF
apiVersion: servicemesh.cisco.com/v1alpha1
kind: IstioControlPlane
metadata:
annotations:
argocd.argoproj.io/sync-wave: "5"
name: cp-v115x
namespace: istio-system
spec:
meshID: mesh1
mode: ACTIVE
networkName: network1
EOF
For workload-cluster-2
:
cat > manifests/smm-controlplane/overlays/workload-cluster-2/istio-cp-v115x.yaml <<EOF
apiVersion: servicemesh.cisco.com/v1alpha1
kind: IstioControlPlane
metadata:
annotations:
argocd.argoproj.io/sync-wave: "5"
name: cp-v115x
namespace: istio-system
spec:
meshID: mesh1
mode: PASSIVE
networkName: workload-cluster-2
EOF
-
Update the control-plane.yaml
files to use the istio-cp-v115x
IstioControlPlane.
cat > manifests/smm-controlplane/base/control-plane.yaml << EOF
apiVersion: smm.cisco.com/v1alpha1
kind: ControlPlane
metadata:
annotations:
argocd.argoproj.io/sync-wave: "10"
name: smm
spec:
certManager:
namespace: cert-manager
clusterName: CLUSTER-NAME
clusterRegistry:
enabled: true
namespace: cluster-registry
log: {}
meshManager:
enabled: true
istio:
enabled: true
istioCRRef:
name: cp-v115x
namespace: istio-system
operators:
namespace: smm-system
namespace: smm-system
nodeExporter:
enabled: true
namespace: smm-system
psp:
enabled: false
rbac:
enabled: true
oneEye: {}
registryAccess:
enabled: true
imagePullSecretsController: {}
namespace: smm-registry-access
pullSecrets:
- name: smm-registry.eticloud.io-pull-secret
namespace: smm-registry-access
repositoryOverride:
host: registry.eticloud.io
prefix: smm
role: active
smm:
als:
enabled: true
log: {}
application:
enabled: true
log: {}
auth:
mode: impersonation
certManager:
enabled: true
enabled: true
federationGateway:
enabled: true
name: smm
service:
enabled: true
name: smm-federation-gateway
port: 80
federationGatewayOperator:
enabled: true
impersonation:
enabled: true
istio:
revision: cp-v115x.istio-system
leo:
enabled: true
log: {}
log: {}
namespace: smm-system
prometheus:
enabled: true
replicas: 1
prometheusOperator: {}
releaseName: smm
role: active
sdm:
enabled: false
sre:
enabled: true
useIstioResources: true
EOF
For workload-cluster-1
:
cat > manifests/smm-controlplane/overlays/workload-cluster-1/control-plane.yaml << EOF
apiVersion: smm.cisco.com/v1alpha1
kind: ControlPlane
metadata:
name: smm
spec:
clusterName: workload-cluster-1
certManager:
enabled: true
smm:
exposeDashboard:
meshGateway:
enabled: true
auth:
forceUnsecureCookies: true
mode: anonymous
EOF
For workload-cluster-2
:
cat > manifests/smm-controlplane/overlays/workload-cluster-2/control-plane.yaml << EOF
apiVersion: smm.cisco.com/v1alpha1
kind: ControlPlane
metadata:
name: smm
spec:
clusterName: workload-cluster-2
role: passive
smm:
als:
enabled: true
log: {}
application:
enabled: false
log: {}
auth:
mode: impersonation
certManager:
enabled: false
enabled: true
federationGateway:
enabled: false
name: smm
service:
enabled: true
name: smm-federation-gateway
port: 80
federationGatewayOperator:
enabled: true
grafana:
enabled: false
impersonation:
enabled: true
istio:
revision: cp-v115x.istio-system
kubestatemetrics:
enabled: true
leo:
enabled: false
log: {}
log: {}
namespace: smm-system
prometheus:
enabled: true
replicas: 1
retentionTime: 8h
prometheusOperator: {}
releaseName: smm
role: passive
sdm:
enabled: false
sre:
enabled: false
tracing:
enabled: true
useIstioResources: false
web:
enabled: false
EOF
-
Create the customization files.
cat > manifests/smm-controlplane/base/kustomization.yaml << EOF
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
metadata:
name: cluster-secrets
resources:
- cert-manager-namespace.yaml
- istio-system-namespace.yaml
- istio-cp-v113x.yaml
- istio-cp-v115x.yaml
- control-plane.yaml
EOF
cat > manifests/smm-controlplane/overlays/workload-cluster-1/kustomization.yaml << EOF
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
bases:
- ../../base
patches:
- istio-cp-v113x.yaml
- istio-cp-v115x.yaml
- control-plane.yaml
EOF
cat > manifests/smm-controlplane/overlays/workload-cluster-2/kustomization.yaml << EOF
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
bases:
- ../../base
patches:
- istio-cp-v113x.yaml
- istio-cp-v115x.yaml
- control-plane.yaml
EOF
-
If you are upgrading Service Mesh Manager 1.10.0 to 1.11.0: complete this step, otherwise skip to the next step.
Apply the following patch to your Service Mesh Manager v1.10.0 cluster to modify the spec field of a job that cleans up the cert-manager-startupapicheck
job after 100 seconds when completed. If you skip this step, you might see a cert-manager-startupapicheck
related error during the upgrade.
kubectl patch jobs.batch -n cert-manager cert-manager-startupapicheck -p '{"spec":{"ttlSecondsAfterFinished":100}}' --type=merge --kubeconfig "${WORKLOAD_CLUSTER_1_KUBECONFIG}" --context "${WORKLOAD_CLUSTER_1_CONTEXT}"
For workload-cluster-2
:
kubectl patch jobs.batch -n cert-manager cert-manager-startupapicheck -p '{"spec":{"ttlSecondsAfterFinished":100}}' --type=merge --kubeconfig "${WORKLOAD_CLUSTER_2_KUBECONFIG}" --context "${WORKLOAD_CLUSTER_2_CONTEXT}"
-
Commit and push the changes to the Git repository.
git add .
git commit -m "upgrade SMM to 1.11.0"
git push
-
Wait a few minutes, then check the new IstioControlPlane.
kubectl -n istio-system get istiocontrolplanes.servicemesh.cisco.com --kubeconfig "${WORKLOAD_CLUSTER_1_KUBECONFIG}" --context "${WORKLOAD_CLUSTER_1_CONTEXT}"
Expected output:
NAME MODE NETWORK STATUS MESH EXPANSION EXPANSION GW IPS ERROR AGE
cp-v113x ACTIVE network1 Available true ["52.208.63.154","54.155.81.181"] 61m
cp-v115x ACTIVE network1 Available true ["52.211.44.215","63.32.253.55"] 11m
For workload-cluster-2
:
kubectl -n istio-system get istiocontrolplanes.servicemesh.cisco.com --kubeconfig "${WORKLOAD_CLUSTER_2_KUBECONFIG}" --context "${WORKLOAD_CLUSTER_2_CONTEXT}"
-
Open the Service Mesh Manager dashboard.
On the MENU > OVERVIEW page everything should be fine, except for some validation issues. The validation issues show that the business application (demo-app
) is behind the smm-control-plane
and the business application should be updated to use the latest IstioControlPlane.
You can see the two IstioControlPlanes on the MENU > MESH page. The smm-control-plane
is using the cp-v115x.istio-system
IstioControlPlane and the demo-app
is using the cp-v113x.istio-system
IstioControlPlane.
Upgrade Demo application
The demo-app
application is just for demonstration purposes, but it represents your business applications. To upgrade a business application, set the istio.io/rev
label of the business application’s namespace to the target IstioControlPlane. Complete the following steps to update the demo-app
to use the cp-v115x.istio-system
IstioControlPlane and set the istio.io/rev
label accordingly.
-
Update the label on you application namespace (for the demo application, that’s the demo-app-namespace.yaml
file).
cat > manifests/demo-app/base/demo-app-namespace.yaml << EOF
apiVersion: v1
kind: Namespace
metadata:
labels:
app.kubernetes.io/instance: smm-demo
app.kubernetes.io/name: smm-demo
app.kubernetes.io/part-of: smm-demo
app.kubernetes.io/version: 0.1.4
istio.io/rev: cp-v115x.istio-system
name: smm-demo
EOF
-
Update the demo-app.yaml
files.
For workload-cluster-1
:
cat > manifests/demo-app/overlays/workload-cluster-1/demo-app.yaml << EOF
apiVersion: smm.cisco.com/v1alpha1
kind: DemoApplication
metadata:
name: smm-demo
namespace: smm-demo
spec:
autoscaling:
enabled: true
controlPlaneRef:
name: smm
deployIstioResources: true
deploySLOResources: true
enabled: true
enabledComponents:
- frontpage
- catalog
- bookings
- postgresql
istio:
revision: cp-v115x.istio-system
load:
enabled: true
maxRPS: 30
minRPS: 10
swingPeriod: 1380000000000
replicas: 1
resources:
limits:
cpu: "2"
memory: 192Mi
requests:
cpu: 40m
memory: 64Mi
EOF
For workload-cluster-2
:
cat > manifests/demo-app/overlays/workload-cluster-2/demo-app.yaml << EOF
apiVersion: smm.cisco.com/v1alpha1
kind: DemoApplication
metadata:
name: smm-demo
namespace: smm-demo
spec:
autoscaling:
enabled: true
controlPlaneRef:
name: smm
deployIstioResources: false
deploySLOResources: false
enabled: true
enabledComponents:
- movies
- payments
- notifications
- analytics
- database
- mysql
istio:
revision: cp-v115x.istio-system
replicas: 1
resources:
limits:
cpu: "2"
memory: 192Mi
requests:
cpu: 40m
memory: 64Mi
EOF
-
Commit and push the changes to the Git repository.
git add .
git commit -m "upgrade demo-app to istio-cp-v115x"
git push
-
Wait a few minutes, then check the IstioControlPlane of the demo-app
.
kubectl get ns smm-demo -o=jsonpath='{.metadata.labels.istio\.io/rev}' --kubeconfig "${WORKLOAD_CLUSTER_1_KUBECONFIG}" --context "${WORKLOAD_CLUSTER_1_CONTEXT}"
Expected output: the demo-app
is using the new istio-cp-v115x
IstioControlPlane.
For workload-cluster-2
:
kubectl get ns smm-demo -o=jsonpath='{.metadata.labels.istio\.io/rev}' --kubeconfig "${WORKLOAD_CLUSTER_2_KUBECONFIG}" --context "${WORKLOAD_CLUSTER_2_CONTEXT}"
Expected output: the demo-app
is using the new istio-cp-v115x
IstioControlPlane.
-
Check the dashboard.
-
On the MENU > OVERVIEW page, everything should be fine.
-
On the MENU > MESH page, you can see both the old and the new IstioControlPlane, but both smm-controlplane
and demo-app
are using the new one.
-
Check the MENU > TOPOLOGY page.
2.4.5 - Multi-cluster upgrade from 1.10.0 to 1.11.0
This document shows you how to upgrade Service Mesh Manager in a multi-cluster mesh scenario. For details on how to set up a multi-cluster mesh, see the multi-cluster installation guide. To access the latest binary files, see Accessing the Service Mesh Manager binaries.
Upgrade from 1.10.0 to 1.11.0
To upgrade Service Mesh Manager from 1.10.0 to 1.11.0 for a multi-cluster setup, complete the following steps.
-
Before upgrading Service Mesh Manager 1.10 to 1.11, apply the following patch to your Service Mesh Manager v1.10 cluster to modify the spec field of a job that cleans up the cert-manager-startupapicheck
job after 100sec when completed. If you skip this step, you might see a “cert-manager-startupapicheck” related error during the upgrade. The error is non-blocking and doesn’t stop the upgrade process. Alternatively, you can apply the patch after you have upgraded the cluster.
kubectl patch jobs.batch -n cert-manager cert-manager-startupapicheck -p '{"spec":{"ttlSecondsAfterFinished":100}}' --type=merge
-
Download the Service Mesh Manager command-line tool for version 1.11.0. The archive contains the smm
and supertubes
binaries. Extract these binaries and update your local copy on your machine. For details, see Accessing the Service Mesh Manager binaries.
The output should be similar to:
Service Mesh Manager CLI version 1.11.0 (6ba681d83) built on 2022-11-15T21:43:27Z
-
Deploy a new version of Service Mesh Manager.
The following command upgrades the Service Mesh Manager control plane. It also installs the new Istio control plane (version 1.15.x), but the applications keep using the old control plane until you restart your workloads.
In the following examples, smm
refers to version 1.11.0 of the binary.
-
Restart the Prometheus instance so that it starts using the cp-v115x.istio-system
Istio control plane:
kubectl rollout restart statefulset prometheus-smm-prometheus --namespace smm-system
-
Rerun the attach command with --force
flag to upgrade Service Mesh Manager on the peer cluster:
smm istio cluster attach <PEER_CLUSTER_KUBECONFIG_FILE> --force
-
Select the 1.15.x version Istio control plane as that’s the Istio version supported by Service Mesh Manager version 1.11.0:
✓ validate-kubeconfig ❯ checking cluster reachability...
✓ validate-version ❯ checking K8s API server version...
✓ validate-version ❯ detected K8s API server version: 1.21.14
Multiple Istio control planes were found. Which one would you like this cluster to attach to?
? cp-v115x
The upgrade process is completed.
Note: If you see the following error message, rerun the attach command:
could not apply k8s resources: could not update resource: Internal error occurred: failed calling webhook "cluster-validator.clusterregistry.k8s.cisco.com": Post "https://cluster-registry-controller.cluster-registry.svc:443/validate-cluster?timeout=30s": x509: certificate is not valid for any names, but wanted to matchcluster-registry-controller.cluster-registry.svc
Upgrade existing workloads
Now the Service Mesh Manager control plane is upgraded and is using the new -1.15.x Istio control plane. But, the workloads are still using the old Istio control plane and data plane. Complete the following steps to upgrade the Istio sidecar proxy in your application pods.
-
On your primary Service Mesh Manager cluster, add the cp-v115x.istio-system
label to your application namespaces (for example, the smm-demo
namespace). This label is automatically synchronized to the peer clusters.
kubectl label ns <name-of-your-namespace> istio.io/rev=cp-v115x.istio-system --overwrite
-
Run a kubectl rollout
command on the primary and peer clusters to ensure that the pods use the latest Istio sidecar proxy:
# Run this command on ALL clusters
kubectl rollout restart deployment --namespace <name-of-your-namespace>
-
To verify whether the upgrade process has succeeded on peer clusters, you can check the workload pods and see if they’re using the 1.15.x Istio proxies.
2.5 - Dashboard
Service Mesh Manager provides a dashboard interface that can be used to diagnose any issues with the underlying deployment. This section provides an introduction to the list of available features on this user interface.
Accessing the dashboard
To access the dashboard, set your KUBECONFIG file to the cluster where the Service Mesh Manager control plane is running, then run the following command to open a browser window to the dashboard.
smm dashboard --kubeconfig <path/to/kubeconfig>
In case you are executing this command on a remote machine, complete the following additional steps.
-
Check the output of the command and forward the indicated port to your local machine.
-
Open a browser window and navigate to http://127.0.0.1:50500/
.
-
Service Mesh Manager asks for login credentials. To acquire the credentials, follow the instructions on the user interface.
Alternatively, you can complete the following steps.
-
Set your KUBECONFIG file to the cluster where the Service Mesh Manager control plane is running, then run the following command.
smm login --kubeconfig <path/to/kubeconfig>
A temporary login token is displayed. Now you can perform other actions from the command line.
2.5.1 - Dashboard overview
The MENU > OVERVIEW page on the Service Mesh Manager web interface shows information about your the traffic in your mesh and the health of your services and workloads.
If your application hasn’t received any traffic yet, there will be no metrics in the system so you won’t see any visualization yet. To send some traffic to your services as a test, see Generate test load.
The page shows the following information and controls:
Metrics
- Requests per second
- Average latency (95th percentile)
- Error rate (for 5XX errors). Client-side errors (with 4XX status code) are not included.
- Clusters (number of clusters in the mesh)
- Services (number of services in the mesh / total number of services)
- Workloads (number of workloads in the mesh / total number of workloads)
- Pods (number of pods in the mesh / total number of pods)
- Validation issues
Dashboards
The OVERVIEW page shows charts about the health of the services and workloads, as well as the aggregated status of your service level objectives (SLOs). Click on the chart to get more information, for example, about the SLO burn rates that display warnings.
The OVERVIEW page also shows the following live Grafana dashboards:
- Requests per second
- Average latency (95th percentile)
- Error rate (for 5XX errors). Client-side errors (with 4XX status code) are not included.
Validations
To check the validation status of your YAML configuration files, select OVERVIEW > VALIDATION ISSUES. For details, see Validation.
2.5.2 - Mesh
The MENU > MESH page on the Service Mesh Manager web interface shows information about your service mesh and the control planes.
The page shows the following real-time information:
The mesh in numbers
- CONTROL PLANES: The number of Istio control planes in the mesh.
- CLUSTERS: The number of clusters in the mesh.
- ISTIO PROXIES MEMORY USAGE: Current memory usage of the Istio proxies (sidecars).
- ISTIO PROXIES CPU USAGE: Current CPU usage of the Istio proxies (sidecars).
- ISTIO PROXIES NOT RUNNING: The Istio proxies (sidecars) that are not running for some reason.
Clusters
Displays basic status information about the clusters in the mesh.
This is mostly useful in the multi-cluster setup when multiple clusters are in the mesh.
Control planes
This section displays information and metrics about the Istio control planes in the mesh, including version and revision information, and validation errors.
Click on a specific control plane to display information about:
In addition, selecting a control plane also shows the following basic information:
- CLUSTER NAME: The name of the cluster the control plane is running on.
- VERSION: The Istio version of the service mesh.
- ROOT NAMESPACE: The administrative root namespace for Istio configuration of the service mesh.
- TRUST DOMAIN: The list of trust domains.
- AUTOMATIC MTLS: Shows whether automatic mutual TLS is enabled for the service mesh.
- OUTBOUND TRAFFIC POLICY: The default outbound traffic policy for accessing external services set for the mesh. For details, see External Services.
- PROXIES: The number of sidecar proxies in the mesh.
- CONFIG: Click the
icon to display the configuration of the control plane.
Pods
Shows information and status about the pods of the control plane.
Proxies
Lists the proxies managed by the control plane, and the synchronization status of the cluster discovery service (CDS), listener discovery service (LDS), endpoint discovery service(EDS), and route discovery service (RDS) for the proxy.
Trust bundles
Shows the trust bundles defined for the control plane.
Validation issues
Lists the validation issues for the entire control plane.
Metrics
The timeline charts show the version and revision of the Istio proxies used in the mesh, as well as error metrics from the Istio Pilot agent, for example, rejected CDS and EDS configurations. (Istio Pilot agent runs in the sidecar or gateway container and bootstraps Envoy.)
To display more detailed metrics about the resource usage of Istiod and the proxies, click on a control plane in the Control planes section.
2.5.2.1 - Validation
The Service Mesh Manager product:
- simplifies service mesh configuration and management,
- guides you through setting up complex traffic routing rules
- takes care of creating, merging and validating the YAML configuration.
And unlike some other similar products, it’s working in both directions: you can edit the YAML files manually, and you can still view and manipulate the configuration from Service Mesh Manager. That’s possible because there’s no intermediate configuration layer in Service Mesh Manager.
To support the bi-directional mesh configuration, Service Mesh Manager provides a validation subsystem for the entire mesh. Istio itself provides some syntactic and semantic validation for the individual Istio resources, but Service Mesh Manager goes even further. Service Mesh Manager performs complex validations which take the whole cluster state and related resources into account to check whether everything is configured correctly within the whole mesh.
Service Mesh Manager performs a lot of syntactical and semantical validation checks for various aspects of the configuration. The validation checks are constantly curated and new checks added with every release. For example:
- Sidecar injection template validation: Validates whether there are any pods in the environment that run with outdated sidecar proxy image or configuration.
- Gateway port protocol configuration conflict validation: Detects conflicting port configuration in different Gateway resources.
- Multiple gateways with the same TLS certificate validation: Configuring multiple gateways to use the same TLS certificate causes most browsers to produce 404 errors when accessing a second host after a connection to another host has already been established.
Check validation results on the Service Mesh Manager UI
The validations are constantly running in the background. To display the actual results, navigate to OVERVIEW > VALIDATION ISSUES. You can use the NAMESPACES field to select the namespaces you want to observe.
To display the invalid part of the configuration in the invalid resource, click the
icon.
To display every validation error of a control plane as a list, navigate to MENU > MESH, and click on the control plane in the Control planes section, then select VALIDATIONS. For details, see Validation issues.
Check validation results from the CLI
To check the results of the validation from the CLI, run the smm analyze
command. To show only results affecting a specific namespace, use the –namespace option, for example: smm analyze --namespace smm-demo
, or smm analyze --namespace istio-system
The smm analyze
command can also produce JSON output, for example:
smm analyze --namespace istio-system -o json
Example output:
{
"gateway.networking.istio.io:master:istio-system:demo-gw-demo1": [
{
"checkID": "gateway/reused-cert",
"istioRevision": "cp-v115x.istio-system",
"subjectContextKey": "gateway.networking.istio.io:master:istio-system:demo-gw-demo1",
"passed": false,
"error": {},
"errorMessage": "multiple gateways configured with same TLS certificate"
}
],
"gateway.networking.istio.io:master:istio-system:demo-gw-demo2": [
{
"checkID": "gateway/reused-cert",
"istioRevision": "cp-v115x.istio-system",
"subjectContextKey": "gateway.networking.istio.io:master:istio-system:demo-gw-demo2",
"passed": false,
"error": {},
"errorMessage": "multiple gateways configured with same TLS certificate"
}
]
}
2.5.3 - Topology view
The MENU > TOPOLOGY page of the Service Mesh Manager web interface displays the topology of services and workloads inside the mesh, and annotates it with real-time and information about latency, throughput, or HTTP request failures. You can also display historical data by adjusting the timeline.
The topology view is almost entirely based on metrics: metrics received from Prometheus and enhanced with information from Kubernetes.
The topology page serves as a starting point of diagnosing problems within the mesh. Service Mesh Manager is integrated with Grafana and Jaeger for easy access to in-depth monitoring and distributed traces) of various services.
The nodes in the graph are services or workloads, while the arrows represent network connections between different services. This is based on Istio metrics retrieved from Prometheus.
For certain services like MySQL and PostgreSQL, protocol-specific metrics normally not available in Istio are also shown, for example, sessions or transactions per second.
Note: Protocol-specific metrics for MySQL and PostgreSQL are available only for certain versions of MySQL and PostgreSQL. For details, see the documentation of the MySQL and the PostgreSQL Envoy filters.
For example, the following image shows SQL sessions and transactions per second.
The graph serves as a visual monitoring tool, as it displays various errors and metrics in the system. Click the ? icon on the left to show a legend of the graph to better understand what the icons mean in the graph. Virtual machines integrated into the mesh are displayed as workloads with the
icon in their corner.
If your application hasn’t received any traffic yet, there will be no metrics in the system so you won’t see any visualization yet. To send some traffic to your services as a test, see Generate test load.
Select the data displayed
You can select and configure what is displayed on the graph in the top bar of the screen. You can also display historical data using the TIMELINE.
Namespaces
Display data only for the selected namespaces.
Resources
You can select the type of resources you want to display in the graph. The following resources can be displayed in a cluster: clusters, namespaces, apps, services, and workloads.
Workloads are always shown, they cannot be disabled.
Here’s an example when only apps, services and workloads are shown:
Showing clusters is important in multi-cloud and hybrid-cloud environments. For details, see Multi-cluster.
Edge labels
The labels on the edges of the graph can display various real-time information about the traffic between services. You can display the following information:
- the protocol used in the communication (
HTTP
, gRPC
, TCP
) and the request rate (or throughput for TCP connections),
- actual P95 latency, or
- whether the connection is using mTLS or not.
For certain services like MySQL and PostgreSQL, protocol-specific metrics normally not available in Istio are also shown, for example, sessions or transactions per second.
Note: Protocol-specific metrics for MySQL and PostgreSQL are available only for certain versions of MySQL and PostgreSQL. For details, see the documentation of the MySQL and the PostgreSQL Envoy filters.
Timeline
By default, the graph displays the current data. The timeline view allows you to select a specific point in time, and then move minutes back and forth to see how your most important metrics have changed. For example, you can use it to check how things changed for a specific service, when did the error rates go up, or how your latency values have changed over time when RPS values increased. This can be a good indicator to know where to look for errors in the logs, or to notice if something else has changed in the cluster that can be related to a specific failure.
- To display the timeline, select TIMELINE on the left, then use the timeline bar to adjust the date and the time. The date corresponding to the displayed data is shown below the topology graph.
- To return to the current real-time data, select LIVE.
Drill-down to the pods and nodes
You can drill-down from the MENU > TOPOLOGY page by selecting a service or a workload in the Istio service mesh. You can trace back an issue from the top-level service mesh layer by navigating deeper in the stack, and see the status and most important metrics of your Kubernetes controllers, pods, and nodes.
See Drill-down for details.
2.5.3.1 - Drill-down
You can drill-down from the MENU > TOPOLOGY page by selecting a service or a workload in the Istio service mesh. You can trace back an issue from the top-level service mesh layer by navigating deeper in the stack, and see the status and most important metrics of your Kubernetes controllers, pods, and nodes.
For an example on how you can diagnose and solve a real-life problem using the drill-down view, see the Service Mesh Manager drill-down blog post.
Drill-down from the Topology view
The highest level of the Topology view is the service mesh layer. This level contains the most important network-level metrics, and an overview of the corresponding Kubernetes controllers.
Click on a workload (
) or service (
) to display its details on the right. Workloads running on virtual machines have a blue icon in the corner (
).
From the details overview, you can drill down through the following levels to the underlying resources of the infrastructure:
To display the metrics-based health of the workload or the service, select the HEALTH tab. You can scroll down to display the charts of the selected metric (for example, saturation, latency, or success rate). Note that for workloads running on virtual machines, the total saturation of the virtual machine is shown.
Service overview
The following details of the service are displayed:
-
Namespace: The namespace the service belongs to.
-
APP: The application exposed using the service.
-
PORTS: The ports where the service is accessible, for example:
http 8080/TCP → 8080
grpc 8082/TCP → 8082
tcp 8083/TCP → 8083
-
Services: The services exposed in this resource. Click on the name of the service to display the details of the service.
-
Metrics: Dashboards of the most important metrics. Click
to open the related dashboards in Grafana.
-
Traces: Click
to run tracing with Jaeger.
CAUTION:
If you have installed Service Mesh Manager in
Anonymous mode, you won’t be able to access the Metrics and Traces dashboards from the UI. Clicking the

or

icon in anonymous mode causes the
RBAC: access denied error message.
Details of the workload
The following details of the workload are displayed:
- Namespace: The namespace the workload belongs to.
- APP: The application running in the workload.
- VERSION: The version number of the workload, for example, v2.
- REPLICAS: The number of replicas for the workload.
- LABELS: The list of Kubernetes labels assigned to the resource.
- Controllers: The controllers related to the workload. Click on a controller to display its details.
- Metrics: Dashboards of the most important metrics. Click
to open the related dashboards in Grafana.
Service details
Select a Service in the SERVICE section of a service overview to display the details of the service.
The following details of the service are displayed:
-
Namespace: The namespace the service belongs to.
-
CLUSTER: The name of the Kubernetes cluster the service belongs to.
-
SELECTOR: The label selector used to select the set of Pods targeted by the Service.
-
PORTS: The ports where the service is accessible, for example:
http 8080/TCP → 8080
grpc 8082/TCP → 8082
tcp 8083/TCP → 8083
-
TYPE: The ServiceType indicates how your service is exposed, for example, ClusterIP or LoadBalancer.
-
CLUSTER IP: The IP address corresponding to the ServiceType.
-
CREATED: The date when the service was started.
-
LABELS: The list of Kubernetes labels assigned to the resource.
-
Pods: The list of pods running this service, including their name, number of containers in the pod, and their status. Click on the name of the pod to display the details of the pod.
-
Events: Recent events related to the service resource.
Workload controller details
Select a deployment in the CONTROLLER section of a workload to display the details of the deployment.This view contains detailed information about the Kubernetes controller.
While the service mesh layer displays network level metrics and an aggregated view of the corresponding controllers, this view focuses on CPU and memory metrics, and the Kubernetes resources, like related pods or events. It’s also possible that multiple controllers belong to the same service mesh entity, for example, in a shared control plane multi-cluster scenario, when multiple clusters are running controllers that belong to the same logical workload.
The following details of the workload controller are displayed:
- Namespace: The namespace the workload belongs to.
- CLUSTER: The name of the Kubernetes cluster the workload belongs to.
- Kind: The type of the controller, for example, Deployment. If the workload is running on a virtual machine, the kind of the controller is WorkloadGroup.
- APP: The application running in the workload.
- VERSION: The version number of the workload, for example, v2.
- REPLICAS: The number of replicas for the workload.
- CREATED: The date when the workload was started.
- LABELS: The list of Kubernetes labels assigned to the resource.
- Pods: The list of pods running this workload. Click on the name of the pod to display the details of the pod.
- WorkloadEntries: The list of virtual machines running this workload. Click on the name of the WorkloadEntry to display the details of the WorkloadEntry.
- Events: Recent events related to the resource.
Details of the pod
To check the details of a pod, select a pod in the CONTROLLER > POD or the SERVICE > POD section.
The following details of the pod are displayed:
- Namespace: The namespace the pod belongs to.
- CLUSTER: The name of the Kubernetes cluster the pod belongs to.
- NODE: The hostname of the node the pod is running on, for example, ip-192-168-1-1.us-east-2.compute.internal. Click on the name of the node to display the details of the node.
- IP: The IP address of the pod.
- STARTED: The date when the pod was started.
- LABELS: The list of Kubernetes labels assigned to the resource.
- Containers: The list of containers in the pod. Also includes the Name, Image, and Status of the container.
- Events: Recent events related to the resource.
- Metrics: Dashboards of the most important metrics. Click
to open the related dashboards in Grafana.
CAUTION:
If you have installed Service Mesh Manager in
Anonymous mode, you won’t be able to access the Metrics and Traces dashboards from the UI. Clicking the

or

icon in anonymous mode causes the
RBAC: access denied error message.
To display the logs of the pod, click the
icon. The pod logs are displayed at the bottom of the screen.
Note: In multi-cluster scenarios, live log streaming is available only for pods running on the primary cluster.
Details of the node
To check the health of a node, select a node in the pod details view. The node view is the deepest layer of the drill-down view and shows information about a Kubernetes node.
The following details of the node that the pod is running on are displayed:
- CLUSTER: The name of the Kubernetes cluster the node belongs to.
- OS: The operating system running on the node, for example: linux amd64 (Ubuntu 18.04.4 LTS)
- STARTED: The date when the node was started.
- TAINTS: The list of Kubernetes taints assigned to the node.
- LABELS: The list of Kubernetes labels assigned to the node.
- Conditions: The status of the node, for example, disk and memory pressure, or network and kubelet status.
- Pods: The list of pods currently running on the node.
- Events: Recent events related to the node.
- Metrics: Dashboards of the most important metrics. Click
to open the related dashboards in Grafana.
CAUTION:
If you have installed Service Mesh Manager in
Anonymous mode, you won’t be able to access the Metrics and Traces dashboards from the UI. Clicking the

or

icon in anonymous mode causes the
RBAC: access denied error message.
Details of the WorkloadEntry
The following details of the pod are displayed:
- NAMESPACE: The namespace the pod belongs to.
- CLUSTER: The name of the Kubernetes cluster the pod belongs to.
- NETWORK: The name of the network the virtual machine running the workload belongs to.
- ADDRESS: The IP address of the virtual machine.
- PORTS: The open ports and related protocols of the WorkloadGroup.
- LABELS: The list of Kubernetes labels assigned to the resource.
- Events: Recent events related to the resource.
- Metrics: Dashboards of the most important metrics. Click
to open the related dashboards in Grafana.
2.5.3.2 - Generate test load
There are several places in the Service Mesh Manager interface where you can’t see anything if your application hasn’t received any traffic yet. For example, if there are no metrics in the system you won’t see any visualization on the MENU > OVERVIEW page.
Generate load on the UI
To send generated traffic to an endpoint, complete the following steps.
- On the Service Mesh Manager web interface, select MENU > TOPOLOGY.
- Click HTTP on the left side of the screen.
- Complete the form with the endpoint details and click SEND to generate test traffic to your services.
- In a few seconds a graph of your services appears.
Generate load in the CLI
If needed, you can generate constant traffic in the demo application by running: smm demoapp load start
2.5.4 - Workloads
You can drill-down from the MENU > TOPOLOGY page by selecting a service or a workload in the Istio service mesh. You can trace back an issue from the top-level service mesh layer by navigating deeper in the stack, and see the status and most important metrics of your Kubernetes controllers, pods, and nodes.
List of workloads
The MENU > WORKLOADS page contains information about the workloads in your service mesh. Above the list of workloads, there is a summary dashboard about the state of your workloads, showing the following information:
- Requests per second: Requests per second for the workloads.
- Average latency: Average latency for the workloads (95th percentile latency in milliseconds).
- Error rate: The percentage of requests returning a 5xx status code. Client-side errors (with 4XX status code) are not included.
- Clusters: The number of clusters in the service mesh.
- Workloads: The number of workloads in the mesh and the total number of workloads.
- Pods: The number of running pods and the desired number of pods.
The list displays the workloads (grouped by namespaces), and a timeline of the metrics-based health score of each workload. The
icon indicates Kubernetes workloads, while the
icon indicates workloads running on virtual machines. You can filter the list to show only the selected namespaces, and display historical data by adjusting the timeline.
- To display the details of a workload, click the name of the workload.
- To open the Grafana dashboards related to the workload, click
.
- To display the detailed health metrics of a workload, click the health indicator of the workload for the selected period.
From the mesh workload overview, you can drill down through the following levels to the underlying resources of the infrastructure:
Workload details
Select a Workload from the list to display its details.
The following details of the workload are displayed:
- NAMESPACE: The namespace the workload belongs to.
- APP: The application running in the workload.
- VERSION: The version number of the workload, for example, v2.
- REPLICAS: The number of replicas for the workload.
- LABELS: The list of Kubernetes labels assigned to the resource.
- HEALTH: Indicates the health score of the workload. Click the chart to display more details.
- Controller: The controllers related to the workload. Click on a controller to display its details.
- CLUSTER: The name of the Kubernetes cluster the workload belongs to.
- KIND: The kind of the workload, for example, DaemonSet, Deployment, ReplicaSet, or StatefulSet.
- Pods: The list of pods running this workload. Click on the name of the pod to display the details of the pod. You can also display and filter logs of the pod.
- Events: Recent events related to the resource.
- Metrics: Dashboards of the most important metrics. Click
to open the related dashboards in Grafana.
Pod details
To check the details of a pod, click the name of a pod in the Pods section.
The following details of the pod are displayed:
- Namespace: The namespace the pod belongs to.
- CLUSTER: The name of the Kubernetes cluster the pod belongs to.
- NODE: The hostname of the node the pod is running on, for example, ip-192-168-1-1.us-east-2.compute.internal. Click on the name of the node to display the details of the node.
- IP: The IP address of the pod.
- STARTED: The date when the pod was started.
- LABELS: The list of Kubernetes labels assigned to the resource.
- Containers: The list of containers in the pod. Also includes the Name, Image, and Status of the container.
- Events: Recent events related to the resource.
- Metrics: Dashboards of the most important metrics. Click
to open the related dashboards in Grafana.
CAUTION:
If you have installed Service Mesh Manager in
Anonymous mode, you won’t be able to access the Metrics and Traces dashboards from the UI. Clicking the

or

icon in anonymous mode causes the
RBAC: access denied error message.
To display the logs of the pod, click the
icon. The pod logs are displayed at the bottom of the screen.
Note: In multi-cluster scenarios, live log streaming is available only for pods running on the primary cluster.
Node details
To check the health of a node, select a node in the pod details view. The node view is the deepest layer of the drill-down view and shows information about a Kubernetes node.
The following details of the node that the pod is running on are displayed:
- CLUSTER: The name of the Kubernetes cluster the node belongs to.
- OS: The operating system running on the node, for example: linux amd64 (Ubuntu 18.04.4 LTS)
- STARTED: The date when the node was started.
- TAINTS: The list of Kubernetes taints assigned to the node.
- LABELS: The list of Kubernetes labels assigned to the node.
- Conditions: The status of the node, for example, disk and memory pressure, or network and kubelet status.
- Pods: The list of pods currently running on the node.
- Events: Recent events related to the node.
- Metrics: Dashboards of the most important metrics. Click
to open the related dashboards in Grafana.
CAUTION:
If you have installed Service Mesh Manager in
Anonymous mode, you won’t be able to access the Metrics and Traces dashboards from the UI. Clicking the

or

icon in anonymous mode causes the
RBAC: access denied error message.
WorkloadEntry details
The following details of the pod are displayed:
- NAMESPACE: The namespace the pod belongs to.
- CLUSTER: The name of the Kubernetes cluster the pod belongs to.
- NETWORK: The name of the network the virtual machine running the workload belongs to.
- ADDRESS: The IP address of the virtual machine.
- PORTS: The open ports and related protocols of the WorkloadGroup.
- LABELS: The list of Kubernetes labels assigned to the resource.
- Events: Recent events related to the resource.
- Metrics: Dashboards of the most important metrics. Click
to open the related dashboards in Grafana.
2.5.5 - Services
You can drill-down from the MENU > TOPOLOGY page by selecting a service or a workload in the Istio service mesh. You can trace back an issue from the top-level service mesh layer by navigating deeper in the stack, and see the status and most important metrics of your Kubernetes controllers, pods, and nodes.
List of services
The MENU > SERVICES page contains information about the services in your service mesh. Above the list of services, there is a summary dashboard about the state of your services, showing the following information:
The list displays the services (grouped by namespaces), and a timeline of the metrics-based health score of each service. You can filter the list to show only the selected namespaces, and display historical data by adjusting the timeline.
- To display the details of a service, click the name of the service.
- To open the Grafana dashboards related to the service, click
.
- To run tracing with Jaeger click
.
CAUTION:
If you have installed Service Mesh Manager in
Anonymous mode, you won’t be able to access the Metrics and Traces dashboards from the UI. Clicking the

or

icon in anonymous mode causes the
RBAC: access denied error message.
- To display the detailed health metrics of a service, click the health indicator of the service for the selected period.
From the services list, you can drill down through the following levels to the underlying resources of the infrastructure: List of services > Mesh Service > Service > Pod > Node.
Mesh Service details
-
Mesh Service: The multi-cluster mesh service available in the current mesh. Select a service from the SERVICE field to display the details of the service in any of the attached clusters.
-
Namespace: The namespace the service belongs to.
-
APP: The application exposed using the service.
-
PORTS: The ports where the service is accessible, for example:
http 8080/TCP → 8080
grpc 8082/TCP → 8082
tcp 8083/TCP → 8083
-
HEALTH: Indicates the health score of the workload. Click the chart to display more details.
-
Service Level Objectives: The details of the Service Level Objectives (SLOs) defined for the service. Click on an SLO to display its details.
-
Services: The list of services belonging to the mesh service.
-
Metrics: Dashboards of the most important metrics. Click on a service to display its details.
- To open the related Grafana dashboards, click
.
- To run tracing with Jaeger click
.
Display service details
Select a Service from the list to display its details.
- To run tracing with Jaeger click
. In case of a multi-cluster setup, you can select which cluster’s data to display.
The following details of the service are displayed:
Namespace: The namespace the service belongs to.
CLUSTER: The name of the Kubernetes cluster the service belongs to.
SELECTOR: The label selector used to select the set of Pods targeted by the Service.
PORTS: The ports where the service is accessible, for example:
http 8080/TCP → 8080
grpc 8082/TCP → 8082
tcp 8083/TCP → 8083
TYPE: The ServiceType indicates how your service is exposed, for example, ClusterIP or LoadBalancer.
CLUSTER IP: The IP address corresponding to the ServiceType.
CREATED: The date when the service was started.
LABELS: The list of Kubernetes labels assigned to the resource.
Pods: The list of pods running this service, including their name, number of containers in the pod, and their status. Click on the name of the pod to display the details of the pod.
Events: Recent events related to the service resource.
Display pod details
To check the details of a pod, click the name of the pod in the PODS section.
The following details of the pod are displayed:
- Namespace: The namespace the pod belongs to.
- CLUSTER: The name of the Kubernetes cluster the pod belongs to.
- NODE: The hostname of the node the pod is running on, for example, ip-192-168-1-1.us-east-2.compute.internal. Click on the name of the node to display the details of the node.
- IP: The IP address of the pod.
- STARTED: The date when the pod was started.
- LABELS: The list of Kubernetes labels assigned to the resource.
- Containers: The list of containers in the pod. Also includes the Name, Image, and Status of the container.
- Events: Recent events related to the resource.
- Metrics: Dashboards of the most important metrics. Click
to open the related dashboards in Grafana.
CAUTION:
If you have installed Service Mesh Manager in
Anonymous mode, you won’t be able to access the Metrics and Traces dashboards from the UI. Clicking the

or

icon in anonymous mode causes the
RBAC: access denied error message.
To display the logs of the pod, click the
icon. The pod logs are displayed at the bottom of the screen.
Note: In multi-cluster scenarios, live log streaming is available only for pods running on the primary cluster.
Display node details
To check the health of a node, select a node in the pod details view. The node view is the deepest layer of the drill-down view and shows information about a Kubernetes node.
The following details of the node that the pod is running on are displayed:
- CLUSTER: The name of the Kubernetes cluster the node belongs to.
- OS: The operating system running on the node, for example: linux amd64 (Ubuntu 18.04.4 LTS)
- STARTED: The date when the node was started.
- TAINTS: The list of Kubernetes taints assigned to the node.
- LABELS: The list of Kubernetes labels assigned to the node.
- Conditions: The status of the node, for example, disk and memory pressure, or network and kubelet status.
- Pods: The list of pods currently running on the node.
- Events: Recent events related to the node.
- Metrics: Dashboards of the most important metrics. Click
to open the related dashboards in Grafana.
CAUTION:
If you have installed Service Mesh Manager in
Anonymous mode, you won’t be able to access the Metrics and Traces dashboards from the UI. Clicking the

or

icon in anonymous mode causes the
RBAC: access denied error message.
2.5.6 - Gateways
The MENU > GATEWAYS page of the Service Mesh Manager web interface allows you to:
Note: Service Mesh Manager uses the concept of IstioMeshGateways, a declarative representation of Istio ingress and egress gateway services and deployments. With the help of IstioMeshGateways, you can set up multiple gateways in a cluster, and use them for different purposes.
To create a new ingress gateway, see Create ingress gateway.
List gateways
To list the gateways of your service mesh, navigate to MENU > GATEWAYS.
For each gateway, the following information is shown:
- Name: The name of the gateway.
- Namespace: The namespace the gateway belongs to.
- Cluster: The cluster the gateway belongs to. Mainly useful in multi-cluster scenarios.
- Type: Type of the gateway. Ingress gateways define an entry point into your Istio mesh for incoming traffic, while egress gateways define an exit point from your Istio mesh for outgoing traffic.
- Open ports: The ports the gateway accepts connections on.
- Hosts: Number of hosts accessible using the gateway.
- Routes: Number of routing rules configured for the ingress traffic.
- Error rate: The number of errors during the last polling interval, for 5XX errors. Client-side errors (with 4XX status code) are not included.
- Requests per second: The number of requests per second during the last polling interval.
- Status: Status of the gateway.
Click the name of a gateway to display the details of the gateway (grouped into several tabs: Overview and host configuration, Routes, Deployment and Service).
To display the YAML configuration of MeshGateways, Gateways, or VirtualServices, click the name of the gateway in the list, then click the
icon next to their name.
Monitor upstream traffic
Service Mesh Manager collects upstream metrics like latencies, throughput, RPS, or error rate from Prometheus, and provides a summary for each gateway. It also sets up a Grafana dashboard and displays appropriate charts in-place.
To monitor the upstream traffic of your Istio gateways, complete the following steps.
-
Open the Service Mesh Manager web interface, and navigate to MENU > GATEWAYS.
-
From the list of gateways, click the gateway you want to monitor.
-
On the OVERVIEW tab, scroll down to the METRICS section. The most important metrics of the gateway are displayed on the Service Mesh Manager web interface (for example, upstream requests per second and error rate).
Note: You can also view the details of the service or the deployment related to the gateway.
Click
to open the related dashboards in Grafana.
CAUTION:
If you have installed Service Mesh Manager in
Anonymous mode, you won’t be able to access the Metrics and Traces dashboards from the UI. Clicking the

or

icon in anonymous mode causes the
RBAC: access denied error message.
Gateway deployment and service details
To display the details, events, and most important metrics of the deployment and service related to a gateway, navigate to MENU > GATEWAYS > <Gateway-to-inspect>, then click SERVICE or DEPLOYMENT.
2.5.6.1 - Create ingress gateway
Overview
Ingress gateways define an entry point into your Istio mesh for incoming traffic.
You can configure gateways using the Gateway and VirtualService custom resources of Istio, and the IstioMeshGateway CR of Service Mesh Manager.
- The
Gateway
resource describes the port configuration of the gateway deployment that operates at the edge of the mesh and receives incoming or outgoing HTTP/TCP connections. The specification describes a set of ports that should be exposed, the type of protocol to use, TLS configuration – if any – of the exposed ports, and so on. For more information about the gateway resource, see the Istio documentation.
- The
VirtualService
resource defines a set of traffic routing rules to apply when a host is addressed. Each routing rule defines matching criteria for the traffic of a specific protocol. If the traffic matches a routing rule, then it is sent to a named destination service defined in the registry. For example, it can route requests to different versions of a service or to a completely different service than was requested. Requests can be routed based on the request source and destination, HTTP paths and header fields, and weights associated with individual service versions. For more information about VirtualServices, see the Istio documentation.
- Service Mesh Manager provides a custom resource called
IstioMeshGateway
and uses a separate controller to reconcile gateways, allowing you to use multiple gateways in multiple namespaces. That way you can also control who can manage gateways, without having permissions to modify other parts of the Istio mesh configuration.
Using IstioMeshGateway, you can add Istio ingress or egress gateways in the mesh and configure them. When you create a new IstioMeshGateway CR, Service Mesh Manager takes care of configuring and reconciling the necessary resources, including the Envoy deployment and its related Kubernetes service.
Note: Service Mesh Manager automatically creates an ingress gateway called smm-ingressgateway and an istio-meshexpansion-cp-v115x. The smm-ingressgateway serves as the main entry point for the services of Service Mesh Manager, for example, the dashboard and the API, while the meshexpansion gateway is used in multi-cluster setups to ensure communication between clusters for the Istio control plane and the user services.
Do not use this gateway for user workloads, because it is managed by Service Mesh Manager, and any change to its port configuration will be overwritten. Instead, create a new mesh gateway using the IstioMeshGateway custom resource.
Prerequisites
Auto sidecar injection must be enabled for the namespace of the service you want to make accessible.
Steps
To create a new ingress gateway and expose a service, complete the following steps.
-
If you haven’t already done so, create and expose the service you want to make accessible through the gateway.
For testing, you can download and apply the following echo service:
apiVersion: apps/v1
kind: Deployment
metadata:
name: echo
labels:
k8s-app: echo
namespace: default
spec:
replicas: 1
selector:
matchLabels:
k8s-app: echo
template:
metadata:
labels:
k8s-app: echo
spec:
terminationGracePeriodSeconds: 2
containers:
- name: echo-service
image: k8s.gcr.io/echoserver:1.10
ports:
- containerPort: 8080
---
apiVersion: v1
kind: Service
metadata:
name: echo
labels:
k8s-app: echo
namespace: default
spec:
ports:
- name: http
port: 80
targetPort: 8080
selector:
k8s-app: echo
kubectl apply -f echo.yaml
Expected output:
deployment.apps/echo created
service/echo created
-
Create a new ingress gateway using the IstioMeshGateway resource.
-
Download the following resource and adjust it as needed for your environment:
CAUTION:
By the default, the IstioMeshGateway pod is running without root privileges, therefore it cannot use ports under 1024. Either use ports above 1024 as targetports (for example, 8080 instead of 80) or run the gateway pod with root privileges by setting
spec.runAsRoot: true in the IstioMeshGateway custom resource.
apiVersion: servicemesh.cisco.com/v1alpha1
kind: IstioMeshGateway
metadata:
name: demo-gw
spec:
istioControlPlane:
name: cp-v115x
namespace: istio-system
runAsRoot: false
service:
ports:
- name: tcp-status-port
port: 15021
protocol: TCP
targetPort: 15021
- name: http
port: 80
protocol: TCP
targetPort: 8080
type: LoadBalancer
type: ingress
-
Apply the IstioMeshGateway resource. Service Mesh Manager creates a new ingress gateway deployment and a corresponding service, and automatically labels them with the gateway-name and gateway-type labels and their corresponding values.
kubectl apply -f meshgw.yaml
Expected output:
istiomeshgateway.servicemesh.cisco.com/demo-gw created
-
Get the IP address of the gateway. (Adjust the name and namespace of the IstioMeshGateway as needed for your environment.)
kubectl -n default get istiomeshgateways demo-gw
The output should be similar to:
NAME TYPE SERVICE TYPE STATUS INGRESS IPS ERROR AGE CONTROL PLANE
demo-gw ingress LoadBalancer Available ["3.10.16.232"] 107s {"name":"cp-v115x","namespace":"istio-system"}
-
Create the Gateway and VirtualService resources to configure listening ports on the matching gateway deployment. Make sure to adjust the hosts fields to the external hostname of the service. (You should manually set an external hostname that points to these addresses, but for testing purposes you can use for example nip.io, which is a domain name that provides wildcard DNS for any IP address.)
apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
name: echo
namespace: default
spec:
selector:
gateway-name: demo-gw
gateway-type: ingress
servers:
- port:
number: 80
name: http
protocol: HTTP
hosts:
- "echo.3.10.16.232.nip.io"
---
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: echo
namespace: default
spec:
hosts:
- "echo.3.10.16.232.nip.io"
gateways:
- echo
http:
- route:
- destination:
port:
number: 80
host: echo.default.svc.cluster.local
kubectl apply -f gwvs.yaml
Expected output:
gateway.networking.istio.io/echo created
virtualservice.networking.istio.io/echo created
-
Access the service on the external address.
curl -i echo.3.10.16.232.nip.io
The output should be similar to:
HTTP/1.1 200 OK
date: Mon, 07 Mar 2022 19:22:15 GMT
content-type: text/plain
server: istio-envoy
x-envoy-upstream-service-time: 1
Hostname: echo-68578cf9d9-874rz
...
IstioMeshGateway CR reference
This section describes the fields of the IstioMeshGateway custom resource.
apiVersion (string)
Must be servicemesh.cisco.com/v1alpha1
kind (string)
Must be IstioMeshGateway
spec (object)
The configuration and parameters of the IstioMeshGateway.
spec.type (string, required)
Type of the mesh gateway. Ingress gateways define an entry point into your Istio mesh for incoming traffic, while egress gateways define an exit point from your Istio mesh for outgoing traffic. Possible values:
spec.istioControlPlane (object, required)
Specifies the istiocontrolplane cr the istio-proxy connects to by a namespaced name. When upgrading to a new Istio version (thus to a new control plane), this should be upgraded.
For example:
spec:
istioControlPlane:
name: cp-v115x
namespace: istio-system
spec.deployment (object)
Configuration options for the Kubernetes istio-proxy deployment. Metadata like labels and annotations can be set here for the deployment or pods as well, in spec.deployment.metadata.annotations or spec.deployment.podMetadata.annotations.
spec.service (object, required)
Configuration options for the Kubernetes service. Annotations can be set here as well as in spec.service.metadata.annotations, they are often useful in cloud loadbalancer cases, for example to specify some configuration for AWS.
For example:
service:
ports:
- name: tcp-status-port
port: 15021
protocol: TCP
targetPort: 15021
- name: http
port: 80
protocol: TCP
targetPort: 8080
type: LoadBalancer
spec.runAsRoot (true | false)
Whether to run the gateway in a privileged container. If not running as root, only ports higher than 1024 can be opened on the container. Default value: false
2.5.6.2 - Create egress gateway
Egress traffic
Traffic that’s outbound from a pod that has an Istio sidecar also passes through that sidecar’s container, (more precisely, through Envoy). Therefore, the accessibility of external services depends on the configuration of that Envoy proxy.
By default, Istio configures the Envoy proxy to enable requests for unknown services. Although this provides a convenient way of getting started with Istio, it’s generally a good idea to put stricter controls in place.
Allow only registered access
You can configure Service Mesh Manager to permit access only to specific external services. For details, see External Services.
Egress gateway
Egress gateways define an exit point from your Istio mesh for outgoing traffic. Egress gateways also allow you to apply Istio features on the traffic that exits the mesh, for example monitoring, routing rules, or retries.
Why do you need egress gateways? For example:
- Your organization requires some, or all, outbound traffic to go through dedicated nodes. These nodes could be separated from the rest of the nodes for the purposes of monitoring and policy enforcement.
- The application nodes of a cluster don’t have public IPs, so the in-mesh services that run on them cannot access the internet directly. Allocating public IPs to the egress gateway nodes and routing egress traffic through the gateway allows for controlled access to external services.
CAUTION:
Using an egress gateway doesn’t restrict outgoing traffic, it only routes it through the egress gateway. To limit access only to selected external services, see
External Services.
Create egress gateway
To create an egress gateway and route egress traffic through it, complete the following steps.
Note: The YAML samples work with the Service Mesh Manager demo application. Adjust their parameters (for example, namespace, service name, and so on) as needed for your environment.
CAUTION:
Using an egress gateway doesn’t restrict outgoing traffic, it only routes it through the egress gateway. To limit access only to selected external services, see
External Services.
-
Create an egress gateway using the IstioMeshGateway resource.
CAUTION:
By the default, the IstioMeshGateway pod is running without root privileges, therefore it cannot use ports under 1024. Either use ports above 1024 as targetports (for example, 8080 instead of 80) or run the gateway pod with root privileges by setting
spec.runAsRoot: true in the IstioMeshGateway custom resource.
For testing, you can download and apply the following resource to create a new egress gateway deployment and a corresponding service in the smm-demo namespace.
apiVersion: servicemesh.cisco.com/v1alpha1
kind: IstioMeshGateway
metadata:
name: egress-demo
namespace: smm-demo
spec:
istioControlPlane:
name: cp-v115x
namespace: istio-system
deployment:
replicas:
max: 1
min: 1
service:
ports:
- name: http
port: 80
protocol: TCP
targetPort: 8080
type: ClusterIP
type: egress
kubectl apply -f egress-meshgateway.yaml
Expected output:
istiomeshgateway.servicemesh.cisco.com/egress-demo created
-
Create a Gateway resource for the egress gateway. The Gateway resource connects the Istio configuration resources and the deployment of a matching gateway. Apply the following Gateway resource to configure the outbound port (80 in the previous example) on the egress gateway that you have defined in the previous step.
apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
name: egress-demo
namespace: smm-demo
spec:
selector:
gateway-name: egress-demo
gateway-type: egress
servers:
- port:
number: 80
name: http
protocol: HTTP
hosts:
- "*"
kubectl apply -f egress-gateway.yaml
Expected output:
gateway.networking.istio.io/egress-demo created
-
Define a VirtualService resource to direct traffic from the sidecars to the egress gateway.
Apply the following VirtualService to direct traffic from the sidecars to the egress gateway, and also from the egress gateway to the external service. Edit the VirtualService to match the external service you want to permit access to.
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: httpbin-egress
namespace: smm-demo
spec:
hosts:
- httpbin.org
gateways:
- egress-demo
- mesh
http:
- match:
- gateways:
- mesh
port: 80
route:
- destination:
host: egress-demo.smm-demo.svc.cluster.local
port:
number: 80
- match:
- gateways:
- egress-demo
port: 80
route:
- destination:
host: httpbin.org
port:
number: 80
kubectl apply -f egress-virtualservice.yaml
Expected output:
virtualservice.networking.istio.io/httpbin-egress created
-
Test access to the external service.
If you have installed the Service Mesh Manager demo application and used the examples in the previous steps, you can run the following command to start requests from the notifications-v1 workload to the external http-bin service:
kubectl -n smm-demo set env deployment/notifications-v1 'REQUESTS=http://httpbin.org/get#1'
- If everything is set up correctly, the new egress gateway appears on the MENU > GATEWAYS page.
- If there is egress traffic, the gateway appears on theMENU > GATEWAYS page (make sure to select the relevant namespace). Note that the traffic from the gateway to the external service is visible only if you create a ServiceEntry resource for the external service.
-
If needed, permit access only to specific external services. For details, see External Services.
IstioMeshGateway CR reference
This section describes the fields of the IstioMeshGateway custom resource.
apiVersion (string)
Must be servicemesh.cisco.com/v1alpha1
kind (string)
Must be IstioMeshGateway
spec (object)
The configuration and parameters of the IstioMeshGateway.
spec.type (string, required)
Type of the mesh gateway. Ingress gateways define an entry point into your Istio mesh for incoming traffic, while egress gateways define an exit point from your Istio mesh for outgoing traffic. Possible values:
spec.istioControlPlane (object, required)
Specifies the istiocontrolplane cr the istio-proxy connects to by a namespaced name. When upgrading to a new Istio version (thus to a new control plane), this should be upgraded.
For example:
spec:
istioControlPlane:
name: cp-v115x
namespace: istio-system
spec.deployment (object)
Configuration options for the Kubernetes istio-proxy deployment. Metadata like labels and annotations can be set here for the deployment or pods as well, in spec.deployment.metadata.annotations or spec.deployment.podMetadata.annotations.
spec.service (object, required)
Configuration options for the Kubernetes service. Annotations can be set here as well as in spec.service.metadata.annotations, they are often useful in cloud loadbalancer cases, for example to specify some configuration for AWS.
For example:
service:
ports:
- name: tcp-status-port
port: 15021
protocol: TCP
targetPort: 15021
- name: http
port: 80
protocol: TCP
targetPort: 8080
type: LoadBalancer
spec.runAsRoot (true | false)
Whether to run the gateway in a privileged container. If not running as root, only ports higher than 1024 can be opened on the container. Default value: false
2.5.6.3 - Manage host and port configuration
Service Mesh Manager understands the Gateway CRs of Istio and the gateway’s service configuration in Kubernetes (with the help of the MeshGateway CR), so it can display information about ports, hosts, and protocols that are configured on a specific gateway.
-
Open the Service Mesh Manager web interface, and navigate to MENU > GATEWAYS.
-
From the list of gateways, click the gateway you want to monitor.
-
You can see the host and port configurations on the OVERVIEW tab, in the Ports & Hosts section.
The following information is shown about each entry point.
- GATEWAY NAME: The name of the gateway.
- GATEWAY NAMESPACE: The namespace the gateway belongs to.
- PORT: The list of open ports on the gateway.
- PROTOCOL: The protocols permitted on the gateway.
- HOSTS: The host selector that determines which hosts are accessible using the gateway.
- TLS: The TLS settings applying to the gateway.
- To modify an existing route, click
, change the settings as needed, then click APPLY.
- To delete a route, click
.
- To create a new entry point, click CREATE NEW.
Create new ingress entry point
You can set up a new entry point for your Istio ingress gateways, and Service Mesh Manager translates your configuration to declarative custom resources.
-
Navigate to MENU > GATEWAYS > <Gateway-to-modify> > OVERVIEW.
-
In the Ports & Hosts section, click CREATE NEW.
-
Set the parameters of the entry point. As a minimum, you must set the port number for the entry point, and the protocol (for example, HTTP, HTTPS, or GRPC) that is accepted at the entry point.
Note: DNS resolution is not managed by Service Mesh Manager.
Once you’ve configured ingress for a particular service, Service Mesh Manager will display the IP of the ingress gateway service.
Do not forget to create the corresponding DNS records that point to this IP.
-
Click CREATE.
Gateway TLS settings
When setting up a service on a gateway with TLS, you need to configure a certificate for the host(s). You can do that by bringing your own certificate, putting it down in a Kubernetes secret, and configuring it for a gateway server. This works for simple use cases, but involves lots of manual steps when obtaining or renewing a certificate. Automated Certificate Management Environments (ACME) automates these kinds of interactions with the certificate provider.
ACME is most widely used with Let’s Encrypt and - when in a Kubernetes environment - cert-manager. Service Mesh Manager helps you set up cert-manager, and you can quickly obtain a valid Let’s Encrypt certificate through the dashboard with a few clicks.
Note: For details on using your own domain name with Let’s Encrypt, see Using Let’s Encrypt with your own domain name.
To set TLS encryption for a gateway, complete the following steps.
-
Navigate to MENU > GATEWAYS > <Gateway-to-modify> > OVERVIEW.
-
In the Ports & Hosts section, click
in the row of the gateway you want to modify.
-
Set PORT PROTOCOL to HTTPS.
-
Decide how you want to provide the certificate for the gateway.
- To use Let’s Encrypt, select USE LET’S ENCRYPT FOR TLS, then enter a valid email address into the CONTACT EMAIL field. The provided email address will be used to notify about expirations and to communicate about any issues specific to your account.
- To use a custom certificate, upload a certificate as a Kubernetes secret, then set the name of the secret in the TLS SECRET NAME field. Note that currently you cannot upload the certificate from the Service Mesh Manager UI, use regular Kubernetes tools instead.
-
Click CREATE.
2.5.6.4 - Routes and traffic management
Note: This section describes the routing rules of ingress gateways. To configure routing rules for in-mesh services, see Routing.
One of the main reasons to use Istio gateways instead of native Kubernetes ingress is that you can use VirtualServices to configure the routing of incoming traffic, just like for in-mesh routes. You can apply Istio concepts to incoming requests, like redirects, rewrites, timeouts, retries, or fault injection.
Service Mesh Manager displays routes and their related configuration on the Gateways page, and gives you the ability to configure routing. Service Mesh Manager translates the inputs to Istio CRs (mainly VirtualServices), then validates and applies them to the cluster.
The MENU > GATEWAYS > <Gateway-to-inspect> > ROUTES page displays the following information about the routes of the gateway.
- VIRTUAL SERVICE: The name of the VirtualService resource for the gateway. To display the YAML configuration of the VirtualService, click the
icon next to its name.
- GATEWAYS: The names of gateways and sidecars that apply these routes.
- HOSTS: The host selector that determines which hosts are accessible using the route.
- MATCH: The route applies only to requests matching this expression.
- DESTINATION: The destinations of the routing rule.
- ACTIONS: Any special action related to the route (for example, rewrite).
- PROTOCOL: The protocol permitted in the route.
- To modify an existing route, click
.
- To delete a route, click
.
- To create a new route, click CREATE NEW.
Routing rule precedence
Note the following points about how Service Mesh Manager evaluates the routing rules:
- Rules are evaluated in top-down order.
- Rules that match on
any
traffic are always the last to help avoid rule shadowing.
- Changing the order of rules is not supported in Service Mesh Manager.
When you specify multiple MATCH arguments, they have a logical OR relationship: the rule matches any request that fits one of the match rules. Within a match rule, you can specify multiple rules that have an AND relation. That way you can match requests against a specific URL and an HTTP method, for example.
Create a new route
To create a new route on the gateway, or to apply redirects, rewrites, or other changes to the incoming requests, complete the following steps.
Note: Rules are evaluated in top-down order. For more details, see Rule precedence.
-
Navigate to MENU > GATEWAYS > <Gateway-to-modify> > ROUTES.
-
Click CREATE NEW.
-
Select the gateways that should apply this rule in the GATEWAYS field.
-
By default, the new rule matches every incoming request. Click ADD CUSTOM MATCH to select only specific traffic for the rule based on scheme, method, URI, host, port, or authority.
When you specify multiple MATCH arguments, they have a logical OR relationship: the rule matches any request that fits one of the match rules. Within a match rule, you can specify multiple rules that have an AND relation. That way you can match requests against a specific URL and an HTTP method, for example.
For example, the following rule matches GET requests, and PUT requests received on port 8080.
-
Set the action you want to execute on the matching requests.
-
You can Route the requests to a specific service. To route a portion of the traffic to a different destination, select ADD DESTINATION and use the WEIGHT parameter to split the traffic between multiple destination services.
Note: If you want to mirror the traffic (that is, send the same requests to multiple destinations), see Mirroring.
-
Alternatively, you can Redirect the traffic to a specific URI. Redirect rules overwrite the Path portion of the URL with this value. Note that the entire path is replaced, irrespective of the request URI being matched as an exact path or prefix. To overwrite the Authority/Host portion of the URL, set the AUTHORITY FIELD as well.
-
Set TIMEOUT and RETRY options as needed for your environment.
-
Click Apply. The new rule appears on the ROUTES tab.
You can later edit or delete the routing rule by clicking the
or
icons, respectively.
2.5.6.5 - Using Let's Encrypt with your own domain name
The following procedure shows you how to set up an encrypted HTTPS port under your own domain name for your services, and obtain a matching certificate from Let’s Encrypt.
This requires solving the ACME HTTP-01 challenge, and this involves routing an HTTP request from the ACME server (the Certificate Authority) to the cert-manager challenge-solver pod.
Complete the following steps.
-
Open the Service Mesh Manager web interface, and navigate to MENU > GATEWAYS > OVERVIEW.
-
Select the gateway you want secured. Note that the SERVICE TYPE of the gateway must be LoadBalancer. The load balancer determines the IP address(es) to be used for the ACME HTTP-01 challenge. In the following example, it’s istio-meshexpansion-gateway-cp-v115x.
-
Point your domain name to the IP address or DNS name found in the ADDRESS field.
-
Configure the ingress gateway.
-
In the Ports & Hosts section, click CREATE NEW in the upper right corner.
-
Select the HTTPS protocol and the port you want to accept incoming connections on (probably 443).
-
Enter your domain name into the HOSTS field. To enter multiple domain names, use Enter.
-
Select Use Let’s Encrypt for TLS to get a certificate for your domain from Let’s Encrypt.
-
Enter your email address. This address is forwarded to Let’s Encrypt and is used for ACME account management.
-
Click CREATE.
-
Two more items appear in the Ports & Hosts list for your domain name:
- One on the HTTPS port (for example, 443) for the incoming connection requests, and
- the other on port 80 for solving the ACME HTTP-01 challenge.
A warning icon shows if the HTTPS port is not valid yet.
-
Wait while the certificate arrives. After a short while the item with port 80 and protocol HTTP disappears, and a green check mark appears next to HTTPS. This shows that the certificate has been issued and is used to secure your domain:
-
Set up routing for your service. Use the gateway, host, and port number you provided in this procedure. For details, see Routes and traffic management.
-
Test that your service can be accessed, and that it shows the proper certificate.
2.5.7 - Traffic tap
The traffic tap feature of Service Mesh Manager enables you to monitor live access logs of the Istio sidecar proxies.
Each sidecar proxy outputs access information for the individual HTTP requests or HTTP/gRPC streams.
The access logs contain information about the:
- reporter proxy,
- source and destination workloads,
- request,
- response, as well as the
- timings.
Note: For workloads that are running on virtual machines, the name of the pod is the hostname of the virtual machine.
Traffic tap using the UI
Traffic tap is also available from the dashboard. To use it, complete the following steps.
-
Select MENU > TRAFFIC TAP.
-
Select the reporter (namespace or workload) from the REPORTING SOURCE field.
-
Click FILTERS to set additional filters, for example, HTTP method, destination, status code, or HTTP headers.
-
Click START STREAMING.
-
Select an individual log to see its details:
-
After you are done, click PAUSE STREAMING.
Traffic tap using the CLI
These examples work out of the box with the demo application packaged with Service Mesh Manager.
Change the service name and namespace to match your service.
To watch the access logs for an individual namespace, workload or pod, use the smm tap
command. For example, to tap the smm-demo namespace, run:
The output should be similar to:
✓ start tapping max-rps=0
2022-04-25T10:56:47Z outbound frontpage-v1-776d76965-b7w76 catalog-v1-5864c4b7d7-j5cmf "http GET / HTTP11" 200 121.499879ms "tcp://10.10.48.169:8080"
2022-04-25T10:56:47Z outbound frontpage-v1-776d76965-b7w76 catalog-v1-5864c4b7d7-j5cmf "http GET / HTTP11" 200 123.066985ms "tcp://10.10.48.169:8080"
2022-04-25T10:56:47Z inbound bombardier-66786577f7-sgv8z frontpage-v1-776d76965-b7w76 "http GET / HTTP11" 200 145.422013ms "tcp://10.20.2.98:8080"
2022-04-25T10:56:47Z outbound frontpage-v1-776d76965-b7w76 catalog-v1-5864c4b7d7-j5cmf "http GET / HTTP11" 200 129.024302ms "tcp://10.10.48.169:8080"
2022-04-25T10:56:47Z outbound frontpage-v1-776d76965-b7w76 catalog-v1-5864c4b7d7-j5cmf "http GET / HTTP11" 200 125.462172ms "tcp://10.10.48.169:8080"
2022-04-25T10:56:47Z inbound bombardier-66786577f7-sgv8z frontpage-v1-776d76965-b7w76 "http GET / HTTP11" 200 143.590923ms "tcp://10.20.2.98:8080"
2022-04-25T10:56:47Z outbound frontpage-v1-776d76965-b7w76 catalog-v1-5864c4b7d7-j5cmf "http GET / HTTP11" 200 121.868301ms "tcp://10.10.48.169:8080"
2022-04-25T10:56:47Z inbound bombardier-66786577f7-sgv8z frontpage-v1-776d76965-b7w76 "http GET / HTTP11" 200 145.090036ms "tcp://10.20.2.98:8080"
...
Filter on workload or pod
You can tap into specific workloads and pods, for example:
At large volume it’s difficult to find the relevant or problematic logs, but you can use filter flags to display only the relevant lines, for example:
# Show only server errors
smm tap ns/smm-demo --method GET --response-code 500,599
The output can be similar to:
2020-02-06T14:00:13Z outbound frontpage-v1-57468c558c-8c9cb bookings:8080 " GET / HTTP11" 503 173.099µs "tcp://10.10.111.111:8080" 2020-02-06T14:00:18Z outbound frontpage-v1-57468c558c-8c9cb bookings:8080 " GET / HTTP11" 503 157.164µs "tcp://10.10.111.111:8080" 2020-02-06T14:00:19Z outbound frontpage-v1-57468c558c-4w26k bookings:8080 " GET / HTTP11" 503 172.541µs "tcp://10.10.111.111:8080" 2020-02-06T14:00:15Z outbound frontpage-v1-57468c558c-8c9cb bookings:8080 " GET / HTTP11" 503 165.05µs "tcp://10.10.111.111:8080" 2020-02-06T14:00:15Z outbound frontpage-v1-57468c558c-8c9cb bookings:8080 " GET / HTTP11" 503 125.671µs "tcp://10.10.111.111:8080" 2020-02-06T14:00:19Z outbound frontpage-v1-57468c558c-8c9cb bookings:8080 " GET / HTTP11" 503 101.701µs "tcp://10.10.111.111:8080"
You can also change the output format to JSON, and use the jq
command line tool to further filter or map the log entries, for example:
# Show pods with a specific user-agent
smm tap ns/smm-demo -o json | jq 'select(.request.userAgent=="fasthttp") | .source.name'
The output can be similar to:
"payments-v1-7c955bccdd-vt2pg"
"bookings-v1-7d8d76cd6b-f96tm"
"bookings-v1-7d8d76cd6b-f96tm"
"payments-v1-7c955bccdd-vt2pg"
"bookings-v1-7d8d76cd6b-f96tm"
2.5.8 - Managing Kafka clusters
The Streaming Data Manager dashboard is integrated into the Service Mesh Manager dashboard, allowing you to manage and overview your Apache Kafka deployments, including brokers, topics, ACLs, and more. For details on using the dashboard features related to Apache Kafka, see Dashboard.
2.5.9 - Configuring the Dashboard
2.5.9.1 - Exposing the Dashboard
By default, Service Mesh Manager relies on Kubernetes' built-in authentication and proxying capabilities to allow our users to access the Dashboard. In some cases, it makes sense to allow developers to access the Dashboard via a public URL, to make distributing Service Mesh Manager client binaries easier.
You can download the Service Mesh Manager client binaries from the login page:
Or alternatively, the deployment can use an OIDC-compliant External Provider for authentication so that there’s no need for downloading and installing the CLI binary.
Expose the dashboard
While planning to expose the dashboard, consider the following:
- Does the Kubernetes cluster running Service Mesh Manager support LoadBalancer typed services natively? If not, see exposing via NodePort.
- Where to terminate the TLS connections? (Should it be terminated by Istio inside the cluster, or should it be terminated by an external LoadBalancer?)
- How to manage the TLS certificate for the dashboard? (Do you want to use Let’s Encrypt for certificates, or does your organization have its own certificate authority?)
For some of the examples, we assume that the externalDNS controller is installed and functional on the cluster. If not, make sure to manually set up the required DNS record based on your deployment.
This document covers a few scenarios to address the setups based on the answers to the previous questions.
Recommended setup
In this scenario, we are assuming that:
- Your Kubernetes cluster supports LoadBalancer typed services to expose services externally.
- You use Istio to terminate the TLS connections inside the cluster.
- You want to use Let’s Encrypt to manage the certificates.
- The externalDNS controller is operational on the cluster.
The dashboard will be exposed on the domain name smm.example.org
. To expose Service Mesh Manager on that URL, add the following to the Service Mesh Manager ControlPlane resource:
cat > enable-dashboard-expose.yaml <<EOF
spec:
smm:
exposeDashboard:
meshGateway:
enabled: true
service:
annotations:
external-dns.alpha.kubernetes.io/hostname: smm.example.org.
tls:
enabled: true
letsEncrypt:
dnsNames:
- smm.example.org
enabled: true
# server: https://acme-staging-v02.api.letsencrypt.org/directory
EOF
kubectl patch controlplane --type=merge --patch "$(cat enable-dashboard-expose.yaml )" smm
- If you are using Service Mesh Manager in Operator Mode, then the Istio deployment is updated automatically.
- If you are using the imperative mode, run the
smm operator reconcile
command to apply the changes.
The dashboard is now available on the https://smm.example.org/
URL.
Note: When externalDNS is not present on the cluster, make sure that the external name of the MeshGateway
service is assigned to the right DNS name. Otherwise, Certificate
requests will fail. To check the IP address/name of the service, run the kubectl get service smm-ingressgateway-external --namespace smm-system
command. The output should be similar to:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
smm-ingressgateway-external LoadBalancer 10.10.157.144 afd8bac546b1e46faab0e284fa0dc5da-580525876.eu-north-1.elb.amazonaws.com 15021:30566/TCP,80:32436/TCP,443:30434/TCP 20h
Terminate TLS on the LoadBalancer
To terminate TLS on the LoadBalancer, in the Service Mesh Manager ControlPlane resource you must set the .spec.smm.exposeDashboard.meshGateway.tls.enabled
value to false
.
If the Kubernetes Service requires additional annotations to enable TLS, add these annotations to the ControlPlane resource. For example, for AWS/EKS you can use the following settings to terminate TLS with AWS Certificate Manager:
cat > enable-dashboard-expose.yaml <<EOF
spec:
smm:
exposeDashboard:
meshGateway:
enabled: true
service:
annotations:
service.beta.kubernetes.io/aws-load-balancer-backend-protocol: http
service.beta.kubernetes.io/aws-load-balancer-ssl-cert: arn:aws:acm:{region}:{user id}:certificate/{id}
service.beta.kubernetes.io/aws-load-balancer-ssl-ports: "443"
external-dns.alpha.kubernetes.io/hostname: smm.example.org.
tls:
enabled: true
externalTermination: true
EOF
kubectl patch controlplane --type=merge --patch "$(cat enable-dashboard-expose.yaml )" smm
- If you are using Service Mesh Manager in Operator Mode, then the Istio deployment is updated automatically.
- If you are using the imperative mode, run the
smm operator reconcile
command to apply the changes.
Note: In the previous example, the externalTermination: true
instructs Service Mesh Manager to expose a plain HTTP endpoint on port 443 so that the external LoadBalancer can terminate TLS for that port too.
Using NodePort
In this setup the LoadBalancer is managed externally. Each worker node will expose the set ports and you can create a LoadBalancer
by pointing it to all the worker node’s relevant port.
To enable NodePort-based exposing of the SMM service, run the following command. This example exposes the HTTP on all worker node’s 40080 port, and HTTPS on port 40443.
Note: The HTTPS port is only available if the TLS settings are explicitly enabled, this example omits that part. Either use the TLS settings from the LoadBalancer example, or check the section on user-provided TLS settings.
cat > enable-dashboard-expose.yaml <<EOF
spec:
smm:
exposeDashboard:
meshGateway:
enabled: true
service:
type: NodePort
nodePorts:
http: 40080
https: 40443
EOF
kubectl patch controlplane --type=merge --patch "$(cat enable-dashboard-expose.yaml )" smm
After that, set up the LoadBalancer and the DNS names manually.
- If you are using Service Mesh Manager in Operator Mode, then the Istio deployment is updated automatically.
- If you are using the imperative mode, run the
smm operator reconcile
command to apply the changes.
Expose using custom TLS credentials
You can provide a custom TLS secret in the secret called my-own-secret
in the smm-system
namespace. The following command configures the system to use that for in-cluster TLS termination:
cat > enable-dashboard-expose.yaml <<EOF
spec:
smm:
exposeDashboard:
meshGateway:
enabled: true
tls:
enabled: true
credentialName: "my-own-secret"
EOF
kubectl patch controlplane --type=merge --patch "$(cat enable-dashboard-expose.yaml )" smm
- If you are using Service Mesh Manager in Operator Mode, then the Istio deployment is updated automatically.
- If you are using the imperative mode, run the
smm operator reconcile
command to apply the changes.
Known limitations in HTTP access
As a security measure, Service Mesh Manager operates only over HTTPS when exposed via an external URL. Make sure that somewhere in the traffic chain some component (Istio or LoadBalancer) terminates the TLS connections, otherwise every login attempt to the dashboard will fail.
2.5.9.2 - OIDC authentication
Service Mesh Manager allows for authenticating towards an OIDC External Provider instead of relying on the kubeconfig
based authentication. This is useful when your organization already has an existing OIDC Provider that is used for user authentication on the Kubernetes clusters.
Since Service Mesh Manager does not require the Kubernetes cluster to be relying on OIDC authentication, you (or the operator of the cluster) might need to set up additional Groups in the cluster (for details, see the Setting up user permissions).
If your organization uses a central authentication database which is not OIDC compliant, check out Dex. Dex can act as an OIDC provider and supports LDAP, GitHub, or any OAuth2 identity provider as a backend. For an example on setting up Service Mesh Manager to use GitHub authentication using Dex, see Using Dex for authentication.
Note: Even if OIDC is enabled in Service Mesh Manager, you can access Service Mesh Manager from the command line by running smm dashboard
. This is a fallback authentication/access method in case the OIDC provider is down.
Prerequisites
Before starting to set up OIDC authentication, make sure that you have already:
Enable OIDC authentication
To enable the OIDC authentication, patch the ControlPlane resource with the following settings:
cat > oidc-enable.yaml <<EOF
spec:
smm:
auth:
oidc:
enabled: true
client:
id: ${OIDC_CLIENT_ID}
issuerURL: https://${IDENTITY_PROVIDER_EXTERNAL_URL}
secret: ${OIDC_CLIENT_SECRET}
EOF
Where:
${OIDC_CLIENT_ID}
is the client id obtained from the External OIDC Provider of your organization.
${OIDC_CLIENT_SECRET}
is the client secret obtained from the External OIDC Provider of your organization.
${IDENTITY_PROVIDER_EXTERNAL_URL}
is the URL of the External OIDC Provider of your organization.
- If you are using Service Mesh Manager in Operator Mode, then the Istio deployment is updated automatically.
- If you are using the imperative mode, run the
smm operator reconcile
command to apply the changes.
After this change, the dashboard will allow logging in using an External OIDC Provider:
Set up user and group mappings
After completing the previous step, the users will be able to authenticate via OIDC. However, Service Mesh Manager needs to map them to Kubernetes users. As Service Mesh Manager uses Kubernetes RBAC for access control, it relies on the same mapping as the Kubernetes API Server’s OIDC authentication backend.
You can use the following settings in the ControlPlane resource :
spec:
smm:
auth:
oidc:
username:
claim: # Claim to take the username from
prefix: # Append this prefix to all usernames
groups:
claim: # Claim to take the user's groups from
prefix: # Append this prefix to all group names the user has
requiredClaims:
<CLAIM>: "<VALUE>" # Only allow authentication if the given claim has the specified value
If the target cluster has OIDC enabled, the following table helps mapping the OIDC options of the API server to the settings of Service Mesh Manager:
API Server Setting |
Description |
ControlPlane setting |
--oidc-issuer-url |
URL of the provider which allows the API server to discover public signing keys. Only URLs which use the https:// scheme are accepted. This URL should point to the level below .well-known/openid-configuration |
.spec.smm.auth.client.issuerURL |
--oidc-client-id |
A client id that all tokens must be issued for. |
.spec.smm.auth.client.id |
|
A client secret that all tokens must be issued for. |
.spec.smm.auth.client.secret |
--oidc-username-claim |
JWT claim to use as the user name. By default sub, which is expected to be a unique identifier of the end user. |
.spec.smm.auth.username.claim |
--oidc-username-prefix |
Prefix prepended to username claims to prevent clashes with existing names (such as system:users ). For example, the value oidc: will create usernames like oidc:jane.doe. If this flag isn’t provided and –oidc-username-claim is a value other than email, the prefix defaults to the value of –oidc-issuer-url. Use the - value to disable all prefixing. |
.spec.smm.auth.username.prefix |
--oidc-groups-claim |
JWT claim to use as the user’s group. If the claim is present, it must be an array of strings. |
.spec.smm.auth.groups.claim |
--oidc-groups-prefix |
Prefix prepended to group claims to prevent clashes with existing names (such as system:groups ). For example, the value oidc: will create group names like oidc:engineering and oidc:infra. |
.spec.smm.auth.groups.prefix |
--oidc-required-claim |
A key=value pair that describes a required claim in the ID Token. If set, the claim is verified to be present in the ID Token with a matching value. |
.spec.smm.auth.requiredClaims |
- If you are using Service Mesh Manager in Operator Mode, then the Istio deployment is updated automatically.
- If you are using the imperative mode, run the
smm operator reconcile
command to apply the changes.
Set up user permissions
Note: This step is only required if the target cluster does not already have OIDC authentication set up. If the Kubernetes cluster’s OIDC authentication settings are matching the ones set in the ControlPlane resource, no further action is needed.
By default, when using OIDC authentication, users and groups cannot modify the resources in the target cluster, so you need to create the ClusterRoleBinding right for these Groups or Users.
The groups a given user belongs to is shown in the right hand menu on the user interface:
In this example, the username is oidc:test@example.org
and the user belongs to only one group, called oidc:example-org:test
.
If the Kubernetes Cluster is not using OIDC for authentication, create the relevant ClusterRoleBindings
against these Groups
and Users
.
2.5.9.2.1 - Using Dex for authentication
Dex is an identity service that uses OpenID Connect to drive authentication for other apps.
Dex acts as a portal to other identity providers through “connectors.” This lets Dex defer authentication to LDAP servers, SAML providers, or established identity providers like GitHub, Google, and Active Directory. Clients write their authentication logic once to talk to Dex, then Dex handles the protocols for a given backend.
This section shows you how to set up GitHub authentication using Service Mesh Manager. To set up other authentication backends such as Active Directory or LDAP, see the DEX Connectors documentation.
Enable GitHub authentication
As GitHub is an OAuth 2 provider, Service Mesh Manager requires a bridge between OAuth 2 (or any other authentication backend) and OIDC.
Prerequisites
Before starting to set up GitHub authentication to Service Mesh Manager, make sure that you have already:
You need the following information to follow this guide:
GITHUB_CLIENT_ID
: The Client ID from the GitHub OAuth 2 registration.
GITHUB_CLIENT_SECRET
: The Client Secret from the GitHub OAuth 2 registration.
GITHUB_ORG_NAME
: The name of the GitHub organization to authenticate against. If you want to support multiple organizations, consult the Dex manual.
GITHUB_ADMIN_TEAM_NAME
: The name of the GitHub team that contains the users who receive administrative privileges.
DEX_EXTERNAL_URL
: The URL where Dex will be exposed. This must be separate from the dashboard URL.
SMM_DASHBOARD_URL
: The URL where the Service Mesh Manager dashboard is exposed.
OIDC_CLIENT_SECRET
: The secret to be used between Dex and the Service Mesh Manager authentication backend (can be any random string).
To follow the examples, export these values as environment variables from your terminal, as these will be needed in multiple steps:
export GITHUB_CLIENT_ID=<***>
export GITHUB_CLIENT_SECRET=<***>
export GITHUB_ORG_NAME=my-github-org
export GITHUB_ADMIN_TEAM_NAME=admin
export DEX_EXTERNAL_URL=dex.example.org
export SMM_DASHBOARD_URL=smm.example.org
export OIDC_CLIENT_SECRET=$(openssl rand -base64 32) # or any random string
Create namespace for Dex
Dex will be installed into its own namespace for isolation. Create the namespace for it:
Dex will be exposed externally using Istio, so enable Istio sidecar injection on the namespace:
kubectl label ns dex istio.io/rev=cp-v115x.istio-system
Create MeshGateway for Dex
GitHub needs to access Dex to invoke the OAuth 2 callback, so that Dex can understand what was the result of the authentication on the GitHub side.
Create an externally available MeshGateway:
cat > dex-meshgateway.yaml <<EOF
apiVersion: servicemesh.cisco.com/v1alpha1
kind: IstioMeshGateway
metadata:
labels:
app.kubernetes.io/instance: dex
app.kubernetes.io/name: dex-ingress
name: dex-ingress
namespace: dex
spec:
istioControlPlane:
name: cp-v115x
namespace: istio-system
deployment:
metadata:
labels:
app.kubernetes.io/instance: dex
app.kubernetes.io/name: dex-ingress
gateway-name: dex-ingress
gateway-type: ingress
replicas:
max: 1
min: 1
count: 1
service:
metadata:
annotations:
external-dns.alpha.kubernetes.io/hostname: ${DEX_EXTERNAL_URL}.
ports:
- name: http2
port: 80
protocol: TCP
targetPort: 8080
- name: https
port: 443
protocol: TCP
targetPort: 8443
type: LoadBalancer
type: ingress
---
apiVersion: networking.istio.io/v1beta1
kind: Gateway
metadata:
labels:
app.kubernetes.io/instance: dex
app.kubernetes.io/name: dex-ingress
name: dex-ingress
namespace: dex
spec:
selector:
app.kubernetes.io/instance: dex
app.kubernetes.io/name: dex-ingress
servers:
- hosts:
- '*'
port:
name: http
number: 80
protocol: HTTP
- hosts:
- '*'
port:
name: https
number: 443
protocol: HTTPS
tls:
credentialName: dex-ingress-tls
mode: SIMPLE
---
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
labels:
app.kubernetes.io/instance: dex
app.kubernetes.io/name: dex-ingress
name: dex-ingress
namespace: dex
spec:
gateways:
- dex-ingress
hosts:
- '*'
http:
- match:
- uri:
prefix: /
route:
- destination:
host: dex
port:
number: 80
EOF
kubectl apply -f dex-meshgateway.yaml
Get certificates for Dex
The secret referenced in the MeshGateway resource is not yet available. To secure the communication between the end-user’s browser and your Dex installation, enable the Let’s Encrypt support for gateways in Service Mesh Manager:
cat > certs.yaml <<EOF
---
apiVersion: cert-manager.io/v1
kind: Issuer
metadata:
name: dex-issuer
namespace: dex
spec:
acme:
email: noreply@cisco.com
preferredChain: ""
privateKeySecretRef:
name: smm-letsencrypt-issuer
server: https://acme-v02.api.letsencrypt.org/directory
solvers:
- http01:
ingress:
class: nginx
---
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: dex-tls
namespace: dex
annotations:
acme.smm.cisco.com/gateway-selector: |
{
"app.kubernetes.io/instance": "dex",
"app.kubernetes.io/name": "dex-ingress"
}
spec:
dnsNames:
- ${DEX_EXTERNAL_URL}
duration: 2160h0m0s
issuerRef:
group: cert-manager.io
kind: Issuer
name: dex-issuer
privateKey:
algorithm: RSA
encoding: PKCS1
size: 2048
renewBefore: 360h0m0s
secretName: dex-ingress-tls
usages:
- server auth
- client auth
EOF
kubectl apply -f certs.yaml
After executing the previous commands, check that the Certificate has been successfully issued by running the kubectl get certificate
command. The output should be similar to:
NAME READY SECRET AGE
dex-tls True dex-ingress-tls 24h
If the READY column shows True, then the Certificate has been issued. If not, refer to the Cert Manager documentation for troubleshooting the issue.
Provision Dex
Now you can install Dex onto the namespace using helm
. First create a file called dex-values.yaml
for the Dex installation:
cat > dex-values.yaml <<EOF
---
config:
issuer: https://${DEX_EXTERNAL_URL}
storage:
type: kubernetes
config:
inCluster: true
connectors:
- type: github
id: github
name: GitHub
config:
clientID: $GITHUB_CLIENT_ID
clientSecret: "$GITHUB_CLIENT_SECRET"
redirectURI: https://${DEX_EXTERNAL_URL}/callback
orgs:
- name: $GITHUB_ORG_NAME
loadAllGroups: true
oauth2:
skipApprovalScreen: true
staticClients:
- id: smm-app
redirectURIs:
- "https://${SMM_DASHBOARD_URL}/auth/callback"
name: 'Cisco Service Mesh Manager'
secret: ${OIDC_CLIENT_SECRET}
service:
enabled: true
ports:
http:
port: 80
https:
port: 443
EOF
Run the following commands to install Dex using these values:
helm repo add dex https://charts.dexidp.io
helm install -n dex dex -f dex-values.yaml dex/dex
Verify that Dex has started successfully by running the kubectl get pods -n dex
command. The output should be similar to:
NAME READY STATUS RESTARTS AGE
dex-6d879bb86d-pxtvm 2/2 Running 1 20m
dex-ingress-6885b4f747-c5l96 1/1 Running 0 24m
Enable Dex as an OIDC provider to Service Mesh Manager by patching the ControlPlane resource:
cat > smm-oidc-enable.yaml <<EOF
spec:
smm:
auth:
oidc:
enabled: true
client:
id: smm-app
issuerURL: https://${DEX_EXTERNAL_URL}
secret: ${OIDC_CLIENT_SECRET}
groups:
claim: groups
prefix: 'oidc:'
username:
claim: email
prefix: 'oidc:'
EOF
kubectl patch --type=merge --patch "$(cat smm-oidc-enable.yaml )" controlplane smm
- If you are using Service Mesh Manager in Operator Mode, then the Istio deployment is updated automatically.
- If you are using the imperative mode, run the
smm operator reconcile
command to apply the changes.
Create user mapping
After logging in, the users will be mapped to have the:
- Username of
oidc:<email-of-the-github-user>
, and the
- groups of
oidc:$$GITHUB_ORG_NAME:<team-name>
for each of the GitHub Teams the user is a member of.
By default, these users and groups cannot modify the resources in the target cluster, so you need to create the ClusterRoleBinding right for these Groups or Users. For example, to grant administrative access to the users in the $GITHUB_ADMIN_TEAM_NAME
GitHub Team, run the following command:
cat > allow-admin-access.yaml <<EOF
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: oidc-admin-access
subjects:
- kind: Group
name: 'oidc:$GITHUB_ORG_NAME:$GITHUB_ADMIN_TEAM_NAME'
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: ClusterRole
name: cluster-admin
apiGroup: rbac.authorization.k8s.io
EOF
kubectl apply -f allow-admin-access.yaml
The groups a given user belongs to is shown in the right hand menu on the user interface:
In this example, the username is oidc:test@example.org
and the user belongs to only one group, called oidc:example-org:test
.
Verify login
To test that the login works, navigate to the URL where the Service Mesh Manager dashboard is exposed ($SMM_DASHBOARD_URL
), and select Sign in with OIDC.
2.6 - Mesh Management
2.6.1 - Multi-cluster - single mesh
Multi-cluster overview
Service Mesh Manager is able to construct an Istio service mesh that spans multiple clusters.
In this scenario you combine multiple clusters into a single service mesh that you can manage from either a single or from multiple Istio control planes.
Single mesh scenarios are best suited to use cases where clusters are configured together, sharing resources, and are generally treated as one infrastructural component within an organization.
Istio clusters and SMM clusters
When you are working with Service Mesh Manager in a multi-cluster scenario, you must understand the following concepts:
- Every Istio cluster you attach to the mesh is either a remote Istio cluster or a primary Istio cluster. Remote Istio clusters don’t have a separate Istio control plane, while primary Istio clusters do. To understand the difference between the remote Istio and primary Istio clusters, see the Istio control plane models document.
- When you install Service Mesh Manager on a cluster, it installs a primary Istio cluster. This cluster is effectively the primary Service Mesh Manager cluster.
- Even if you add multiple primary Istio clusters to the mesh, Service Mesh Manager runs only on the primary Service Mesh Manager cluster (even though some of its components are replicated to the other clusters).
- You can deploy Service Mesh Manager in an active-passive model. The active Service Mesh Manager control plane has all components installed on a primary Istio cluster. The passive Service Mesh Manager control plane has only a limited number of components installed on a primary or remote Istio cluster. Only one Service Mesh Manager control plane is active, all other Service Mesh Manager control planes are passive.
This means that when using the Service Mesh Manager CLI (for example, to attach or detach a new cluster), you must run it in the context of the active Service Mesh Manager cluster, even if there are multiple primary Istio clusters in the mesh.
Creating a multi-cluster mesh
Read the multi-cluster installation guide for details on how to set up a multi-cluster mesh.
2.6.1.1 - Cluster network
A multi-cluster mesh connects multiple clusters into a single service mesh. The topology of the mesh – how the different clusters are grouped into networks and how each cluster connects to the mesh – determines how the clusters connect to each other and how the pods, services, and workloads can access resources in other clusters.
Communication between clusters
In a multi-cluster mesh, every cluster belongs to a specific network. Clusters belonging to the same mesh can access the services of each other, but how this happens depends on which network the cluster belongs to.
- If the clusters belong to the same network, their pods can access each other directly over a flat network, without using a cluster gateway.
- If the clusters belong to different networks, the services of the cluster can be accessed only through the gateway of the cluster. Since Service Mesh Manager assigns each cluster to its own network by default, this is the default behavior.
The networkName label of the cluster determines which network the cluster belongs to. By default, every cluster belongs to its own network, where the name of the network is the name of the cluster.
Note: If the name of the cluster cannot be used as a Kubernetes resource name (for example, because it contains the underscore, colon, or another special character), you must manually specify a name to use when you are attaching the cluster to the service mesh. For example:
smm istio cluster attach <PEER-CLUSTER-KUBECONFIG-FILE> --name <KUBERNETES-COMPLIANT-CLUSTER-NAME> --active-istio-control-plane
Otherwise, the following error occurs when you try to attach the cluster:
could not attach peer cluster: graphql: Secret "example-secret" is invalid: metadata.name: Invalid value: "gke_gcp-cluster_region": a DNS-1123 subdomain must consist of lower case alphanumeric characters, '-' or '.'**
You can specify the network of the cluster when you are attaching the cluster to the mesh.
Assigning clusters to different networks allows you to optimize the topology of your mesh network. Depending on your cloud provider, there might be differences in cross-cluster latencies and transfer costs between the different connection types.
Network connectivity requirements
For a multi-cluster scenario the necessary networking configurations are listed in this section.
- If the clusters belong to the same network, then network connectivity should be fine nothing else needs to be done.
- If the clusters belong to different networks, and the endpoints in the networks are publicly accessible without restrictions, then again nothing needs to be done.
- If the clusters belong to different networks, but there are restrictions on which endpoints can be accessed, at least the following endpoints must be accessible for a proper multi-cluster setup with Service Mesh Manager:
- From all clusters:
- From the primary cluster(s):
- All peer clusters' k8s API server address
- All IP addresses or host names of the
meshexpansion-gateway
LoadBalancer type services on the peer clusters on port 15443
- From peer clusters:
- IP address or host name of the
meshexpansion-gateway
LoadBalancer type service on the primary cluster(s) on ports 15443,15012
- IP address or host name of the
meshexpansion-gateway
LoadBalancer type service on the primary cluster where Service Mesh Manager is installed on ports 50600,59411
2.6.1.2 - Attach a new cluster to the mesh
Service Mesh Manager automates the process of creating the resources necessary for the peer cluster, generates and sets up the kubeconfig for that cluster, and attaches the cluster to the mesh.
Note: If you are using Service Mesh Manager with a commercial license in a multi-cluster scenario, Service Mesh Manager automatically synchronizes the license to the attached clusters. If the peer cluster already has a license, it is automatically deleted and replaced with the license of the primary Service Mesh Manager cluster. Detaching a peer cluster automatically deletes the license from the peer cluster.
To attach a new cluster to the service mesh managed by Service Mesh Manager, complete the following steps. For an overview of the network settings of the cluster, see Cluster network.
Prerequisites
- The Service Mesh Manager CLI tool installed on your computer.
- Access to the KUBECONFIG file of the cluster you want to attach to the service mesh.
- Access to the KUBECONFIG file of the cluster that runs the primary Service Mesh Manager service.
- Network connectivity properly configured between the participating clusters.
Steps
-
Find out the name of the network you want to attach the cluster to.
- By default, every cluster belongs to its own network, where the name of the network is the name of the cluster.
- If you want to attach the cluster to an existing network, you must manually specify the name of the network when you are attaching the cluster to the service mesh using the
--network-name
option in the next step.
If you have to specify the network name manually, note the name of the network you want to use. You can check the existing network names using the smm istio cluster status
command.
-
On the primary Service Mesh Manager cluster, attach the peer cluster to the mesh using one of the following commands.
Note: To understand the difference between the remote Istio and primary Istio clusters, see the Istio control plane models section in the official Istio documentation.
The short summary is that remote Istio clusters do not have a separate Istio control plane, while primary Istio clusters do.
The following commands automate the process of creating the resources necessary for the peer cluster, generate and set up the kubeconfig for that cluster, and attach the cluster to the mesh.
-
To attach a remote Istio cluster with the default options, run:
smm istio cluster attach <PEER_CLUSTER_KUBECONFIG_FILE>
-
To attach a primary Istio cluster (one that has an active Istio control plane installed), run:
smm istio cluster attach <PEER_CLUSTER_KUBECONFIG_FILE> --active-istio-control-plane
Note: If the name of the cluster cannot be used as a Kubernetes resource name (for example, because it contains the underscore, colon, or another special character), you must manually specify a name to use when you are attaching the cluster to the service mesh. For example:
smm istio cluster attach <PEER-CLUSTER-KUBECONFIG-FILE> --name <KUBERNETES-COMPLIANT-CLUSTER-NAME> --active-istio-control-plane
Otherwise, the following error occurs when you try to attach the cluster:
could not attach peer cluster: graphql: Secret "example-secret" is invalid: metadata.name: Invalid value: "gke_gcp-cluster_region": a DNS-1123 subdomain must consist of lower case alphanumeric characters, '-' or '.'**
-
To override the name of the cluster, run:
smm istio cluster attach <PEER_CLUSTER_KUBECONFIG_FILE> --name <kubernetes-compliant-cluster-name>
-
To specify the network name, run:
smm istio cluster attach <PEER_CLUSTER_KUBECONFIG_FILE> --network-name <network-name>
Note: If you are using Service Mesh Manager with a commercial license in a multi-cluster scenario, Service Mesh Manager automatically synchronizes the license to the attached clusters. If the peer cluster already has a license, it is automatically deleted and replaced with the license of the primary Service Mesh Manager cluster. Detaching a peer cluster automatically deletes the license from the peer cluster.
-
Wait until the peer cluster is attached. Attaching the peer cluster takes some time, because it can be completed only after the ingress gateway address works. You can verify that the peer cluster is attached successfully with the following command:
The process is finished when you see Available
in the Status
field of all clusters.
-
(Optional) Open the Service Mesh Manager dashboard and verify that the new peer cluster is visible on the MENU > TOPOLOGY page.
2.6.1.3 - Deploy applications on multiple clusters
After you have one or more clusters attached to the mesh, here are some best practices to deploy applications on multiple clusters.
Deploy demo application
If you just want to get started with any demo application in a multi-cluster mesh, the easiest is to install the built-in Service Mesh Manager demo application.
-
You can deploy the demo application in a distributed way to multiple clusters with the following commands:
smm demoapp install -s frontpage,catalog,bookings
smm -c <PEER_CLUSTER_KUBECONFIG_FILE> demoapp install -s movies,payments,notifications,analytics,database --peer
After installation, the demo application automatically starts generating traffic, and the dashboard draws a picture of the data flow.
(If if doesn’t, run the smm demoapp load start
command, or Generate load on the UI.
If you want to stop generating traffic, run smm demoapp load stop
.)
-
Open the dashboard and look around.
Deploy custom application
Here is how you can deploy your own application on multiple clusters with Service Mesh Manager.
-
Create the namespace where you would like to run your applications on every cluster:
-
In the cluster where Service Mesh Manager is installed, enable sidecar injection in that namespace:
smm sidecar-proxy auto-inject on test
This will place an istio.io/rev
label and set it to the appropriate Istio control plane (if there are multiple control planes, you get to choose which one).
(The sidecar injection can be enabled from the Service Mesh Manager dashboard as well.)
Service Mesh Manager, more particularly the Istio operator, will take care of adding the same label to this namespace on all other clusters.
(If not, check the istio-operator pod logs on the particular cluster for any potential issues.)
-
Deploy your application on the clusters as you would usually do:
One caveat is that you should deploy all kubernetes service
resources on all clusters even if pods are only present on a subset of clusters.
This is needed for Istio to be able to do proper routing across clusters.
-
Make sure that sidecar pods are indeed injected to your application pods.
If not, check the official Istio documentation for potential issues.
-
Send traffic to your applications, then open the dashboard and look around.
2.6.1.4 - Detach a cluster from the mesh
To detach a cluster from the service mesh managed by Service Mesh Manager, complete the following steps.
Prerequisites
- The Service Mesh Manager CLI tool installed on your computer.
- Access to the KUBECONFIG file of the cluster you want to detach from the service mesh.
- Access to the KUBECONFIG file of the cluster that runs the primary Service Mesh Manager service.
Steps
-
On the primary Service Mesh Manager cluster, detach the peer cluster from the mesh by running the following command.
smm istio cluster detach <PEER_CLUSTER_KUBECONFIG_FILE>
-
Wait until the peer cluster is detached. You can check the status of peer clusters by running the following command:
-
(Optional) Navigate to the MENU > MESH page of the Service Mesh Manager dashboard and verify that the cluster you have detached is not shown in the Clusters list.
2.6.1.5 - Cluster registry controller
Service Mesh Manager uses the cluster registry controller to synchronize any Kubernetes resources across the clusters in a multi-cluster setup. That way, the necessary resources are automatically synchronized, so the multi-cluster topologies of Istio and the multi-cluster features (for example, observability, multi-cluster topology view, tracing, traffic tapping) of Service Mesh Manager work in a multi-cluster environment.
In addition, you can use the resource synchronization capabilities of Service Mesh Manager to synchronize any Kubernetes resources on demand between the clusters of your mesh.
Overview
When installing Service Mesh Manager in imperative mode from the command line, Service Mesh Manager automatically deploys the cluster registry controller