Cloud Service Mesh overview

Cloud Service Mesh is a service mesh available on Google Cloud and across supported GKE Enterprise platforms. It supports services running on a range of computing infrastructures. Cloud Service Mesh is controlled by APIs designed for Google Cloud, for open source, or for both.

This document is for you if you're a new Cloud Service Mesh user or a continuing Anthos Service Mesh or Traffic Director customer.

What is a service mesh?

A service mesh is an architecture that enables managed, observable, and secure communication among your services, making it it easier for you to create robust enterprise applications made up of many microservices on your chosen infrastructure. Service meshes manage the common requirements of running a service, such as monitoring, networking, and security, with consistent, powerful tools, making it easier for service developers and operators to focus on creating and managing great applications for their users.

Architecturally, a service mesh consists of one or more control planes and a data plane. The service mesh monitors all traffic into and out of your services. On Kubernetes, a proxy is deployed by a sidecar pattern to the microservices in the mesh. On Compute Engine, you can deploy proxies on VMs or use proxyless gRPC for the data plane.

This pattern decouples application or business logic from network functions, and enables developers to focus on the features that the business needs. Service meshes also let operations teams and development teams decouple their work from one another.

Architecting your applications as microservices provides many benefits. However, your workloads can become more complex and fragmented as they scale. Service mesh helps solve the fragmentation problem and makes it easier to manage your microservices.

What is Cloud Service Mesh?

Cloud Service Mesh is Google's solution for both Google Cloud and supported GKE Enterprise environments.

  • On Google Cloud: Cloud Service Mesh provides APIs that are specific to the computing infrastructure on which your workloads run.
  • Off Google Cloud: With Distributed Cloud Virtual or GKE multicloud, Cloud Service Mesh supports the Istio APIs for Kubernetes workloads.

Whether on or off Google Cloud, Cloud Service Mesh lets you manage, observe, and secure your services without having to change your application code.

Cloud Service Mesh reduces the toil for your operations and development teams by simplifying service delivery, from traffic management and mesh telemetry to securing communications between services. Google's fully managed service mesh lets you manage complex environments and enjoy the benefits they promise.

Features

Cloud Service Mesh has a suite of features for traffic management, observability and telemetry, and security.

Traffic management

Cloud Service Mesh controls the flow of traffic among services in the mesh, into the mesh (ingress), and to outside services (egress). You configure and deploy resources to manage this traffic at the application (L7) layer. For example, you can do the following:

  • Use service discovery.
  • Configure load balancing among services.
  • Create canary and blue-green deployments.
  • Finely control routing for your services.
  • Set up circuit breakers.

Cloud Service Mesh maintains a list of all services in the mesh by name and by their respective endpoints. It maintains this list to manage the flow of traffic (for example, Kubernetes Pod IP addresses or the IP addresses of Compute Engine VMs in a Managed instance group). By using this service registry, and by running the proxies side-by-side with the services, the mesh can direct traffic to the appropriate endpoint. Proxyless gRPC workloads can also be used in parallel with workloads using Envoy proxies.

Observability insights

The Cloud Service Mesh user interface in the Google Cloud console provides insights into your service mesh. These metrics are automatically generated for workloads configured through the Istio APIs.

  • Service metrics and logs for HTTP traffic within your mesh's GKE cluster are automatically ingested to Google Cloud.
  • Preconfigured service dashboards give you the information you need to understand your services.
  • In-depth telemetry—powered by Cloud Monitoring, Cloud Logging, and Cloud Trace—lets you dig deep into your service metrics and logs. You can filter and segment your data on a wide variety of attributes.
  • Service-to-service relationships help you understand at a glance inter-service dependences and who connects to each service.
  • You can quickly see the communication security posture not only of your service, but its relationships to other services.
  • Service level objectives (SLOs) give you insight into the health of your services. You can define an SLO and alert on your own standards of service health.

Learn more about Cloud Service Meshs observability features in our Observability guide.

Security benefits

Cloud Service Mesh provides you with many security benefits.

  • Mitigates risk of replay or impersonation attacks that use stolen credentials. Cloud Service Mesh relies on mutual TLS (mTLS) certificates to authenticate peers, rather than bearer tokens such as JSON Web Tokens (JWT).
  • Ensures encryption in transit. Using mTLS for authentication also ensures that all TCP communications are encrypted in transit.
  • Mitigates the risk that unauthorized clients can access a service with sensitive data, irrespective of the network location of the client and the application-level credentials.
  • Mitigates the risk of user data breach within your production network. You can ensure that insiders can only access sensitive data through authorized clients.
  • Identifies which clients accessed a service with sensitive data. Cloud Service Mesh access logging captures the mTLS identity of the client in addition to the IP address.
  • All in-cluster control plane components are built with FIPS 140-2 validated encryption modules.

Learn more about Service Mesh's security benefits and features in the Security guide.

Deployment options

You have the following deployment options in Cloud Service Mesh:

  • On Google Cloud
    • Managed Cloud Service Mesh - managed control and data plane for GKE (recommended)
    • Managed Cloud Service Mesh - managed control and data plane for Compute Engine with VMs (recommended)
    • In-cluster control plane for GKE with Istio APIs (Discouraged)
  • Off Google Cloud
    • In-cluster control plane for Kubernetes with Istio APIs

Managed Cloud Service Mesh

Managed Cloud Service Mesh consists of the managed control plane for all infrastructures and the managed data plane for GKE. With Managed Cloud Service Mesh, Google handles upgrades, scaling, and security for you, minimizing manual user maintenance. This covers the control plane, data plane, and related resources.

Data plane implementation

If you use Google Cloud APIs, your data plane can be provided by Envoy proxies or by proxyless gRPC applications. If you are updating an existing application, the sidecar-based approach allows for integration into the mesh without changing your application. If you want to avoid the overhead of running a sidecar, you can update your application to use gRPC.

Envoy proxies and proxyless gRPC both use the xDS API to connect to the control plane. If you use proxyless gRPC, you have a choice of supported languages for your applications, including Go, C++, Java, and Python.

If you use open source Istio APIs, your data plane is provided by Envoy proxies.

Control plane implementation

Your Cloud Service Mesh control plane depends on whether your configuration is on or off Google Cloud and whether you are a new customer.

Control plane implementation for existing users

To determine your current control plane, read Identify control plane implementation. For more information on control planes and control plane migration, see Managed control plane overview for continuing customers.

Control plane implementation for new users

Control plane migration

If you are a continuing Anthos Service Mesh customer and you use the Istio APIs, your clusters will start migrating to the Traffic Director control plane. You can continue to use the Istio APIs for configuration.

To determine whether your clusters still use the Istio control plane or have migrated to the new global control plane, read Identify control plane implementation.

What's next