36
Istio A modern service mesh Louis Ryan Google @louiscryan Shriram Rajagopalan IBM @rshriram

Istio · Weaving the mesh svcA sidecar sidecar Service A ... Fault Injection ... Istio adds fault tolerance to your application

  • Upload
    vukhue

  • View
    219

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Istio · Weaving the mesh svcA sidecar sidecar Service A ... Fault Injection ... Istio adds fault tolerance to your application

IstioA modern service mesh

Louis RyanGoogle@louiscryan

Shriram RajagopalanIBM@rshriram

Page 2: Istio · Weaving the mesh svcA sidecar sidecar Service A ... Fault Injection ... Istio adds fault tolerance to your application

What is a ‘Service Mesh’ ?

A network for services, not bytes

● Visibility

● Resiliency & Efficiency

● Traffic Control

● Security

● Policy Enforcement

Page 3: Istio · Weaving the mesh svcA sidecar sidecar Service A ... Fault Injection ... Istio adds fault tolerance to your application

Why do you need this?

● Microservices

Page 4: Istio · Weaving the mesh svcA sidecar sidecar Service A ... Fault Injection ... Istio adds fault tolerance to your application

Why do you want this?

● Microservices

● Infrastructure Bloat X Polyglot

Page 5: Istio · Weaving the mesh svcA sidecar sidecar Service A ... Fault Injection ... Istio adds fault tolerance to your application

Why do you want this?

● Microservices

● Infrastructure Bloat X Polyglot

● Operational Velocity

Page 6: Istio · Weaving the mesh svcA sidecar sidecar Service A ... Fault Injection ... Istio adds fault tolerance to your application

Why do you want this?

● Microservices

● Infrastructure Bloat X Polyglot

● Operational Velocity

● Control

Page 7: Istio · Weaving the mesh svcA sidecar sidecar Service A ... Fault Injection ... Istio adds fault tolerance to your application

What is a ‘Service Mesh’ ?

A network for services, not bytes

● Visibility

● Resiliency & Efficiency

● Traffic Control

● Security

● Policy Enforcement

Page 8: Istio · Weaving the mesh svcA sidecar sidecar Service A ... Fault Injection ... Istio adds fault tolerance to your application

You need control over load balancing. But stop (mis)using the kernel for it!

Lightweight sidecars to manage traffic between services

Sidecars can do much more than just load balancing!

So you want to build a service mesh?

Page 9: Istio · Weaving the mesh svcA sidecar sidecar Service A ... Fault Injection ... Istio adds fault tolerance to your application

Weaving the mesh

svcA

sidecarsidecar

Service A

svcB

sidecar

Service B

External Services

HTTP/1.1, HTTP/2, gRPC, TCP with or without TLS

HTTP/1.1, HTTP/2, gRPC, TCP with or without TLS

Internet

Outbound features:❖ Service authentication❖ Load balancing❖ Retry and circuit breaker❖ Fine-grained routing❖ Telemetry❖ Request Tracing❖ Fault Injection

Inbound features:❖ Service authentication❖ Authorization❖ Rate limits❖ Load shedding❖ Telemetry❖ Request Tracing❖ Fault Injection

Page 10: Istio · Weaving the mesh svcA sidecar sidecar Service A ... Fault Injection ... Istio adds fault tolerance to your application

Our sidecar of choice - Envoy● A C++ based L4/L7 proxy

● Low memory footprint

● Battle-tested @ Lyft

○ 100+ services ○ 10,000+ VMs ○ 2M req/s

Plus an awesome team willing to work with the community!

Goodies:❖ HTTP/2 & gRPC❖ Zone-aware load balancing w/ failover❖ Health checks, circuit breakers, timeouts, retry

budgets❖ No hot reloads - API driven config updates

Istio’s contributions:❖ Transparent proxying w/ SO_ORIGINAL_DST❖ Traffic routing and splitting❖ Request tracing using Zipkin❖ Fault injection

Page 11: Istio · Weaving the mesh svcA sidecar sidecar Service A ... Fault Injection ... Istio adds fault tolerance to your application

Putting it all together

svcA

Envoy

Pod

Service A

svcB

Envoy

Service B

Pilot

Control Plane API

Mixer

Discovery & Config data to Envoys

Policy checks, telemetry

Control flow during request processing Istio-Auth

TLS certs to Envoy

Traffic is transparently intercepted and proxied. App is

unaware of Envoy’s presence

Page 12: Istio · Weaving the mesh svcA sidecar sidecar Service A ... Fault Injection ... Istio adds fault tolerance to your application

Modeling the Service Mesh

etcd

Kube

rnet

es

Cons

ul

Envoy

Abstract Model

Cust

om

disc

over

y

Platform Adapter

Envoy APIRule

s AP

I

Pilot

Envoy EnvoyEnvoy

Service discovery & traffic rules

1. Environment-specific topology extraction

2. Topology is mapped to a platform-agnostic model.

3. Additional rules are layered onto the model. E.g. retries, traffic splits etc.

4. Configuration is pushed to Envoy and applied without restarts

Page 13: Istio · Weaving the mesh svcA sidecar sidecar Service A ... Fault Injection ... Istio adds fault tolerance to your application

What is a ‘Service Mesh’ ?

A network for services, not bytes

● Visibility

● Resiliency & Efficiency

● Traffic Control

● Security

● Policy Enforcement

Page 14: Istio · Weaving the mesh svcA sidecar sidecar Service A ... Fault Injection ... Istio adds fault tolerance to your application

VisibilityMonitoring & tracing should not be an afterthought in the infrastructure

Goals● Metrics without instrumenting apps● Consistent metrics across fleet● Trace flow of requests across services● Portable across metric backend providers

Istio Zipkin tracing dashboard

Istio - Grafana dashboard w/ Prometheus backend

Page 15: Istio · Weaving the mesh svcA sidecar sidecar Service A ... Fault Injection ... Istio adds fault tolerance to your application

Metrics flow

svcA

Envoy

Pod

Service A

svcB

Envoy

Service B

API: /svcBLatency: 10msStatus Code: 503Src: 10.0.0.1Dst: 10.0.0.2…...

Prometheus InfluxDB

Prom

ethe

us

Adap

ter

Influ

xDB

Adap

ter

Cus

tom

Ad

apte

r

Mixer

● Mixer collects metrics emitted by Envoys● Adapters in the Mixer normalize and

forward to monitoring backends● Metrics backend can be swapped at

runtime

PrometheusPrometheus

InfluxDBInfluxDB Custom

backend

Page 16: Istio · Weaving the mesh svcA sidecar sidecar Service A ... Fault Injection ... Istio adds fault tolerance to your application

Visibility: Tracing

svcA

Envoy

Pod

Service A

svcB

Envoy

Service B

Trace HeadersX-B3-TraceIdX-B3-SpanId

X-B3-ParentSpanIdX-B3-SampledX-B3-Flags svcC

Envoy

Service C

SpansSpans

Prometheus InfluxDB

Zipk

in

Adap

ter

Stac

kdriv

er

Adap

ter

Cus

tom

Ad

apte

r

Mixer

PrometheusZipkin

InfluxDBStackdriver Custom

backend

● Application do not have to deal with generating spans or correlating causality

● Envoys generate spans○ Applications need to *forward*

context headers on outbound calls

● Envoys send traces to Mixer● Adapters at Mixer send traces to

respective backends

Page 17: Istio · Weaving the mesh svcA sidecar sidecar Service A ... Fault Injection ... Istio adds fault tolerance to your application

What is a ‘Service Mesh’ ?

A network for services, not bytes

● Visibility

● Resiliency & Efficiency

● Traffic Control

● Security

● Control

Page 18: Istio · Weaving the mesh svcA sidecar sidecar Service A ... Fault Injection ... Istio adds fault tolerance to your application

ResiliencyIstio adds fault tolerance to your application without any changes to code

Resilience features❖ Timeouts❖ Retries with timeout budget❖ Circuit breakers❖ Health checks❖ AZ-aware load balancing w/

automatic failover❖ Control connection pool size and

request load❖ Systematic fault injection

// Circuit breakers

destination: serviceB.example.cluster.localpolicy:- tags: version: v1 circuitBreaker: simpleCb: maxConnections: 100 httpMaxRequests: 1000 httpMaxRequestsPerConnection: 10 httpConsecutiveErrors: 7 sleepWindow: 15m httpDetectionInterval: 5m

Page 19: Istio · Weaving the mesh svcA sidecar sidecar Service A ... Fault Injection ... Istio adds fault tolerance to your application

Resiliency Testing

Systematic fault injection to identify weaknesses in failure recovery policies○ HTTP/gRPC error codes ○ Delay injection

svcA

Envoy

Service A

svcB

Envoy

Service B

svcC

Envoy

Service C

Timeout: 100msRetries: 3300ms

Timeout: 200msRetries: 2400ms

Page 20: Istio · Weaving the mesh svcA sidecar sidecar Service A ... Fault Injection ... Istio adds fault tolerance to your application

Efficiency● L7 load balancing

○ Passive/Active health checks, circuit breaks○ Backend subsets○ Affinity

● Inter-service communication happens over HTTP/2○ HTTP/1.1 connections are transparently upgraded○ QUIC on the roadmap

● TLS offload○ No more JSSE or stale SSL versions.

● HTTP/2 and gRPC proxying

Page 21: Istio · Weaving the mesh svcA sidecar sidecar Service A ... Fault Injection ... Istio adds fault tolerance to your application

What is a ‘Service Mesh’ ?

A network for services, not bytes

● Visibility

● Resiliency & Efficiency

● Traffic Control

● Security

● Policy Enforcement

Page 22: Istio · Weaving the mesh svcA sidecar sidecar Service A ... Fault Injection ... Istio adds fault tolerance to your application

Traffic Splitting

svcA

Envoy

Pod

Service A

svcB

Envoy

Serv

ice

B

http://serviceB.example

Pod Labels: version: v1.5env: us-prod

svcB

Envoy

Pod Labels:version: v2.0-alpha, env:us-staging

serviceB.example.cluster.local

Traffic routing rules

99%

1%

Rules API

Pilot

Traffic control is decoupled from infrastructure scaling

// A simple traffic splitting rule

destination: serviceB.example.cluster.localmatch: source: serviceA.example.cluster.localroute:- tags: version: v1.5 env: us-prod weight: 99

- tags: version: v2.0-alpha env: us-staging weight: 1

Page 23: Istio · Weaving the mesh svcA sidecar sidecar Service A ... Fault Injection ... Istio adds fault tolerance to your application

svcA

Service AsvcB

Service Bversion: v1

Pod 3Pod 2

Pod 1

Content-based traffic steering

svcA

Service A

svcB

Service B

version: v1

Pod 3Pod 2

Pod 1

User-agent: *Android*

svcB’

version: canary

Pod 4

User-agent: *iPhone*

Traffic Steering// Content-based traffic steering rule

destination: serviceB.example.cluster.local

match:

httpHeaders:

user-agent:

regex: ^(.*?;)?(iPhone)(;.*)?$

precedence: 2

route:

- tags:

version: canary

Page 24: Istio · Weaving the mesh svcA sidecar sidecar Service A ... Fault Injection ... Istio adds fault tolerance to your application

What is a ‘Service Mesh’ ?

A network for services, not bytes

● Visibility

● Resiliency & Efficiency

● Traffic Control

● Security

● Policy Enforcement

Page 25: Istio · Weaving the mesh svcA sidecar sidecar Service A ... Fault Injection ... Istio adds fault tolerance to your application

Securing Microservices

● Verifiable identity

● Secure naming / addressing

● Traffic encryption

● Revocation

Page 26: Istio · Weaving the mesh svcA sidecar sidecar Service A ... Fault Injection ... Istio adds fault tolerance to your application

Problem: Strong Service Security at Scale

Concerns● Concerned about insider access risks● Adopting a (micro-)services architecture● Audit & Compliance

Issues● Modern architectures are based on dynamically placed workloads and remotely

accessed shared (micro-)services. ● Existing network based security paradigms either enable broad access within a

network or are brittle / hard to manage. ● Customers want a way to limit sensitive data access to only limited services (or

identities) and enforce strong authentication at scale.

Page 27: Istio · Weaving the mesh svcA sidecar sidecar Service A ... Fault Injection ... Istio adds fault tolerance to your application

Istio - Security at Scale

spiffe.io

Page 28: Istio · Weaving the mesh svcA sidecar sidecar Service A ... Fault Injection ... Istio adds fault tolerance to your application

What is a ‘Service Mesh’ ?

A network for services, not bytes

● Visibility

● Resiliency & Efficiency

● Traffic Control

● Security

● Policy Enforcement

Page 29: Istio · Weaving the mesh svcA sidecar sidecar Service A ... Fault Injection ... Istio adds fault tolerance to your application

Putting it all together

svcA

Envoy

Pod

Service A

svcB

Envoy

Service B

Pilot

Control Plane API

Mixer

Discovery & Config data to Envoys

Policy checks, telemetry

Control flow during request processing Istio-Auth

TLS certs to Envoy

Page 30: Istio · Weaving the mesh svcA sidecar sidecar Service A ... Fault Injection ... Istio adds fault tolerance to your application

What’s Mixer For?● Nexus for policy evaluation and telemetry reporting

○ Precondition checking

○ Quotas & Rate Limiting

● Primary point of extensibility

● Enabler for platform mobility

● Operator-focused configuration model

Page 31: Istio · Weaving the mesh svcA sidecar sidecar Service A ... Fault Injection ... Istio adds fault tolerance to your application

Plugin Model for Extensibility● Mixer uses pluggable adapters to extend its

functionality○ Adapters are modules that interface to infrastructure backends○ They expose specialized interfaces (logging, metrics, quotas, etc)○ Multi-interface adapters are possible (e.g., a Stackdriver adapter

exposing logging & monitoring)

● Adapters run within the Mixer process Mixer

GCP

AWS

Prometheus

Heapster

New Relic

Bluemix

Page 32: Istio · Weaving the mesh svcA sidecar sidecar Service A ... Fault Injection ... Istio adds fault tolerance to your application

Attributes - The behavioral vocabulary

target.service = “playlist.svc.cluster.local”request.size = 345request.time = 2017-04-12T12:34:56Zsource.ip = 192.168.10.1source.name = “music-fe.serving.cluster.local”source.user = “[email protected]”api.operation = “GetPlaylist”

Page 33: Istio · Weaving the mesh svcA sidecar sidecar Service A ... Fault Injection ... Istio adds fault tolerance to your application

Attributes● Typed name-value tuples that describe behaviors within the mesh

○ Base vocabulary○ Extensible

● Envoy and Services produce attributes, Mixer consumes them● Attributes are fundamental to how operators experience Istio

Page 34: Istio · Weaving the mesh svcA sidecar sidecar Service A ... Fault Injection ... Istio adds fault tolerance to your application

Roadmap

● More networking features - UDP, Payload transforms, Websocket, Global LB● VMs and other environments● Hybrid cloud & federation● Value-add integrations - ACLs, Telemetry, Audit, Policy, ....● Security - vTPM/HSM & Cert stores, Federation, Cloud Platforms, ...● Stability

Page 35: Istio · Weaving the mesh svcA sidecar sidecar Service A ... Fault Injection ... Istio adds fault tolerance to your application

Community Partners● RedHat● Pivotal● WeaveWorks● Tigera● Datawire● Scytale (SPIFFE)

… and you!

Page 36: Istio · Weaving the mesh svcA sidecar sidecar Service A ... Fault Injection ... Istio adds fault tolerance to your application

Thanks! Phew