24
Thial : A Client Noti cation Service for Internet-Scale Applications Atul Adya, Gregory Cooper, Daniel Myers, Michael Piatek Google Seattle 1

Thial: A Client Notication Service for Internet-Scale Applications Atul Adya, Gregory Cooper, Daniel Myers, Michael Piatek Google Seattle 1

Embed Size (px)

Citation preview

Page 1: Thial: A Client Notication Service for Internet-Scale Applications Atul Adya, Gregory Cooper, Daniel Myers, Michael Piatek Google Seattle 1

1

Thialfi: A Client Notification Servicefor Internet-Scale Applications

Atul Adya, Gregory Cooper, Daniel Myers, Michael Piatek

Google Seattle

Page 2: Thial: A Client Notication Service for Internet-Scale Applications Atul Adya, Gregory Cooper, Daniel Myers, Michael Piatek Google Seattle 1

2

A Case for Notifications

Problem: Ensuring cached data is fresh across users and devices

Page 3: Thial: A Client Notication Service for Internet-Scale Applications Atul Adya, Gregory Cooper, Daniel Myers, Michael Piatek Google Seattle 1

3

Common Application Patterns

• Clients poll to detect changes– Simple and reliable, but slow and inefficient

• Push updates to the client– Fast but complex– Add backup polling to get reliability– Tail latencies can be high: masks bugs– Application-specific protocol

sacrifice reliability

Page 4: Thial: A Client Notication Service for Internet-Scale Applications Atul Adya, Gregory Cooper, Daniel Myers, Michael Piatek Google Seattle 1

4

Our Solution: Thialfi

• Scalable: tracks millions of clients and objects

• Fast: notifies clients in less than a second

• Reliable: even when entire data centers fail

• Easy to use: deployed in Chrome Sync, Contacts, Google Plus

Page 5: Thial: A Client Notication Service for Internet-Scale Applications Atul Adya, Gregory Cooper, Daniel Myers, Michael Piatek Google Seattle 1

5

Talk Outline

• Thialfi’s abstraction: reliable signaling

• Delivering notifications in the common case

• Detecting and recovering from failures

• Evaluation and experience

Page 6: Thial: A Client Notication Service for Internet-Scale Applications Atul Adya, Gregory Cooper, Daniel Myers, Michael Piatek Google Seattle 1

6

Thialfi Overview

Thialfi client library

Register X Notify X

ClientData center

X: C1, C2

Client C1 Client C2

Thialfi Service

Update XRegister

Register

Update XApplication backend

Notify X Notify X

Page 7: Thial: A Client Notication Service for Internet-Scale Applications Atul Adya, Gregory Cooper, Daniel Myers, Michael Piatek Google Seattle 1

7

Thialfi Abstraction

• Objects have unique IDs and version numbers, monotonically increasing on every update

• Delivery guarantee– Registered clients learn latest version number– Reliable signal only: cached object ID X at version Y

Page 8: Thial: A Client Notication Service for Internet-Scale Applications Atul Adya, Gregory Cooper, Daniel Myers, Michael Piatek Google Seattle 1

8

Why Signal, Not Data?

• Developers want reliable, in-order data delivery

• Adds complexity to Thialfi and application, e.g.,– Hard state, arbitrary buffering– Offline applications flooded with data on wakeup

• For most applications, reliable signal is enough– Invoke polling path on signal: simplifies integration

Page 9: Thial: A Client Notication Service for Internet-Scale Applications Atul Adya, Gregory Cooper, Daniel Myers, Michael Piatek Google Seattle 1

9

API Without Failure Recovery

Thialfi Service Publish(objectId, version)

ClientLibrary

Register(objectId)Unregister(objectId)

Notify(objectId, version)

Page 10: Thial: A Client Notication Service for Internet-Scale Applications Atul Adya, Gregory Cooper, Daniel Myers, Michael Piatek Google Seattle 1

10

Talk Outline

• Thialfi’s abstraction: reliable signaling

• Delivering notifications in the common case

• Detecting and recovering from failures

• Evaluation and experience

Page 11: Thial: A Client Notication Service for Internet-Scale Applications Atul Adya, Gregory Cooper, Daniel Myers, Michael Piatek Google Seattle 1

11

Architecture

ClientBigtable

• Matcher: Object ID registered clients, version• Registrar: Client ID registered objects, notifications

Client

Registrar

MatcherObjectBigtable

Data center

Notifications Application Backend

Registrations, notifications,acknowledgments

Client library

Page 12: Thial: A Client Notication Service for Internet-Scale Applications Atul Adya, Gregory Cooper, Daniel Myers, Michael Piatek Google Seattle 1

12

C1: x, v7C2: x, v7C1: x, v5C2: x,

x: v5; C1, C2x: v7; C1, C2x: v7; C1, C2

x

Life of a Notification

ClientBigtable

C1: x, v7

C2: x, v7

Notify: x, v7

Client C2

MatcherObjectBigtable

Data center

Publish(x, v7)x, v7

Ack: x, v7

Registrar

Page 13: Thial: A Client Notication Service for Internet-Scale Applications Atul Adya, Gregory Cooper, Daniel Myers, Michael Piatek Google Seattle 1

13

Talk Outline

• Thialfi’s abstraction: reliable signaling

• Delivering notifications in the common case

• Detecting and recovering from failures

• Evaluation and experience

Page 14: Thial: A Client Notication Service for Internet-Scale Applications Atul Adya, Gregory Cooper, Daniel Myers, Michael Piatek Google Seattle 1

14

Data center lossServer state loss/schema migrationPartial storage unavailability

Possible Failures

ClientLibrary

ClientBigtable Registrar

MatcherObjectBigtable

ClientBigtable Registrar

MatcherObjectBigtable

. . .

Data center 1 Data center nThialfi Service

ClientStore

Client restartClient state loss

Publish Feed

Network failures

Page 15: Thial: A Client Notication Service for Internet-Scale Applications Atul Adya, Gregory Cooper, Daniel Myers, Michael Piatek Google Seattle 1

15

Failures Addressed by Thialfi

• Client restart• Client state loss• Network failures• Partial storage unavailability• Server state loss / schema migration• Publish feed loss• Data center outage

Page 16: Thial: A Client Notication Service for Internet-Scale Applications Atul Adya, Gregory Cooper, Daniel Myers, Michael Piatek Google Seattle 1

16

Main Principle: No Hard State

• Thialfi remains correct even if all state is lost– All registrations– All object versions

• Detect and reconstruct after failures using:– ReissueRegistrations() client event– Registration Sync Protocol– NotifyUnknown() client event

Page 17: Thial: A Client Notication Service for Internet-Scale Applications Atul Adya, Gregory Cooper, Daniel Myers, Michael Piatek Google Seattle 1

17

Recovering Client Registrations

Registrar

MatcherObjectBigtable

x

y

x yReissueRegistrations()

Register(x); Register(y)

ReissueRegistrations: Not a burden for applications– Application stores objects in its cache, or – Object list is implicit, e.g., bookmarks for user X

Page 18: Thial: A Client Notication Service for Internet-Scale Applications Atul Adya, Gregory Cooper, Daniel Myers, Michael Piatek Google Seattle 1

18

Registrar

MatcherObjectBigtable

Register: x, y

Syncing Client Registrations

x

y

Hash(x, y)x y

• Goal: Keep client-registrar registration state in sync• Every message contains hash of registered objects• Registrar initiates protocol when detects out-of-sync• Allows simpler reasoning of registration state

Reg syncHash(x, y)

Page 19: Thial: A Client Notication Service for Internet-Scale Applications Atul Adya, Gregory Cooper, Daniel Myers, Michael Piatek Google Seattle 1

19

Recovering From Lost Versions

• Versions may be lost, e.g. schema migration

• Refreshing from backend requires tight coupling

• Inform client with NotifyUnknown(objectId) – Client must refresh, regardless of its current state

Page 20: Thial: A Client Notication Service for Internet-Scale Applications Atul Adya, Gregory Cooper, Daniel Myers, Michael Piatek Google Seattle 1

20

Talk Outline

• Thialfi’s abstraction: reliable signaling

• Delivering notifications in the common case

• Detecting and recovering from failures

• Evaluation and experience

Page 21: Thial: A Client Notication Service for Internet-Scale Applications Atul Adya, Gregory Cooper, Daniel Myers, Michael Piatek Google Seattle 1

21

Notification Latency Breakdown

Notification latency (ms)0

100

200

300

Matcher to Registrar RPC (Batched)

Matcher Bigtable Read

Matcher Bigtable Write (Batched)

Bridge to Matcher RPC (Batched)

App Backend to Bridge

Batching accounts for significant fraction of latency

Page 22: Thial: A Client Notication Service for Internet-Scale Applications Atul Adya, Gregory Cooper, Daniel Myers, Michael Piatek Google Seattle 1

22

Thialfi Usage by Applications

Application Language Network Channel

Client Lines of Code(Semi-colons)

Chrome Sync C++ XMPP 535Contacts JavaScript Hanging GET 40

Google+ JavaScript Hanging GET 80

Android Application Java C2DM + Standard GET

300

Google BlackBerry Java RPC 340

Page 23: Thial: A Client Notication Service for Internet-Scale Applications Atul Adya, Gregory Cooper, Daniel Myers, Michael Piatek Google Seattle 1

23

Some Lessons Learned

• Add complexity at the server, not the client– Deploy at server: minutes. Upgrade clients: years+

• Asynchronous events, not callbacks– Spontaneous events occur: need to handle them

• Initial applications have few objects per client– Earlier use of polling forces such a model

Page 24: Thial: A Client Notication Service for Internet-Scale Applications Atul Adya, Gregory Cooper, Daniel Myers, Michael Piatek Google Seattle 1

24

Thialfi Summary

• Fast, scalable notification service• Reliable even when data centers fail• Two key ideas simplify failure handling– Deliver a reliable signal, not data– No hard state: reconstruct after failure

• Deployed in Chrome Sync, Contacts, Google+