23
Toward A High- Performance JSON Protocol: Notes JS.Conf May 3 rd , 2011 V 0.9 Presented By: Daniel Austin Yahoo! Exceptional Performance

Notes on a High-Performance JSON Protocol

Embed Size (px)

DESCRIPTION

This is my presentation from JSConf 2011. I am proposing a new Web protocol to improve performance across the Internet. It's based on a dual-band protocol layered over TCP/IP and UDP and is backward compatible with existing HTTP-based systems.

Citation preview

Page 1: Notes on a High-Performance JSON Protocol

Toward A High-Performance JSON Protocol:NotesJS.Conf

May 3rd, 2011

V 0.9

Presented By:Daniel Austin

Yahoo! Exceptional Performance

Page 2: Notes on a High-Performance JSON Protocol

1

2

Introduction: Starting From Scratch

Protocol Design

Results & Current State

Where Do We Go From Here?

3

4

AGENDA

Page 3: Notes on a High-Performance JSON Protocol

Exceptional Performance: What we do…

Create great Tools for users to optimize their pages, like YSlow.

Optimize the User Experience for Yahoo! users every day

Research on how to make the Web smarter and faster

Page 4: Notes on a High-Performance JSON Protocol

Prepared forClient name

Goals for Today’s Talk

Explain the goals and design for SCRATCH, and why we are excited about using JSON to make the Web faster and smarter

Describe our Experiments and what we’ve learned about protocol design, and where we are thinking of going next

Request Feedback from our colleagues for ideas and improvements!

Page 5: Notes on a High-Performance JSON Protocol

Starting From SCRATCH

“We wanted to design a super-fast data protocol that would let us prioritize content and manage context while still working at scale…initially we ended up more or less re-designing TCP… …then we tore it up and started all over again…that’s why we called it SCRATCH”

Page 6: Notes on a High-Performance JSON Protocol

Prepared forClient name

Elevator Pitch

SCRATCH is a new dual-

band data protocol for the

Web.

It’s designed to work together

with HTTP/TCP as a control

channel [a] and use

SCRATCH/UDP as it’s data

channel [D].

Page 7: Notes on a High-Performance JSON Protocol

Goals for Scratch Data Channel [Work in Progress!]

• Fast Bandwidth efficiency up by 2x to 50%

• Smart ‘semantic awareness’ Managed contexts for state, identity, etc. as first-class objects in the system

• Robust but lightweight To target slow Networks, mobile and tablet

devices, low-bandwidth IoT chatter…

Page 8: Notes on a High-Performance JSON Protocol

1

2

Introduction: Starting From Scratch

Protocol Design

Results & Current State

Where Do We Go From Here?

3

4

AGENDA

Page 9: Notes on a High-Performance JSON Protocol

TCP & Bandwidth Efficiency•Slow for small objects•Parallelism not uniform•No context = redundancy•Trades reliability for

performance•Not designed for small

incremental changes•Typically ~ 25%b

Distribution of Web Objects By Size & TCP Efficiency

W. Shi et al. / J. Parallel Distrib. Comput. 63 (2003) 963–980

Page 10: Notes on a High-Performance JSON Protocol

Fellow Travellers

EXI RakNet

AVRO Scratch

Protocol Buffers

Thrift

Argot/ XPL

RTP/ RTCP

YQL SCTP

SPDY

Page 11: Notes on a High-Performance JSON Protocol

Why UDP?

• Need for Speed• Need more flexible, multipoint architectures• Small messages, transient data• Consistent ordering not required• Use resend-don’t-retransmit strategy• Already a significant amount of prior art• Simple as possible (but no simpler)

Page 12: Notes on a High-Performance JSON Protocol

The UDT Library

- Originally developed at UIUC- Winner of multiple Supercomputing Challenge

awards- Provides full encapsulation, connection

management, congestion control hooks- 3rd generation code/design choice- Code is robust, well-tested- API similar to traditional BSD sockets- Almost too much flexibility!

Page 13: Notes on a High-Performance JSON Protocol

JSON – The Good Parts

- Easy to encode/decode- Available on all platforms (mobile, desktop…)- True to Web semantics, human-understandable- Compact and lightweight

It makes everything else a whole lot easier…

Scratch Uses JSON as Its Data Layer Format. Why?

Page 14: Notes on a High-Performance JSON Protocol

1

2

Intro: Starting From Scratch

Protocol Design

Results & Current State

Where Do We Go From Here?

3

4

AGENDA

Page 15: Notes on a High-Performance JSON Protocol

Looking at the Stack: UDP+JSON

Page 16: Notes on a High-Performance JSON Protocol

Prepared forClient name

Learnings from using AVRO/JSON

Pro- Well-managed, current

codebase- Makes JSON more robust

with well-defined types,

grammar- Self-contained schemas-

as-metadata - Hooks for SASL, lexical

sorting

Con- Code complexity, long

learning curve- Very RPC-centric (not bad

but not what we wanted )- Not many cons!

[ { "type" : "record", "name" : “Cookie", "fields" [ { "name" : “Name", "type" : "string"}, { "name" : “Value", "type" : “string"} ] …

Page 17: Notes on a High-Performance JSON Protocol

Prepared forClient name

Scratchpad Performance – 1st Pass Results

Test Setup

- 5 AWS global locations

US-,US-W,AP-S-AP-T,EU- Circular buffer test

1000 ‘Linkdef’ [D] objects

(1470 bytes padded)

- Also tested 35k text buffer

(size of Yahoo! Front Page

base HTML)

ResultsSCRATCH [D] (ms) HTTP/TCP (ms) dropped %

Update 1 338 2240 0.11Update all 1281 N/A 0.11Send base file (35k) 217 675 N/ACompress & Send 114 480 N/A

0 200 400 600 800

Scratch/UDP

HTTP/TCP

SCRATCH [D] vs. HTTP/TCP

Response Time (ms)SCRATCH

Page 18: Notes on a High-Performance JSON Protocol

Is SCRATCH Network-friendly?

Fewer Packets vs. More Updates Throttling based on MTU, RTT Metadata as 1st Class Object? Well-defined endpoints and

connection state establishment? Handles smaller MTU sizes? Nearest-node potential to reduce

payloads

Page 19: Notes on a High-Performance JSON Protocol

1

2

Intro: Starting From Scratch

Protocol Design

Results & Current State

Where Do We Go From Here?

3

4

AGENDA

Page 20: Notes on a High-Performance JSON Protocol

Where Do We Go From Here?

When we first started, we were only trying to make things go faster…we soon realized that to really make the Web go faster, we had to make it smarter as well…

Page 21: Notes on a High-Performance JSON Protocol

Must Haves

Better Semantics

- Currently only 3 SCRATCH Schemas: Cookie, URI,

HTTPHeader

Resource Caching Encapsulation

- Should dynamically update IP of nearest copy

Encryption with SASL/SSL/TLS

- Difficult to make any type of encryption work over a

proxy

Native Compression (byte-pair, gzip)

- Byte-pair cheaper for mobile devices?

Node support

 

Page 22: Notes on a High-Performance JSON Protocol

Future Research

Improving HypertextUse SCRATCH to make links self-aware and self-healing, multi-home and context-aware

Peer Caching Use SCRATCH to update the browser cache

incrementally in a stateful wayMerging with the Internet of Things Everyday objects emitting SCRATCH objects

and joining the Web…who knows?

Page 23: Notes on a High-Performance JSON Protocol

THANK YOUQuestions?

Daniel [email protected]

@daniel_b_austin

In building, architecture is a noun – in business, architecture is a verb.

R. Buckminster Fuller