26
© 2014 Aerospike. All rights reserved. Confidential 1 What Starups Can Learn from Real-time Bidding Or “10 times faster, really?” Brian Bulkowski CTO and co- founder Aerospike

Brian Bulkowski : what startups can learn from real-time bidding

Embed Size (px)

DESCRIPTION

Presentation about the technical issues of scaling out that apply to startup CTOs. https://ti.to/startup-cto-summit/sf

Citation preview

Page 1: Brian Bulkowski : what startups can learn from real-time bidding

© 2014 Aerospike. All rights reserved. Confidential 1

What Starups Can Learn from Real-time Bidding

Or

“10 times faster, really?”

Brian BulkowskiCTO and co-founder

Aerospike

Page 2: Brian Bulkowski : what startups can learn from real-time bidding

© 2014 Aerospike. All rights reserved. Confidential 2

Who am I ?

■ TRS-80, PC, Apple II, Vax 11/70, Wang■First product: lightpen university teaching kiosk■Networks: computers without people are boring

■ Liberate / NetComputer through the boom■10B market cap in 1999, employee 32

■ 2003-2007 “time off” ( startups )

■ Citrusleaf / Aerospike history■ 42 year old first-time CEO (me)■ 2008 Prototype■ 2010 First sale, get the band back together■ 2011+ 3 rounds of funding (Draper, ALP, NEA, CNTP)■ 70 employees, 2 offices

[email protected]@aerospike.com@bbulkow

Page 3: Brian Bulkowski : what startups can learn from real-time bidding

© 2014 Aerospike. All rights reserved. Confidential 3

MILLIONS OF CONSUMERSBILLIONS OF DEVICES

APP SERVERS

DATA WAREHOUSEINSIGHTS

Advertising Technology Stack

WRITE CONTEXT

In-memory NoSQL

WRITE REAL-TIME CONTEXTREAD RECENT CONTENT

PROFILE STORECookies, email, deviceID, IP address, location, segments, clicks, likes, tweets, search terms...

REAL-TIME ANALYTICS Best sellers, top scores, trending tweets

BATCH ANALYTICSDiscover patterns, segment data: location patterns, audience affinity

Page 4: Brian Bulkowski : what startups can learn from real-time bidding

© 2014 Aerospike. All rights reserved. Confidential 4

Introduction to Advertising: Real-time Bidding

Page 5: Brian Bulkowski : what startups can learn from real-time bidding

© 2014 Aerospike. All rights reserved. Confidential 5

North American RTB speeds & feeds

■ 1 to 6 billion cookies tracked■Some companies track 200M, some track 20B

■ Each bidder has their own data pool■Data is your weapon■Recent searches, behavior, IP addresses■Audience clusters (K-cluster, K-means) from offline Hadoop

■ “Remnant” from Google, Yahoo is about 0.6 million / sec

■ Facebook exchange: about 0.6 million / sec■ “other” is 0.5 million / sec

Currently about 3.0M / sec in North American

Page 6: Brian Bulkowski : what startups can learn from real-time bidding

© 2014 Aerospike. All rights reserved. Confidential 6

Financial Services – Intraday Positions

LEGACY DATABASE(MAINFRAME)

Read/Write

Start of Day Data Loading

End of DayReconciliation

QueryREAL-TIME DATA FEED

ACCOUNTPOSITIONS

XDR

10M+ user records

Primary key access

1M+ TPS planned

Finance App

Records App

RT Reporting App

Page 7: Brian Bulkowski : what startups can learn from real-time bidding

© 2014 Aerospike. All rights reserved. Confidential 7

Social Media

MYSQL or POSTGRES(ROTATIONAL DISK)

Recent user generated content

Java application tier

Data abstractionand sharding

MODIFIED REDIS(SSD ENABLED)

Content and Historical data

Page 8: Brian Bulkowski : what startups can learn from real-time bidding

© 2014 Aerospike. All rights reserved. Confidential 8

Travel Portal

PRICING DATABASE(RATE LIMITED)

Poll for Pricing Changes

PRICING DATA

Store LatestPrice

SESSIONMANAGEMENT

SessionData

ReadPrice

XDR

Airlines forced interstate banking

Legacy mainframe technology

Multi-company reservation and pricing

Requirement: 1M TPS allowing overhead

Travel App

Page 9: Brian Bulkowski : what startups can learn from real-time bidding

© 2014 Aerospike. All rights reserved. Confidential 9

SOURCE DEVICE/ USER

QOS & Real-Time Billing for Telcos

■In-switch Per HTTP request Billing■US Telcos: 200M subscribers, 50 metros

■In-memory use case

Hot Standby

Execute Request

Real-timeChecks

DESTINATION

UpdateDeviceUserSettings

Request

XDR

Real-time Auth. QoS Billing

Config Module App

Page 10: Brian Bulkowski : what startups can learn from real-time bidding

© 2014 Aerospike. All rights reserved. Confidential 10

Old Architecture ( scale out in 2000 )

Request routing and sharding

APP SERVERS

CACHE

DATABASE

STORAGE

CONTENT DELIVERY NETWORK

LOAD BALANCER

Page 11: Brian Bulkowski : what startups can learn from real-time bidding

© 2014 Aerospike. All rights reserved. Confidential 11

Modern Scale Out Architecture

Load balancerSimple stateless

APP SERVERS

IN-MEMORY NoSQL

RESEARCHWAREHOUSE

CONTENT DELIVERY NETWORK

LOAD BALANCER

Long term cold storageFast stateless

HDFS BASED

Page 12: Brian Bulkowski : what startups can learn from real-time bidding

© 2014 Aerospike. All rights reserved. Confidential 12

How Fast You Can Go

( a few graphs )

Page 13: Brian Bulkowski : what startups can learn from real-time bidding

© 2014 Aerospike. All rights reserved. Confidential 13

YCSB Performance Comparison 2014

Page 14: Brian Bulkowski : what startups can learn from real-time bidding

© 2014 Aerospike. All rights reserved. Confidential 14

Hot Analytics

■High throughput Queries■2 node cluster, 10 Indexes■Query returns 100 of 50M records

■Predictable low latency

UN-PREDICTABLE LATENCY

128 – 300 ms

70 – 760 ms

7 – 10 ms

QPS

Page 15: Brian Bulkowski : what startups can learn from real-time bidding

© 2014 Aerospike. All rights reserved. Confidential 15

Amazon EC2 results

Page 16: Brian Bulkowski : what startups can learn from real-time bidding

© 2014 Aerospike. All rights reserved. Confidential 16

Mo’ speed, mo’ problems

I don’t need that much speed( you will ! )

“ferrari speed” is bad( but with camry

reliability? )

I don’t believe you( simple benchmark

tooling )Amazon will save me

( multicloud )( sell to API, platform

companies )

Page 17: Brian Bulkowski : what startups can learn from real-time bidding

© 2014 Aerospike. All rights reserved. Confidential 17

Lessons Learned

Page 18: Brian Bulkowski : what startups can learn from real-time bidding

© 2014 Aerospike. All rights reserved. Confidential 18

Coding standards

( hiring is the obvious problem )

Page 19: Brian Bulkowski : what startups can learn from real-time bidding

© 2014 Aerospike. All rights reserved. Confidential 19

Memory matters – the new coding style

CPU is free

Memory is expensive

Malloc is the ultimate enemy

Page 20: Brian Bulkowski : what startups can learn from real-time bidding

© 2014 Aerospike. All rights reserved. Confidential 20

Multithreading and reference counting

“we multithread so you don’t have to”

Hire old embedded guys

Build reference counted libraries

Memory access is the enemy

Page 21: Brian Bulkowski : what startups can learn from real-time bidding

© 2014 Aerospike. All rights reserved. Confidential 21

Clients are hard

Page 22: Brian Bulkowski : what startups can learn from real-time bidding

© 2014 Aerospike. All rights reserved. Confidential 22

Creative corner cutting (opinionated)

Server restart time doesn’t matter if the code is reliable

Hash collisions don’t matter if the hash function hasn’t had a collision (RIPE-160)

Rotational disk is dead( correct for

analytics ) Data commit doesn’t matter if the app server crashed

Page 23: Brian Bulkowski : what startups can learn from real-time bidding

© 2014 Aerospike. All rights reserved. Confidential 23

Aerospike’s Flash Experience

■Know your Flash■ACT benchmark http://github.com/aerospike/act■Read-write benchmark results back to 2011

■All clouds support flash now■New EC2 instances■Google Compute■Internap, Softlayer, GoGrid…

■Write durability usually not a problem with modern flash■Durability is high (5 “drive writes per day” for 5 years, etc)■Read performance suffers under write load anyway

Page 24: Brian Bulkowski : what startups can learn from real-time bidding

© 2014 Aerospike. All rights reserved. Confidential 24

Aerospike’s Flash Experience

■Densities increasing■100G 2 years ago 800G today■SATA vs PCI-E■Appliances: 50T per 1U this year

■Prices still dropping: perhaps $1/G next year

■ Intel P3700 results■250K per device @ $2.5 / G■Old standard: Micron P320h 500K @ $8 / G

■ “Wide SATA”■20 SATA drives■LSI “pass through mode”■250K+ per server

Page 25: Brian Bulkowski : what startups can learn from real-time bidding

© 2014 Aerospike. All rights reserved. Confidential 25

Use Open Source

Page 26: Brian Bulkowski : what startups can learn from real-time bidding

© 2014 Aerospike. All rights reserved. Confidential 26