32
Managing Performance Globally with MySQL Daniel Austin, PayPal, Inc. MySQL Connect 2013 Sept 22nd, 2013

Managing Performance Globally with MySQL

Embed Size (px)

DESCRIPTION

This is my presentation at MySQL Connect for 2013. I describe a large-scale Big Data system and how it was built.

Citation preview

Page 1: Managing Performance Globally with MySQL

Managing Performance Globally with MySQL

Managing Performance Globally with MySQL

Daniel Austin, PayPal, Inc.

MySQL Connect 2013

Sept 22nd, 2013

Page 2: Managing Performance Globally with MySQL

Why Are We Here?We needed a comprehensive system for performance management at PayPal

Vision->Goals->Plan->Execution->Delighted User

“Anytime Anywhere” implies a significant commitment to the user experience, especially performance and service reliability.

So we designed a fast real-time analytics system for performance data using MySQL 5.1.

And then we built it.

Page 3: Managing Performance Globally with MySQL

Overture: Architecture Principles

1. Design and build for scale

2. Only build to differentiate

3. Everything we use or create must have a managed lifecycle

4. Design with systemic qualities in mind

5. Adopt industry standards

3

Page 4: Managing Performance Globally with MySQL

What Do You Mean ‘Web Performance’?

• Performance is response time– In this case, we are scoping

the discussion to include only end-user response time for PayPal activities

• Only outside the PayPal system boundary– Inside, it’s monitoring,

complementary but different– We are concerned with real

people not machines• For our purposes, we treat

PayPal’s systems as a black box

Page 5: Managing Performance Globally with MySQL

The Vision: 3 Big Ideas

Bake It In Up Front

One Consistent

View

End2End Performance

for End Users

Performance engineering is a design-time activity.

Establish one shared, consistent performance toolkit and testing methodology.

We are focused on the experiences of end users of PayPal, anywhere, anyway, anytime.

Page 6: Managing Performance Globally with MySQL

Who Needs Performance Data?

Page 7: Managing Performance Globally with MySQL

Architecture: Features

• Model Driven Architecture – no code!• Data Driven

– Real data products– Fast, efficient data model for HTTP

• Up-to-date global dataset provides low MTTR• Flexible fast reporting for performance analytics

Page 8: Managing Performance Globally with MySQL

The Big Picture

Data Collection

Data Reporting

Data Storage

Page 9: Managing Performance Globally with MySQL

The Big Picture

Page 10: Managing Performance Globally with MySQL

Part I – Data Collection

Page 11: Managing Performance Globally with MySQL

The Data Collection Footprint

Page 12: Managing Performance Globally with MySQL

End to End Testing

Page 13: Managing Performance Globally with MySQL

The API Testing Challenge

Page 14: Managing Performance Globally with MySQL

Data Collection Summary

• Multiple sources for synthetic and RUM performance testing data

• Large-scale dataset with very long (10 yrs+) retention time– Need to build for the ages

• Requires some effort to design a flexible methodology when devices and networks are changing quickly

Page 15: Managing Performance Globally with MySQL

Part II – Data Storage

Page 16: Managing Performance Globally with MySQL

Advanced ETL With Talend• MODEL-DRIVEN = FAST

DEVELOPMENT

• LETS US DEVELOP COMPONENTS FAST

• METADATA DRIVEN• MODEL IN, JAVA

OUT

Page 17: Managing Performance Globally with MySQL

GLeaM Data Products

• Level 0 – Raw data at measurement-level

resolution– Field-level Syntactic & Semantic

Validation

– Level 1 – 3NF 5D Data Model – concrete aggregates while

retaining record-level resolution

– Level 2 – User-defined and derived

measures– Time & Space-based aggregates– Longitudinal and bulk reporting

A data product is a well-defined data set that has data types, a data dictionary, and validation criteria. It should be possible to rebuild the system from a functional viewpoint based entirely on the data product catalog.

Page 18: Managing Performance Globally with MySQL

Semantic v. Syntactic Validation

2. Semantic Validation Step

1. Syntactic Validation Step

Page 19: Managing Performance Globally with MySQL

GLeaM Data Storage

• Modeling HTTP in SQL

• MySQL 5.1, Master & multi-slave config

• 3rd Normal Form, Codd compliance

• Fast, efficient analytical data model for HTTP Sessions

Page 20: Managing Performance Globally with MySQL

3NF Level 1 Data Model for HTTP

• NO xrefs• 5D User Narrative Model• High levels of normalization are costly up

front…• …but pay for themselves later when you

are making queries!

Page 21: Managing Performance Globally with MySQL

GLeaM Data Model

Page 22: Managing Performance Globally with MySQL

Level 1: The Boss Battle!

Page 23: Managing Performance Globally with MySQL

Managing URLs• VARCHAR(4096)?• Split at path segment• We used a simple

SHA(1) key to index secondary URL tables

• We need a defined URI data type in MySQL!

Page 24: Managing Performance Globally with MySQL

Some Best Practices

– URIs: Handle with care• Encode text strings in lexical order• Use sequential bitfields for searching

– Integer arithmetic only– Combined fields for per-row consistency

checks in every table– Don’t skip the supporting jobs – sharding,

rollover, logging– Don’t trade ETL time for integrity risk!

Page 25: Managing Performance Globally with MySQL

Part III – Data Reporting

Page 26: Managing Performance Globally with MySQL

GLeaM Data Reporting

• GLeaM is intended to be agnostic and flexible w.r.t reporting tools

• We chose Tableau for dynamic analytics • We also use several enterprise-level

reporting tools to produce aggregate reports

Page 27: Managing Performance Globally with MySQL

Tableau FeaturesINTERACTIVE & FLEXIBLE

EXCEL-LIKE SIMPLICITY

WEB AND DESKTOP CLIENTS

FAST PERFORMANCE

Page 28: Managing Performance Globally with MySQL

GLeaM Reports

Executives

Operations

Analytics

• High-level overviews for busy decision-makers

• Diagnostic reports for operations teams to identify

• Deep-dive analytical reports to identify opportunities for improvements

We designed initial reports for 3 sets of stakeholders:

Page 29: Managing Performance Globally with MySQL

Global Performance Management

29

Page 30: Managing Performance Globally with MySQL

What We Learned

• Paying attention to design patterns pays off• MySQL rewards detailed optimization• Trade-offs around normalization can lead to

10x or even 100x query time reduction• Sharding remains an issue• We believe we can easily achieve petabyte

scales with additional slaves

30

Page 31: Managing Performance Globally with MySQL

CODA: THE LAST ARCHITECTURE PRINCIPLE

SHIBUI

SIMPLE

ELEGANT

BALANCED

 

…A PLAYER IS SAID TO BE SHIBUI WHEN HE OR SHE MAKES NO SPECTACULAR PLAYS ON THE FIELD, BUT CONTRIBUTES TO THE TEAM IN AN UNOBTRUSIVE WAY.

Page 32: Managing Performance Globally with MySQL

Thank You!Thank You!

Daniel Austin PayPal, Inc.

MySQL Connect 2013

Sept 22nd, 2013

@daniel_b_austin