51
Cloud

[Gaming on AWS] Big Data Analysis in the Cloud

Embed Size (px)

DESCRIPTION

Big Data Analysis in the Cloud - AWS Korea (정윤진, Solutions Architect)

Citation preview

Page 1: [Gaming on AWS] Big Data Analysis in the Cloud

Cloud

Page 2: [Gaming on AWS] Big Data Analysis in the Cloud

Thank you

Page 3: [Gaming on AWS] Big Data Analysis in the Cloud

In the next 30 minutes

1

3

What is big data

Big data on AWS

How customers using AWS

2

Page 4: [Gaming on AWS] Big Data Analysis in the Cloud
Page 5: [Gaming on AWS] Big Data Analysis in the Cloud

Where is this data coming from ?

Page 6: [Gaming on AWS] Big Data Analysis in the Cloud

Human generated

Machine generated

Tweet

Surf the internet

Buy and sell products

Upload images and videos

Play games

Check in at restaurants

Search for cafes

Find deals

Watch content online

Look for directions

Use social media

Page 7: [Gaming on AWS] Big Data Analysis in the Cloud

Human generated

Machine generated

Networks and security devices

Mobile phones

Cell phone towers

Smart grids

Smart meters

Telematics from cars

Sensors on machines

Videos from traffic and security cameras

Page 8: [Gaming on AWS] Big Data Analysis in the Cloud

What is it used for ?

Page 9: [Gaming on AWS] Big Data Analysis in the Cloud

Data for competitive advantage

Page 10: [Gaming on AWS] Big Data Analysis in the Cloud

Data for competitive advantage

Customer Segmentation

Financial modeling,

System analysis,

Line-of-sight,

Replacing Human decisions

Business intelligence..

Page 11: [Gaming on AWS] Big Data Analysis in the Cloud

Data for competitive advantage

Customer Segmentation

Financial modeling,

System analysis,

Line-of-sight,

Replacing Human decisions

Business intelligence..

Innovating new business and revenue models

Page 12: [Gaming on AWS] Big Data Analysis in the Cloud

Generation

Collect

Store

Collaboration & sharing

Analysis and Computation

Page 13: [Gaming on AWS] Big Data Analysis in the Cloud

Generation

Collect

Store

Collaboration & sharing

Analysis and Computation

lower cost,

increased

throughput

Page 14: [Gaming on AWS] Big Data Analysis in the Cloud

Generation

Collect

Store

Collaboration & sharing

Analysis and Computation

lower cost,

increased

throughput

constraint

Page 15: [Gaming on AWS] Big Data Analysis in the Cloud

Very high barrier to

turning data into

information…

Page 16: [Gaming on AWS] Big Data Analysis in the Cloud

Very high barrier to

turning data into

information.

Infrastructure capacity

Technical Skills

Questions to ask

Cheap experimentation

Page 17: [Gaming on AWS] Big Data Analysis in the Cloud

Amazon Web Services Cloud

Page 18: [Gaming on AWS] Big Data Analysis in the Cloud

Elastic and highly scalable

No upfront capital expense

Only pay for what you use

+

+

Available on-demand

+

= Remove

constraints

Page 19: [Gaming on AWS] Big Data Analysis in the Cloud

Remove constraints = More experimentation

More experimentation = More innovation

More Innovation = Competitive edge

Page 20: [Gaming on AWS] Big Data Analysis in the Cloud

Amazon Web Services

Removes constraints

Focus on your data

Leave undifferentiated heavy lifting to us

Page 21: [Gaming on AWS] Big Data Analysis in the Cloud

HOW

Page 22: [Gaming on AWS] Big Data Analysis in the Cloud

Generation

Collect

Store

Collaboration & sharing

Analysis and Computation

Page 23: [Gaming on AWS] Big Data Analysis in the Cloud

26

Page 24: [Gaming on AWS] Big Data Analysis in the Cloud

AWS Cloud Corporate Data center

Virtual Private Cloud

VPN

Internet

Direct Connect

Storage Gateway

AWS Import/Export

S3 EMR RedShift

How to move your data into AWS

Page 25: [Gaming on AWS] Big Data Analysis in the Cloud

AWS

Import/Export

Corporate

data center

Amazon

Elastic

MapReduce Amazon

Simple

Storage

Service (S3)

BI Users

Clickstream data

from 500+

websites and VoD

platform

Page 26: [Gaming on AWS] Big Data Analysis in the Cloud

Generation

Collect

Store

Collaboration & sharing

Analysis and Computation

Page 27: [Gaming on AWS] Big Data Analysis in the Cloud

More than 25 Million Streaming Members

50 Billion Events Per Day

30 Million plays every day

2 billion hours of video in 3

months

4 million ratings per day

3 million searches

Device location , time ,

day, week etc.

Social data

Page 28: [Gaming on AWS] Big Data Analysis in the Cloud

10 TB of streaming data per day

Page 29: [Gaming on AWS] Big Data Analysis in the Cloud

What is S3?

Highly scalable data storage

Access via APIs

Fast

(850K requests

per sec)

Highly available & durable

(99.999999999% Durability

Economical

($0.095 per GB)*

Web store

Page 30: [Gaming on AWS] Big Data Analysis in the Cloud
Page 31: [Gaming on AWS] Big Data Analysis in the Cloud

Velocity of data

Amazon Dynamodb

Page 32: [Gaming on AWS] Big Data Analysis in the Cloud

Generation

Collect

Store

Collaboration & sharing

Analysis and Computation

Page 33: [Gaming on AWS] Big Data Analysis in the Cloud

“Who buys video games?”

Page 34: [Gaming on AWS] Big Data Analysis in the Cloud

3.5 billion records

13 TB of click stream logs

71 million unique cookies

Per day:

Page 35: [Gaming on AWS] Big Data Analysis in the Cloud

500% return on ad spend

17,000% reduction in

procurement time

Results:

Page 36: [Gaming on AWS] Big Data Analysis in the Cloud
Page 37: [Gaming on AWS] Big Data Analysis in the Cloud
Page 38: [Gaming on AWS] Big Data Analysis in the Cloud

What is EMR?

Map-Reduce engine Integrated with tools

Hadoop-as-a-service

Massively parallel

Cost effective AWS wrapper

Integrated to AWS services

Page 39: [Gaming on AWS] Big Data Analysis in the Cloud

+

Source: http://nerds.airbnb.com/redshift-performance-cost

Table Size Query type Hive Redshift

3 billion

rows

Simple range

query

1680

seconds (28

min)

360 seconds

(6 min)

1 million

rows

2 complex

joins

182 seconds 8 seconds

$13.60/hour on Redshift versus $57/hour on

HIVE

Page 40: [Gaming on AWS] Big Data Analysis in the Cloud

Every day is crucial and costly

Page 41: [Gaming on AWS] Big Data Analysis in the Cloud

Challenge: To run a virtual screen with a higher

accuracy algorithm & 21 million compounds

Page 42: [Gaming on AWS] Big Data Analysis in the Cloud
Page 43: [Gaming on AWS] Big Data Analysis in the Cloud

Metric Count

Compute Hours of

Work

109,927 hours

Compute Days of

Work

4,580 days

Compute Years of

Work

12.55 years

Ligand Count ~21 million ligands

Using Cycle Computing and Amazon

Web Services

Page 44: [Gaming on AWS] Big Data Analysis in the Cloud

3 Hours for $4828.85/hr

Page 45: [Gaming on AWS] Big Data Analysis in the Cloud

Instead of $20+

Million in

Infrastructure

Page 46: [Gaming on AWS] Big Data Analysis in the Cloud

Generation

Collect

Store

Collaboration & sharing

Analysis and Computation

Page 47: [Gaming on AWS] Big Data Analysis in the Cloud

Open web index.

3.4 billion records.

Available to all.

1000 Genomes

project

Page 48: [Gaming on AWS] Big Data Analysis in the Cloud
Page 49: [Gaming on AWS] Big Data Analysis in the Cloud

Generation

Collect

Store

Collaboration & sharing

Analysis and Computation

Page 50: [Gaming on AWS] Big Data Analysis in the Cloud

Game instances

DB instances Proxy farms

Amazon EMR

Amazon

Glacier

Amazon

RedShift

Amazon

DynamoDB

Game traffic Analysis

Users

Sample architecture

Page 51: [Gaming on AWS] Big Data Analysis in the Cloud

Thank you! aws.amazon.com/big-data

[email protected]

May 21st, COEX Intercontinental, Seoul

One day Free training

Walk through of services

http://aws.amazon.com/apac/awsday/seoul/