Netflix running Presto in the AWS Cloud

Netflix running Presto in the AWS CloudZhenxiao Luo

Senior Software Engineer @ Netflix

Outline

● BigDataPlatform@Netflix● Use cases & requirements● What we did

○ Reading/Writing from/to Amazon S3○ Operations○ Deployment○ Performance

● What’s next?

BigDataPlatform @ Netflix

Use Cases● Big Batch Jobs

○ high throughput, fault tolerant, ETL○ data spills to disk○ Hive on Tez, Pig on Tez

● Adhoc Queries○ low latency, interactive, data exploration○ in-memory, but limited data size○ Impala, Redshift, Spark, Presto

Netflix Requirement● SQL like Language● Low latency for adhoc queries● Work well on AWS cloud● Good integration with Hadoop stack● Scale to 1000+ node cluster● Open source with community support

What did Netflix do?

Reading/Writing to/from S3

● Option 1: Apache Hadoop NativeS3FileSysyem

● Option 2: PrestoS3FileSystem○ retry logic for read timeout○ write directly to final S3 path

● Option 3: emrFileSystem○ disable hadoop logging○ disable hadoop FileSystem cache

Bug Fixes● https://github.

com/facebook/presto/commit/cf0b2d66f4050fb1959c832809fa76e323d6d46e

● https://github.com/facebook/presto/commit/594b06c3e93a482dc162d2c49c9bd265795efb86

● https://github.com/facebook/presto/pull/1147● https://github.com/facebook/presto/pull/1300● https://github.com/facebook/presto/issues/1285● https://github.com/facebook/presto/issues/1264

Our Operations Environment

● Launch script on top of EMR

● Ganglia integration

● Usage graphs - concurrent queries & tasks

Current Deployment

● Presto in Production @ Netflix● 100+ nodes Presto Cluster● 1000+ queries running per day● Presto query against the same Petabyte Scale S3 Data

Warehouse as Hive and Pig

Observed Performance @ Netflix

● Data in Sequence File Format● One MapReduce Job SmallTableScan

○ MapReduce overhead dominates the query execution time○ Presto is always ~10X faster than Hive

● One MapReduce Job BigTableScan○ MapReduce overhead is marginal compared with big table scan time○ Presto performs similar to Hive

● Multiple MapReduce Aggregation○ Presto is always > 10X faster than Hive

● Joins○ Presto is always > 2X faster than Hive

What we are working on

● Support Parquet File Format○ https://github.com/facebook/presto/pull/1147○ Parquet performs similar to Sequence, but not as fast as RCFile

● ODBC/JDBC driver for Presto○ Support Microstrategy running on Presto

Some inconveniences ...● Support Server Side “Use Schema”

○ Workaround: Client Side “Use Schema” Or “Schema.Table”● Recurse the partition directory

○ Different behavior with Hive● Metadata caching

○ have to rerun the query a number of times to see the metadata change

● Extend JSON extract functions to allow . notation○ json_extract_scalar(mapColumn, '$.namePart1.namePart2')○ Workaround: regexp_extract

● WebUI running slow○ load query task info on demand

Features we would like● Big table join● User Defined Functions● Break down one column value into several tuples

○ In Hive: lateral view explode json_tuple● Decimal type● Scheduler● Writes

○ Insert overwrite○ Alter table add partition○ Parallel writes from workers (not client only)

Q & AThank you!

Netflix running Presto in the AWS Cloud

Technology

Netflix and Open Source - etouches and Open Source April 2013 ... New Anti-Fragile Patterns Micro-services ... Multiple AWS Regions Eureka Registry

(BDT303) Running Spark and Presto on the Netflix Big Data Platform

Automating Security in Cloud Workloads with DevSecOps...... Amazon Web Services, Inc ... aws ec2 --region $REGION create-tags ... Repository of sample Custom Rules for AWS Config Netflix/security_monkey-Monitors

Presto 2.0 Introduction - What is Presto

Understanding Presto - Presto meetup @ Tokyo #1

Journey to the Cloud with Snowflake and AWS · Journey to the Cloud with Snowflake and AWS •Ken Chestnut Amazon Web Services. Cloud Computing Has Become the New Normal ... •Netflix

AWS re:Invent 2016: Another Day in the Life of a Netflix Engineer (DEV209)

Baking Stash in the AWS Cloud at Netflix

Notice installation PRESTO V5 - cecoiawiki.ac-creteil.frcecoiawiki.ac-creteil.fr/wiki/telechargement/PRESTO/PRESTO_V5.0/... · Notice d’installation PRESTO V5.0 DSI CRETEIL Assistance

How Netflix Leverages Multiple Regions to Increase Availability (ARC305) | AWS re:Invent 2013

1 Analysis of Netflix presented by Vince Wang. 2 Agenda Introduction Introduction What is Netflix? What is Netflix? How Netflix Works? How Netflix Works?

PRESTO Preservation Technologies for European …1seminariopreservacaopatrimoniodigital.dglab.gov.pt/wp-content/...PRESTO Consortium Confidential PRESTO – Preservation Technologies

Culture and Evolution - TMA Migration Evolution_Adrian... · Amazon Web Services Culture and Evolution. ... Netflix cloud journey, brought up to date. 2008 ... Netflix and AWS April

Presto Testing Tools: Benchto & Tempto (Presto Boston Meetup 10062015)

Cloud Performance Root Cause Analysis at Netflix · >150k AWS EC2 server instances ~34% US Internet traffic at night >130M members Performance is customer satisfaction & Netflix cost

Presto! PageManager Overviewscanners.fcpa.fujitsu.com/scanzen/presto_page_manager.pdf · Presto! PageManager Overview Presto! PageManager is a document management software that is

Presto Training Series, Session 1: Using Advanced SQL ... · Presto Training Series, Session 1: Using Advanced SQL Features In Presto Try Presto: David Phillips and Manfred Moser

The Enterprise Presto Company STARBURST Presto: SQL-on ...biconsulting.hu/letoltes/2018budapestdata/wojciech_biela_presto_sql_on_anything.pdf · The Enterprise Presto Company STARBURST

AWS re:Invent 2016: Netflix: Container Scheduling, Execution, and Integration with AWS (CON313)

Running Fast, Interactive Queries on Petabyte Datasets using Presto - AWS July 2016 Webinar Series