Netflix running Presto in the AWS Cloud

Netflix running Presto in the AWS CloudZhenxiao Luo

Senior Software Engineer @ Netflix

Outline

● BigDataPlatform@Netflix● Use cases & requirements● What we did

○ Reading/Writing from/to Amazon S3○ Operations○ Deployment○ Performance

● What’s next?

BigDataPlatform @ Netflix

Use Cases● Big Batch Jobs

○ high throughput, fault tolerant, ETL○ data spills to disk○ Hive on Tez, Pig on Tez

● Adhoc Queries○ low latency, interactive, data exploration○ in-memory, but limited data size○ Impala, Redshift, Spark, Presto

Netflix Requirement● SQL like Language● Low latency for adhoc queries● Work well on AWS cloud● Good integration with Hadoop stack● Scale to 1000+ node cluster● Open source with community support

What did Netflix do?

Reading/Writing to/from S3

● Option 1: Apache Hadoop NativeS3FileSysyem

● Option 2: PrestoS3FileSystem○ retry logic for read timeout○ write directly to final S3 path

● Option 3: emrFileSystem○ disable hadoop logging○ disable hadoop FileSystem cache

Bug Fixes● https://github.

com/facebook/presto/commit/cf0b2d66f4050fb1959c832809fa76e323d6d46e

● https://github.com/facebook/presto/commit/594b06c3e93a482dc162d2c49c9bd265795efb86

● https://github.com/facebook/presto/pull/1147● https://github.com/facebook/presto/pull/1300● https://github.com/facebook/presto/issues/1285● https://github.com/facebook/presto/issues/1264

https://github.com/facebook/presto/commit/cf0b2d66f4050fb1959c832809fa76e323d6d46e




https://github.com/facebook/presto/commit/594b06c3e93a482dc162d2c49c9bd265795efb86




https://github.com/facebook/presto/pull/1147




https://github.com/facebook/presto/issues/1285




Our Operations Environment

● Launch script on top of EMR

● Ganglia integration

● Usage graphs - concurrent queries & tasks

Current Deployment

● Presto in Production @ Netflix● 100+ nodes Presto Cluster● 1000+ queries running per day● Presto query against the same Petabyte Scale S3 Data

Warehouse as Hive and Pig

Observed Performance @ Netflix

● Data in Sequence File Format● One MapReduce Job SmallTableScan

○ MapReduce overhead dominates the query execution time○ Presto is always ~10X faster than Hive

● One MapReduce Job BigTableScan○ MapReduce overhead is marginal compared with big table scan time○ Presto performs similar to Hive

● Multiple MapReduce Aggregation○ Presto is always > 10X faster than Hive

● Joins○ Presto is always > 2X faster than Hive

What we are working on

● Support Parquet File Format○ https://github.com/facebook/presto/pull/1147○ Parquet performs similar to Sequence, but not as fast as RCFile

● ODBC/JDBC driver for Presto○ Support Microstrategy running on Presto



Some inconveniences ...● Support Server Side “Use Schema”

○ Workaround: Client Side “Use Schema” Or “Schema.Table”● Recurse the partition directory

○ Different behavior with Hive● Metadata caching

○ have to rerun the query a number of times to see the metadata change

● Extend JSON extract functions to allow . notation○ json_extract_scalar(mapColumn, '$.namePart1.namePart2')○ Workaround: regexp_extract

● WebUI running slow○ load query task info on demand

Features we would like● Big table join● User Defined Functions● Break down one column value into several tuples

○ In Hive: lateral view explode json_tuple● Decimal type● Scheduler● Writes

○ Insert overwrite○ Alter table add partition○ Parallel writes from workers (not client only)

Q & AThank you!

Technology

Netflix running Presto in the AWS Cloud