Upload
zhenxiao-luo
View
3.805
Download
5
Embed Size (px)
Citation preview
Netflix running Presto in the AWS CloudZhenxiao Luo
Senior Software Engineer @ Netflix
Outline
● BigDataPlatform@Netflix● Use cases & requirements● What we did
○ Reading/Writing from/to Amazon S3○ Operations○ Deployment○ Performance
● What’s next?
BigDataPlatform @ Netflix
Use Cases● Big Batch Jobs
○ high throughput, fault tolerant, ETL○ data spills to disk○ Hive on Tez, Pig on Tez
● Adhoc Queries○ low latency, interactive, data exploration○ in-memory, but limited data size○ Impala, Redshift, Spark, Presto
Netflix Requirement● SQL like Language● Low latency for adhoc queries● Work well on AWS cloud● Good integration with Hadoop stack● Scale to 1000+ node cluster● Open source with community support
What did Netflix do?
Reading/Writing to/from S3
● Option 1: Apache Hadoop NativeS3FileSysyem
● Option 2: PrestoS3FileSystem○ retry logic for read timeout○ write directly to final S3 path
● Option 3: emrFileSystem○ disable hadoop logging○ disable hadoop FileSystem cache
Bug Fixes● https://github.
com/facebook/presto/commit/cf0b2d66f4050fb1959c832809fa76e323d6d46e
● https://github.com/facebook/presto/commit/594b06c3e93a482dc162d2c49c9bd265795efb86
● https://github.com/facebook/presto/pull/1147● https://github.com/facebook/presto/pull/1300● https://github.com/facebook/presto/issues/1285● https://github.com/facebook/presto/issues/1264
Our Operations Environment
● Launch script on top of EMR
● Ganglia integration
● Usage graphs - concurrent queries & tasks
Current Deployment
● Presto in Production @ Netflix● 100+ nodes Presto Cluster● 1000+ queries running per day● Presto query against the same Petabyte Scale S3 Data
Warehouse as Hive and Pig
Observed Performance @ Netflix
● Data in Sequence File Format● One MapReduce Job SmallTableScan
○ MapReduce overhead dominates the query execution time○ Presto is always ~10X faster than Hive
● One MapReduce Job BigTableScan○ MapReduce overhead is marginal compared with big table scan time○ Presto performs similar to Hive
● Multiple MapReduce Aggregation○ Presto is always > 10X faster than Hive
● Joins○ Presto is always > 2X faster than Hive
What we are working on
● Support Parquet File Format○ https://github.com/facebook/presto/pull/1147○ Parquet performs similar to Sequence, but not as fast as RCFile
● ODBC/JDBC driver for Presto○ Support Microstrategy running on Presto
Some inconveniences ...● Support Server Side “Use Schema”
○ Workaround: Client Side “Use Schema” Or “Schema.Table”● Recurse the partition directory
○ Different behavior with Hive● Metadata caching
○ have to rerun the query a number of times to see the metadata change
● Extend JSON extract functions to allow . notation○ json_extract_scalar(mapColumn, '$.namePart1.namePart2')○ Workaround: regexp_extract
● WebUI running slow○ load query task info on demand
Features we would like● Big table join● User Defined Functions● Break down one column value into several tuples
○ In Hive: lateral view explode json_tuple● Decimal type● Scheduler● Writes
○ Insert overwrite○ Alter table add partition○ Parallel writes from workers (not client only)
Q & AThank you!