Upload
sagittarius-lucas
View
31
Download
2
Embed Size (px)
DESCRIPTION
A Perfect Hybrid. Split query processing in Polybase. biaobiaoqi [email protected] 2013/4/25. Outline. Background Related Work PDW Polybase Performance Evaluation. Background. Structured data & unstructured data RDBMS & Big Data. RDBMS. Combine. Insight. Hadoop. Related Work. - PowerPoint PPT Presentation
Citation preview
Outline• Background• Related Work• PDW• Polybase• Performance Evaluation
Background• Structured data & unstructured data
• RDBMS & Big Data
RDBMSHadoop
Combine
Insight
Related Work• Sqoop: Transferring bulk data between Hadoop
and structured data stores such as relational database
• Teradata & Asterdata• Greenplum & Vertica : external table• Oracle: external table and OLH(Oracle loader for
Hadoop)• IBM: split mechanism to use mapreduce to access
appliance• Hadapt(HadoopDB): outset to support the
execution of SQL-like queries across both unstructured and structured data sets.
PDW Architecture• Parallel Data Warehouse
• Shared-nothing system
Components in PDW• Node
o SQL server instance on nodeo Data are hash-partitioned through compute node
• Control node: [PWD Engine in it]o query parsingo Optimizationo creating distributed execution plan to compute nodes(DSQL)o tracking execution steps of plan to compute nodes
• Compute node:o Storageo Query processing
• DMS: Data Movement Service o (1)repartitioning rows of a table among the SQL Server instances on PDW
compute nodes.o (2)converting fields of rows being loaded into appliance into the appropriate
ODBC types.
Overview of Polybase• A new feature in PDW V2• Using SQL standard language• Dealing with both structured and unstructured
data(in SQL Server and Hadoop)• Split query processing paradigm• leverages the capabilities of SQL Server PDW,
especially it's cost-based parallel query optimizer and execution engine.
Use case of Polybase
Assumption in Polybase
• 1. Polybase makes no assumptions about where HDFS data is
• 2. Nor any assumptions about the OS of data nodes
• 3. Nor the format of HDFS files (i.e. TextFile, RCFile, custom, …)
Core Components• External Table• HDFS Bridge in DMS• Cost-based query optimizer(wrapping the one in
V1)
External Table• Create cluster instance
o CREATE HADOOP_CLUSTER GSL_CLUSTER WITH (namenode=‘hadoop-head’,namenode_port=9000, jobtracker=‘hadoop-head’,jobtracker_port=9010);
• Create File Formato CREATE HADOOP_FILEFORMAT TEXT_FORMAT WITH (INPUT_FORMAT=‘polybase.TextInputFormat’, OUTPUT_FORMAT = ‘polybase.TextOutputFormat’, ROW_DELIMITER = '\n', COLUMN_DELIMITER = ‘|’);
• Create External Tableo CREATE EXTERNAL TABLE hdfsCustomer ( c_custkey bigint not null, c_name varchar(25) not null, …… c_comment varchar(117) not null) WITH (LOCATION='/tpch1gb/customer.tbl', FORMAT_OPTIONS (EXTERNAL_CLUSTER = GSL_CLUSTER, EXTERNAL_FILEFORMAT = TEXT_FORMAT));
HDFS Bridge
HDFS Bridge• HDFS is a component of DMS• Goal: Transferring data in parallel between the
nodes of Hadoop and PDW clusters.• HDFS shuffle phase: (read data from hadoop)
o 1. Communicate with namenode, get info of fileo 2. Balance number of bytes read by each DMS instance(based on hdfs
info and dms instances count)o 3. Invoke openRecordReader() RecordReader instance: directly
communicate with datanodeo 4. Get data and transfer into ODBC types.(may done in mapreduce job)o 4. Hash function to determine target node for each record
• Write to hadoop is almost the sameo Invoking openRecordWriter()
Read Process
Optimizer & Compilation
• Parsingo A Memo data structure of alternative serial plans
• Parallel optimization[in PDW V1]o Bottom-up optimizer to insert data movement operators in the serial
plans
• Cost-based query optimizer:[whether pushing to Hadoop]o Based on statistics\ relative size of two clusters and other factors
• Semantic Compatibilityo Data typeso SQL semanticso Error handling
Statistics• Define statistics table for external table:
o CREATE STATISTICS hdfsCustomerStats ONo hdfsCustomer (c_custkey);
• Steps to obtain statistics in HDFS:o 1. Read block level sample data from DMS or map jobso 2. Partitioned samples across compute nodes.o 3. Each node calculates a histogram on its portiono 4. Merge all histograms stored in catalog for database.
• An alternative implementation:o In HadoopV2, let Hadoop cluster calculate the histograms. (cost a lot)o Make the best use of computational resource of Hadoop cluster
Semantic Compatibility
• Data typeso Java primitive typeso Non-primitive typeso Third-party types that can be implementedo Marked those can not be implemented in Java[only can be processed in
PDW]
• SQL semanticso Return of Expressions: implemented in Javao Returning null: eg. A+B (A==null || B==null)?null: (A+B)o Marked those can not be implemented in Java[only can be processed in
PDW]
• Error handlingo Exceptions will come out in SQL should also be throwed in Java
Example• SELECT count (*)
from Customer WHERE acctbal < 0GROUP BY nationkey
Optimized Query Plan #1
Optimized Query Plan #2
MapReduce Join• Distributed Hash Join
o Support for equi-join
• Implementation: o Build side: the side with smaller size of data. They are materialized in
HDFS.o Probe side: the other side of data. o Partition build side, making build side in-memory to speed up.o Build side may also be replicated.
Performance Evaluation
• Test configuration:o C-16/48 16 node PDW cluster, 48 node Hadoop clustero C-30/30 30 node PDW cluster, 30 node Hadoop clustero C-60 60 node PDW cluster and 60 node Hadoop cluster
• Test database:o Two identical tables T1 and T2
• 10 billion rows• 13 integer attributes and 3 string attributes (~200 bytes/row)• About 2TB uncompressed
o One copy of each table in HDFS• HDFS block size of 256 MB• Stored as a compressed RCFile• RCFiles store rows “column wise” inside a block
o One copy of each table in PDW• Block-wise compression enabled
SELECT u1, u2, u3, str1, str2, str4 from T1 (in HDFS) where (u1 % 100) < sf
Selection on HDFS table
1 20 40 60 80 1000
500
1000
1500
2000
2500
PDWImportMR
Selectivity Factor (%)
Exe
cu
tion
Tim
e (
secs.
)
Polybase Phase 2
Polybase Phase 1
Crossover Point:Above a selectivity factor of ~80%, PB Phase 2 is slower
PB.1
SP
PB.1 PB.1 PB.1 PB.1 PB.1
SP
SP
SP
SP
SP
23
Join HDFS Table with PDW Table
SELECT * from T1 (HDFS), T2 (PDW) where T1.u1 = T2.u2 and (T1.u2 % 100) < sf and (T2.u2 % 100) < 50
1 33 66 1000
500
1000
1500
2000
2500
3000
3500
PDW
Import
MR
Selectivity Factor (%)
Exe
cu
tion
Tim
e (
secs.
)
Polybase Phase 1
Polybase Phase 2
PB.1
SP
SP
SP
SP
PB.1 PB.1 PB.1
24
Join Two HDFS TablesSELECT * from T1 (HDFS), T2 (HDFS) where T1.u1 = T2.u2 and (T1.u2 % 100) < SF and (T2.u2 % 100) < 10
PB
.1
PB
.2H
PB
.2P
PB
.2H
PB
.2H
PB
.1
PB
.1
PB
.1
1 33 66 1000
500
1000
1500
2000
2500
PDW
Import-Join
MR-Shuffle-J
MR-Shuffle
Import T2
Import T1
MR- Sel T2
MR-Sel T1Selectivity Factor
Execu
tion
Tim
e (
secs.)
PB
.2H
PB
.2P
PB
.2P
PB
.2P
PB.2P – Selections on T1 and T2 pushed to Hadoop. Join performed on PDWPB.1 – All operators on PDWPB.2H – Selections & Join on Hadoop
25
Performance Wrap-up• Split query processing really works!• Up to 10X performance improvement!• A cost-based optimizer is clearly required to
decide when an operator should be pushed• Optimizer must also incorporate relative cluster
sizes in its decisions
Reference• Split Query Processing in Polybase(SIGMOD’13 , June 22-
27,2013,New York,USA.)o Microsoft Corporation
• Polybase: What, Why, How(ppt)o Microsoft Corporation
• Query Optimization in Microsoft SQL Server PDW (SIGMOD'12, May 20-24,2012,Scottsdale,Arizona,USA)o Microsoft Corporation
THANKS