Pivotal hawq internals

Preview:

DESCRIPTION

The presentation delivered during Hadoop Kitchen in Moscow on 27.09.2014. It describes the technology that lies beneath Pivotal HAWQ technology

Citation preview

1Pivotal Confidential–Internal Use Only 1Pivotal Confidential–Internal Use Only

Pivotal HAWQ

A.Grishchenko

HadoopKitchen @ Mail.ru27 Sep 2014

2Pivotal Confidential–Internal Use Only

SQL-on-Hadoop Solutions

Hive

2008

Developed by Facebook– Hive is used for data analysis in their data warehouse– DWH size is ~300PB at the moment, ~600TB of data is loaded daily. Data

is compressed using ORCFiles, compression ratio is ~8x

HiveQL language is not compatible with ANSI SQL-92

Has many limitations on subqueries

Cost-based optimizer (Optiq) is only in technical preview now

3Pivotal Confidential–Internal Use Only

SQL-on-Hadoop Solutions

Hive

2008

Developed by Cloudera– Open-source solution– Cloudera sells this solution to enterprise shops– Was in beta until the May’2013

Supports HiveQL, moving forward complete ANSI SQL-92 support

Written in C++, does not use Map-Reduce for running queries

Requires much memory, big tables join usually causes OOM error

Impala

10.2012

4Pivotal Confidential–Internal Use Only

SQL-on-Hadoop Solutions

Hive

2008

Hortonworks initiative– Consists of a number of steps to make Hive run 100x faster

Tez – solution to make Hive queries be translated to Tez jobs, which are similar to Map-Reduce but may have arbitrary topology

Optiq – cost-based query optimizer for Hive (technical preview ATM)

ORCFile – columnar storage format with adaptive compression and inline indexes

Hive-5317 – ACID and Update/Delete support (release at ~ 11.2014)

Impala

10.2012

Stinger

02.2013

5Pivotal Confidential–Internal Use Only

SQL-on-Hadoop Solutions

Hive

2008

Pivotal product– Greenplum MPP DBMS, ported to store data in HDFS– Written in C, query optimizer is rewritten for this solution (ORCA)

Supports ANSI SQL-92 and analytic extensions from SQL-2003

Supports complex queries with correlated subqueries, window functions and different joins

Data is put on disk only if the process does not have enough memory

Impala

10.2012

Stinger

02.2013

HAWQ

02.2013

6Pivotal Confidential–Internal Use Only

SQL-on-Hadoop Solutions

Hive

2008

HP Vertica– Supports only MapR distribution as requires updatable storage– Supports ANSI SQL-92, SQL-2003– Supports UPDATE/DELETE– Officially announced as available in July’2014, no implementations yet

IBM BigSQL v3– IBM DB2 ported to store data in HDFS– Federated queries, good query optimizer, etc.

Both solutions are similar to Pivotal HAWQ in general idea

Impala

10.2012

Stinger

02.2013

HAWQ

02.2013

Vertica,BigSQL

2014

7Pivotal Confidential–Internal Use Only

Pivotal HAWQ Components

Master

Segment 1

Segment 2

Segment K

Server 1

Standby Master

Server 2

Server 3

Segment K+1

Segment K+2

Segment 2*K

Server 4

Segment N

Server M

… … ……

8Pivotal Confidential–Internal Use Only

Pivotal HAWQ Components

HAWQ Master

HAWQ Segm.

Server 1

HAWQ SBMstr

Server 2

Server 5

NameNode

Server 3

SNameNode

Server 4

ZK QJMZK QJMZK QJM

Datanode

HAWQ Segm.

Server 6

Datanode

HAWQ Segm.

Server M

Datanode

9Pivotal Confidential–Internal Use Only

Pivotal HAWQ Components

HAWQ Master

Query Parser

Query Optimizer

Query Executor

Transaction Manager

Process Manager

Metadata Catalog

HAWQ Standby Master

Query Parser

Query Optimizer

Query Executor

Transaction Manager

Process Manager

Metadata Catalog

WALreplic.

10Pivotal Confidential–Internal Use Only

Pivotal HAWQ Components Metadata is stored only on master-servers

Metadata is stored in modified Postgres instance, replicated to standby master with WAL

Metadata contains– Table information – schema, names, files– Statistics – number of unique values, value ranges, sample values,

etc.– Information about users, groups, priorities, etc.

Master server shutdown causes the switch to standby with the loss of running sessions

11Pivotal Confidential–Internal Use Only

Pivotal HAWQ Components

HAWQ Segment

Query Executor

libhdfs3

PXF

HDFS Datanode

Segment Data Directory

Local Filesystem (xfs)

Spill Data Directory

12Pivotal Confidential–Internal Use Only

Pivotal HAWQ Components Both masters and segments are modified postgres

instances (to be clear, modified Greenplum instances)

Opening connection to the master server you fork postmaster process that starts to work with your session

Starting the query execution you connect to the segment instances and they also fork a process to execute query

Query execution plan is split into independent blocks (slices), each of them is executed as a separate OS process on the segment server, moving the data through UDP

13Pivotal Confidential–Internal Use Only

Pivotal HAWQ Components Tables can be stored as:

– Row-oriented (quicklz, zlib compression)– Column-oriented (quicklz, zlib, rle compression)– Parquet tables

Each segment has separate directory on HDFS where it stores its data shard

Within columnar storage each column is represented as a separate file

Parquet allows to store the table by columns and does not load NameNode with many files / block location requests

14Pivotal Confidential–Internal Use Only

HAWQ Master

Metadata

Transact. Mgr.

Parser Query Optimiz.

Process Mgr.

Query Executor

NameNode

Query Execution in Pivotal HAWQ

Local Spill Directory

HAWQ SegmentBackend

HDFS Datanode

Segment Directory

Local Spill Directory

HAWQ SegmentBackend

HDFS Datanode

Segment Directory

15Pivotal Confidential–Internal Use Only

HAWQ Master

Metadata

Transact. Mgr.

Parser Query Optimiz.

Process Mgr.

Query Executor

NameNode

Query Execution in Pivotal HAWQ

Local Spill Directory

HAWQ SegmentBackend

HDFS Datanode

Segment Directory

Local Spill Directory

HAWQ SegmentBackend

HDFS Datanode

Segment Directory

16Pivotal Confidential–Internal Use Only

HAWQ Master

Metadata

Transact. Mgr.

Parser Query Optimiz.

Process Mgr.

Query Executor

NameNode

Query Execution in Pivotal HAWQ

Local Spill Directory

HAWQ SegmentBackend

HDFS Datanode

Segment Directory

Local Spill Directory

HAWQ SegmentBackend

HDFS Datanode

Segment Directory

17Pivotal Confidential–Internal Use Only

HAWQ Master

Metadata

Transact. Mgr.

Parser Query Optimiz.

Process Mgr.

Query Executor

NameNode

Query Execution in Pivotal HAWQ

Local Spill Directory

HAWQ SegmentBackend

HDFS Datanode

Segment Directory

Local Spill Directory

HAWQ SegmentBackend

HDFS Datanode

Segment Directory

18Pivotal Confidential–Internal Use Only

HAWQ Master

Metadata

Transact. Mgr.

Parser Query Optimiz.

Process Mgr.

Query Executor

NameNode

Query Execution in Pivotal HAWQ

Local Spill Directory

HAWQ SegmentBackend

HDFS Datanode

Segment Directory

Local Spill Directory

HAWQ SegmentBackend

HDFS Datanode

Segment Directory

19Pivotal Confidential–Internal Use Only

HAWQ Master

Metadata

Transact. Mgr.

Parser Query Optimiz.

Process Mgr.

Query Executor

NameNode

Query Execution in Pivotal HAWQ

Local Spill Directory

HAWQ SegmentBackend

HDFS Datanode

Segment Directory

Local Spill Directory

HAWQ SegmentBackend

HDFS Datanode

Segment Directory

20Pivotal Confidential–Internal Use Only

HAWQ Master

Metadata

Transact. Mgr.

Parser Query Optimiz.

Process Mgr.

Query Executor

NameNode

Query Execution in Pivotal HAWQ

Local Spill Directory

HAWQ SegmentBackend

HDFS Datanode

Segment Directory

Local Spill Directory

HAWQ SegmentBackend

HDFS Datanode

Segment Directory

21Pivotal Confidential–Internal Use Only

HAWQ Master

Metadata

Transact. Mgr.

Parser Query Optimiz.

Process Mgr.

Query Executor

NameNode

Query Execution in Pivotal HAWQ

Local Spill Directory

HAWQ SegmentBackend

HDFS Datanode

Segment Directory

QE

Local Spill Directory

HAWQ SegmentBackend

HDFS Datanode

Segment Directory

QE

22Pivotal Confidential–Internal Use Only

HAWQ Master

Metadata

Transact. Mgr.

Parser Query Optimiz.

Process Mgr.

Query Executor

NameNode

Query Execution in Pivotal HAWQ

Local Spill Directory

HAWQ SegmentBackend

HDFS Datanode

Segment Directory

QE

Local Spill Directory

HAWQ SegmentBackend

HDFS Datanode

Segment Directory

QE

23Pivotal Confidential–Internal Use Only

HAWQ Master

Metadata

Transact. Mgr.

Parser Query Optimiz.

Process Mgr.

Query Executor

NameNode

Query Execution in Pivotal HAWQ

Local Spill Directory

HAWQ SegmentBackend

HDFS Datanode

Segment Directory

QE S1 S2 S3

Local Spill Directory

HAWQ SegmentBackend

HDFS Datanode

Segment Directory

QE S1 S2 S3

24Pivotal Confidential–Internal Use Only

HAWQ Master

Metadata

Transact. Mgr.

Parser Query Optimiz.

Process Mgr.

Query Executor

NameNode

Query Execution in Pivotal HAWQ

Local Spill Directory

HAWQ SegmentBackend

HDFS Datanode

Segment Directory

QE S1 S2 S3

Local Spill Directory

HAWQ SegmentBackend

HDFS Datanode

Segment Directory

QE S1 S2 S3

25Pivotal Confidential–Internal Use Only

HAWQ Master

Metadata

Transact. Mgr.

Parser Query Optimiz.

Process Mgr.

Query Executor

NameNode

Query Execution in Pivotal HAWQ

Local Spill Directory

HAWQ SegmentBackend

HDFS Datanode

Segment Directory

QE S1 S2 S3

Local Spill Directory

HAWQ SegmentBackend

HDFS Datanode

Segment Directory

QE S1 S2 S3

26Pivotal Confidential–Internal Use Only

HAWQ Master

Metadata

Transact. Mgr.

Parser Query Optimiz.

Process Mgr.

Query Executor

NameNode

Query Execution in Pivotal HAWQ

Local Spill Directory

HAWQ SegmentBackend

HDFS Datanode

Segment Directory

QE S1 S2 S3

Local Spill Directory

HAWQ SegmentBackend

HDFS Datanode

Segment Directory

QE S1 S2 S3

27Pivotal Confidential–Internal Use Only

HAWQ Master

Metadata

Transact. Mgr.

Parser Query Optimiz.

Process Mgr.

Query Executor

NameNode

Query Execution in Pivotal HAWQ

Local Spill Directory

HAWQ SegmentBackend

HDFS Datanode

Segment Directory

QE S1 S2 S3

Local Spill Directory

HAWQ SegmentBackend

HDFS Datanode

Segment Directory

QE S1 S2 S3

28Pivotal Confidential–Internal Use Only

HAWQ Master

Metadata

Transact. Mgr.

Parser Query Optimiz.

Process Mgr.

Query Executor

NameNode

Query Execution in Pivotal HAWQ

Local Spill Directory

HAWQ SegmentBackend

HDFS Datanode

Segment Directory

QE S1 S2 S3

Local Spill Directory

HAWQ SegmentBackend

HDFS Datanode

Segment Directory

QE S1 S2 S3

29Pivotal Confidential–Internal Use Only

HAWQ Master

Metadata

Transact. Mgr.

Parser Query Optimiz.

Process Mgr.

Query Executor

NameNode

Query Execution in Pivotal HAWQ

Local Spill Directory

HAWQ SegmentBackend

HDFS Datanode

Segment Directory

QE S1 S2 S3

Local Spill Directory

HAWQ SegmentBackend

HDFS Datanode

Segment Directory

QE S1 S2 S3

30Pivotal Confidential–Internal Use Only

HAWQ Master

Metadata

Transact. Mgr.

Parser Query Optimiz.

Process Mgr.

Query Executor

NameNode

Query Execution in Pivotal HAWQ

Local Spill Directory

HAWQ SegmentBackend

HDFS Datanode

Segment Directory

QE S1 S2 S3

Local Spill Directory

HAWQ SegmentBackend

HDFS Datanode

Segment Directory

QE S1 S2 S3

31Pivotal Confidential–Internal Use Only

HAWQ Master

Metadata

Transact. Mgr.

Parser Query Optimiz.

Process Mgr.

Query Executor

NameNode

Query Execution in Pivotal HAWQ

Local Spill Directory

HAWQ SegmentBackend

HDFS Datanode

Segment Directory

QE S1 S2 S3

Local Spill Directory

HAWQ SegmentBackend

HDFS Datanode

Segment Directory

QE S1 S2 S3

32Pivotal Confidential–Internal Use Only

HAWQ Master

Metadata

Transact. Mgr.

Parser Query Optimiz.

Process Mgr.

Query Executor

NameNode

Query Execution in Pivotal HAWQ

Local Spill Directory

HAWQ SegmentBackend

HDFS Datanode

Segment Directory

QE S1 S2 S3

Local Spill Directory

HAWQ SegmentBackend

HDFS Datanode

Segment Directory

QE S1 S2 S3

33Pivotal Confidential–Internal Use Only

HAWQ Master

Metadata

Transact. Mgr.

Parser Query Optimiz.

Process Mgr.

Query Executor

NameNode

Query Execution in Pivotal HAWQ

Local Spill Directory

HAWQ SegmentBackend

HDFS Datanode

Segment Directory

QE S1 S2 S3

Local Spill Directory

HAWQ SegmentBackend

HDFS Datanode

Segment Directory

QE S1 S2 S3

34Pivotal Confidential–Internal Use Only

HAWQ Master

Metadata

Transact. Mgr.

Parser Query Optimiz.

Process Mgr.

Query Executor

NameNode

Query Execution in Pivotal HAWQ

Local Spill Directory

HAWQ SegmentBackend

HDFS Datanode

Segment Directory

Local Spill Directory

HAWQ SegmentBackend

HDFS Datanode

Segment Directory

35Pivotal Confidential–Internal Use Only

PXF Framework

Gives you ability to read different data types from HDFS– Text files, both compressed and uncompressed– Seqence-files– AVRO-files

Able to read data from external data sources– HBase– Cassandra– Redis

Extensible API

36Pivotal Confidential–Internal Use Only

NameNode

PXF FrameworkHAWQ Master

PXF Fragmenter

Process Mgr.

Local Spill Directory

HAWQ SegmentQuery Executor

HDFS DatanodeSegment Directory

PXF Accessor

PXF Fragmenter

Local Spill Directory

HAWQ SegmentQuery Executor

HDFS DatanodeSegment Directory

PXF Accessor

PXF Fragmenter

37Pivotal Confidential–Internal Use Only

NameNode

PXF FrameworkHAWQ Master

PXF Fragmenter

Process Mgr.

Local Spill Directory

HAWQ SegmentQuery Executor

HDFS DatanodeSegment Directory

PXF Accessor

PXF Fragmenter

Local Spill Directory

HAWQ SegmentQuery Executor

HDFS DatanodeSegment Directory

PXF Accessor

PXF Fragmenter

38Pivotal Confidential–Internal Use Only

NameNode

PXF FrameworkHAWQ Master

PXF Fragmenter

Process Mgr.

Local Spill Directory

HAWQ SegmentQuery Executor

HDFS DatanodeSegment Directory

PXF Accessor

PXF Fragmenter

Local Spill Directory

HAWQ SegmentQuery Executor

HDFS DatanodeSegment Directory

PXF Accessor

PXF Fragmenter

39Pivotal Confidential–Internal Use Only

NameNode

PXF FrameworkHAWQ Master

PXF Fragmenter

Process Mgr.

Local Spill Directory

HAWQ SegmentQuery Executor

HDFS DatanodeSegment Directory

PXF Accessor

PXF Fragmenter

Local Spill Directory

HAWQ SegmentQuery Executor

HDFS DatanodeSegment Directory

PXF Accessor

PXF Fragmenter

40Pivotal Confidential–Internal Use Only

NameNode

PXF FrameworkHAWQ Master

PXF Fragmenter

Process Mgr.

Local Spill Directory

HAWQ SegmentQuery Executor

HDFS DatanodeSegment Directory

PXF Accessor

PXF Fragmenter

Local Spill Directory

HAWQ SegmentQuery Executor

HDFS DatanodeSegment Directory

PXF Accessor

PXF Fragmenter

41Pivotal Confidential–Internal Use Only

NameNode

PXF FrameworkHAWQ Master

PXF Fragmenter

Process Mgr.

Local Spill Directory

HAWQ SegmentQuery Executor

HDFS DatanodeSegment Directory

PXF Accessor

PXF Fragmenter

Local Spill Directory

HAWQ SegmentQuery Executor

HDFS DatanodeSegment Directory

PXF Accessor

PXF Fragmenter

42Pivotal Confidential–Internal Use Only

NameNode

PXF FrameworkHAWQ Master

PXF Fragmenter

Process Mgr.

Local Spill Directory

HAWQ SegmentQuery Executor

HDFS DatanodeSegment Directory

PXF Accessor

PXF Fragmenter

Local Spill Directory

HAWQ SegmentQuery Executor

HDFS DatanodeSegment Directory

PXF Accessor

PXF Fragmenter

43Pivotal Confidential–Internal Use Only

NameNode

PXF FrameworkHAWQ Master

PXF Fragmenter

Process Mgr.

Local Spill Directory

HAWQ SegmentQuery Executor

HDFS DatanodeSegment Directory

PXF Accessor

PXF Fragmenter

Local Spill Directory

HAWQ SegmentQuery Executor

HDFS DatanodeSegment Directory

PXF Accessor

PXF Fragmenter

44Pivotal Confidential–Internal Use Only

Further Steps

Master server scaling – pool of master servers

New native data storage formats and new native compression algorithms

YARN as resource manager for HAWQ

Dynamic segment allocation / decommission

45Pivotal Confidential–Internal Use Only 45Pivotal Confidential–Internal Use Only

Questions?

BUILT FOR THE SPEED OF BUSINESS

Recommended