Aster. Now. Future. Why. - Amazon S3 · Aster. Now. Future. Why. Michael McIntire. ... TP1. TP1....

Preview:

Citation preview

#TDPARTNERS16 GEORGIA WORLD CONGRESS CENTER

Aster. Now. Future. Why.

Michael McIntireCTO Teradata Labs, Aster

Who is that McIntire guy anyway…

• Extreme Scale MPP platforms & complex data systems

• The “Seven year stretch…”• Seven years: Geographic Information Systems… • Seven years: Teradata in the 90’s – DB Architect… • Seven years: independent EDW consultant…• Seven years: eBay EDW Chief Architect

• And some between time... Yahoo, Sears, Prime, EDS...

Roles… Market…

Role One: Ideas – Business Process

• Algorithm and Knowledge Finding• Business Process – Growth Objective • Discover and Prove • New: Classic and Citizen Data Scientists

• Speed of Ideation• Dynamic Connectivity• Broad analytic capability• Scalable performance

DESIGN PLAN BUILD TEST

REVIEW

DRIVE

EVALUATE

Role Two: Operationalization – IT Process

• Given a Known Hypothesis…

• IT - “Fitting things together”• Generalized Architecture • Secure, Repeatable, predictable

• Platform for Production• IT Process – Control for Cost• Architecture Adherence• Enterprise Features and Resiliency• Fed by the Business process…

ProductionPlatform

Platform view of the market…

• Expanding Compute Engines• Narrower Focus• More capable point solutions

• Diverging Storage and Compute• APIs are the Enabler

• Consolidating Storage Engines• Rise of “Good enough”

• Former Platforms as Engines...• “Pluggable” Infrastructures

ComputeStorage

HDFS LocalCEPH

Ecosystem is the Platform

Connecting Multiple Platforms at Runtime

• Aster Compute within Hadoop Ecosystem• First Class Citizen – YARN Resource mgmt• Native Read/Write HDFS Storage

• MPP Interoperability with External Engines• Native Aster and Spark integration

• Teradata + Aster + Hadoop

Accellerate Business Decision making with Platform Interoperability

Engine Workflow Integration• Every Analytic Engine will have one

• Just like visualization…• First Generation: AppCenter

• Workflows will Mix Paradigms • Set processing + Procedural

So What’s New: • Command Language Implementation

• Server Side implementation• Visual Tools Layered on Commands• Data “Cooking” tools already there…

• Tech issue: Exposing Logic inside a set statement across proc bounds.

• Predicate Search for example• Optimization across loop/branch

constructs

What kind of play? Play Hard or Money? Creative money?

PLAY WORK & PLAY WORK

Snake action game First Shooter Be an Actor What kind of Work?

FACEBOOK GAMING EXTREME JUST NOTES YESNO

Why not get An Opad?

NO YES NO YES

NO YESRIGHT

FALSEGet a

Success

NOTEPAD

Get aSucces

s?

NOTEPADBROKEN

RIGHT FALSE

Snakeaction game

Play Hard or Money?

RIGHT FALSE

NO YES

RIGHT FALSEGettingA success

OutcomePlayHard

orMoney?

NO YESNOTEPAD

Inputs

Aster Next

Objective - Aster in the Cloud

• Aster 6.2x on Appliance• Aster on Hadoop (AsterX 7.0)

• Aster 6.20 on AWS• Aster 7.x on Cloud

• Managed, Public

• Not likely to see: • Aster on Hadoop on the Cloud…

Compute

StorageHDFS LocalCEPH

AsterX Evolution

• Aster Execution Engine • Aster is a Compute Engine• Spill to disk temp storage only

• Non-Persistant Storage • Access via Connectors

• QueryGrid2

Compute

StorageHDFS LocalCEPH

6.20 Worker – Aster Managed Storage

Worker Node

ASTER - vWorkerASTER - vWorkervWorkerMany per Node

RelationalEngine

Map ReduceEngine

Node Local Storage – OS Managed

GraphEngine

Compression Replication AsterAsterAsterLocal Storage

Aster

EdgeNodeQueen

Exec

User

Cluster Services –1 per node

6.50 Worker – 100% HDFS Storage

Aster

HDFS Cluster

EdgeNodeQueen

Exec

User

Hadoop

NameNode

YARN

Distributed File System (HDFS on HDP, CDH)

HadoopWorker Node

ASTER - vWorkerASTER - vWorkervWorker m/node

RelationalEngine

Map ReduceEngine

Cluster Wide Storage: HDFS:/aster/vworkerX*Y

GraphEngine

Compression Replication Security Management

Cluster Wide Storage Interface

Cluster Services –1 per node

AsterX 7.0 Worker – Local Temp

HadoopWorker Node

ASTER - vWorkerASTER - vWorkervWorker m/node

Cluster Services –1 per node

RelationalEngine

Map ReduceEngine

Distributed Persistant Data System Hive, Teradata,

GraphEngine

Aster

HDFS Cluster

EdgeNodeQueen

Exec

User

Hadoop

NameNode

YARN

NODE LOCAL Storage: /aster/vworkerX*Y

Aster 7.x Architecture

AsterX 7.0 Cluster Architecture• Internally: Daemon based implementation

• Always on - not per job instantiation

• vWorker Deployment: Cluster Subset• vWorker count is Static per Instance • vWorkers can be moved• Expand / contract Hadoop Cluster

• without Aster intervention• Architected as a SUBSET

• Queen edge node (required)• Security and Connectivity (++ eliminate

bridge)

• Libraries on all nodes • For Simplicity and Latency reasons

EdgeNode

App AQueen

User

HadoopServices

NameNode

Aster B

HDFSAster A

HDFS Map/Reduce

HDFS

Aster A

HDFS

Hive

Aster B

EdgeNode

App BQueen

User

WorkerNodes

Head Nodes

YARN Managed Resource

• Full Hadoop Services Integration

• Injects Third Party Management

• Reverse order Worker setup/teardown

• Aster Cluster still manages “State”

• Consul implementation

Yarn ManagedEdge Node

Hadoop Services

YARN ManagedWorker Node

ASTERvWorkerHDFS

Aster

HDFS

ASTERQueen Aster Yarn

Server YARN

User

Ambari Zookeeper

YarnClient

1

2

34

AsterX 7.0 – Consul State Management

• Consul: State management, configurations

• Simple, always on key-value store• Similar to ZooKeeper (Dir/Key structure)

• Aster 7.0 use:• Common, resilient store port mapping• Future use dynamic mapping of ports

• Dynamic worker movement…

• Consul is required• AX7.0 is a Private Implementation

• Future use of existing Consul possible• If not available – Aster will not come up

Queen

User

Aster Aster

Temp Temp

Consul

AsterX 7.0 Cluster Configuration

• Subset of nodes: explicitly or system decides

• Exact # nodes will fit node capacity • i.e if the nodes are powerful there will be fewer

nodes used• Alternate maxusage yarn parm for temp/io

heavy apps• Equivalent of “Prepared” state, still needs

“activate”

• Port Configuration –• startup time: port conflicts can be resolved• Re-address when new Cluster SW is

installed

• No add/remove node functionality• Stand up another cluster… Point to the data• No “data migration”...

Queen

User

Aster Aster

Temp Temp

AsterX 7.0 Cluster Startup

• Install, startup - separate steps• Install libraries, basic directories

• Startup plumbs all connections• Setup vWorkers• Connect Queen and Workers

• Shutdown - cleans up the workers• Temp data removed• Reuse temp : Future Optimization Case

• All via Aster Yarn Client Commands• Equivalent to Aster “Activate”

Queen

User

Aster Aster

Temp Temp

Consul

AsterX 7.0 Worker – Local Temp Only

• Persistence in HDFS • Hive, hCat…

• Access: Connectors + QueryGrid

• Read at script Start / Write at script End

• Objects managed by user• Same semantics as a Database• Persist for Duration of Cluster

• No Replication & Compression• Redistribution remains

Hadoop NodeHadoop NodeHadoop NodeAster Connectors – Query Grid

ASTER - vWorkerASTER - vWorkervWorker m/nodeSQL

Engine

M/RAPI

Engine

GraphEngine

HDFS Cluster

EdgeNodeQuee

nExec

User

NODE LOCAL Storage

Hadoop

NameNode

YARN

AsterX Local

Storage

Advanced Analytics Enabled by SQL (for Data/Business Analysts)

Once you know how to use on Aster SQL command you have learned how to use them all!

CREATE TABLE complaints_nb_model(PARTITION KEY(token)) AS

SELECT token, SUM(crash) AS crash, SUM(no_crash) AS no_crash

FROM NaiveBayesText (

ON complaintsTEXT_COLUMN ('text_data')CATEGORY_COLUMN ('category')CATEGORIES ('crash', 'no_crash') )

GROUP BY token;

ANSI SQL Statement

SQL MR Statement

Data Source

SQL-MR Predicates

AsterX 7.0 Storage

• Examples: Analytic Temp Tables and Hive Perm Tables

TEMPStorage

PermStorage

Queen

AX AX AXSQL

Temp Temp

Distributed Storage: Hive, hCat

LocalSpill to

disk

Foreign Server Read/Write w SQL-H

HiveTable

Weblog

TEMPStorage

PermStorage

AsterX Storage

• Before AX is running• Hive tables: Hive_t1, Hive_t2, Hive_t3• Flat Files: weblog.txt

Queen

AX AX AXSQL

Temp Temp

Distributed Storage: Hive, hCat

LocalSpill to

disk

Foreign Server Read/Write w SQL-H

HiveT1

HiveT2

HiveT3

Weblog

TEMPStorage

PermStorage

AsterX Storage

• CTAS – analytic: Aster_analytic_t1• CTAS – Temp: Aster_session_temp_t2

Queen

AX AX AXSQL

Temp Temp

Distributed Storage: Hive, hCat

LocalSpill to

disk

Foreign Server Read/Write w SQL-H

Session Lifetime

Uptime Lifetime

HiveT1

HiveT2

HiveT3

Weblog

T1 T2

TEMPStorage

PermStorage

AsterX Storage

• SQL query – phase temp tables• Temp_phase_1 (the real name would be like

_tmp_21398041237)• Temp_phase_2• Query_output

Queen

AX AX AXSQL

Temp Temp

Distributed Storage: Hive, hCat

LocalSpill to

disk

Foreign Server Read/Write w SQL-H

HiveT1

HiveT2

HiveT3

Weblog

T1 T2

TP1 TP1 QO

Query Lifetime

TEMPStorage

PermStorage

AsterX Storage

• CTAS – To Hive• Alan_dailyreport_06_24_2015

Queen

AX AX AXSQL

Temp Temp

Distributed Storage: Hive, hCat

LocalSpill to

disk

Foreign Server Read/Write w SQL-H

HiveT1

HiveT3

Weblog

T1 T2

.. AlanDR

TEMPStorage

PermStorage

AsterX Storage

• After Aster shutdown

Queen

AX AX AXSQL

Temp Temp

Distributed Storage: Hive, hCat

LocalSpill to

disk

Foreign Server Read/Write w SQL-H

HiveT1

HiveT3

Weblog

.. AlanDR

AsterX Failure & Recovery

Metadata persistance

• Admin DDL Checkpoint• Saves checkpoint file to disk• Manually done via ncli command• Restart causes checkpoint to be replayed• Checkpoint files are valid on any AsterX instance*

• Check pointed• Users, Roles, Databases, Schemas• foreign server definitions• packaged analytics models and functions• grant privileges on above

• Not Check pointed• Tables, views, constraints, indexes• R scripts installed on the server side• user-installed files and SQL/MR functions• user scripts for vacuum or daily jobs

AsterX 7.0 Failure Recovery

• Node Failure = loss of analytic tables

• Worker Node AND/or vWorker• System will allocate new node

• Conversation with YARN… • Move vworkers/Node• Come to prepared state• Activate – automatically

• ALL TEMP Data is LOST. • Vworker is treated as node failure

Edge Node

Queen User

Aster Aster

Temp Temp Temp

Edge Node

Queen User

Aster Aster

Temp Temp Temp

Before

After Restart

AsterX 7.0 Failure Recovery

• Queen Fails• Recovery is “Repair”• DDL Gen Unwind…

• If unrecoverable• Delete cluster• cluster create…

EdgeNode

App AQueen

User

HadoopServices

NameNode

Aster A

HDFSAster A

HDFS Map/Reduce

HDFS

Aster A

HDFS

Hive

Aster B

EdgeNode

App BQueen

User

WorkerNodes

Head Nodes

AsterX 7.0 Failure Recovery

• Other issues• Same behaviors, different

impact

• DDL Gen – SQL Script of the dictionary.

• State is lost – Temp data will be deleted on restart

EdgeNode

App AQueen

User

HadoopServices

NameNode

Aster A

HDFSAster A

HDFS Map/Reduce

HDFS

Aster A

HDFS

Hive

Aster B

EdgeNode

App BQueen

User

WorkerNodes

Head Nodes

AsterX 7.0 Expansion

• New Instance. Got that?

• Create New Aster Instance• Setup Foreign Server

Constructs… • Go … • Reference Existing

Persistent Data

EdgeNode

App AQueen

User

HadoopServices

NameNode

Aster A

HDFSAster A

HDFS Map/Reduce

HDFS

Aster A

HDFS

Hive

Aster B

EdgeNode

App BQueen

User

WorkerNodes

Head Nodes

AsterX 7.x Configuration Options…

Many, many more options

• Single cluster per workload… • Or... Xmas sized Cluster...

• Monthly Term licensing???

• Internal HDFS Chargeback?

• LOB specific Aster Instance • Delegation of adminstration...• Simplified CapEx / OpEx

administration

EdgeNode

App AQueen

User

HadoopServices

NameNode

Aster A

HDFSAster A

HDFS Map/Reduce

HDFS

Aster A

HDFS

Hive

Aster B

EdgeNode

App BQueen

User

WorkerNodes

Head Nodes

Aster Persistent Storage& Access - Query Grid Two

Aster 7.10 QueryGrid Two - Next Gen

• High speed TD, Presto, Hadoop connectivity• Cluster to Cluster connectivity • Point to Point model – not hub (Kafka is a Hub)

• Common Framework included in each product• Communications, State, Error Management, Data Conversion• Network Protocol, Parallelism, Distribution … and more

• Single cost implementation• Simple set of Get/Set operations specific to the implementation

• Uses full matrix communications in first release• Blocks of Tuples are distributed round robin• Full Communication Matrix• Session Data is MultiPlexed

• Multiple sessions use same communications channelAster Aster

TD TD TD

Aster Aster

TD TD TD

CurrentConnectors

QG2

Aster - Foreign Server Syntax Support

• DML syntax - external objects• Teradata’s Foreign Server Syntax

• Aster & source: Bi-directional data movement

• Load_from_Hcatalog, Teradata, etc• Load_to_Hcatalog, Teradata, etc

• Use: SEL,INS,Views and CTAS

• Query pushdown, Query time special & override of parameters also supported

• Grant & Provoke USAGE & EXECUTE privileges

CREATE FOREIGN SERVER nameUSING server(‘1.1.1.1') port('1234') …DO IMPORT WITH Load_from_XYZ USING …DO EXPORT WITH Load_to_XYZ USING …

SELECT * FROM table@foreignServer;

INSERT INTO table@foreignServerSELECT id, value FROM asterTable;

WITH FOREIGN SERVER fsAlias as (foreignServer using username('foo') password('bar') )SELECT * FROM table1@fsAlias, table2@fsAlias…;

AsterX 7.0 – Scripting Pattern Changes

• Existing Customers - Implementing persistence in AX

• Best practices in script writing• Disable Failure mode until after DDL commands• Truncate Tables (delete from all)• Create Tables inline (keeps code in one place, enables

operator to drop table and not have to change production code)• Cascading Insert/Selects/CTAS• Pour over tables for failure/locking latency

• Option of creating a cluster sized just for this workload

Aster 7.xOther Cool Stuff

AsterX 7.0 Planner Changes

• Improve plans for external table (ET) queries• External tables are the norm in AX (exception in AD)

• 7.0 Planner Hive Meta-store to get table size

?

Sales Store

? ?

Region

AD 6.50 planner view of ETs

3,000,000,000 6,000 4

Sales

StoreRegion

AX 7.0 planner view of ETs

AsterX 7.0 Planner Changes

• Planner Hive Meta-store to get base table stats• Only table rowcount and size. No columns/histograms

• Recognize small ETs and replicate as dim tables• Save on costly data repartitioning

• Improve join order optimization• Avoid early theta joins and dataflow multipliers

• Better skew avoidance• Avoid partitioning on low cardinality columns

AsterX 7.0 Multi-Tenancy

• Hadoop is an Execution Environment• Aster must conform to Hadoop’s capabilities • Hadoop supports “Sessions” and Aster

supports ”Sessions”…• Ergo – how does Aster run inside Hadoop... • Aster is a Daemon based architecture...

• Multi-Tenancy in AX7.0• Co-exist with other Hadoop Applications• Port Mapping is the largest single problem

EdgeNode

App AQueen

User

HadoopServices

NameNode

Aster A

HDFSAster A

Aster B Map/Reduce

HDFS

Aster A

HDFS

Hive

Aster B

EdgeNode

App BQueen

User

WorkerNodes

Head Nodes

Thank You

Questions/CommentsEmail:

Follow MeTwitter @

Rate This Session # with the PARTNERS Mobile App

Remember To Share Your Virtual Passes

michael.mcintire@teradata.com

DataOcean

598

45

Aster 7.10 – What’s NEXT

(where’s the cool graphic???)

Aster 7.10 Containers (using Docker)

• Objective: • Architecture using Containers (what processes go where) - Hadoop

• Major Impact to Startup, Distribution, Process Management

• Reality: • Extraordinarily difficult on Hadoop

• Required complete rewrite of all process management

• Theory issue problems• Process Allocation and management • Foundation of AsterX on other platforms - GCP, AWS, Azure...

• Open decision on long term Hadoop implementation

Aster 7.10 Planner Improvements

• Pushdown predicate to Hive• Automatic in 7.10 (manual in 7.0)• Reduces data movement• Utilizes store format filters

• Cuts down IO & CPU

• Increase dependence on Stats• Foreign System Get/Set Scan

Filter

Scan + Filter

Aster 7.10 Planner Improvements

• Planner pushdown sub-queries to Hive• Intelligent push down (semantics, type, size) • Minimal data movement • Better stats & data distribution utilization

ET1

T4

ET2 ET3

GB

ET1

T4

ET2 ET3

GB

NPath NPath

Aster 7.10 Aster Spark

• Utilize Spark execution framework for Aster• Aster query operator as Spark functions / scripts• Uses Spark MLlib analytics libraries • Customers write functions in Spark using…

• Uses familiar SQL/MR language framework• Support multiple Spark clusters (ex: same query)• Parallel data transfer (Sockets or HDFS)• Spark Job Monitoring

Aster 7.10 Spark Aster

• Aster table/queries use Spark Data frame API• Read Aster tables/queries in parallel

• Can cache data on disksqlContext.readAsterTable (“<table-name>”, “cache-on-disk>, …)sqlContext.readAsterUsingQuery (“<query>”, <cache-on-disk>, …)

• Write Data frames Tables in parallel (overwrite / append mode)<dataframe>.writeToAsterTable(“table-name”, <mode>, …)

Existing Framework (Analytic flow)

Batch-mode Processing

ASTER FRAMEWORK

Analytics(Model Builder)

Prediction Response

PredictionRequests

AppropriateAction

Analytics(Predictor)

Queries(Test Data)

TrainingData

Model

Score

RealTime Platform -Asynchronousfeedback between the two frameworks.

Aster Platform - Prediction analytics isolated from training (modeling)

Proposal: Split the processing… Real-time Scoring

Analytics(Model Builder) Prediction

Response

PredictionRequests

AppropriateAction

Analytics(Predictor)

Queries(Test Data)

TrainingData

Model

Score

Generates AML File from Model Table

Aster Model Language GeneratorReal-time Scoring

Analytics(Model Builder) Prediction

Response

PredictionRequests

AppropriateAction

Analytics(Predictor)

Queries(Test Data)

TrainingData

Model

Score

AMLGenerator

DRIVERFUNCTION

Your Real Time FrameworkAML File

Scorer Execution Flow

Prediction Response

Transport as

Java JAR file

AppropriateAction

Score

Request

Response

Configurator

Model TypeModel DefinitionModel DataRequest ParametersRequest Definition

PredictionRequests