41
Grab some coffee and enjoy the pre-show banter before the top of the hour!

Not Your Father’s Data Warehouse: Breaking Tradition with Innovation

Embed Size (px)

DESCRIPTION

The Briefing Room with Dr. Robin Bloor and Teradata Live Webcast on May 20, 2014 Watch the archive: https://bloorgroup.webex.com/bloorgroup/lsr.php?RCID=f09e84f88e4ca6e0a9179c9a9e930b82 Traditional data warehouses have been the backbone of corporate decision making for over three decades. With the emergence of Big Data and popular technologies like open-source Apache™ Hadoop®, some analysts question the lifespan of the data warehouse and the future role it will play in enterprise information management. But it’s not practical to believe that emerging technologies provide a wholesale replacement of existing technologies and corporate investments in data management. Rather, a better approach is for new innovations and technologies to complement and build upon existing solutions. Register for this episode of The Briefing Room to hear veteran Analyst Dr. Robin Bloor as he explains where tomorrow’s data warehouse fits in the information landscape. He’ll be briefed by Imad Birouty of Teradata, who will highlight the ways in which his company is evolving to meet the challenges presented by different types of data and applications. He will also tout Teradata’s recently-announced Teradata® Database 15 and Teradata® QueryGrid™, an analytics platform that enables data processing across the enterprise. Visit InsideAnlaysis.com for more information.

Citation preview

Page 1: Not Your Father’s Data Warehouse: Breaking Tradition with Innovation

Grab some coffee and enjoy the pre-show banter before the top of the hour!

Page 2: Not Your Father’s Data Warehouse: Breaking Tradition with Innovation

The Briefing Room

Not Your Father’s Data Warehouse: Breaking Tradition with Innovation

Page 3: Not Your Father’s Data Warehouse: Breaking Tradition with Innovation

Twitter Tag: #briefr

The Briefing Room

Welcome

Host: Eric Kavanagh

[email protected] @eric_kavanagh

Page 4: Not Your Father’s Data Warehouse: Breaking Tradition with Innovation

Twitter Tag: #briefr

The Briefing Room

!   Reveal the essential characteristics of enterprise software, good and bad

!   Provide a forum for detailed analysis of today’s innovative technologies

!   Give vendors a chance to explain their product to savvy analysts

!   Allow audience members to pose serious questions... and get answers!

Mission

Page 5: Not Your Father’s Data Warehouse: Breaking Tradition with Innovation

Twitter Tag: #briefr

The Briefing Room

Topics

This Month: DATABASE

June: ANALYTICS & MACHINE LEARNING

July: INNOVATIVE TECHNOLOGY

2014 Editorial Calendar at www.insideanalysis.com/webcasts/the-briefing-room

Page 6: Not Your Father’s Data Warehouse: Breaking Tradition with Innovation

Twitter Tag: #briefr

The Briefing Room

Database

Page 7: Not Your Father’s Data Warehouse: Breaking Tradition with Innovation

Twitter Tag: #briefr

The Briefing Room

Analyst: Robin Bloor

Robin Bloor is Chief Analyst at The Bloor Group

[email protected] @robinbloor

Page 8: Not Your Father’s Data Warehouse: Breaking Tradition with Innovation

Twitter Tag: #briefr

The Briefing Room

Teradata

!   Teradata is known for its analytics data solutions with a focus on integrated data warehousing, big data analytics and business applications

!   It offers a broad suite of technology platforms and solutions and a wide range of data management applications

!   Teradata recently announced Database 15 and QueryGrid, an analytics platform that enables data processing across the enterprise

Page 9: Not Your Father’s Data Warehouse: Breaking Tradition with Innovation

Twitter Tag: #briefr

The Briefing Room

Guest: Imad Birouty

Imad Birouty holds the position of Manager of Teradata Product Marketing and is responsible for Teradata software and hardware products, including the Teradata Database, Teradata Platform Family, Teradata Unity, Tools and Utilities, and In-Database Analytics. Prior to this, Imad led the Product Management team responsible for the NCR/Teradata Platforms and was responsible for setting product strategy and direction.

Page 10: Not Your Father’s Data Warehouse: Breaking Tradition with Innovation

Imad Birouty Teradata

NOT YOUR FATHER’S DATABASE: BREAKING TRADITION WITH NEW INNOVATIONS

Page 11: Not Your Father’s Data Warehouse: Breaking Tradition with Innovation

11 5/19/14 Copyright Teradata

THE CHANGING DATABASE

From Structured Data To Structured and Multi-Structured Data (XML, JSON, Weblogs)

From SQL To SQL, Java, Perl, Ruby, Python, and R From Business Users To Business Users and Developers

From Disk Data Storage To Disk Data Storage with Solid State Drives and in-memory

From Query Single Database To Query Multi-Databases and Sources

From Reporting/Ad Hoc Queries To Reporting/Ad Hoc Queries and 1,000s of In-database Analytics Algorithms

From Row Data Storage To Hybrid Row/Column Data Storage

Page 12: Not Your Father’s Data Warehouse: Breaking Tradition with Innovation

12 5/19/14 Copyright Teradata

RUNS THE INTERNET OF THINGS – AND TERADATA RUNS JSON

Page 13: Not Your Father’s Data Warehouse: Breaking Tradition with Innovation

13 5/19/14 Copyright Teradata

Multi-structured Data à Data Warehouse

Teradata Data Warehouse

JSON

41521390 2013-01-01 00:25:42 2.111.94.18 Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_5; en-us) AppleWebKit/533.19.4 (KHTML, like Gecko) Version/5.0.3 Safari/533.19.4 "http://www.cokstate.edu/welcome/" "https://www.google.com/#sclient=psyab&hl=en&source=hp&q=oklahoma+state&pbx=1&oq”

weblogs XML

<xml />

XML

Page 14: Not Your Father’s Data Warehouse: Breaking Tradition with Innovation

14 5/19/14 Copyright Teradata

Serving Two Perspectives

Business User

• New Data Elements • New Data Sources • Dynamic Data Sources • Rapid Turnaround • Agile Change •  Independence, Autonomy

IT Professional

• Consistency • Stability •  IT Processes • Governance and Security •  Test Cycles • Smooth Application Interaction

Page 15: Not Your Father’s Data Warehouse: Breaking Tradition with Innovation

15 5/19/14 Copyright Teradata

Early and Late Binding in SQL

Early binding

Late binding

RUNTIME LOAD TIME

Data Warehouse

Source data

Schema

ETL

CLOB Weblogs

SQL + parse/extract

functions

BI tools

Page 16: Not Your Father’s Data Warehouse: Breaking Tradition with Innovation

16 5/19/14 Copyright Teradata

Choice: Right Approach In Any Environment

Schema on Write

• Well understood data • Relational integrity • Storage efficiency

Schema On Read

• Dynamic data • Reduced coordination • Human readable

Teradata 15 now offers both

Page 17: Not Your Father’s Data Warehouse: Breaking Tradition with Innovation

17 5/19/14 Copyright Teradata

Teradata table operators support C++, Java, Perl, Ruby – and Python.

ALL THE COOL KIDS CODE IN

WELCOME TO THE COOLEST ANALYTIC DBMS, DUDES.

Page 18: Not Your Father’s Data Warehouse: Breaking Tradition with Innovation

18 5/19/14 Copyright Teradata

Application Developers and BI SQL Programmers

Application Developers

•  Flow logic control focus •  Procedural and script languages • Data retrieved for use in application processing • Work within IDE • Object and custom build orientation

BI SQL Programmers

• Set-based data processing focus • SQL language • Data retrieval, processing, and presentation is the application • Standalone or template-based query development • RDBMS orientation

Page 19: Not Your Father’s Data Warehouse: Breaking Tradition with Innovation

19 5/19/14 Copyright Teradata

Run in-database, in parallel • Perl • Python • Ruby • R • Shell Scripts • C/C++ • Java

Choice of Languages

•  Embed parts of application logic in database >  Separate presentation and

processing layers >  Eliminate “round trips” >  Automatically process data in

parallel

Page 20: Not Your Father’s Data Warehouse: Breaking Tradition with Innovation

20 5/19/14 Copyright Teradata

In-memory performance, spinning disk prices.ANY QUESTIONS?

IN A RECENT PETABYTE-SCALE BENCHMARK ON TERADATA TECHNOLOGY

95%OF I/Os WERE SERVED FROM MEMORY

Page 21: Not Your Father’s Data Warehouse: Breaking Tradition with Innovation

21 5/19/14 Copyright Teradata

•  43% of disk I/O against 1% of data • Hottest data in memory/not all the data •  Integrated into Teradata system • No need for separate appliance

Improves Query Performance

Performance of in-memory databases without their cost

Data Temperature Profile – Typical DW

Page 22: Not Your Father’s Data Warehouse: Breaking Tradition with Innovation

22 5/19/14 Copyright Teradata

•  Sophisticated algorithms to track usage, measure temperature, and rank data •  Compliments

FSG cache •  Dynamically

adjusts to new query patterns

Leveraging Extended Memory Space

Intelligent Memory

most recently

used data

most frequently used data

Hottest data placed and maintained in memory, aged out

as it cools

cool out very hot in

FSG Cache

Temporarily store data required for current queries, purges least

recently used

Page 23: Not Your Father’s Data Warehouse: Breaking Tradition with Innovation

23 5/19/14 Copyright Teradata

1+ Petabyte Benchmark – Impact of TIM

TIM set to 50% of total system memory TIM showed a cache effectiveness of 95% 60% Reduction in Total Physical IO!

..but max CPU @ 100% For both benchmarks

Clear reduction in physical I/O…

Page 24: Not Your Father’s Data Warehouse: Breaking Tradition with Innovation

24 5/19/14 Copyright Teradata

Page 25: Not Your Father’s Data Warehouse: Breaking Tradition with Innovation

25 5/19/14 Copyright Teradata

• Run the right analytic on the right platform >  Take advantage of specialized processing engines while operating as a

cohesive analytic environment

• Automated and optimized work distribution through “push-down” processing across platforms >  Minimize data movement, process data where it resides >  Minimize data duplication >  Transparently automate analytic processing and data movement

between systems >  Bi-directional data movement

•  Integrated processing; within and outside the UDA

•  Easy access to data and analytics through existing SQL skills and tools

Teradata QueryGrid™

Optimize, simplify, and orchestrate processing across and beyond the Teradata UDA

Page 26: Not Your Father’s Data Warehouse: Breaking Tradition with Innovation

26 5/19/14 Copyright Teradata

HADOOP LANGUAGES OTHER DATABASES

Remote, push-down

processing in Hadoop

Teradata Databases

Aster functions such as SQL-MapReduce™,

graph

When fully implemented, the Teradata Database or the Teradata Aster Database will be able to intelligently use the functionality and data of multiple heterogeneous processing engines

Teradata QueryGrid™

TERADATA ASTER

DATABASE

IDW Discovery

TERADATA DATABASE

TERADATA DATABASE

TERADATA ASTER

DATABASE

RDBMS Databases

Leverage Languages such

as SAS, Perl, Python, Ruby, R

Page 27: Not Your Father’s Data Warehouse: Breaking Tradition with Innovation

27 5/19/14 Copyright Teradata

Teradata Database 15 – Teradata QueryGrid™

• Query through Teradata

• Sent to Hadoop through Hive

• Results returned to Teradata

• Additional processing joins data in Teradata

• Final results sent back to application/user

Leverage Hadoop resources, Reduce data movement • Bi-directional to Hadoop

• Query push-down • Easy configuration of server connections

1 2

3 4

Page 28: Not Your Father’s Data Warehouse: Breaking Tradition with Innovation

28 5/19/14 Copyright Teradata

Customer Value Based on Social Influence Use Case

HADOOP TERADATA

ASTER DATABASE

TERADATA DATABASE

•  Determine high value customers based on history •  Determine customer value

based on social influence

<=

•  Determine customer sentiment

•  Determine customer sphere of influence

$$ 1

2 3

4

Page 29: Not Your Father’s Data Warehouse: Breaking Tradition with Innovation

29 5/19/14 Copyright Teradata

QUESTIONS?

Page 30: Not Your Father’s Data Warehouse: Breaking Tradition with Innovation

Twitter Tag: #briefr

The Briefing Room

Perceptions & Questions

Analyst: Robin Bloor

Page 31: Not Your Father’s Data Warehouse: Breaking Tradition with Innovation

To Go With The Flow

Robin Bloor, Ph.D.

Page 32: Not Your Father’s Data Warehouse: Breaking Tradition with Innovation

Everything in Flux

u  Hardware (network, storage, servers)

u  Data sources u  Data staging u  Data volumes u  Data flow u  Data governance u  Data usage u  Data structures u  Schema definition u  Ingest speeds u  Data workloads

Page 33: Not Your Father’s Data Warehouse: Breaking Tradition with Innovation

The “Pipeline” Data Architecture Do we take the DATA TO THE PROCESSING…

…or the PROCESSING TO THE DATA?

This is not a simple question

Page 34: Not Your Father’s Data Warehouse: Breaking Tradition with Innovation

Data as “The New Oil”

The diagram illustrates the fractional distillation of crude oil

The DATA RESERVOIR/DATA HUB concept suggests something similar for data

The analogy is not perfect, but it is useful

Page 35: Not Your Father’s Data Warehouse: Breaking Tradition with Innovation

The Data Problem

Page 36: Not Your Father’s Data Warehouse: Breaking Tradition with Innovation

In a Data Pipeline Architecture…

The structure of the data reservoir cannot be independent of the structure

of the logical data warehouse

In our view, the whole ensemble needs to be heuristic

Page 37: Not Your Father’s Data Warehouse: Breaking Tradition with Innovation

u  How do you see the future of JSON and XML?

u  What is the penalty, with Teradata, for late binding in SQL?

u  What do you see as the fundamental division of workload between Hadoop/Asterdata and Teradata? Or, in fact, is there one?

u  Why do you think Hadoop is important from a technical perspective?

Page 38: Not Your Father’s Data Warehouse: Breaking Tradition with Innovation

u  Does Teradata provide any special optimization between query and analytical workloads?

u  Which specific components of the Hadoop ecosystem does Teradata recommend using?

Page 39: Not Your Father’s Data Warehouse: Breaking Tradition with Innovation

Twitter Tag: #briefr

The Briefing Room

Page 40: Not Your Father’s Data Warehouse: Breaking Tradition with Innovation

Twitter Tag: #briefr

The Briefing Room

Upcoming Topics

www.insideanalysis.com

2014 Editorial Calendar at www.insideanalysis.com/webcasts/the-briefing-room

This Month: DATABASE

June: ANALYTICS & MACHINE LEARNING

July: INNOVATIVE TECHNOLOGY

Page 41: Not Your Father’s Data Warehouse: Breaking Tradition with Innovation

Twitter Tag: #briefr

The Briefing Room

THANK YOU for your

ATTENTION!