29
Teradata Listener & A Streaming Analytics Solution with the Teradata UDA

Teradata Listener & A Streaming Analytics Solution with

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Teradata Listener & A Streaming Analytics Solution with

Teradata Listener &

A Streaming Analytics Solution with the Teradata UDA

DTG

Page 2: Teradata Listener & A Streaming Analytics Solution with

• The Internet of Things (IoT)

• Smart electric grids

• Twitter, Facebook, etc.

• Financial markets

• News feed providers

• Web navigation tracking tools

• Weather and earthquake sensors

• Telecommunication networks

• Set-top boxes

• Fleets of moving vehicles

• Mobile location

The Case for Real Time Data

DTG

Page 3: Teradata Listener & A Streaming Analytics Solution with

Business Use Cases: Ingest/Distribute Streaming Data

IoT/sensors

Real-time streaming data

Web-page activity

Web clicks in real time

Security & surveillance Logins, data access

Product recommendations

Product offers feedback

Email compliance

Agent emails or spam

Retailer Cyber Monday

Sales activity

Customer satisfaction index

Positive & negative events

Reservations brokers

Events and resources

Social media watch

RSS, tweets, PR, blogs

Track and trace logistics

Vehicles and containers

DTG

Page 4: Teradata Listener & A Streaming Analytics Solution with

Acquisition Analytics Access

EMERGING

APP FRAMEWORK

Data Engines

CONVENTIONAL

MULTI GENRE

DATA WAREHOUSE

IN MEMORY

DATA LAKE

No SQL

COMPUTE CLUSTER

OPERATIONAL VIRTUAL

QUERY

Business

Intelligence

Languages

Integrated

Development

Environment

Users

Operational

Systems

Customers

Partners

Engineers

Data

Scientists

Business

Analysts

Knowledge

Workers

Marketing

Executives

Platform Services DEVELOPMENT DATA OPERATIONS

PRIVATE HYBRID Cloud Deployment PUBLIC

Sources

ERP

SCM

CRM

Sensors

Audio

and

Video

Machine

Logs

Text

Web and

Social

REAL TIME

INGEST

Page 5: Teradata Listener & A Streaming Analytics Solution with

Acquisition Analytics Access

EMERGING

APP FRAMEWORK

Data Engines

CONVENTIONAL

MULTI GENRE

DATA WAREHOUSE

IN MEMORY

DATA LAKE

No SQL

COMPUTE CLUSTER

OPERATIONAL

QueryGrid

VIRTUAL QUERY

Users

Operational

Systems

Customers

Partners

Engineers

Data

Scientists

Business

Analysts

Knowledge

Workers

Marketing

Executives

Platform Services DEVELOPMENT DATA OPERATIONS

PRIVATE HYBRID Cloud Deployment PUBLIC

Sources

ERP

SCM

CRM

Sensors

Audio

and

Video

Machine

Logs

Text

Web and

Social

AppCenter

REAL TIME

Aster Analytics

R, Spark, Giraph

SAS, SPSS, KXEN

Teradata

Database

Hadoop

Teradata

Database

Business

Intelligence

Languages

Integrated

Development

Environment

INGEST

Listener

Page 6: Teradata Listener & A Streaming Analytics Solution with

6 © 2014 Teradata

Teradata Listener A brief Introduction

Page 7: Teradata Listener & A Streaming Analytics Solution with

Listener Distributes Many Sources to Many Targets

Hadoop2

Hadoop1

Teradata2

Teradata1

Aster1

Sources Targets

DTG

Listener

Page 8: Teradata Listener & A Streaming Analytics Solution with

Predictive Parts Failure Use Case

Real Time Analytics Pipeline Build • Discover & Build analytics against sensor data w/ Aster • Create Ingestion Stream with Listener streaming Sensor data

in from machines in the field – Stream forks to Teradata and Spark

• Deploy model onto Spark where Listener streams the sensor data into Operational Action Zone

• Alerts fire out of Spark based on Aster model/rules – Email alert and database logging as new events

Page 9: Teradata Listener & A Streaming Analytics Solution with

9 © 2014 Teradata

Self-service platform to build, deploy, manage, and run data centric apps for the enterprise

at scale across the UDA

AppCenter

Page 10: Teradata Listener & A Streaming Analytics Solution with

Single discovery platform for ALL users – Business, Analyst & Data Scientist

IDE

SELECT n.event_path, count(*)

FROM nPath(

ON (

SELECT *

FROM telco_data td, profile p

WHERE d.customer_id = p.customer_id

)

PARTITION BY customer_id

ORDER BY timestamp

MODE( overlapping )‏

PATTERN(‘EVENT+.CANCEL_SERVICE_EARLY’)‏

SYMBOLS(

action‏<>‏‘CANCEL‏SERVICE’‏AS‏EVENT,

BI & Open Source Visualization Tools

AppCenter & Guided UI SQL Client

Business Analysts R User Data Scientists

Enables Discovery Development and Execution

R Client

Time to Value Acceleration

(actionable insights in hours, days or weeks)

Page 11: Teradata Listener & A Streaming Analytics Solution with

The Guided UI: Unique Path & Pattern Analysis (nPath) A user-friendly, form-based approach to Path & Pattern Analysis that requires no SQL knowledge

Leverages prebuilt and patented analytics functions

Results can be visualized, published and shared with others

Insights are truly now just a few clicks away

Page 12: Teradata Listener & A Streaming Analytics Solution with

AppCenter Repeatable Analytics, Shareable Results

AppCenter provides your organization with the ability to store your most valuable analytics workflows as apps, execute them with ease, visualize the results and share the insights with others.

Store your packaged analytics apps in a single location.

Execute your apps on demand or use the built-in scheduler.

Visualize your results using open source visualizations and/or BI tools.

Share your results and visualizations throughout your organization and beyond.

Page 13: Teradata Listener & A Streaming Analytics Solution with

Predictive Parts Failure Discovery

• Discover & Build analytics against sensor data w/ Aster – AppCenter currently for Aster only, UDA AppCenter in Roadmap

• Create Ingestion Stream with Listener streaming Sensor data in from machines in the field – Stream forks to Teradata and Spark

• Deploy model onto Spark where Listener streams the sensor data into Operational Action Zone

• Alerts fire out of Spark based on Aster model/rules – Email alert and database logging as new events into Teradata

Page 14: Teradata Listener & A Streaming Analytics Solution with

Analytics (Model Builder)

ASTER FRAMEWORK

Training Data

Aster Model

Prediction

Response

Prediction

Requests

Aster Scorer

Queries

USER FRAMEWORK

AML Generator

AML

File

Scoring in the customer’s real time environment

Operationalization Overview with Aster 6.21

Aster

Model File

Rapid Iteration on Massive Data @ Rest

Easy Deployment

No Model Re-Create

Page 15: Teradata Listener & A Streaming Analytics Solution with

Predictive Parts Failure Discovery

• Discover & Build analytics against sensor data w/ Aster

• Create Ingestion Stream with Listener streaming Sensor data in from machines in the field – Stream forks to Teradata and Spark

• Deploy model onto Spark where Listener streams the sensor data into Operational Action Zone

• Alerts fire out of Spark based on Aster model/rules – Email alert and database logging as new events into Teradata

Page 16: Teradata Listener & A Streaming Analytics Solution with

Text Tagger

Prediction

Requests2

ASTER FRAMEWORK

Prediction

Requests1

Action

TT Scorer

Text from Listener stream

Tagged

Text Tagger AML file

USER FRAMEWORK

Single Decision Tree

Single Decision Tree AML

file SDT Scorer Scores

Multiple scoring operations UDA Listener stream Operationalization Overview

Page 17: Teradata Listener & A Streaming Analytics Solution with

Packaging and Installation • Package of Scorer JAR files

– Implemented in Java

– Compatible with JDK 1.6 onwards

– Packaged into jar file along with all

scoring libraries and documentation

– Light-weight, Thread-safe

– Platform independent

• AML Generator in the Analytics Base package

– Analytics base package will contain the AML generator function

– AML generator creates XML files describing data transformations and models built on the Aster cluster

– Aster Analytics User Guide for documentation

Operationalization Overview

Page 18: Teradata Listener & A Streaming Analytics Solution with

Customers separate deployment activities from data science. Roles can be performed independently by experts with appropriate skillsets.

Data Scientist DevOps

Engineer

1. Transform the data on Aster 2. Train a model on Aster 3. Generate an AML file for each 4. Export the AML file and transfer them to the

real-time env. 5. Refresh the model with a new AML file

generated from the model

1. Deploy the scorer jar file 2. Deploy the AML files into the real-time environment 3. To refresh the model, get the latest AML file

Operationalization Overview

Page 19: Teradata Listener & A Streaming Analytics Solution with

Acquisition Analytics Access

EMERGING

APP FRAMEWORK

Data Engines

CONVENTIONAL

MULTI GENRE

DATA WAREHOUSE

IN MEMORY

DATA LAKE

No SQL

COMPUTE CLUSTER

OPERATIONAL

Users

Operational

Systems

Customers

Partners

Engineers

Data

Scientists

Business

Analysts

Knowledge

Workers

Marketing

Executives

Platform Services DEVELOPMENT DATA OPERATIONS

PRIVATE HYBRID Cloud Deployment PUBLIC

Sources

ERP

SCM

CRM

Sensors

Audio

and

Video

Machine

Logs

Text

Web and

Social

AppCenter

REAL TIME

Aster Analytics

R, Spark, Giraph

SAS, SPSS, KXEN

Teradata

Database

Hadoop

Teradata

Database

Business

Intelligence

Languages

Integrated

Development

Environment

INGEST

Listener QueryGrid

VIRTUAL QUERY

DEPLOY

Page 20: Teradata Listener & A Streaming Analytics Solution with

21 © 2014 Teradata

Radically Simplify Big Data Streaming

LISTENER

Page 21: Teradata Listener & A Streaming Analytics Solution with

22

Listener Makes It Simple

Teradata Listener is an intelligent, self-service software solution for ingesting and distributing fast moving data streams

© 2014 Teradata

Page 22: Teradata Listener & A Streaming Analytics Solution with

Teradata Listener Features

Enterprise wide solution for ingesting high volume real-time streams of

data

Automatic calculation of volume, latency & monitoring metrics

Pre-built integration with Teradata UDA for persisting data in real-time or

batches

Fully supported & enterprise grade

Self-service & governance for users

Build real-time streaming analytics, power real-time dashboards, generate

alerts using Teradata Listener APIs

Microservice cloud architecture

Page 23: Teradata Listener & A Streaming Analytics Solution with

Listener Data Flow & Engine Integration

• A source can be persisted to one or more systems or multiple sources can be persisted to a single system

• Data can be persisted in near real time or batches (records or time) • Data can also be streamed out of Listener to external 3rd party processing engines (e.g. Storm,

Spark, TIBCO, IBM, VoltDB, etc.)

Teradata Listener SOURCES

IoT

SOURCES

Twitter

SOURCES

Facebook

SOURCES

iOS App

SOURCES

Website

SOURCES

Sensor

SYSTEMS

HBase

SYSTEMS

Aster

SYSTEMS

HDFS

SYSTEMS

Teradata

Page 24: Teradata Listener & A Streaming Analytics Solution with

25

Listener Data Flow

2

5

SOURCES SYSTEMS FIREHOSE STREAMS ROUTER WRITERS INGEST

Write tuples Write to firehose Write to streams

Read mini-batches Read mini-batches Multiple sources

DTG Stream API

Page 25: Teradata Listener & A Streaming Analytics Solution with

26

Listener Capabilities

Scale and volume Scale out multiple inputs and outputs

Volume metrics Monitor data movement

with easy dashboards

Real-time Ingest

Continuous streams Capture and exploit

new data sources

Easy setup and go Add new sources and

targets in minutes

Self-Service and

Governance

Pause / resume Pause data streams for

maintenance

Security built in Owners and admins

Enterprise platform For the entire enterprise

Highly available Failover and data

protection built in

API everywhere Access micro services

via RESTful APIs

Enterprise Class

DTG

Page 26: Teradata Listener & A Streaming Analytics Solution with

Predictive Parts Failure Discovery

• Discover & Build analytics against sensor data w/ Aster • Create Ingestion Stream with Listener streaming Sensor data

in from machines in the field – Stream forks to Teradata and Spark – Spark is currently a roadmap Target

• Deploy model onto Spark where Listener streams the sensor data into Operational Action Zone

• Alerts fire out of Spark based on Aster model/rules – Email alert and database logging as new events into Teradata

Page 27: Teradata Listener & A Streaming Analytics Solution with

Real-Time Analytic Pattern Options

• There are multiple option for Real-Time analytics, we discussed only one today. Others include:

• Listener Stream -> Aster model operationalized on Spark or Storm

• Listener Stream -> SAS model operationalized on Teradata

• Listener Stream -> FuzzyLogix based model on Teradata

• Web Service Integration -> ARTIM next best action interaction

– ARTIM has a nice business UI and leverages Machine Learning algorithms to optimize the offers served

Page 28: Teradata Listener & A Streaming Analytics Solution with

Q&A

• Discussion

Page 29: Teradata Listener & A Streaming Analytics Solution with

MQ, JMS

What is Real Time?

Real time Synchronous

1-100ms

Near real time Asynchronous

100ms-1 minute

Batch minutes to hours

Listener

Micro batch

1-5 minutes

Apps

Hadoop1

Teradata1

Aster1

DTG

Data-in-motion Data-at-rest