41
Enabling the Real-Time Analytical Enterprise Michael Ger General Manager, Manufacturing & Automotive, Hortonworks Jordan Martz, Director, Technology Solutions Attunity Chris Gambino, Solutions Engineer Hortonworks

Enabling the Real Time Analytical Enterprise

Embed Size (px)

Citation preview

Presentation Title Goes Here with a Maximum of Three Lines of Copy

Enabling the Real-Time Analytical EnterpriseMichael GerGeneral Manager, Manufacturing & Automotive, HortonworksJordan Martz,Director, Technology SolutionsAttunityChris Gambino,Solutions EngineerHortonworks

# Hortonworks Inc. 2011 2016. All Rights Reserved

Speakers

Michael GerGM, Industrial Manufacturing and Automotive Solutions Hortonworks

Jordan Martz Director of Technology Solutions Attunity

Chris GambinoSolutions EngineerHortonworks

# Hortonworks Inc. 2011 2016. All Rights ReservedAgendaThe Real-Time Analytics ChallengeCase Study: Real-Time Quality AnalyticsThe Hortonworks/Attunity SolutionBrief DemonstrationQ & A

# Hortonworks Inc. 2011 2016. All Rights Reserved

# Hortonworks Inc. 2011 2016. All Rights Reserved

Hortonworks: Powering the Future of Data3

Connected WorldConnected WorldInventoryFactoriesDevicesVehicles 30X Connected Device growth by 2020 (Gartner) 3X growth in connected vehicles by 2022 (PWC)$12M to $209 BILLION growth in RFID by 2021 (McKinsey) 26.9% CAGR growth in Manufacturing IoT thru 2020 (Forbes) Consumers 7X growth in connected population within last 5 years (United Nations)

# Hortonworks Inc. 2011 2016. All Rights Reserved

4

Driving Digital, Real-Time Business Processes

Connected Customers, Vehicles, DevicesSocially crowd-sourced requirementsDigital design and analysisDigital prototypes and tests (simulations)Connected Factories, Sensors, DevicesHuman-robotic interaction3D-printing on demandConnected Trucks, InventoryLocation, traffic, weather-aware distributionReal-time inventory visibilityDynamic reroutingConnected Customers, DevicesOmni- channel demand sensingReal-Time RecommendationsConnected AssetsRemote service monitoring & deliveryPredictive maintenanceOTA UpdatesDevelopmentManufacturingDistributionMarketing/SalesService

# Hortonworks Inc. 2011 2016. All Rights ReservedIn Connected World, Time to Insight Is KeyLeverage ALL information within your Enterprise?Access the latest, real-time information?Analyze and respond immediately to opportunities and threats?What if You Could..

# Hortonworks Inc. 2011 2016. All Rights ReservedTypical Barriers to SuccessNon-Integrated SystemsData VarietyData VolumeData Management Complexity and Cost

# Hortonworks Inc. 2011 2016. All Rights ReservedThe Opportunity

The Real-Time Analytical OrganizationInstant visibility to trending issuesInstant access to data for analysis Collect and Aggregate Data in Real-TimeEnable Real-Time InsightsNew data sources (IOT, Social, CX, etc.)Enterprise Transactional Data

# Hortonworks Inc. 2011 2016. All Rights ReservedCASE STUDY:REAL-TIME QUALITY ANALYTICS

# Hortonworks Inc. 2011 2016. All Rights ReservedAutomotive Quality In The News

Recalls

Customer SatisfactionGovernment RegulationQuality Across EntireCustomer ExperienceRecalls aGrowing OccurrenceTheWatchful Eye

# Hortonworks Inc. 2011 2016. All Rights Reserved

10

Early Visibility and Resolution is Key.Exponential increases in warranty costs by lifecycle phase*

Initial Design $1XSimulation Testing $10XPrototype Testing$100XInitial Production$1,000XThe Rule of 10s*Source: Warranty Week

Warranty Repair$10,000XRecall$100,000XEarly Visibility Minimizes Units in Field!

# Hortonworks Inc. 2011 2016. All Rights ReservedWhat Does it Take to Respond Quickly?What Happened?Why Did It Happen?Contain The Problem

# Hortonworks Inc. 2011 2016. All Rights Reserved

InstalledBaseVoice Of Field

Voice Of Supplier

Production and Operations

Voice Of FactoryProduction and Operations

Voice of VehicleTelematics

Voice Of Customer

Social + Web(Forums, Surveys, etc.)Call CenterService & Warranty

Answers Require Wide Ranging DataLikesDislikesSentimentComplaintsClaimsManMethodMaterialMachineManMethodMaterialMachineDTCsPerformance ParametersVINsConfigurationsLocationsVoice Of Field

What Happened?Why Did It Happen?Contain The Problem

# Hortonworks Inc. 2011 2016. All Rights Reserved

From a Variety of Data Source TypesWhat Happened?Why Did It Happen?Contain The ProblemVoice Of Customer

Social + Web(Forums, Surveys, etc.)Call CenterService & Warranty

Voice of VehicleVoice Of FactoryVoice Of Supplier

Production and Operations

InstalledBaseTelematics

Production and Operations

Voice Of Field

SOCIAL, WEBTRANSACTIONTRANSACTIONTRANSACTION,IOTTRANSACTIONIOTTRANSACTION

# Hortonworks Inc. 2011 2016. All Rights ReservedAgendaThe Real-Time Analytics ChallengeCase Study: Real-Time Quality AnalyticsThe Hortonworks/Attunity SolutionBrief DemonstrationQ & A

# Hortonworks Inc. 2011 2016. All Rights Reserved

# Hortonworks Inc. 2011 2016. All Rights Reserved

Hortonworks: Powering the Future of Data15

Recommended Architecture

#Hortonworks Data Platform

Reference Architecture - DetailedHCatalog: Shared Table & User Defined Metadata for All WorkloadsOozie: Orchestrate Processing

IngestSecurity/Governance: Full Stack Authz Policy Definition, Enforcement, Audit

Data CleansingData TransformationData Processing

Hive LLAP

ToolingR

1

HDFS (Hadoop Distributed File System)/ S3YARN (Cluster Resource Management)Sources

Data Science, Machine Learning

Data Exploration

Model BuildingCustom Applications

DashboardsAnalytics, BI, Ad-hoc Exploration

Visualization & ReportingData Exploration

SASPython

Cloudbreak: Provision via Ambari BluePrints and HDP Cluster on AWS, AutoScale nodes in Cluster based on Job Metrics

CubesCDC

Vehicle

Transactional/Messaging Systems

Iot: Telematics, Social

# Hortonworks Inc. 2011 2016. All Rights ReservedNifi becomes the central ingest mechanism for all data coming into the environment. Nifi can be instantiated as its own WebServer listening to PUT/POST requests from different sources, Nifi can pull from different sources such as S3, SFTP, HTTPS, RDBMS, and more. Once the data is inside of Nifi we can do simple event processing, launch parsers that PerkinElmer has already created, perform data enrichment before landing into a Kafka queue. NOTE: throughout the duration of data lifecycle within Nifi and within HDP we will be tracking lineage such as where the data came from and how it was manipulated.

Placing the data in a Kafka queue will allow other engines to easily pick up the data to begin working with it such as Spark. Spark will be used as the heavy lifting ETL processing along with SQL, and Machine Learning it provides. Leveraging Spark on Hadoop allows Spark to adhere to Security (file,folder, column level security). Writing the data out as ORC file formats keeps the data more flexible to be used with other options such as Hive.

When the data is processed through Spark and stored down through the HiveContext to ORC file formats (or just left in spark) we can then expose to external sources such as SpotFire for analysts to do analysis on.17

Attunity Replicate2017

2016 Attunity

18

Replicate Product Architecture and Demo

2016 Attunity

19

Across All Major PlatformsSAPOracleDB2 z/OSDB2 LUWSQLData WarehouseExadataTeradataNetezzaVerticaActian VectorActian MatrixHortonworksClouderaMapRPivotalHadoopIMS/DBSQL M/PEnscribeRMSVSAMLegacyAWS RDSSalesforceCloudRDBMSOracleSQL ServerDB2 LUWMySQLPostgreSQLSybase ASEInformixMariaDB Data WarehouseExadataTeradataNetezzaVerticaPivotal DB (Greenplum)Pivotal HAWQActian VectorActian MatrixSybase IQHortonworksClouderaMapRPivotalHadoopMongoDBNoSQLAWS RDS/Redshift/EC2Google Cloud SQLGoogle Cloud DataprocAzure SQL Data WarehouseAzure SQL DatabaseCloudKafkaMessage BrokertargetsSourcesRDBMSOracleSQL ServerDB2 iSeriesDB2 z/OSDB2 LUW MySQLSybase ASEInformix

2016 Attunity

20

Feeding the Data Lake with Attunity ReplicateResults4500 applications

DB2 MFSQLOracleConsolidating massive data volumes for global analyticsHadoop Data Lake with KafkaMinimizing labor and costRealizing faster insights and competitive advantage

Fortune 100 auto maker

2016 Attunity

21

Feeding the Data Lake with Attunity Replicate

Heterogeneous applicationsDB2 MFSQLOracleCombining massive data volumes for global analyticsSAP HANA, Microsoft Azure Standardize ingestion across the data lakeInitial use cases include warranty analysis, remanufacturing analysis

Major Construction Equipment ManufacturerDB2 LUWIMSVSAMTeradata

2016 Attunity

Data Lake Ingests with Attunity Replicate: On-Prem & Clouds

Transfer

TransformFilterBatchCDCIncrementalIn-MemoryFile ChannelBatch

Hadoop

Files

RDBMS

Data Warehouse

Mainframe

CloudOn-premCloudOn-prem

Hadoop

Files

RDBMS

Data Warehouse

Kafka

Persistent Store

2016 Attunity

What the industry cares about:Hadoop has moved out of testEnterprise Use CaseCloser to ProductionBusiness impactEnterprise + Real-time VS Sqoop + BatchAttunity + ReplicateHigh performance connectivity to Hadoop though native APIs for data ingest and publicationAutomated schema generation in HcatalogDrag & drop configuration with Click-2-Replicate designHigh-speed data load options:Full reload with overwriteInsert only appendsChange Data Capture(CDC)In-memory data filtering and transformation Monitoring dashboard with web-based metrics, alerts and log file management

23

Attunity Solutions for SAPSAP Test Data ManagementSAP HANA Data IntegrationEnabling SAP Analytics

2016 Attunity

24

Replicate for SAP

TransformFilterBatchCDCIncrementalIn-MemoryFile ChannelBatch

Attunity Replicate for SAP Data Flow

Persistent Store Extract relationships for Pool and Cluster Tables

Navigate and select SAPbusiness objectsAutomated ABAP Mapping and Change-Data-Capture for Pool and Cluster tables1. Replicate for SAP UI

2. Replicate for SAP RFC CallsRDBMS(Oracle, DB2, etc.)

Redo/ ArchivelogsorJournalFile----------------TransparentTables

On Premises

Hadoop

RDBMS

Data WarehouseKafkaCloud

2016 Attunity

25

Attunity ReplicateGo agile with modern and automated integration No manual coding or scriptingAutomated end-to-endOptimized and configurable

Hadoop

Files

RDBMS

EDW

Mainframe

Target schema creationHeterogeneous data type mappingBatch to CDC transitionDDL change propagationFilteringTransformations

Hadoop

Files

RDBMS

EDW

Kafka

2016 Attunity

26

Zero-footprint ArchitectureLower impact on ITNo software agents on sources and targets for mainstream databasesReplicate data from 100s of source systems with easy configurationNo software upgrades required at each database source or target

Hadoop

Files

RDBMS

EDW

Mainframe

Log basedSource specific optimization

Hadoop

Files

RDBMS

EDW

Kafka

2016 Attunity

Another advantage of Replicate is the agentless data replication for mainstream database systems. Recently a customer needed to ingest data from 4500 applications across hundreds of databases into Hadoop. With Replicate they are able to do this without installing an agent on each source system because Replicate extracts source logs remotely in an optimized manner and processes the data in-memory on the Replicate server. This also means that maintaining the product is simplified since it does not requiring maintaining and upgrading software agents each source or target system.

27

Hortonworks HDF+ Attunity Replicate

2016 Attunity

ATTU/CDCAutomated data ingest Incremental updates with Change Data Capture (CDC)Broad support for many enterprise data sourcesHDP/HDFRapid deployments of HUGE data lakesContinuous data refresh for RELEVANT analyticsCOMPLETE datasets across databases, DWs and mainframes

28

The Connected Data Platform

2016 Attunity

HDF and HDP form the Connected Data PlatformData in Motion (connected, real-time, tracked) and Data at Rest (massive scale analysis, retention, security)Modern Data Applications are built on the Connected Data PlatformMetron for exampleCustomer built applicationsHortonworks: Powering the Future of Data29

The Connected Data Architecture & Attunity

SOURCESOLTP, ERP,CRM SystemsDocuments, EmailsWeb Logs,Click StreamsSocial NetworksMachine GeneratedSensorDataGeolocation Data

Data Integration & Ingests

Attunity Replicate for HDP and HDFAccelerate time-to-insights by delivering solutions faster, with fresher data, from many sources Automated data ingest Incremental data ingest (CDC)Broad support for many sources

2016 Attunity

30

In Memory and File Optimized Data TransportEnterprise-class CDC for Data At Rest and Data In Motion

R1R1R2R1R2R1R2

Batch CDCData WarehouseIngest-Merge

SQL

n

2

1SQLSQLTransactional CDC

Message Encoded CDC

Data Sources

Attunity Replicate Change Processing

CDC

Many Databases and Data Warehouses

....

2016 Attunity

One of the reasons several large technology companies trust and rely on Attunity for their own solutions is because of the robust CDC capability that Replicate provides.

There are several options that are built into the product that provide flexible and optimized ways to implement change data capture.

In addition to applying transactions in real-time and in-order, Replicate can handle varying volumes of changes on the source systems by applying the changes in optimized batches to improve throughput and latency

In order to provide high-speed data loads into data warehouse appliances, Replicate is integrated with native data warehouse loaders for fast data ingestion into the target and then changes are merged in the target. It does not rely on sub optimal ODBC for data loading into the ware house systems.

And recently, Attunity added support to write changes in message encoded format that can be published to Kafka message brokers as well.

31

DemandEasy ingest + CDCReal-time processingReal-time monitoringReal-time HadoopScalable to 1000s applicationsOne publisher multiple ConsumersAttunity - ReplicateDirect integration using Kafka APIs In-memory optimized data streamingSupport for multi-topic and multi-partitioned data publicationFull load and CDCIntegrated management and monitoring via GUIKafka and Real-time Streaming

2016 Attunity

32

T1/P0T2/P1T3/P0Broker 1Attunity Replicate for Kafka - Architecture

M0M1M2M3M4M5M6M7M8

M0M1M2M3M4M5

M0M1M2M3M4M5M6M7

T1/P1T2/P0Broker 2M0M1M2M3M4

M0M1M2M3M4M5M6

2016 Attunity

33

AgendaThe Real-Time Analytics ChallengeCase Study: Real-Time Quality AnalyticsThe Hortonworks/Attunity SolutionBrief DemonstrationQ & A

# Hortonworks Inc. 2011 2016. All Rights Reserved

# Hortonworks Inc. 2011 2016. All Rights Reserved

Hortonworks: Powering the Future of Data34

CDCDemo: Data Streaming into Kafka HDF HDP

MSG

n

2

1MSGMSGData Streaming

Transaction logsIn memory optimised metadata management and data transportBulk Load

MSG

n

2

1MSGMSGData Streaming

Message brokerMessage broker

Example DB:

2016 Attunity

Data transport and integrationLog dataDatabase changesSensors and device dataMonitoring streamsCall data recordsStock ticker dataReal-time stream processingMonitoringAsynchronous applicationsFraud and security

35

Demo: Data Streaming into Kafka HDF HDP

2016 Attunity

Data transport and integrationLog dataDatabase changesSensors and device dataMonitoring streamsCall data recordsStock ticker dataReal-time stream processingMonitoringAsynchronous applicationsFraud and security

36

# Hortonworks Inc. 2011 2016. All Rights Reserved

# Hortonworks Inc. 2011 2016. All Rights Reserved

# Hortonworks Inc. 2011 2016. All Rights Reserved

# Hortonworks Inc. 2011 2016. All Rights ReservedThank You!Q/AHortonworks.comAttunity.com

# Hortonworks Inc. 2011 2016. All Rights Reserved

# Hortonworks Inc. 2011 2016. All Rights Reserved

Hortonworks: Powering the Future of Data41