Upload
hortonworks
View
228
Download
0
Embed Size (px)
Citation preview
Presentation Title Goes Here with a Maximum of Three Lines of Copy
Enabling the Real-Time Analytical EnterpriseMichael GerGeneral Manager, Manufacturing & Automotive, HortonworksJordan Martz,Director, Technology SolutionsAttunityChris Gambino,Solutions EngineerHortonworks
# Hortonworks Inc. 2011 2016. All Rights Reserved
Speakers
Michael GerGM, Industrial Manufacturing and Automotive Solutions Hortonworks
Jordan Martz Director of Technology Solutions Attunity
Chris GambinoSolutions EngineerHortonworks
# Hortonworks Inc. 2011 2016. All Rights ReservedAgendaThe Real-Time Analytics ChallengeCase Study: Real-Time Quality AnalyticsThe Hortonworks/Attunity SolutionBrief DemonstrationQ & A
# Hortonworks Inc. 2011 2016. All Rights Reserved
# Hortonworks Inc. 2011 2016. All Rights Reserved
Hortonworks: Powering the Future of Data3
Connected WorldConnected WorldInventoryFactoriesDevicesVehicles 30X Connected Device growth by 2020 (Gartner) 3X growth in connected vehicles by 2022 (PWC)$12M to $209 BILLION growth in RFID by 2021 (McKinsey) 26.9% CAGR growth in Manufacturing IoT thru 2020 (Forbes) Consumers 7X growth in connected population within last 5 years (United Nations)
# Hortonworks Inc. 2011 2016. All Rights Reserved
4
Driving Digital, Real-Time Business Processes
Connected Customers, Vehicles, DevicesSocially crowd-sourced requirementsDigital design and analysisDigital prototypes and tests (simulations)Connected Factories, Sensors, DevicesHuman-robotic interaction3D-printing on demandConnected Trucks, InventoryLocation, traffic, weather-aware distributionReal-time inventory visibilityDynamic reroutingConnected Customers, DevicesOmni- channel demand sensingReal-Time RecommendationsConnected AssetsRemote service monitoring & deliveryPredictive maintenanceOTA UpdatesDevelopmentManufacturingDistributionMarketing/SalesService
# Hortonworks Inc. 2011 2016. All Rights ReservedIn Connected World, Time to Insight Is KeyLeverage ALL information within your Enterprise?Access the latest, real-time information?Analyze and respond immediately to opportunities and threats?What if You Could..
# Hortonworks Inc. 2011 2016. All Rights ReservedTypical Barriers to SuccessNon-Integrated SystemsData VarietyData VolumeData Management Complexity and Cost
# Hortonworks Inc. 2011 2016. All Rights ReservedThe Opportunity
The Real-Time Analytical OrganizationInstant visibility to trending issuesInstant access to data for analysis Collect and Aggregate Data in Real-TimeEnable Real-Time InsightsNew data sources (IOT, Social, CX, etc.)Enterprise Transactional Data
# Hortonworks Inc. 2011 2016. All Rights ReservedCASE STUDY:REAL-TIME QUALITY ANALYTICS
# Hortonworks Inc. 2011 2016. All Rights ReservedAutomotive Quality In The News
Recalls
Customer SatisfactionGovernment RegulationQuality Across EntireCustomer ExperienceRecalls aGrowing OccurrenceTheWatchful Eye
# Hortonworks Inc. 2011 2016. All Rights Reserved
10
Early Visibility and Resolution is Key.Exponential increases in warranty costs by lifecycle phase*
Initial Design $1XSimulation Testing $10XPrototype Testing$100XInitial Production$1,000XThe Rule of 10s*Source: Warranty Week
Warranty Repair$10,000XRecall$100,000XEarly Visibility Minimizes Units in Field!
# Hortonworks Inc. 2011 2016. All Rights ReservedWhat Does it Take to Respond Quickly?What Happened?Why Did It Happen?Contain The Problem
# Hortonworks Inc. 2011 2016. All Rights Reserved
InstalledBaseVoice Of Field
Voice Of Supplier
Production and Operations
Voice Of FactoryProduction and Operations
Voice of VehicleTelematics
Voice Of Customer
Social + Web(Forums, Surveys, etc.)Call CenterService & Warranty
Answers Require Wide Ranging DataLikesDislikesSentimentComplaintsClaimsManMethodMaterialMachineManMethodMaterialMachineDTCsPerformance ParametersVINsConfigurationsLocationsVoice Of Field
What Happened?Why Did It Happen?Contain The Problem
# Hortonworks Inc. 2011 2016. All Rights Reserved
From a Variety of Data Source TypesWhat Happened?Why Did It Happen?Contain The ProblemVoice Of Customer
Social + Web(Forums, Surveys, etc.)Call CenterService & Warranty
Voice of VehicleVoice Of FactoryVoice Of Supplier
Production and Operations
InstalledBaseTelematics
Production and Operations
Voice Of Field
SOCIAL, WEBTRANSACTIONTRANSACTIONTRANSACTION,IOTTRANSACTIONIOTTRANSACTION
# Hortonworks Inc. 2011 2016. All Rights ReservedAgendaThe Real-Time Analytics ChallengeCase Study: Real-Time Quality AnalyticsThe Hortonworks/Attunity SolutionBrief DemonstrationQ & A
# Hortonworks Inc. 2011 2016. All Rights Reserved
# Hortonworks Inc. 2011 2016. All Rights Reserved
Hortonworks: Powering the Future of Data15
Recommended Architecture
#Hortonworks Data Platform
Reference Architecture - DetailedHCatalog: Shared Table & User Defined Metadata for All WorkloadsOozie: Orchestrate Processing
IngestSecurity/Governance: Full Stack Authz Policy Definition, Enforcement, Audit
Data CleansingData TransformationData Processing
Hive LLAP
ToolingR
1
HDFS (Hadoop Distributed File System)/ S3YARN (Cluster Resource Management)Sources
Data Science, Machine Learning
Data Exploration
Model BuildingCustom Applications
DashboardsAnalytics, BI, Ad-hoc Exploration
Visualization & ReportingData Exploration
SASPython
Cloudbreak: Provision via Ambari BluePrints and HDP Cluster on AWS, AutoScale nodes in Cluster based on Job Metrics
CubesCDC
Vehicle
Transactional/Messaging Systems
Iot: Telematics, Social
# Hortonworks Inc. 2011 2016. All Rights ReservedNifi becomes the central ingest mechanism for all data coming into the environment. Nifi can be instantiated as its own WebServer listening to PUT/POST requests from different sources, Nifi can pull from different sources such as S3, SFTP, HTTPS, RDBMS, and more. Once the data is inside of Nifi we can do simple event processing, launch parsers that PerkinElmer has already created, perform data enrichment before landing into a Kafka queue. NOTE: throughout the duration of data lifecycle within Nifi and within HDP we will be tracking lineage such as where the data came from and how it was manipulated.
Placing the data in a Kafka queue will allow other engines to easily pick up the data to begin working with it such as Spark. Spark will be used as the heavy lifting ETL processing along with SQL, and Machine Learning it provides. Leveraging Spark on Hadoop allows Spark to adhere to Security (file,folder, column level security). Writing the data out as ORC file formats keeps the data more flexible to be used with other options such as Hive.
When the data is processed through Spark and stored down through the HiveContext to ORC file formats (or just left in spark) we can then expose to external sources such as SpotFire for analysts to do analysis on.17
Attunity Replicate2017
2016 Attunity
18
Replicate Product Architecture and Demo
2016 Attunity
19
Across All Major PlatformsSAPOracleDB2 z/OSDB2 LUWSQLData WarehouseExadataTeradataNetezzaVerticaActian VectorActian MatrixHortonworksClouderaMapRPivotalHadoopIMS/DBSQL M/PEnscribeRMSVSAMLegacyAWS RDSSalesforceCloudRDBMSOracleSQL ServerDB2 LUWMySQLPostgreSQLSybase ASEInformixMariaDB Data WarehouseExadataTeradataNetezzaVerticaPivotal DB (Greenplum)Pivotal HAWQActian VectorActian MatrixSybase IQHortonworksClouderaMapRPivotalHadoopMongoDBNoSQLAWS RDS/Redshift/EC2Google Cloud SQLGoogle Cloud DataprocAzure SQL Data WarehouseAzure SQL DatabaseCloudKafkaMessage BrokertargetsSourcesRDBMSOracleSQL ServerDB2 iSeriesDB2 z/OSDB2 LUW MySQLSybase ASEInformix
2016 Attunity
20
Feeding the Data Lake with Attunity ReplicateResults4500 applications
DB2 MFSQLOracleConsolidating massive data volumes for global analyticsHadoop Data Lake with KafkaMinimizing labor and costRealizing faster insights and competitive advantage
Fortune 100 auto maker
2016 Attunity
21
Feeding the Data Lake with Attunity Replicate
Heterogeneous applicationsDB2 MFSQLOracleCombining massive data volumes for global analyticsSAP HANA, Microsoft Azure Standardize ingestion across the data lakeInitial use cases include warranty analysis, remanufacturing analysis
Major Construction Equipment ManufacturerDB2 LUWIMSVSAMTeradata
2016 Attunity
Data Lake Ingests with Attunity Replicate: On-Prem & Clouds
Transfer
TransformFilterBatchCDCIncrementalIn-MemoryFile ChannelBatch
Hadoop
Files
RDBMS
Data Warehouse
Mainframe
CloudOn-premCloudOn-prem
Hadoop
Files
RDBMS
Data Warehouse
Kafka
Persistent Store
2016 Attunity
What the industry cares about:Hadoop has moved out of testEnterprise Use CaseCloser to ProductionBusiness impactEnterprise + Real-time VS Sqoop + BatchAttunity + ReplicateHigh performance connectivity to Hadoop though native APIs for data ingest and publicationAutomated schema generation in HcatalogDrag & drop configuration with Click-2-Replicate designHigh-speed data load options:Full reload with overwriteInsert only appendsChange Data Capture(CDC)In-memory data filtering and transformation Monitoring dashboard with web-based metrics, alerts and log file management
23
Attunity Solutions for SAPSAP Test Data ManagementSAP HANA Data IntegrationEnabling SAP Analytics
2016 Attunity
24
Replicate for SAP
TransformFilterBatchCDCIncrementalIn-MemoryFile ChannelBatch
Attunity Replicate for SAP Data Flow
Persistent Store Extract relationships for Pool and Cluster Tables
Navigate and select SAPbusiness objectsAutomated ABAP Mapping and Change-Data-Capture for Pool and Cluster tables1. Replicate for SAP UI
2. Replicate for SAP RFC CallsRDBMS(Oracle, DB2, etc.)
Redo/ ArchivelogsorJournalFile----------------TransparentTables
On Premises
Hadoop
RDBMS
Data WarehouseKafkaCloud
2016 Attunity
25
Attunity ReplicateGo agile with modern and automated integration No manual coding or scriptingAutomated end-to-endOptimized and configurable
Hadoop
Files
RDBMS
EDW
Mainframe
Target schema creationHeterogeneous data type mappingBatch to CDC transitionDDL change propagationFilteringTransformations
Hadoop
Files
RDBMS
EDW
Kafka
2016 Attunity
26
Zero-footprint ArchitectureLower impact on ITNo software agents on sources and targets for mainstream databasesReplicate data from 100s of source systems with easy configurationNo software upgrades required at each database source or target
Hadoop
Files
RDBMS
EDW
Mainframe
Log basedSource specific optimization
Hadoop
Files
RDBMS
EDW
Kafka
2016 Attunity
Another advantage of Replicate is the agentless data replication for mainstream database systems. Recently a customer needed to ingest data from 4500 applications across hundreds of databases into Hadoop. With Replicate they are able to do this without installing an agent on each source system because Replicate extracts source logs remotely in an optimized manner and processes the data in-memory on the Replicate server. This also means that maintaining the product is simplified since it does not requiring maintaining and upgrading software agents each source or target system.
27
Hortonworks HDF+ Attunity Replicate
2016 Attunity
ATTU/CDCAutomated data ingest Incremental updates with Change Data Capture (CDC)Broad support for many enterprise data sourcesHDP/HDFRapid deployments of HUGE data lakesContinuous data refresh for RELEVANT analyticsCOMPLETE datasets across databases, DWs and mainframes
28
The Connected Data Platform
2016 Attunity
HDF and HDP form the Connected Data PlatformData in Motion (connected, real-time, tracked) and Data at Rest (massive scale analysis, retention, security)Modern Data Applications are built on the Connected Data PlatformMetron for exampleCustomer built applicationsHortonworks: Powering the Future of Data29
The Connected Data Architecture & Attunity
SOURCESOLTP, ERP,CRM SystemsDocuments, EmailsWeb Logs,Click StreamsSocial NetworksMachine GeneratedSensorDataGeolocation Data
Data Integration & Ingests
Attunity Replicate for HDP and HDFAccelerate time-to-insights by delivering solutions faster, with fresher data, from many sources Automated data ingest Incremental data ingest (CDC)Broad support for many sources
2016 Attunity
30
In Memory and File Optimized Data TransportEnterprise-class CDC for Data At Rest and Data In Motion
R1R1R2R1R2R1R2
Batch CDCData WarehouseIngest-Merge
SQL
n
2
1SQLSQLTransactional CDC
Message Encoded CDC
Data Sources
Attunity Replicate Change Processing
CDC
Many Databases and Data Warehouses
....
2016 Attunity
One of the reasons several large technology companies trust and rely on Attunity for their own solutions is because of the robust CDC capability that Replicate provides.
There are several options that are built into the product that provide flexible and optimized ways to implement change data capture.
In addition to applying transactions in real-time and in-order, Replicate can handle varying volumes of changes on the source systems by applying the changes in optimized batches to improve throughput and latency
In order to provide high-speed data loads into data warehouse appliances, Replicate is integrated with native data warehouse loaders for fast data ingestion into the target and then changes are merged in the target. It does not rely on sub optimal ODBC for data loading into the ware house systems.
And recently, Attunity added support to write changes in message encoded format that can be published to Kafka message brokers as well.
31
DemandEasy ingest + CDCReal-time processingReal-time monitoringReal-time HadoopScalable to 1000s applicationsOne publisher multiple ConsumersAttunity - ReplicateDirect integration using Kafka APIs In-memory optimized data streamingSupport for multi-topic and multi-partitioned data publicationFull load and CDCIntegrated management and monitoring via GUIKafka and Real-time Streaming
2016 Attunity
32
T1/P0T2/P1T3/P0Broker 1Attunity Replicate for Kafka - Architecture
M0M1M2M3M4M5M6M7M8
M0M1M2M3M4M5
M0M1M2M3M4M5M6M7
T1/P1T2/P0Broker 2M0M1M2M3M4
M0M1M2M3M4M5M6
2016 Attunity
33
AgendaThe Real-Time Analytics ChallengeCase Study: Real-Time Quality AnalyticsThe Hortonworks/Attunity SolutionBrief DemonstrationQ & A
# Hortonworks Inc. 2011 2016. All Rights Reserved
# Hortonworks Inc. 2011 2016. All Rights Reserved
Hortonworks: Powering the Future of Data34
CDCDemo: Data Streaming into Kafka HDF HDP
MSG
n
2
1MSGMSGData Streaming
Transaction logsIn memory optimised metadata management and data transportBulk Load
MSG
n
2
1MSGMSGData Streaming
Message brokerMessage broker
Example DB:
2016 Attunity
Data transport and integrationLog dataDatabase changesSensors and device dataMonitoring streamsCall data recordsStock ticker dataReal-time stream processingMonitoringAsynchronous applicationsFraud and security
35
Demo: Data Streaming into Kafka HDF HDP
2016 Attunity
Data transport and integrationLog dataDatabase changesSensors and device dataMonitoring streamsCall data recordsStock ticker dataReal-time stream processingMonitoringAsynchronous applicationsFraud and security
36
# Hortonworks Inc. 2011 2016. All Rights Reserved
# Hortonworks Inc. 2011 2016. All Rights Reserved
# Hortonworks Inc. 2011 2016. All Rights Reserved
# Hortonworks Inc. 2011 2016. All Rights ReservedThank You!Q/AHortonworks.comAttunity.com
# Hortonworks Inc. 2011 2016. All Rights Reserved
# Hortonworks Inc. 2011 2016. All Rights Reserved
Hortonworks: Powering the Future of Data41