16
Open Big Data Platforms – The New Frontier October 2015

Open Big Data Platforms – The New Frontier - Oracle · Enable discovery Enable self-service Process all forms of data Maximize correlation identification Operationalization of decision

Embed Size (px)

Citation preview

Open Big Data Platforms – The New Frontier

October 2015

2

Atoms to Bits…

* N. Negroponte

An irrevocable, unstoppable and exponential trend *

Amplification of end users

1

2

3

Five Dimensions of a Big Data System

Go Deep

• Complex questions• Interactive exploration

Cover Breadth

• Large amounts of data• Many types of data• All necessary data

In Real-time

• Recent data• Real-time dataWithout

Pre-processing

• No pre-aggregates• No tuning

Within a Windowof Opportunity

• Fast response• Interactive

Moore’s Law and Hardware Cost5

Enable discovery

Enable self-service

Process all forms of data

Maximize correlation identification

Operationalizationof decision

Reduce ‘time-to-insights’

Unstructured Data 80%

Trusted Data 20%

Real-time Analytics• Dashboards• Complex event

processing

What Do Customers Really Want? 4

Our Success Stories

A Singaporebasedinfo-

communicationCompany

Singapore-based Communications Major Delights Customers8

• Understand customer behavior, buying patterns and sentiments and delight customers by delivering the right content at the right time

• Develop and sell Insights from the wealth of Deep Packet Inspection (DPI) data

• Customer behavior analytics solution integrated and correlated structured/unstructured data from geographic information systems, social and enterprise data sources leveraged to deliver analytics around customer lifestyle, purchase criteria, usage patterns, browsing behavior, location and time

• Generated automated personalized offers using a recommendation engine

• New revenue streams from selling anonymized data • Increased ARPU and reduced churn • Personalized promotions delivered with improved success rate

Opportunity

Data andAnalytics

Results

World’spremierbiopharmaceutical

company

Bio-pharmaceutical Major Improves Clinical Trials9

• Need to optimize clinical trial monitoring process with risk-based monitoring

• Improve process visibility, efficiency and time to market, lower costs without quality compromise

• Scalable and extensible risk-based monitoring solution with predictive analytics driven by pattern matching, machine learning and clustering algorithms

• Text mining (NLP-based components) used to extract relevant data from past reports both structured and unstructured

• Early diagnosis of risks during design, conduct and close-out of clinical interventional studies

• 360-degree view of risks with data and metrics from trial management systems, CDM systems as well as safety management systems

• Risk recalibration during conduct of trial

Opportunity

Data andAnalytics

Results

Apparel Major Learns About its Customer’s Sentiments9

• Social listening tool lacked the ability to analyze unstructured customer inputs on discussion forums

• Text analysis model gauges qualitative inputs, picture emoticons, checks spam, and delivers powerful insights into customer sentiments

• Interactive visualization of word clouds and corresponding insights

• Improved customer perception from better social listening • Sentiment analytics prior to product launches improved success

rates • Improved inventory management from better planning

Opportunity

Data andAnalytics

Results

10Telecom Services Company Prevents Network Faults Using Predictive Maintenance

• Network faults in the ADSL backbone disrupt data services to customers and lead to lower customer satisfaction

• Ingested 15GB ADSL data of customers facing no fault, into Hadoop -Infosys Information Platform (IIP) to find ‘Control Signature’ using signalprocessing and statistical techniques on connection with properties likeattenuation,upload/download rate,etc.

• Computed ‘Fault Signature’ from connection properties during fault• Developed predictive model to provide probability of current connection

state that may result in faults in the next 7 days. This helped us predictfaults for connections, DSLAM and geographic area using 16.5 millionrecords in 5 sec on an 8-core,64GB RAM,5 node cluster

• Impending network faults predicted a week in advance with a high degree of accuracy to fix potential network failure points

• Solution based on open source stack including Apache Spark and prebuilt Infosys components was delivered in just 5 days at an impressive price-performance ratio

Opportunity

Data andAnalytics

Results

Chocolate Maker validates business hypothesis quickly with advanced analytics

• A global chocolate manufacturer wanted to develop out-of-stock analysis for their products across retail superstores

• Infosys Information Platform (IIP) deployed on 5-node AWS cluster• Ingested 5 weeks’ sample data (5.7 million rows) into HDFS. IIP toolset

enabled creation of on-demand schemas (models) to access data as well as data wrangling

• Calculated the historical probability at item, store, register level and ran prediction using R-based Binomial and Geometric distribution model to predict out-of-stock. Visual dashboard was developed in Tableau to demonstrate out-of-stock probability by store, hour, day, and item

• From no capability to business insights in just 3 weeks along with the required environment to run multiple experiments quickly

• Out-of-stock analysis demonstrated ability to drill down to details at store, item and register level

12

Opportunity

Data andAnalytics

Results

12

Freight Railroad Network Major Speeds Up Business by Reducing Unnecessary Stopping of Trains

• Excessive, unwarranted braking events on locomotives of a freight railroad network generated from their PTC platform were resulting in unnecessary stopping of trains

• Examined locomotive brake data along with engineer characteristics, wayside data streams, etc., and PTC signal data after ingesting into Infosys Information Platform on AWS Cloud

• Used R for prediction model and Tableau for visualization• Analysis of locomotives by PTC profile and attributes involved in braking

events and provided 360-degree view; Pareto analysis of target type of braking events

• Basic text mining (N-Gram)/word cloud on delay comments and locomotive delay prediction

• Real-time prediction with higher accuracy helped increase the average velocity of freight trains by 1 MPH. Asset utilization to increase by 45%; potential to yield additional $200 million revenue

Opportunity

Data andAnalytics

Results

13

ATM Manufacturer Improves ATM service levels

• A leading ATM manufacturing and service company wanted to reduce the cost of maintaining ATMs while improving customer service and SLAs

• Ability to predict which ATM is going to fail next week (with 80% accuracy) based on alert and incident data from XMS and machine data

• Ingested ticket data from 8500 ATMs/4 million records into Infosys Information Platform (IIP) - and cleaned it (date, spaces, null fields, etc.) in 27 sec on 10-node AWS cluster (32 CPU, 64GB RAM, 640 GB SSD storage)

• Enriched data to create required fields for analysis and executed prediction in 60 msec using logistic regression in Spark

• Data ingested into Oracle and visualized using Tableau

• Reduced downtime of ATMs by 10%, enabling an increase of 15% in transactions that would have been lost because of faulty ATMs

Opportunity

Data andAnalytics

Results

14

Mining Major Uses Streaming Data from Trucks to Enable Predictive Maintenance

• The failure of autonomous and unmanned trucks in mining affects the entire supply chain and has critical consequences for business

• Close to 200 sensors are embedded in each of these vehicles to feed information about both the vehicle and the terrain which is sent back to a remote operations center

• Data derived from the sensors was streamed into Infosys Information Platform (IIP) at the rate of 27,000 messages/sec through Kafka. Mathematical model developed in Apache Spark was applied on this data to derive the probability of equipment failure, equipment replacement requirement, or if the trucks are in good shape

• Overlaid results on world map - a native HTML5 application on top of IIP that can be drilled down by clicking on the color codes

• With the elastic and scalable IIP, we are supporting the client’s business plan to increase the trucks by at least 300% with more sensor data loads

• Real-time analytics to get view on adjustments to production schedule, spare part order release, etc.

Opportunity

Data andAnalytics

Results

15

Predictive Maintenance in Pharma Manufacturing

• Unexpected equipment breakdown in pharmaceutical manufacturing plants impacts process quality and drug safety apart from issues such as down-time and maintenance costs

• Identified a specific set of equipment - reactors and upstream de-gasifier as a logical sub-process for analytics

• Ingested 18 months’ SAP PM data, PLC data and alarm patterns in Infosys Information Platform (IIP) to train a logistic model and then validated the model with 4 months of data

• Built 1-day and 2-day prediction models and model score cut-off was chosen to balance the capture rate vs. false alarm percentage as these two represent a trade-off

• Predicted major breakdown well ahead (1 or more days vs. just hours) with 80% accuracy and reduced false alarms

• Maintenance teams appreciated the value and pharmamajor is planning to scale implementation across multiple plants

Opportunity

Data andAnalytics

Results

© 2015 Infosys Limited,Bangalore, India.All Rights Reserved. Infosys believes the information in this document is accurate as of its publication date; such information is subject to change without notice. Infosys acknowledges the proprietary rights

of other companies to the trademarks,product names and such other intellectualproperty rights mentioned in this docu ment.Except as expressly permitted,neither this documentation nor any part of it may be reproduced,stored in a retrieval

system,or transmitted in any formor by any means,electronic,mechanical,printing,photocopying, recording or otherwise,without the prior permission of Infosys Limited and/or any named intellectualproperty rights holders under this document.

Thank You