67
TIBCO Advanced Analytics Houston Energy Data Science Meetup Michael O’Connell Chief Data Scientist [email protected] @moc_tib August 2015

Houston Energy Data Science Meet up_TIBCO Slides

Embed Size (px)

Citation preview

Page 1: Houston Energy Data Science Meet up_TIBCO Slides

TIBCO Advanced Analytics

Houston Energy Data Science Meetup

Michael O’Connell

Chief Data Scientist

[email protected]

@moc_tib

August 2015

Page 2: Houston Energy Data Science Meet up_TIBCO Slides

• Data Science Process• Data Analysis Pipeline

• Understand – Anticipate – Act

• Advanced Analytics • TIBCO’s R engine

• GeoLocation Analytics

• Real-Time Analytics • Remote Monitoring – the Digital Nervous System

• Software & APIs

• Wrap-Up / Questions Increase

Productivity

Grow Revenue

Value

Reduce Risk

ROI

TIBCO Analytics – Insight to Action

© Copyright 2000-2015 TIBCO Software Inc.

Page 3: Houston Energy Data Science Meet up_TIBCO Slides

“Data Science”

Engineer/Marketeer“Address the business issue”

Statistician

“Build thebest model”

IT / Developer“Manage my

infrastructure”

Engineer/Marketeer:Knows the business problem but

doesn’t know how to prepare data or build models.

Statistician:Knows how to develop appropriate

models to address business problems but is in short supply and can’t deploy IT or business systems

IT / Developer:Knows databases, application

provisioning and development tools but isn’t familiar with data meaning

or analytical workflow purpose

What is a Data Scientist

© Copyright 2000-2015 TIBCO Software Inc.

Page 4: Houston Energy Data Science Meet up_TIBCO Slides

Data Access & Prep

Exploratory Data Analysis Features Visual

DashboardModel & Predict

Deploy Champion

Model

Test & Learn

Channel

Social

Loyalty

Campaign

Filter

Map

Merge

Shape

Propensity

Affinity

Improve Guided -------- Deploy -------- In-LineExplore Data

Aggregate

Prepare DataBusiness Case

Increase Productivity

Grow Revenue

EnsembleForest

RegressionAdditive ModelsSegment

Visualize

Pricing

Promotion

ChallengerModels

At Rest

In Motion

Value Theses

Reduce Risk

ROI

Value

Dashboard Updates

Data a Insight a Action

© Copyright 2000-2015 TIBCO Software Inc.

Page 5: Houston Energy Data Science Meet up_TIBCO Slides

SpotfireDesktop

TIBCO Analytics Stack

Page 6: Houston Energy Data Science Meet up_TIBCO Slides

Custom GUI-driven data access via SDK

Enterprise Data Access

SiebeleBusiness

Local data sources

AccessExcel STDF

Drag-and-drop

MySQL

SQL ServerOracle

Information Services(join, transform, reusable,

parameterized, dynamic query for in-memory use)

Databases

JDBC/ODBC

HadoopSFDC

PostgreSQL

TeradataNetezza

Etc.XML

RDBMS

FlatFiles

Spread-sheets

WebServices

OracleE-Business

RDBMSRDBMSRDBMS

SAP BWSAP R/3 DATA

FABRIC

Salesforce

ODBCOLE DB

SqlClient

Direct connection

OracleTeradataAsterMS SSAS

Teradata

Direct Query(dynamically query and retrieve data

for visualization and analysis)Databases

MySQLEtc.

OBIEE

NetezzaHadoop

© Copyright 2000-2015 TIBCO Software Inc.

Page 7: Houston Energy Data Science Meet up_TIBCO Slides

Immediate Long-Term Competitive AdvantageValue to the Organization

TIBCO is the only analytics platform that provides business value across the Analytics Spectrum

Self-service Dashboards Event Processing

Predictive and Prescriptive Analytics

Measure Diagnose Predict Optimize Operationalize Automate

Analytics Maturity

Analytics Spectrum

Page 8: Houston Energy Data Science Meet up_TIBCO Slides

Immediate Long-Term Competitive AdvantageValue to the Organization

TIBCO is the only analytics platform that provides business value across the Analytics Spectrum

Self-service Dashboards

Measure Diagnose Predict Optimize Operationalize Automate

Analytics Maturity

Analytics Spectrum

Predictive and Prescriptive Analytics Event Processing

Page 9: Houston Energy Data Science Meet up_TIBCO Slides

Immediate Long-Term Competitive AdvantageValue to the Organization

TIBCO is the only analytics platform that provides business value across the Analytics Spectrum

Self-service Dashboards

Predictive and Prescriptive Analytics

Measure Diagnose Predict Optimize Operationalize Automate

Analytics Maturity

Analytics Spectrum

Event Processing

Page 10: Houston Energy Data Science Meet up_TIBCO Slides

© Copyright 2000-2015 TIBCO Software Inc. 10

Visual Analytics – Spotfire

Page 11: Houston Energy Data Science Meet up_TIBCO Slides

Visual Analytics – Spotfire

3D rotate SurfacePolar

Contour Network Funnel

Page 12: Houston Energy Data Science Meet up_TIBCO Slides

Spotfire Extensions – d3 and JS

© Copyright 2000-2015 TIBCO Software Inc.

Sankey Venn

ChordDonut

Dials

Gantt

Page 13: Houston Energy Data Science Meet up_TIBCO Slides

Visual Analytics – Dashboards

Page 14: Houston Energy Data Science Meet up_TIBCO Slides

Visual Analytics – Dashboards

Page 15: Houston Energy Data Science Meet up_TIBCO Slides

Visual Analytics – Dashboards

Page 16: Houston Energy Data Science Meet up_TIBCO Slides

Visual Analytics – Dashboards

Page 17: Houston Energy Data Science Meet up_TIBCO Slides

Visual Analytics – Dashboards

Page 18: Houston Energy Data Science Meet up_TIBCO Slides

Dashboards and Themes

Page 19: Houston Energy Data Science Meet up_TIBCO Slides

Dashboards and Themes

Page 20: Houston Energy Data Science Meet up_TIBCO Slides

Dashboards and Themes

Page 21: Houston Energy Data Science Meet up_TIBCO Slides

Jaspersoft Pixel-Perfect Embedded Reports

© Copyright 2000-2015 TIBCO Software Inc.

Page 22: Houston Energy Data Science Meet up_TIBCO Slides

Analytic Workspaces & Analytic Fabric

APIs

Sear

ch, S

harin

g et

c.

Business Analysts Report DevelopersAnalyticWorkspaces

AnalyticFabric

Data Discovery Analytics Dashboards Reports© Copyright 2000-2015 TIBCO Software Inc.

Page 23: Houston Energy Data Science Meet up_TIBCO Slides

Spotfire is Super Simple to Use

US Homeless AnalysisStep-by-StepYouTube Playlist• Dashboards• Predictive• GeoLocation

© Copyright 2000-2015 TIBCO Software Inc.

Page 24: Houston Energy Data Science Meet up_TIBCO Slides

Immediate Long-Term Competitive AdvantageValue to the Organization

TIBCO is the only analytics platform that provides business value across the Analytics Spectrum

Self-service Dashboards

Measure Diagnose Predict Optimize Operationalize Automate

Analytics Maturity

Analytics Spectrum

Predictive and Prescriptive Analytics Event Processing

Page 25: Houston Energy Data Science Meet up_TIBCO Slides

Advanced Analytics Ecosystem

© Copyright 2000-2015 TIBCO Software Inc.

Page 26: Houston Energy Data Science Meet up_TIBCO Slides

TIBCO Enterprise Runtime for R (TERR)

© Copyright 2000-2015 TIBCO Software Inc.

• TIBCO has rewritten R as a Commercial Compute Engine • Latest statistics scripting engine: S a S-PLUS® a R a TERR• Runs R code including CRAN packages

• Engine internals rebuilt from scratch at low-level• Redesigned data objects, memory management• High performance + Big Data

• TERR is licensed from TIBCO• TERR Installs (free) with Spotfire Analyst / Desktop and other TIBCO products (CEP, Stats)• Spotfire Server can manage all TERR / R scripts, artifacts for reuse • Standalone Developer Edition: www.TIBCOmmunity.com • Supported by TIBCO

Page 27: Houston Energy Data Science Meet up_TIBCO Slides

Model Fitting: 5 Million Rows Model Scoring: 20 Million Rows

TERR 7X faster 84X

TERR Performance

© Copyright 2000-2015 TIBCO Software Inc.

Page 28: Houston Energy Data Science Meet up_TIBCO Slides

Spotfire and TERR local TERR on server

Spotfire-TERR – Local and Server

• Build models on data using local TERR engine embedded in Spotfire

• Build models on big data directly in TERR on server and display results in Spotfire

• Run TERR as parallel sessions on Hadoop cluster, controlled and visualized in Spotfire

Data Source TERR TSSS

Spotfire

Results

ODBCJDBCSDCFile

DataFunction

Larger Data

Modeling

Spotfire

LocalTERR

ODBCJDBCSDCFile

Data

Data Source

Both Spotfire and TERR can load data from any ODBC or JDBC compliant source or from Spotfire Data Connections (SDC) or Spotfire Information Links stored in the Spotfire library.© Copyright 2000-2015 TIBCO Software

Inc.

Page 29: Houston Energy Data Science Meet up_TIBCO Slides

© Copyright 2000-2015 TIBCO Software Inc.

Simple Predictive Analytics – Forecasting & Modeling

Contextual Analytics- Forecasting

Contextual Analytics- Machine Learning

Page 30: Houston Energy Data Science Meet up_TIBCO Slides

Extensible Predictive Analytics – Analysis Workflows

Interactive Spotfire Analytics with R- Data Function- Robust Cluster Analysis- Any Analysis in R / CRAN

Variables driving segments- Random Forest

Revenue by product- Color by segment

Page 31: Houston Energy Data Science Meet up_TIBCO Slides

Free Scripts - GeoCluster [kmeans(x,y)]

Page 32: Houston Energy Data Science Meet up_TIBCO Slides

Free Scripts - Contours [contourLines(x,y,z)]

Page 33: Houston Energy Data Science Meet up_TIBCO Slides

Spotfire-TERR : Data Types, Analyses

Spotfire data functions support any type of data as input and output parameters to and from TERR.

TERR data functions used for data prep, integration, predictive & prescriptive analytics, …

TERR data functions can output content metadata to Spotfire • formatting of fields • handling of binary data

including images and geospatial objects.

RowsColumnsValuesTables

Metadata

BlobsGeometries

Images

Spotfire TERRData

Function

© Copyright 2000-2015 TIBCO Software Inc.

Page 34: Houston Energy Data Science Meet up_TIBCO Slides

Trade Areas

Page 35: Houston Energy Data Science Meet up_TIBCO Slides

Smart Routing

Page 36: Houston Energy Data Science Meet up_TIBCO Slides

Smart Routing

Page 37: Houston Energy Data Science Meet up_TIBCO Slides

Smart Routing

Page 38: Houston Energy Data Science Meet up_TIBCO Slides

Production Forecasting

Forecast Production – Set Expected Production for Wells• Resource Play• Repeatable distribution for EUR• Offset not reliable predictor• Continuous hydrocarbon system• Free hydrocarbon not held in place by

hydrodynamics

• Geologic Subset • Analogous Wells• Geology, completion, spacing, vintage

• Analysis and Data• Production forecasting (EUR)• Probability of production• Proven (P90), Probable (P50), Possible (P10)• Cluster and Regression Analysis

© Copyright 2000-2015 TIBCO Software Inc.

Page 39: Houston Energy Data Science Meet up_TIBCO Slides

Proven, Probable and Possible Production• Resource Play• Repeatable distribution for EUR• Offset not reliable predictor• Continuous hydrocarbon system• Free hydrocarbon not held in place by

hydrodynamics

• Geologic Subset • Analogous Wells• Geology, completion, spacing, vintage

• Analysis and Data• Production forecasting (EUR)• Probability of production• Proven (P90), Probable (P50), Possible (P10)• Cluster and Regression Analysis

Probability: Proven & Probable Production

© Copyright 2000-2015 TIBCO Software Inc.

Page 40: Houston Energy Data Science Meet up_TIBCO Slides

Completions Optimization

• Business Opportunities• Completions optimization by well• Production prediction for new wells• Identify factors driving production vs

expected production e.g. operator

• Analysis and Data• Subsurface (e.g. Spectra)• Location• Completions• Production

• Value and Financial Impact• Optimal completions• Operations management• Asset valuation & “where to drill”

Optimize Completions – Location, Subsurface

© Copyright 2000-2015 TIBCO Software Inc.

Page 41: Houston Energy Data Science Meet up_TIBCO Slides

41

© Copyright 2000-2014 TIBCO Software Inc.

• Business Opportunities• Maintenance optimization

• Analysis and Data• Failure times and locations• Maintenance and failure costs• Root cause analysis

• Value and Financial Impact• Visibility into maintenance

expenses and root causes• Optimal maintenance scheduling

Maintenance Optimization

Equipment Reliability - Refining

Page 42: Houston Energy Data Science Meet up_TIBCO Slides

Winner of 2014 Strata Cloudera AwardFor Best Advanced Analytics Application

Big Data Analytics with Spotfire and TERR

© Copyright 2000-2015 TIBCO Software Inc.

Page 43: Houston Energy Data Science Meet up_TIBCO Slides

Big Data Analytics with TERR

TERR on the nodes of Hadoop Cluster

TERR in Action

• Hadoop cluster compute• TIBCO Cloud Compute Grid• TIBCO Streambase• TIBCO Business Events• KNIME• Lavastorm• Rstudio• Teradata• TIBCO Statistics Services• TIBCO Spotfire

© Copyright 2000-2015 TIBCO Software Inc.

Page 44: Houston Energy Data Science Meet up_TIBCO Slides

© Copyright 2000-2015 TIBCO Software Inc.

Predictive & Collaborative Analytics

Library of Data Functions – everyone Shares• Analysts use functions – no code• Coders develop new functions – R

Data Function Samples• Ship with Spotfire Server

• Geospatial• Computations with polygons on a map

• Computing optimal routes in logistics

• Machine Learning• Fitting models and making predictions

• Applications• Customers, Finance, Machines, …

IT View - GovernanceUser View - Functions

Page 45: Houston Energy Data Science Meet up_TIBCO Slides

Immediate Long-Term Competitive AdvantageValue to the Organization

TIBCO is the only analytics platform that provides business value across the Analytics Spectrum

Self-service Dashboards

Predictive and Prescriptive Analytics

Measure Diagnose Predict Optimize Operationalize Automate

Analytics Maturity

Analytics Spectrum

Event Processing

Page 46: Houston Energy Data Science Meet up_TIBCO Slides

BIG DATAAT REST

FAST DATAIN MOTION

Insight to Action

© Copyright 2000-2015 TIBCO Software Inc.

Page 47: Houston Energy Data Science Meet up_TIBCO Slides

Analyze And Act On “Critical Business Moments”

Optimize pricing Check for

fraud

Make offer to customer

Restock inventory

Reroute transport

Give customer service

Proactively maintain machines

© Copyright 2000-2015 TIBCO Software Inc.

Page 48: Houston Energy Data Science Meet up_TIBCO Slides

Big Data– Analysis of production– Analysis of contracts and product

inventory

Fast Data– Location data from ships and

trains, weather and tides– Manage product supply– Optimize fuel use

Benefits– Optimize product contracts

– Maximize product shipped

– Minimize logistics cost

Managing Supply Chain

Page 49: Houston Energy Data Science Meet up_TIBCO Slides

Managing Supply Chain

Page 50: Houston Energy Data Science Meet up_TIBCO Slides

Managing Industrial Equipment

Big Data– Analysis of production

– Failure analytics

Fast Data– Real-time sensor data

– Leading indicator for shutdowns

– Drilling: kick detection

– Flow monitoring

Benefits– Reduced NPT: Big $$s

– System reliability

– Efficient drilling

Page 51: Houston Energy Data Science Meet up_TIBCO Slides

Data Monitoring• Motor temperature• Motor vibration• Current• Intake pressure• Intake temperature

Flow

Electrical power cablePumpIntakeProtectorESP motorPump monitoring unit

Pump Components

Equipment Monitoring & Management

Video: https://youtu.be/vIVepQRl5SY

Page 52: Houston Energy Data Science Meet up_TIBCO Slides

• Business Opportunities• Pump health & performance surveillance• Condition-based maintenance

• Analysis and Data• Effects of operating conditions on performance• Effects of suppliers on reliability• Component faults and failure analysis

• Value and Financial Impact• Prioritization of engineering and retrofit • Supplier involvement in system reliability• ID systems for Engineering focus • Warranty cost recovery

Equipment Monitoring & Management

Video: https://youtu.be/vIVepQRl5SY

Page 53: Houston Energy Data Science Meet up_TIBCO Slides

Equipment Monitoring & Management

Video: https://youtu.be/vIVepQRl5SY

Page 54: Houston Energy Data Science Meet up_TIBCO Slides

Trend AnalysisCombination of Rules

CUSUM Analysis

Statistical AnalysisStatistical Process Control

Machine Learning

Location Change– Variable moves up or down

Slope Change– Variable changes trend

Variance Change– Variable becomes more/less volatile

Process Threshold– Shewhart control chart

Failure Model y (0/1) = f (X, b) + e; f = logistic regression, trees, svm, nnet, ...

Sensor Analytics

Page 55: Houston Energy Data Science Meet up_TIBCO Slides

1. Analytics models

2. Data streams

3. Calculations on live data

4. Analysis notifications

Fast Data Analytics

Video: https://youtu.be/vIVepQRl5SY

Page 56: Houston Energy Data Science Meet up_TIBCO Slides

Live Data

Video: https://youtu.be/vIVepQRl5SY

Page 57: Houston Energy Data Science Meet up_TIBCO Slides

Alerting In The Field

Page 58: Houston Energy Data Science Meet up_TIBCO Slides

Crowdsourcing Solutions

Page 59: Houston Energy Data Science Meet up_TIBCO Slides

Industrial Equipment Management Improves Operations

Page 60: Houston Energy Data Science Meet up_TIBCO Slides

IT & Governance

© Copyright 2000-2015 TIBCO Software Inc.© Copyright 2000-2014 TIBCO Software Inc.

• Library Services• Centralized management of Spotfire analysis files,

metadata, information links, TERR scripts, …

• User Services• User authentication, role-based authorization

• Audit Services• Content access, modification, deletion • User authentication, data access, library operations

• Usage Log Analytics• Sessions, Users, Admin, Local Files• Library, Information Links, Admin, Detailed Logs

• Analysis Profiler• Automate every analysis file during upgrade / migration

Page 61: Houston Energy Data Science Meet up_TIBCO Slides

© Copyright 2000-2015 TIBCO Software Inc.

Tibco’s Fast Data Platform Architecture

Page 62: Houston Energy Data Science Meet up_TIBCO Slides

Learn how some of the major players in the energy industry are using Spotfire to revolutionize their business:

• How to minimize risks by better understanding exposure to asset integrity issues

• Using analytics to control margins and conduct customer profiling

• Leveraging forensics to reduce NPT and monitor production

• Production optimization techniques

http://energyforum.tibco.com/

Energy Forum

September 1st – 2nd | Norris Conference Center | Houston, TX

Page 63: Houston Energy Data Science Meet up_TIBCO Slides

spotfire.tibco.com/demos

spotfire.tibco.com/tips/

tibco.com/blog/tag/trends-and-outliers/

www.tibcommunity.com

Resources spotfire.tibco.com

Page 64: Houston Energy Data Science Meet up_TIBCO Slides

Monthly Knowledge ShareHosted by QuintusLinked In hosted by Syntelli

LinkedIn

Page 65: Houston Energy Data Science Meet up_TIBCO Slides

Webcasts

Insight and Action - Analyzing Your OSIsoft PI System Data

Tuesday, July  7, 2015 1 PM EST

Presenter: Michael O'Connell & Dave Leigh

Predictive Analytics in the Energy Sector: Asset Valuation

Tuesday, July 28, 2015 1PM EST

Presenter: Michael O'Connell & Peter Shaw with Haas Engineering and R Lacy

Seeing Stars: the Gartner BI Bakeoff

Recording, May 27, 2015

Presenter: Anna Nowakowska & Michael O'Connell

Events spotfire.tibco.com/about-us/events

Page 66: Houston Energy Data Science Meet up_TIBCO Slides

66

© Copyright 2000-2014 TIBCO Software Inc.

Spotfire Ecosystem

Page 67: Houston Energy Data Science Meet up_TIBCO Slides

Thank you!Michael O’Connell, PhDChief Data ScientistTIBCO [email protected]@moc_tibhttp://about.me/moconnell+1-919-7401560

First to Insight, First to Action

© Copyright 2000-2015 TIBCO Software Inc.