Upload
jennifer-walsh
View
1.158
Download
0
Embed Size (px)
Citation preview
TIBCO Advanced Analytics
Houston Energy Data Science Meetup
Michael O’Connell
Chief Data Scientist
@moc_tib
August 2015
• Data Science Process• Data Analysis Pipeline
• Understand – Anticipate – Act
• Advanced Analytics • TIBCO’s R engine
• GeoLocation Analytics
• Real-Time Analytics • Remote Monitoring – the Digital Nervous System
• Software & APIs
• Wrap-Up / Questions Increase
Productivity
Grow Revenue
Value
Reduce Risk
ROI
TIBCO Analytics – Insight to Action
© Copyright 2000-2015 TIBCO Software Inc.
“Data Science”
Engineer/Marketeer“Address the business issue”
Statistician
“Build thebest model”
IT / Developer“Manage my
infrastructure”
Engineer/Marketeer:Knows the business problem but
doesn’t know how to prepare data or build models.
Statistician:Knows how to develop appropriate
models to address business problems but is in short supply and can’t deploy IT or business systems
IT / Developer:Knows databases, application
provisioning and development tools but isn’t familiar with data meaning
or analytical workflow purpose
What is a Data Scientist
© Copyright 2000-2015 TIBCO Software Inc.
Data Access & Prep
Exploratory Data Analysis Features Visual
DashboardModel & Predict
Deploy Champion
Model
Test & Learn
Channel
Social
Loyalty
Campaign
Filter
Map
Merge
Shape
Propensity
Affinity
Improve Guided -------- Deploy -------- In-LineExplore Data
Aggregate
Prepare DataBusiness Case
Increase Productivity
Grow Revenue
EnsembleForest
RegressionAdditive ModelsSegment
Visualize
Pricing
Promotion
ChallengerModels
At Rest
In Motion
Value Theses
Reduce Risk
ROI
Value
Dashboard Updates
Data a Insight a Action
© Copyright 2000-2015 TIBCO Software Inc.
SpotfireDesktop
TIBCO Analytics Stack
Custom GUI-driven data access via SDK
Enterprise Data Access
SiebeleBusiness
Local data sources
AccessExcel STDF
Drag-and-drop
MySQL
SQL ServerOracle
Information Services(join, transform, reusable,
parameterized, dynamic query for in-memory use)
Databases
JDBC/ODBC
HadoopSFDC
PostgreSQL
TeradataNetezza
Etc.XML
RDBMS
FlatFiles
Spread-sheets
WebServices
OracleE-Business
RDBMSRDBMSRDBMS
SAP BWSAP R/3 DATA
FABRIC
Salesforce
ODBCOLE DB
SqlClient
Direct connection
OracleTeradataAsterMS SSAS
Teradata
Direct Query(dynamically query and retrieve data
for visualization and analysis)Databases
MySQLEtc.
OBIEE
NetezzaHadoop
© Copyright 2000-2015 TIBCO Software Inc.
Immediate Long-Term Competitive AdvantageValue to the Organization
TIBCO is the only analytics platform that provides business value across the Analytics Spectrum
Self-service Dashboards Event Processing
Predictive and Prescriptive Analytics
Measure Diagnose Predict Optimize Operationalize Automate
Analytics Maturity
Analytics Spectrum
Immediate Long-Term Competitive AdvantageValue to the Organization
TIBCO is the only analytics platform that provides business value across the Analytics Spectrum
Self-service Dashboards
Measure Diagnose Predict Optimize Operationalize Automate
Analytics Maturity
Analytics Spectrum
Predictive and Prescriptive Analytics Event Processing
Immediate Long-Term Competitive AdvantageValue to the Organization
TIBCO is the only analytics platform that provides business value across the Analytics Spectrum
Self-service Dashboards
Predictive and Prescriptive Analytics
Measure Diagnose Predict Optimize Operationalize Automate
Analytics Maturity
Analytics Spectrum
Event Processing
© Copyright 2000-2015 TIBCO Software Inc. 10
Visual Analytics – Spotfire
Visual Analytics – Spotfire
3D rotate SurfacePolar
Contour Network Funnel
Spotfire Extensions – d3 and JS
© Copyright 2000-2015 TIBCO Software Inc.
Sankey Venn
ChordDonut
Dials
Gantt
Visual Analytics – Dashboards
Visual Analytics – Dashboards
Visual Analytics – Dashboards
Visual Analytics – Dashboards
Visual Analytics – Dashboards
Dashboards and Themes
Dashboards and Themes
Dashboards and Themes
Jaspersoft Pixel-Perfect Embedded Reports
© Copyright 2000-2015 TIBCO Software Inc.
Analytic Workspaces & Analytic Fabric
APIs
Sear
ch, S
harin
g et
c.
Business Analysts Report DevelopersAnalyticWorkspaces
AnalyticFabric
Data Discovery Analytics Dashboards Reports© Copyright 2000-2015 TIBCO Software Inc.
Spotfire is Super Simple to Use
US Homeless AnalysisStep-by-StepYouTube Playlist• Dashboards• Predictive• GeoLocation
© Copyright 2000-2015 TIBCO Software Inc.
Immediate Long-Term Competitive AdvantageValue to the Organization
TIBCO is the only analytics platform that provides business value across the Analytics Spectrum
Self-service Dashboards
Measure Diagnose Predict Optimize Operationalize Automate
Analytics Maturity
Analytics Spectrum
Predictive and Prescriptive Analytics Event Processing
Advanced Analytics Ecosystem
© Copyright 2000-2015 TIBCO Software Inc.
TIBCO Enterprise Runtime for R (TERR)
© Copyright 2000-2015 TIBCO Software Inc.
• TIBCO has rewritten R as a Commercial Compute Engine • Latest statistics scripting engine: S a S-PLUS® a R a TERR• Runs R code including CRAN packages
• Engine internals rebuilt from scratch at low-level• Redesigned data objects, memory management• High performance + Big Data
• TERR is licensed from TIBCO• TERR Installs (free) with Spotfire Analyst / Desktop and other TIBCO products (CEP, Stats)• Spotfire Server can manage all TERR / R scripts, artifacts for reuse • Standalone Developer Edition: www.TIBCOmmunity.com • Supported by TIBCO
Model Fitting: 5 Million Rows Model Scoring: 20 Million Rows
TERR 7X faster 84X
TERR Performance
© Copyright 2000-2015 TIBCO Software Inc.
Spotfire and TERR local TERR on server
Spotfire-TERR – Local and Server
• Build models on data using local TERR engine embedded in Spotfire
• Build models on big data directly in TERR on server and display results in Spotfire
• Run TERR as parallel sessions on Hadoop cluster, controlled and visualized in Spotfire
Data Source TERR TSSS
Spotfire
Results
ODBCJDBCSDCFile
DataFunction
Larger Data
Modeling
Spotfire
LocalTERR
ODBCJDBCSDCFile
Data
Data Source
Both Spotfire and TERR can load data from any ODBC or JDBC compliant source or from Spotfire Data Connections (SDC) or Spotfire Information Links stored in the Spotfire library.© Copyright 2000-2015 TIBCO Software
Inc.
© Copyright 2000-2015 TIBCO Software Inc.
Simple Predictive Analytics – Forecasting & Modeling
Contextual Analytics- Forecasting
Contextual Analytics- Machine Learning
Extensible Predictive Analytics – Analysis Workflows
Interactive Spotfire Analytics with R- Data Function- Robust Cluster Analysis- Any Analysis in R / CRAN
Variables driving segments- Random Forest
Revenue by product- Color by segment
Free Scripts - GeoCluster [kmeans(x,y)]
Free Scripts - Contours [contourLines(x,y,z)]
Spotfire-TERR : Data Types, Analyses
Spotfire data functions support any type of data as input and output parameters to and from TERR.
TERR data functions used for data prep, integration, predictive & prescriptive analytics, …
TERR data functions can output content metadata to Spotfire • formatting of fields • handling of binary data
including images and geospatial objects.
RowsColumnsValuesTables
Metadata
BlobsGeometries
Images
Spotfire TERRData
Function
© Copyright 2000-2015 TIBCO Software Inc.
Trade Areas
Smart Routing
Smart Routing
Smart Routing
Production Forecasting
Forecast Production – Set Expected Production for Wells• Resource Play• Repeatable distribution for EUR• Offset not reliable predictor• Continuous hydrocarbon system• Free hydrocarbon not held in place by
hydrodynamics
• Geologic Subset • Analogous Wells• Geology, completion, spacing, vintage
• Analysis and Data• Production forecasting (EUR)• Probability of production• Proven (P90), Probable (P50), Possible (P10)• Cluster and Regression Analysis
© Copyright 2000-2015 TIBCO Software Inc.
Proven, Probable and Possible Production• Resource Play• Repeatable distribution for EUR• Offset not reliable predictor• Continuous hydrocarbon system• Free hydrocarbon not held in place by
hydrodynamics
• Geologic Subset • Analogous Wells• Geology, completion, spacing, vintage
• Analysis and Data• Production forecasting (EUR)• Probability of production• Proven (P90), Probable (P50), Possible (P10)• Cluster and Regression Analysis
Probability: Proven & Probable Production
© Copyright 2000-2015 TIBCO Software Inc.
Completions Optimization
• Business Opportunities• Completions optimization by well• Production prediction for new wells• Identify factors driving production vs
expected production e.g. operator
• Analysis and Data• Subsurface (e.g. Spectra)• Location• Completions• Production
• Value and Financial Impact• Optimal completions• Operations management• Asset valuation & “where to drill”
Optimize Completions – Location, Subsurface
© Copyright 2000-2015 TIBCO Software Inc.
41
© Copyright 2000-2014 TIBCO Software Inc.
• Business Opportunities• Maintenance optimization
• Analysis and Data• Failure times and locations• Maintenance and failure costs• Root cause analysis
• Value and Financial Impact• Visibility into maintenance
expenses and root causes• Optimal maintenance scheduling
Maintenance Optimization
Equipment Reliability - Refining
Winner of 2014 Strata Cloudera AwardFor Best Advanced Analytics Application
Big Data Analytics with Spotfire and TERR
© Copyright 2000-2015 TIBCO Software Inc.
Big Data Analytics with TERR
TERR on the nodes of Hadoop Cluster
TERR in Action
• Hadoop cluster compute• TIBCO Cloud Compute Grid• TIBCO Streambase• TIBCO Business Events• KNIME• Lavastorm• Rstudio• Teradata• TIBCO Statistics Services• TIBCO Spotfire
© Copyright 2000-2015 TIBCO Software Inc.
© Copyright 2000-2015 TIBCO Software Inc.
Predictive & Collaborative Analytics
Library of Data Functions – everyone Shares• Analysts use functions – no code• Coders develop new functions – R
Data Function Samples• Ship with Spotfire Server
• Geospatial• Computations with polygons on a map
• Computing optimal routes in logistics
• Machine Learning• Fitting models and making predictions
• Applications• Customers, Finance, Machines, …
IT View - GovernanceUser View - Functions
Immediate Long-Term Competitive AdvantageValue to the Organization
TIBCO is the only analytics platform that provides business value across the Analytics Spectrum
Self-service Dashboards
Predictive and Prescriptive Analytics
Measure Diagnose Predict Optimize Operationalize Automate
Analytics Maturity
Analytics Spectrum
Event Processing
BIG DATAAT REST
FAST DATAIN MOTION
Insight to Action
© Copyright 2000-2015 TIBCO Software Inc.
Analyze And Act On “Critical Business Moments”
Optimize pricing Check for
fraud
Make offer to customer
Restock inventory
Reroute transport
Give customer service
Proactively maintain machines
© Copyright 2000-2015 TIBCO Software Inc.
Big Data– Analysis of production– Analysis of contracts and product
inventory
Fast Data– Location data from ships and
trains, weather and tides– Manage product supply– Optimize fuel use
Benefits– Optimize product contracts
– Maximize product shipped
– Minimize logistics cost
Managing Supply Chain
Managing Supply Chain
Managing Industrial Equipment
Big Data– Analysis of production
– Failure analytics
Fast Data– Real-time sensor data
– Leading indicator for shutdowns
– Drilling: kick detection
– Flow monitoring
Benefits– Reduced NPT: Big $$s
– System reliability
– Efficient drilling
Data Monitoring• Motor temperature• Motor vibration• Current• Intake pressure• Intake temperature
Flow
Electrical power cablePumpIntakeProtectorESP motorPump monitoring unit
Pump Components
Equipment Monitoring & Management
Video: https://youtu.be/vIVepQRl5SY
• Business Opportunities• Pump health & performance surveillance• Condition-based maintenance
• Analysis and Data• Effects of operating conditions on performance• Effects of suppliers on reliability• Component faults and failure analysis
• Value and Financial Impact• Prioritization of engineering and retrofit • Supplier involvement in system reliability• ID systems for Engineering focus • Warranty cost recovery
Equipment Monitoring & Management
Video: https://youtu.be/vIVepQRl5SY
Equipment Monitoring & Management
Video: https://youtu.be/vIVepQRl5SY
Trend AnalysisCombination of Rules
CUSUM Analysis
Statistical AnalysisStatistical Process Control
Machine Learning
Location Change– Variable moves up or down
Slope Change– Variable changes trend
Variance Change– Variable becomes more/less volatile
Process Threshold– Shewhart control chart
Failure Model y (0/1) = f (X, b) + e; f = logistic regression, trees, svm, nnet, ...
Sensor Analytics
1. Analytics models
2. Data streams
3. Calculations on live data
4. Analysis notifications
Fast Data Analytics
Video: https://youtu.be/vIVepQRl5SY
Live Data
Video: https://youtu.be/vIVepQRl5SY
Alerting In The Field
Crowdsourcing Solutions
Industrial Equipment Management Improves Operations
IT & Governance
© Copyright 2000-2015 TIBCO Software Inc.© Copyright 2000-2014 TIBCO Software Inc.
• Library Services• Centralized management of Spotfire analysis files,
metadata, information links, TERR scripts, …
• User Services• User authentication, role-based authorization
• Audit Services• Content access, modification, deletion • User authentication, data access, library operations
• Usage Log Analytics• Sessions, Users, Admin, Local Files• Library, Information Links, Admin, Detailed Logs
• Analysis Profiler• Automate every analysis file during upgrade / migration
© Copyright 2000-2015 TIBCO Software Inc.
Tibco’s Fast Data Platform Architecture
Learn how some of the major players in the energy industry are using Spotfire to revolutionize their business:
• How to minimize risks by better understanding exposure to asset integrity issues
• Using analytics to control margins and conduct customer profiling
• Leveraging forensics to reduce NPT and monitor production
• Production optimization techniques
http://energyforum.tibco.com/
Energy Forum
September 1st – 2nd | Norris Conference Center | Houston, TX
spotfire.tibco.com/demos
spotfire.tibco.com/tips/
tibco.com/blog/tag/trends-and-outliers/
www.tibcommunity.com
Resources spotfire.tibco.com
Monthly Knowledge ShareHosted by QuintusLinked In hosted by Syntelli
Webcasts
Insight and Action - Analyzing Your OSIsoft PI System Data
Tuesday, July 7, 2015 1 PM EST
Presenter: Michael O'Connell & Dave Leigh
Predictive Analytics in the Energy Sector: Asset Valuation
Tuesday, July 28, 2015 1PM EST
Presenter: Michael O'Connell & Peter Shaw with Haas Engineering and R Lacy
Seeing Stars: the Gartner BI Bakeoff
Recording, May 27, 2015
Presenter: Anna Nowakowska & Michael O'Connell
Events spotfire.tibco.com/about-us/events
66
© Copyright 2000-2014 TIBCO Software Inc.
Spotfire Ecosystem
Thank you!Michael O’Connell, PhDChief Data ScientistTIBCO [email protected]@moc_tibhttp://about.me/moconnell+1-919-7401560
First to Insight, First to Action
© Copyright 2000-2015 TIBCO Software Inc.