Upload
others
View
9
Download
0
Embed Size (px)
Citation preview
TIBCO Analytics Meetup
TIBCO Data Science Team
January 22nd 2019
The following information is confidential information of TIBCO Software Inc. Use, duplication, transmission, or republication for any purpose without the prior written consent of TIBCO is expressly prohibited.
CONFIDENTIALITY
© Copyright 2000-2019 TIBCO Software Inc.
This document (including, without limitation, any product roadmap or statement of direction data) illustrates the planned testing, release and availability dates for TIBCO products and services. This document is provided for informational purposes only and its contents are subject to change without notice. TIBCO makes no warranties, express or implied, in or relating to this document or any information in it, including, without limitation, that this document, or any information in it, is error-free or meets any conditions of merchantability or fitness for a particular purpose. This document may not be reproduced or transmitted in any form or by any means without our prior written permission.
The material provided is for informational purposes only, and should not be relied on in making a purchasing decision. The information is not a commitment, promise or legal obligation to deliver any material, code, or functionality. The development, release, and timing of any features or functionality described for our products remains at our sole discretion.
During the course of this presentation TIBCO or its representatives may make forward-looking statements regarding future events, TIBCO’s future results or our future financial performance. These statements are based on management’s current expectations. Although we believe that the expectations reflected in the forward-looking statements contained in this presentation are reasonable, these expectations or any of the forward-looking statements could prove to be incorrect and actual results or financial performance could differ materially from those stated herein. TIBCO does not undertake to update any forward-looking statement that may be made from time to time or on its behalf.
DISCLAIMER
© Copyright 2000-2019 TIBCO Software Inc.
4
TIBCO Analytics Meetup – opportunities to learn and network
Meetup group keeps growing and now has 670+ members!
Join this TIBCO Analytics Meetup group and receive automatic invites to future TIBCO Analytics Meetups
http://www.meetup.com/TIBCOSpotfireOnlineusergroup/
https://bit.ly/2Jm4iOn
© Copyright 2000-2019 TIBCO Software Inc. c
5
• Welcome and TIBCO update by Michael O’Connell
• New Statistica Data Function in Spotfire by Tomáš Jurczyk
• Anomaly Detection with Deep Learning Autoencoder by David Katz
• TIBCO Community update by Heleen Snelting
• Live Q&A
Please submit your questions at any time via Q&A option
We will answer at the end or will get back to you via email
Agenda
© Copyright 2000-2019 TIBCO Software Inc.
6
Connected Intelligence Portfolio Analytics & Data Science
© Copyright 2000-2019 TIBCO Software Inc.
Michael O’ConnellChief Analytics Officer
7
Value = Find + Act on CriticalBusiness Moments
Critical business moments occur in every facet of enterprise operations.
They drive competitive differentiation, customer satisfaction andbusiness success.
smart cross-sell offers
predict impending equipment failure
anomaly detection and risk management
optimize routes
anticipate and handle disruptions
optimize pricing
prevent fraud
deliver proactive customer service
© Copyright 2000-2019 TIBCO Software Inc.
8
TIBCO Connected Intelligence
Data Visualization
Data Science
Data Management
Integration andAPI Management
Messaging andEvents Processing
Digital ProcessAutomation
© Copyright 2000-2019 TIBCO Software Inc.
9
Portfolio Approach
Best-in-class data, analytics & integration
Tightly integrated but loosely coupled
Available anywhere and everywhere
Closed loop, continuous learning apps
VisualAnalytics
DataVirtualization
DataScience
MasterData StreamingLow-Code
Dev
DataCatalog
EdgeIntegration
On-PremCloud Hybrid
© Copyright 2000-2019 TIBCO Software Inc.
Accelerators
Cloud Starters
Applications
Applications: Cloud Starters, AcceleratorsBusiness-focused TIBCO Apps using Data ScienceConfigurable visuals, data science, low code component apps
Anomaly Detection Risk Management Customer Engagement
© Copyright 2000-2019 TIBCO Software Inc.
11
Modeling+ Visual composition + Notebook+ Native ML/DL & OS integrations
Operations Deployment+ Model lifecycle mgt+ Visual analytics & BI+ Batch automation+ Real-time event processing
Data Ingest / Data Prep+ Distributed compute+ Dedicated host+ Feature Engineering
BI eg medics for epidemic monitoring
Engineer eg yield optimization
Quant eg trading desk reconciliation
Data EngineerData ScientistCitizen Data Scientist
Data ScientistCitizen Data Scientist
Analytics OperationsIT / Software Engineer
FUNCTION
USERBusiness UserIT / Administration
Business Applications+ Predictive maintenance+ Engineering/IoT/IIoT+ Customer Analytics+ Supply Chain ...
© Copyright 2000-2019 TIBCO Software Inc.
TIBCO Data Science DataScience
12
TIBCO Data Science
Comprehensive Data Access, Framework IntegrationNative Cloud Authoring / Integration || AWS/SageMaker, GCP/TensorFlow, MSFT Azure Services || SAS, MatLab
TIBCO Data Science – Author Visual | Notebook | Code | Automation | Recommendations
TIBCO Data Science (formerly Alpine) Statistica Unified Author
Licenses
EcosystemCoreDistributed
TIBCO Data Science – OperationsModel Management | Collaboration | Governance | Automation
Distributed computeModel Management
GovernanceJob Scheduler
APIsPortable Format
CollaborationProject Mgt
Audit Trail DBModel Management
GovernanceJob Scheduler
APIsPortable Format
ScoringMonitoring & Alerting
Federated Data Science Services
TIBCO R (TERR) Service3rd Party Engines (SAS, Matlab, OS R, Python)
Job ManagementAPIs
Unified Services Licenses – based
on capacity
Swap & Expand as needs evolve
DataScience
© Copyright 2000-2019 TIBCO Software Inc.
13
TIBCO Data Science – formerly AlpineConnected Teams• Collaborate on data science projects with
business
Scalable Algorithms• Transform and model across data sources
without moving data
• In-database and in-lake data prep, analytics and machine learning
• Python Notebooks, PySpark• Spark Auto-Tuning
Web Visual & Notebook Composition• Rapid deployment in Cloud and on-premise• Amazon, Microsoft Azure• EMR, Amazon Redshift / HDInsights, Azure
SQL
DataScience
© Copyright 2000-2019 TIBCO Software Inc.
14
TIBCO Data Science – Statistica
Statistica – Data Science Workbench• Data ingest, blending, in-db and in-lake processing
• 1000’s of stats, machine and deep learning
• Supervised Learning – models, ensembles
• Unsupervised Learning – anomaly detection
• Marketplaces – Azure ML, Algorithmia, Apervita
• Open source – R, Python, C#, H2O, CNTK Deep NN
Model & Rule Lifecycle Management• Create workspace, manage, version, deploy, embed
• Repeatable, GXP validation, audit, version control
DataScience
© Copyright 2000-2019 TIBCO Software Inc.
15
Impact daily decision making• Embed predictive insights in business applications
• Visualize analysis results in Spotfire and provide access across the organization
• Create self-service web interfaces
Deploy models to production• Push real-time engines to AWS, Azure, or
Cloud Foundry with PFA model formats
• Connect to streaming data (eg StreamBase) with PMML model exports
• Schedule batch runs of Workflows, Python Notebooks and SQL files
SAP HANA
Java
Teradata
Models in OperationsConnect ML pipelines to business processes and applications
DataScience
© Copyright 2000-2019 TIBCO Software Inc.
16
TIBCO Data Science – AMS - StreamBaseTIBCO Data Science – PFA – StreamBase / TIBCO Cloud™ Live Apps
Models in Operations StreamingDataScience
© Copyright 2000-2019 TIBCO Software Inc.
17
AWS and TIBCO Data Science DataScience
© Copyright 2000-2019 TIBCO Software Inc.
TIBCO Anomaly Detection
18
ML
ETL
Spotfire DS
In-DB ETL
In-DB ML
Data Science + Visual Analytics
TIBCO Data Science : REST API to Algos
Spotfire Data Function
Data : never moves
© Copyright 2000-2019 TIBCO Software Inc.
Visual Analytics
DataScience
19
Spotfire X Analytics Experience
AugmentedSearch & AI-Powered Insights
Start in seconds, instant insights
AutomatedAutomagical Dataflows
Author & audit with automatically recorded dataflow steps
AgileReimagined User Interface
Agile exploration made even easier
AcceleratedReal-time Insights
Real-time awareness and action
© Copyright 2000-2019 TIBCO Software Inc.
VisualAnalytics
Spotfire Visual Analytics Apps VisualAnalytics
© Copyright 2000-2019 TIBCO Software Inc.
21
Spotfire X: AI RecommendationsAutomated, Augmented insight discovery & display
Variable Relationships Algorithm• User selects target variable
• AI algorithm discovers variable relationships to target
• Spotfire displays in order of strength - best practices graphics
• 4 clicks to brush-linked dashboard
VisualAnalytics
DataScience
© Copyright 2000-2019 TIBCO Software Inc.
22
Spotfire X: NLQ SearchAugmented NLQ search & display
Search = NLQ• User searches (with text input)
multiple data tables + relationships
• NLQ displays appropriate (best-practice) graphs
• Brush-linked dashboard constructed entirely from chat interaction
© Copyright 2000-2019 TIBCO Software Inc.
VisualAnalytics
23
Data Mashup & Data WranglingAuto-magical Dataflows
Simple, powerful expressions & functions • Author (point-click and code) with automatically recorded
dataflow steps
• Edit from data canvas – including upstream
Automatic data lineage
Edit Transforms Upstream
VisualAnalytics
© Copyright 2000-2019 TIBCO Software Inc.
24
Predictive & Machine Learning
Automation of Data Science and ML for business users• Inbuilt menus: for regression,
classification, trees, cluster analysis, forecast
• Data functions – running models from R, Python, Statistica
Many vertical specific apps • Templates and data functions
• TIBCO Community
VisualAnalytics
DataScience
© Copyright 2000-2018 TIBCO Software Inc.
Spotfire Data Function
Spotfire Expression
© Copyright 2000-2019 TIBCO Software Inc.
25
Spotfire X: Data StreamsReal-Time Awareness and Action
Real Time Data Visualization• User selects time window• Data Streams shows live-update
visualizations• Calculations eg forecast available on live viz
Users and Data• Many real-time data sources supported• Simple, unified data source connector • Brush-linked – like all Spotfire marking
© Copyright 2000-2019 TIBCO Software Inc.
VisualAnalytics
DataStreams
26
Spotfire DifferentiationBrush-Linked Analytic AppsData Mashup / Data WranglingAI-Powered Insights + NLQ (*New* Spotfire X)Real-Time Data Streams (*New* Spotfire X)One click Web Deployment
Geo-Analytics Predictive Analytics and Machine Learning
Server-side Scalability, Governance, SecurityConfigurability and APIs
Visual Analytics Apps
Enterprise Class
Geo & PredictiveAnalytics
Visual & Geo
Analytics
DataScience
Data Streams
Data Wrangling
© Copyright 2000-2019 TIBCO Software Inc.
27
Anomaly Detection
MANUFACTURING:Anomaly Detection
© Copyright 2000-2019 TIBCO Software Inc.
28
Applications of Anomaly Detection
IoT & Engineering• Formula One• Energy - Production Surveillance, Drilling Optimization• Predictive Maintenance (PdM, CbM)• Manufacturing - Yield Optimization
Financial Services• Trade Surveillance• Fraud Detection
Healthcare & Pharmaceutical• Patient risk – cardiac arrest, sepsis, surgery infection
Customer Analytics• Churn Prevention• Cross-Sell, Up-Sell
Key Issue – Understand Variability
© Copyright 2000-2019 TIBCO Software Inc.
29
Anomaly DetectionTechniquesDeep Learning dimension reduction – non-linear• Reconstruction error
Cluster Analysis / PCA• Distance to closest centroid
Features show root causeModels can be used for scoring new event stream
TIBCO Templates, AcceleratorsAutoencoder• Spotfire and TIBCO R (TERR) Template & Data Function*
• TIBCO Data Science – Teams
Cluster Analysis• TIBCO Data Science – Statistica*
Data Science / TIBCO Combinations• Risk Management Accelerator
• High-Tech Manufacturing Accelerator
© Copyright 2000-2019 TIBCO Software Inc. * Demoing Today
VisualAnalytics
DataStreams
30© Copyright 2000-2019 TIBCO Software Inc.
Anomaly Detection Action: TIBCO + F1
© Copyright 2000-2019 TIBCO Software Inc.
32
Tomáš JurczykData Scientist
Statistica Data Function in Spotfire - Demo
© Copyright 2000-2019 TIBCO Software Inc.
33
Statistica Data function for Spotfire
https://community.tibco.com/wiki/statistica-data-function
Statistica
Spotfire
Objective: Demonstrate - via a build from scratch example - how this Spotfire-Statistica integration empowers Analysts and Citizen Data Scientists
Steps of demo:1. Build a workspace in Statistica 2. Call this Statistica workspace as a data function in
Spotfire on a different data source3. Build visualisations in Spotfire based on the new
information about assigned clusters and additional outputs
4. Optional: Add action controls to run the data function according to user’s choice of parameters
© Copyright 2000-2019 TIBCO Software Inc.
34
Anomaly Detection with Deep Learning Autoencoder
© Copyright 2000-2019 TIBCO Software Inc.
David KatzPrincipal Consultant
35
Deep Learning Autoencoder• Autoencoders and Anomaly Detection• Software Tools
Demo
Notes on Setup
Topics to cover
© Copyright 2000-2019 TIBCO Software Inc. c
36
Autoencoders and Anomaly Detection
• Create an identity transformation with constraints
• Analogy to Principal Components – but much more flexible/accurate.
• Anomalies – the output is the reconstructed input, but it does not fully match the original input => Reconstruction Error
• Reconstruction Error:• Overall• By component• By sample.
© Copyright 2000-2019 TIBCO Software Inc. c
37
H2O DeepLearning• Simple Structure of networks – just
specify number of fully-connected layers (and optionally dropout)
• Settings for Sparse data can outperform GPU
• H2O Deep Water Project – • uses GPU but no longer being
developed• H2O recommends Keras for new
projects
Keras • Front end for Tensorflow, CNTK, Theano,
MXNet
• Specify complex network topologies
• Use different types of layers – CNN, RNN,…
• Can leverage GPU
Deep Learning Software
© Copyright 2000-2019 TIBCO Software Inc. c
38
Time Based Multivariate data in Spotfire• TIBCO Community Exchange Template using R/TERR data functions with
H2O – available now – showing in this presentation
• Python data functions with TensorFlow
• Without time based features – available now
• With time based features – coming soon
Time Based Multivariate data in TIBCO Data Science• Available now on AWS Marketplace using TensorFlow / Sagemaker
• Runs on Clusters
• Some post-processing features shown today not yet integrated
TIBCO Interfaces to Deep Learning Software
© Copyright 2000-2019 TIBCO Software Inc. c
TIBCO Community page on Anomaly Detection: https://bit.ly/2SVUkY6
39
Industrial Plant: Raw Time series Data
© Copyright 2000-2019 TIBCO Software Inc. c
40
Industrial Plant: Raw Time series Data
© Copyright 2000-2019 TIBCO Software Inc. c
41
Industrial Plant: Raw Time series Data
© Copyright 2000-2019 TIBCO Software Inc. c
42
Demo
© Copyright 2000-2019 TIBCO Software Inc. c
Case Study - Manufacturing
© Copyright 2000-2019 TIBCO Software Inc. c
44
Validation Error has clear minimum
Model Configuration & Evaluation
© Copyright 2000-2019 TIBCO Software Inc. c
45
Note Problems in Convergence here.Minimum Error looks like a random variation
Build & Evaluate Model
© Copyright 2000-2019 TIBCO Software Inc. c
46
Convergence Prevented by Severe Outlier in Validation Sample
© Copyright 2000-2019 TIBCO Software Inc. c
47
Another way to spot these outliers – excessive variance for these variables
© Copyright 2000-2019 TIBCO Software Inc. c
Severe Outliers Can Cause Failure to ConvergeEspecially in Validation SampleHere we Mark Rows to Omit from Analysis
© Copyright 2000-2019 TIBCO Software Inc. c
49
Without outlier points, we get good convergence:
© Copyright 2000-2019 TIBCO Software Inc. c
50
TIBCO Community Update
© Copyright 2000-2019 TIBCO Software Inc.
Heleen SneltingDirector Data Science
Swap Logoif neededIn master slide
51
TIBCO Community the platform for our users
TIBCO Community the platform for our users! community.tibco.com
© Copyright 2000-2019 TIBCO Software Inc.
52
Statistica and Python Data Functions - Spotfire
https://community.tibco.com/wiki/statistica-data-function
Statistica
Spotfire
https://bit.ly/2EPx4Z8
Spotfire Data Function Tips & Tricks section
© Copyright 2000-2019 TIBCO Software Inc.
https://bit.ly/2CQAI2B
53
Statistica Workspace - A Graphical UI
https://bit.ly/2Ua1M2b
© Copyright 2000-2019 TIBCO Software Inc.
54
Anomaly Detection and Autoencoder ML
https://bit.ly/2Walf4G
© Copyright 2000-2019 TIBCO Software Inc.
Videohttps://youtu.be/24F_Rx5IlHM
Slideshttps://www.slideshare.net/AmazonWebServices/tibco-ai-and-data-science-innovation-with-amazon-sagemaker-ant329s-aws-reinvent-2018
56
TIBCO Labs - Participate and Innovate in TIBCO Connected Intelligence Cloud
https://community.tibco.com/wiki/tibco-labs
Or start with TIBCO Community Exchange https://bit.ly/2sDvTmI
57
Spotfire X, Spotfire Data Streams - learn moreWhat’s New in Spotfire TIBCO Community Page And Spotfire X Webinar Series
https://community.tibco.com/wiki/whats-new-tibco-spotfire
© Copyright 2000-2019 TIBCO Software Inc.
https://www.tibco.com/events/series/spotfire-x-webinar-series
Exploring NYC Traffic Accidents with Spotfire X blog: https://bit.ly/2FPZVeP
Wikipedia Spotfire X and Spotfire Data Streams
Blog: https://bit.ly/2R6ajRTDemo: https://bit.ly/2Wac2cjTIBCO Community how-to info: https://bit.ly/2Wac2cj
Swap Logoif neededIn master slide
58
TIBCO Community Spirit! Some tips
© Copyright 2000-2019 TIBCO Software Inc.
Also use search Engines such as Google to easily find relevant content on the TIBCO Community
Add #DataScience as a tag to promote review by the TIBCO Data Science team
Help expedite answer by following these Tips on Asking and Answering Questions
https://community.tibco.com/wiki/tips-asking-and-answering-questions
Search Answers before posting a question - with 15,000 questions your question may have been answered already!
“We are using the TIBCO Community all the time - we have been answering sometimes even our own questions by referring to existing content and answers” Spotfire Customer, UK
Don’t forget to give feedback to answers
59
What’s new in… for example TIBCO Data Virtualization
https://community.tibco.com/wiki/whats-new-tibco-data-virtualization
Customer Orientation and Customer Success Center - ideal for on-boarding and staying up to date
https://community.tibco.com/wiki/tibco-analytics-new-customer-orientation - feedback appreciated!
https://community.tibco.com/wiki/tibco-spotfire-customer-success-center
TIBCO Geo-Analytics capabilities
https://community.tibco.com/wiki/tibco-spotfire-location-analytics-mapping-geoanalytics-and-spatial-statistics
TIBCO Analytics Meetup pages with recordings and presentations
https://community.tibco.com/wiki/tibco-analytics-meetup
Data Literacy
https://community.tibco.com/wiki/data-literacy
AI on Demand - TIBCO Data Science Meetup Tour 2019 - dates and locations to be published soon
https://community.tibco.com/wiki/ai-demand-data-science-operations
Live TIBCO Spotfire and other Meetups - next Feb 13 (London and Houston) and Feb 26 (Aberdeen)!
https://www.meetup.com/pro/tibco/
Top TIBCO Community links to bookmark
© Copyright 2000-2019 TIBCO Software Inc.
60© Copyright 2000-2019 TIBCO Software Inc.
TIBCO NOW now.tibco.com
61
Questions & Contact
Thank you!Michael O’[email protected]
@MichOConnellH
Heleen Snelting [email protected]@HeleenSnelting
TIBCO Communitycommunity.tibco.com
TIBCO Exchangecommunity.tibco.com/exchange
Spotfire Trialspotfire.tibco.com/trial
© Copyright 2000-2019 TIBCO Software Inc.