View
767
Download
0
Category
Preview:
Citation preview
Lou Bajuk-Yorgan, Sr. Dir. Product Management, TIBCO Analyticslbajuk@tibco.com @loubajuk
Applying the R LanguageIn Streaming Applications and Business Intelligence
Lou Bajuk-Yorgan, Sr. Dir., Product Management, TIBCO Analytics
Lou Bajuk-Yorgan, Sr. Dir. Product Management, TIBCO Analyticslbajuk@tibco.com @loubajuk
Analytic Challenges for Enterprises• Big Data
• More and more data, and the expectation to do something with it
• Competitive Pressures• Deeper insights into data--Apply Advanced
Analytics• Smarter Decisions--Broaden analytic usage to
wider community beyond Data Scientists• Faster Decisions—both human and automated
• Agile response to evolving opportunities and threats
• Answers (and the questions to ask) change rapidly
Lou Bajuk-Yorgan, Sr. Dir. Product Management, TIBCO Analyticslbajuk@tibco.com @loubajuk
R can help…• Agile
• Easy prototyping of new models and analysis
• Deeper insights• Huge array of analytic
methods available• The “best” method to solve a
given problem is likely available
…but has it’s own challenges• Performance
• Not designed for real time or Big Data applications
• Broader usage• Hard for non-Data Scientist to use directly• Challenging to integrate into enterprise
applications • Performance, commercial support and
Intellectual Property concerns
• Compromises which impact Agility• Recode in a new, less agile environment• Rewrite, use specialized R packages to solve
one problem better
Lou Bajuk-Yorgan, Sr. Dir. Product Management, TIBCO Analyticslbajuk@tibco.com @loubajuk
What would the ideal solution look like?
• A single environment that would allow you to prototype in R, and deploy to production in R
• Without recoding, without delay, without compromises• Enable agile response to changing opportunities and threats
Requires• Analytic flexibility, power and breadth of R• High performance, scalable, robust platform• Easy to embed in Business Intelligence, Real time and custom applications• Fully supported for mission critical applications• Allows R users to continue to work in their preferred development
environments (e.g., RStudio)
Lou Bajuk-Yorgan, Sr. Dir. Product Management, TIBCO Analyticslbajuk@tibco.com @loubajuk
TIBCO Enterprise Runtime for R (TERR)
• Unique, enterprise-grade statistics engine, architected from the ground up by TIBCO
• Based on TIBCO’s long history and expertise with S+ • Better performance and memory management than open
source R
• Designed for R language compatibility• Wide range of built-in analytic methods• Extensible through R community packages
• Designed for commercial embeddability • TIBCO licensed & supported product • Not GPL, not a repackaging of the Open source R
engine
• TERR extends the reach of R in the enterprise• Develop code in open source R• Deploy on a commercially-supported and robust platform• Without the delay and cost of rewriting your code• Embed in Data Discovery, BI and real time applications
Lou Bajuk-Yorgan, Sr. Dir. Product Management, TIBCO Analyticslbajuk@tibco.com @loubajuk
Better performance and memory management than open source R
– Handles much larger data sets in memory
– Designed and architected for 64-bit platforms
– Linear, predictable performance as data set sizes increase
Summary• Small to moderate size data sets
– Many common operations– TERR: 2-10x as fast as OS R
• Larger data sets– Common operations (e.g.,
model scoring) or complex, real-world scripts
– TERR: 10-100x as fast as OS R
Predictions using SVMs from the e1071 package
Fitting and Scoring Generalized Linear Models
OS R TERR Speedup
Model Fitting on 5 M rows 107.1 sec 17.5 sec 6.1 x
Model Scoring on 20M rows 84.2 sec 1 sec 84.2 x
TERR Performance
© Copyright 2000-2015 TIBCO Software Inc.-7-
All UsersBusiness Analysts Data Scientists App Developers Sys Admins
All Data Historical & Real-Time Internal & External Structured, Unstructured & Semi-Structured
Visual analytics empowering you to make strong decisions using your data
Descriptive & Diagnostic Analytics Predictive & Prescriptive AnalyticsContent Analytics Location Analytics Event Analytics Fast Data Analytics
Self-Service Analytics without sacrificing strong Central Governance
© Copyright 2000-2015 TIBCO Software Inc.-8-
All UsersBusiness Analysts Data Scientists App Developers Sys Admins
All Data Historical & Real-Time Internal & External Structured, Unstructured & Semi-Structured
Predictive Analytics Ecosystem
Leverage existing analytic investments in aunified framework
Create guided analytic applications Rapid start with easy-to-use tools
Native scripting in R
TIBCO Enterprise Runtime for R (TERR)
Open Source R
MATLAB® SAS®
SQL/In-database Analytics
Hadoop/Spark for Big Data S+
KNIME® Lavastorm Analytics®
Lou Bajuk-Yorgan, Sr. Dir. Product Management, TIBCO Analyticslbajuk@tibco.com @loubajuk
Example 1: Embedded TERR in Spotfire• Spotfire: Data Discovery and Visualization platform for Business Users and Analysts
• Separate analytics platform, independent of TERR/R
• Easily enhance Spotfire analyses and applications with R language scripts• Extend the impact of the Data Scientist/R by making their analytic insights available to a wider audience
Write R code directly in Spotfire;TERR executes locally or on server
Manage TERR analytics locally or in Server to reuse across
community
Deploy TERR-powered applications to the web
Lou Bajuk-Yorgan, Sr. Dir. Product Management, TIBCO Analyticslbajuk@tibco.com @loubajuk
Power of embedded Advanced Analytics
Lou Bajuk-Yorgan, Sr. Dir. Product Management, TIBCO Analyticslbajuk@tibco.com @loubajuk
Advanced Analytic Applications in SpotfireCustomer Churn: • Retain your most profitable customers• Increase upsell, decrease churn
Fraud Detection: • Reduce losses due to fraudulent
transactions
Supply Chain Optimization: • Anticipate peaks and lulls• Optimize distribution centers
HR Planning: • Predict employee attrition and optimize
retention
Lou Bajuk-Yorgan, Sr. Dir. Product Management, TIBCO Analyticslbajuk@tibco.com @loubajuk
Example 2: TERR in TIBCO’s Complex Event Processing• TERR powers real-time advanced analytics in TIBCO “Fast Data”
• When an event is identified, the CEP application applies a predictive model, and then can trigger an automated business process
• E.g., extend a mobile offer to a customer; stop a fraudulent transaction in process
ModelDevelop model
Deploy via TERR in TIBCO Streambase or Business Events
ActAutomatically monitor real-time transactionsAutomatically trigger
action
AnalyzeAnalyze data in Spotfire
Uncover patterns, trends & correlations
Lou Bajuk-Yorgan, Sr. Dir. Product Management, TIBCO Analyticslbajuk@tibco.com @loubajuk
Logistics Optimization
• Port Congestion Detection• Real time system triggers TERR• Analyzes port congestion• Recommends reduction of
speed if no berths available• Maritime Abnormality Detection
• Based on Automatic Identification System info, TERR calculates likelihood of deviation from normal sailing routes
• Alerts carrier & operator
Lou Bajuk-Yorgan, Sr. Dir. Product Management, TIBCO Analyticslbajuk@tibco.com @loubajuk
Predictive Maintenance for Oil & Gas
• Oil & Gas Extraction• Maintenance Downtime and
Equipment failures are costly• Engineers track sensor data to find
leading indicators• Temperature, vibration, etc.
• Engineers usually use ad hoc rules on leading indicators• R/TERR used to develop predictive
models for preventative maintenance• Deployed in real-time systems, alert
when maintenance recommended
Lou Bajuk-Yorgan, Sr. Dir. Product Management, TIBCO Analyticslbajuk@tibco.com @loubajuk
TERR Ecosystem• TIBCO
• Spotfire: BI and Data Discovery• Jaspersoft: pixel perfect reporting • Streambase: real time, streaming applications
• Lavastorm Analytics• Visual workflow tool for data management and analysis• Embedding TERR for R scripting and predictive tools
• RStudio IDE• Free, open source IDE widely used by the R Community• Fully compatible with TERR Developer Edition
• KNIME• Free, open source workflow tool for data management
and analysis• TERR fully compatible with KNIME Interactive R
Statistics Integration nodes
Lou Bajuk-Yorgan, Sr. Dir. Product Management, TIBCO Analyticslbajuk@tibco.com @loubajuk
TERR for individual R users• Empower R users
• Enterprise platform for the deployment and integration of your work—without having to rewrite it!
• TERR Developer Edition• Full version of TERR engine for testing code
prior to deployment• Compatible with RStudio & ESS Emacs
• Free for non-production use• Supported through Community site• Available at Tap.tibco.com
Lou Bajuk-Yorgan, Sr. Dir. Product Management, TIBCO Analyticslbajuk@tibco.com @loubajuk
TERR is R for the Enterprise
• Develop code in open source R, deploy on commercially-supported, and robust platforms
• Without recoding, without compromises• Save time & money, quickly respond to new threats and opportunities
• Tightly & efficiently embed R language functionality• Extend the power of R to a wider audience, more applications
Lou Bajuk-Yorgan, Sr. Dir. Product Management, TIBCO Analyticslbajuk@tibco.com @loubajuk
Learn more and Try it yourself• TERR Community at community.tibco.com
• Resources, Documentation, FAQs, Forums• More info at spotfire.tibco.com/terr
• TERR Developer Edition• Full version of TERR engine for testing code prior to deployment• Supported through TIBCOmmunity, download via tap.tibco.com
• Spotfire Free Trial: http://spotfire.tibco.com/trial
• Presentations: http://www.slideshare.net/loubajukyorgan/presentations• Slides @loubajuk
• We’re hiring Data Scientists! Contact me at lbajuk@tibco.com
• R Consortium Founding Member www.r-consortium.org
Recommended