Easier, Faster, Smarter
© 2013 Datameer, Inc. All rights reserved.
View Recording !!!
You can view the recording of this webinar at:!http://info.datameer.com/Online-Slideshare-
Instant-Visualizations-in-Every-Step-of-Analysis-OnDemand.html!
© 2013 Datameer, Inc. All rights reserved.
Instant Visualization in Every Step of Analysis!
#datameer @karenhsumar @bigdata
© 2013 Datameer, Inc. All rights reserved.
About Our Speaker!Karen Hsu @karenhsumar!With over 15 years of experience in enterprise software, Karen Hsu has co-authored 4 patents and worked in a variety of engineering, marketing and sales roles.!
!
Most recently she came from Informatica where worked with the start-ups Informatica purchased to bring data quality, master data management, B2B and data security solutions to market. Karen has a Bachelors of Science degree in Management Science and Engineering from Stanford University. !
#datameer @karenhsumar @bigdata
© 2013 Datameer, Inc. All rights reserved.
Agenda!• Data Scientist Challenges!• Lean Analytics Process !• Technology!• Demonstration!• Q&A!
#datameer @karenhsumar @bigdata
© 2013 Datameer, Inc. All rights reserved.
Data Scientist Challenges in Analysis Process!• Multiple tools!• Unable to reproduce results !• Not business friendly!
#datameer @karenhsumar @bigdata
© 2013 Datameer, Inc. All rights reserved.
Lean Analytics Process and Metrics!
© 2013 Datameer, Inc. All rights reserved.
Data Scientist Workflow!
Integrate! Prepare! Analyze! Visualize! Deploy!Id Use Case!
#datameer @karenhsumar @bigdata
© 2013 Datameer, Inc. All rights reserved.
Lean Analytics Process !
1. Integrate!
3. Analyze!
4. Visualize! 2. Prepare!Identify!
Use Case! Deploy!
!!
#datameer @karenhsumar @bigdata
© 2013 Datameer, Inc. All rights reserved.
Identify Use Case!
Funnel Optimization
Behavioral Analytics
Fraud Prevention
EDW Optimization
Customer Segmentation
#datameer @karenhsumar @bigdata
© 2013 Datameer, Inc. All rights reserved.
Codeless Data Integration Big Data Management
Integrate!
#datameer @karenhsumar @bigdata
© 2013 Datameer, Inc. All rights reserved.
Prepare!
Uniqueness!
Accuracy!
Consistency!
Completeness!
#datameer @karenhsumar @bigdata
Duplicates!
© 2013 Datameer, Inc. All rights reserved.
Data Profiling
Prepare!Transformation Enrichment
#datameer @karenhsumar @bigdata
© 2013 Datameer, Inc. All rights reserved.
Analyze!Interactive Spreadsheet Collaboration + Governance Smart Analytics
#datameer @karenhsumar @bigdata
© 2013 Datameer, Inc. All rights reserved.
• Freeform Visualization Visualize Anywhere
Visualize!
#datameer @karenhsumar @bigdata
© 2013 Datameer, Inc. All rights reserved.
Production Environment
Scientist’s Desktop
SAS, R, IBM SPSS, Perl,
Python
Java, .NET C, SQL
Lost in Translation
Predictive Deployment!
SAS, R, IBM SPSS …
Great for model building but not for scoring, even more
so when it comes to Hadoop
#datameer @karenhsumar @bigdata
© 2013 Datameer, Inc. All rights reserved.
• Security Scheduling Monitoring
Deploy!Reproducing Projects
#datameer @karenhsumar @bigdata
© 2013 Datameer, Inc. All rights reserved.
Predictive Deployment!Model Building Model Deployment
and Execution
! Angoss
! BigML
! FICO Model Builder
! IBM SPSS
! KNIME
! KXEN
! Microstrategy
! Open Data
! Pervasive DataRush
! RapidMiner
! R / Rattle ! SAS
! SAP Business Objects
! Salford Systems
! StatSoft STASTISTICA
! SQL Server
! TIBCO Spotfire
! Custom Code, etc.
Universal PMML Plug-‐in (UPPI)
PMML (models)
PMML (models)
PMML (models) PMML
Datameer Server
Deploy in minutes ...
© 2013 Datameer, Inc. All rights reserved.
Demonstration!
#datameer @karenhsumar @bigdata
© 2013 Datameer, Inc. All rights reserved.
Demonstration Flow!
1. Integrate!
3. Analyze!
4. Visualize! 2. Prepare!Identify!
Use Case! Deploy!
!!
#datameer @karenhsumar @bigdata
© 2013 Datameer, Inc. All rights reserved.
Identify Use Case!What are the trends linking website behavior to lead activity to revenue?!!How does website behavior affect churn? !
© 2013 Datameer, Inc. All rights reserved.
Integrate!
#datameer @karenhsumar @bigdata
1. Integrate!
3. Analyze!
4. Visualize! 2. Prepare!Identify!
Use Case! Deploy!
© 2013 Datameer, Inc. All rights reserved.
Prepare!
#datameer @karenhsumar @bigdata
1. Integrate!
3. Analyze!
4. Visualize! 2. Prepare!Identify!
Use Case! Deploy!
Profile!
Identify Outliers!
Enrich!
Transform!
Convert!
© 2013 Datameer, Inc. All rights reserved.
Analyze!
#datameer @karenhsumar @bigdata
1. Integrate!
3. Analyze!
4. Visualize! 2. Prepare!Identify!
Use Case! Deploy!
© 2013 Datameer, Inc. All rights reserved.
Visualize!
#datameer @karenhsumar @bigdata
1. Integrate!
3. Analyze!
4. Visualize! 2. Prepare!Identify!
Use Case! Deploy!
© 2013 Datameer, Inc. All rights reserved.
Predictive Deployment!Model Building Model Deployment
and Execution
! Angoss
! BigML
! FICO Model Builder
! IBM SPSS
! KNIME
! KXEN
! Microstrategy
! Open Data
! Pervasive DataRush
! RapidMiner
! R / Rattle ! SAS
! SAP Business Objects
! Salford Systems
! StatSoft STASTISTICA
! SQL Server
! TIBCO Spotfire
! Custom Code, etc.
Universal PMML Plug-‐in (UPPI)
PMML (models)
PMML (models)
PMML (models) PMML
Datameer Server
Deploy in minutes ...
© 2013 Datameer, Inc. All rights reserved.
Deploy!
1. Integrate!
3. Analyze!
4. Visualize! 2. Prepare!Identify!
Use Case! Deploy!
!!
#datameer @karenhsumar @bigdata
© 2013 Datameer, Inc. All rights reserved.
Addressing Data Scientist Challenges in Workflow !
Multiple tools!
Reproduce results!
Not for business!
One tool!
Collaborate + Track!
Ease of Use!
#datameer @karenhsumar @bigdata
© 2013 Datameer, Inc. All rights reserved.
For more information!
! http://www.datameer.com/solutions/use-cases.html!
!
Learn more!
Contact!
#datameer @karenhsumar @bigdata
@Datameer!
© 2013 Datameer, Inc. All rights reserved.
! PMML is an XML-based language used to define statistical and data mining models and to share these between compliant applications.
! It is a mature standard developed by the DMG (Data Mining Group) to avoid proprietary issues and incompatibilities and to deploy models.
! PMML eliminates need for custom model deployment and ensures reliability.
PMML defines a standard not only to represent data-mining models, but also data handling and data transformations (pre- and post-processing)
!Predictive Model Markup Language!
Models Data
Transformations
#datameer @karenhsumar @bigdata