Upload
datameer
View
107
Download
3
Tags:
Embed Size (px)
DESCRIPTION
Surveys reveal that concerns about data quality can create barriers for companies deploying Analytics and BI initiatives. How can you readily identify and correct data quality issues at every step of your big data analysis to ensure accurate insights into customer behavior? In this webcast, we'll discuss how IT and business users can leverage self-service visualizations to quickly spot and correct data anomalies throughout the analytic process. In this webinar, you will learn how to: -Continuously visualize a profile of your data to identify inconsistencies, incompleteness and duplicates in your data -Visualize machine learning and data mining, including clustering, decision tree analysis, column correlations and recommendations -Create self-service visualizations for business and IT users
Citation preview
Easier, Faster, Smarter
© 2013 Datameer, Inc. All rights reserved.
View Recording !!!
You can view the recording of this webinar at:!http://info.datameer.com/Online-Slideshare-
Instant-Visualizations-in-Every-Step-of-Analysis-OnDemand.html!
© 2013 Datameer, Inc. All rights reserved.
Instant Visualization in Every Step of Analysis!
#datameer @karenhsumar @bigdata
© 2013 Datameer, Inc. All rights reserved.
About Our Speaker!Karen Hsu @karenhsumar!With over 15 years of experience in enterprise software, Karen Hsu has co-authored 4 patents and worked in a variety of engineering, marketing and sales roles.!
!
Most recently she came from Informatica where worked with the start-ups Informatica purchased to bring data quality, master data management, B2B and data security solutions to market. Karen has a Bachelors of Science degree in Management Science and Engineering from Stanford University. !
#datameer @karenhsumar @bigdata
© 2013 Datameer, Inc. All rights reserved.
Agenda!• Data Scientist Challenges!• Lean Analytics Process !• Technology!• Demonstration!• Q&A!
#datameer @karenhsumar @bigdata
© 2013 Datameer, Inc. All rights reserved.
Data Scientist Challenges in Analysis Process!• Multiple tools!• Unable to reproduce results !• Not business friendly!
#datameer @karenhsumar @bigdata
© 2013 Datameer, Inc. All rights reserved.
Lean Analytics Process and Metrics!
© 2013 Datameer, Inc. All rights reserved.
Data Scientist Workflow!
Integrate! Prepare! Analyze! Visualize! Deploy!Id Use Case!
#datameer @karenhsumar @bigdata
© 2013 Datameer, Inc. All rights reserved.
Lean Analytics Process !
1. Integrate!
3. Analyze!
4. Visualize! 2. Prepare!Identify!
Use Case! Deploy!
!!
#datameer @karenhsumar @bigdata
© 2013 Datameer, Inc. All rights reserved.
Identify Use Case!
Funnel Optimization
Behavioral Analytics
Fraud Prevention
EDW Optimization
Customer Segmentation
#datameer @karenhsumar @bigdata
© 2013 Datameer, Inc. All rights reserved.
Codeless Data Integration Big Data Management
Integrate!
#datameer @karenhsumar @bigdata
© 2013 Datameer, Inc. All rights reserved.
Prepare!
Uniqueness!
Accuracy!
Consistency!
Completeness!
#datameer @karenhsumar @bigdata
Duplicates!
© 2013 Datameer, Inc. All rights reserved.
Data Profiling
Prepare!Transformation Enrichment
#datameer @karenhsumar @bigdata
© 2013 Datameer, Inc. All rights reserved.
Analyze!Interactive Spreadsheet Collaboration + Governance Smart Analytics
#datameer @karenhsumar @bigdata
© 2013 Datameer, Inc. All rights reserved.
• Freeform Visualization Visualize Anywhere
Visualize!
#datameer @karenhsumar @bigdata
© 2013 Datameer, Inc. All rights reserved.
Production Environment
Scientist’s Desktop
SAS, R, IBM SPSS, Perl,
Python
Java, .NET C, SQL
Lost in Translation
Predictive Deployment!
SAS, R, IBM SPSS …
Great for model building but not for scoring, even more
so when it comes to Hadoop
#datameer @karenhsumar @bigdata
© 2013 Datameer, Inc. All rights reserved.
• Security Scheduling Monitoring
Deploy!Reproducing Projects
#datameer @karenhsumar @bigdata
© 2013 Datameer, Inc. All rights reserved.
Predictive Deployment!Model Building Model Deployment
and Execution
! Angoss
! BigML
! FICO Model Builder
! IBM SPSS
! KNIME
! KXEN
! Microstrategy
! Open Data
! Pervasive DataRush
! RapidMiner
! R / Rattle ! SAS
! SAP Business Objects
! Salford Systems
! StatSoft STASTISTICA
! SQL Server
! TIBCO Spotfire
! Custom Code, etc.
Universal PMML Plug-‐in (UPPI)
PMML (models)
PMML (models)
PMML (models) PMML
Datameer Server
Deploy in minutes ...
© 2013 Datameer, Inc. All rights reserved.
Demonstration!
#datameer @karenhsumar @bigdata
© 2013 Datameer, Inc. All rights reserved.
Demonstration Flow!
1. Integrate!
3. Analyze!
4. Visualize! 2. Prepare!Identify!
Use Case! Deploy!
!!
#datameer @karenhsumar @bigdata
© 2013 Datameer, Inc. All rights reserved.
Identify Use Case!What are the trends linking website behavior to lead activity to revenue?!!How does website behavior affect churn? !
© 2013 Datameer, Inc. All rights reserved.
Integrate!
#datameer @karenhsumar @bigdata
1. Integrate!
3. Analyze!
4. Visualize! 2. Prepare!Identify!
Use Case! Deploy!
© 2013 Datameer, Inc. All rights reserved.
Prepare!
#datameer @karenhsumar @bigdata
1. Integrate!
3. Analyze!
4. Visualize! 2. Prepare!Identify!
Use Case! Deploy!
Profile!
Identify Outliers!
Enrich!
Transform!
Convert!
© 2013 Datameer, Inc. All rights reserved.
Analyze!
#datameer @karenhsumar @bigdata
1. Integrate!
3. Analyze!
4. Visualize! 2. Prepare!Identify!
Use Case! Deploy!
© 2013 Datameer, Inc. All rights reserved.
Visualize!
#datameer @karenhsumar @bigdata
1. Integrate!
3. Analyze!
4. Visualize! 2. Prepare!Identify!
Use Case! Deploy!
© 2013 Datameer, Inc. All rights reserved.
Predictive Deployment!Model Building Model Deployment
and Execution
! Angoss
! BigML
! FICO Model Builder
! IBM SPSS
! KNIME
! KXEN
! Microstrategy
! Open Data
! Pervasive DataRush
! RapidMiner
! R / Rattle ! SAS
! SAP Business Objects
! Salford Systems
! StatSoft STASTISTICA
! SQL Server
! TIBCO Spotfire
! Custom Code, etc.
Universal PMML Plug-‐in (UPPI)
PMML (models)
PMML (models)
PMML (models) PMML
Datameer Server
Deploy in minutes ...
© 2013 Datameer, Inc. All rights reserved.
Deploy!
1. Integrate!
3. Analyze!
4. Visualize! 2. Prepare!Identify!
Use Case! Deploy!
!!
#datameer @karenhsumar @bigdata
© 2013 Datameer, Inc. All rights reserved.
Addressing Data Scientist Challenges in Workflow !
Multiple tools!
Reproduce results!
Not for business!
One tool!
Collaborate + Track!
Ease of Use!
#datameer @karenhsumar @bigdata
© 2013 Datameer, Inc. All rights reserved.
For more information!
! http://www.datameer.com/solutions/use-cases.html!
!
Learn more!
Contact!
#datameer @karenhsumar @bigdata
@Datameer!
© 2013 Datameer, Inc. All rights reserved.
! PMML is an XML-based language used to define statistical and data mining models and to share these between compliant applications.
! It is a mature standard developed by the DMG (Data Mining Group) to avoid proprietary issues and incompatibilities and to deploy models.
! PMML eliminates need for custom model deployment and ensures reliability.
PMML defines a standard not only to represent data-mining models, but also data handling and data transformations (pre- and post-processing)
!Predictive Model Markup Language!
Models Data
Transformations
#datameer @karenhsumar @bigdata