Instant Visualizations in Every Step of Analysis

Preview:

DESCRIPTION

Surveys reveal that concerns about data quality can create barriers for companies deploying Analytics and BI initiatives. How can you readily identify and correct data quality issues at every step of your big data analysis to ensure accurate insights into customer behavior? In this webcast, we'll discuss how IT and business users can leverage self-service visualizations to quickly spot and correct data anomalies throughout the analytic process. In this webinar, you will learn how to: -Continuously visualize a profile of your data to identify inconsistencies, incompleteness and duplicates in your data -Visualize machine learning and data mining, including clustering, decision tree analysis, column correlations and recommendations -Create self-service visualizations for business and IT users

Citation preview

Easier, Faster, Smarter

© 2013 Datameer, Inc. All rights reserved.

View Recording !!!

You can view the recording of this webinar at:!http://info.datameer.com/Online-Slideshare-

Instant-Visualizations-in-Every-Step-of-Analysis-OnDemand.html!

© 2013 Datameer, Inc. All rights reserved.

Instant Visualization in Every Step of Analysis!

#datameer @karenhsumar @bigdata

© 2013 Datameer, Inc. All rights reserved.

About Our Speaker!Karen Hsu @karenhsumar!With over 15 years of experience in enterprise software, Karen Hsu has co-authored 4 patents and worked in a variety of engineering, marketing and sales roles.!

!

Most recently she came from Informatica where worked with the start-ups Informatica purchased to bring data quality, master data management, B2B and data security solutions to market.  Karen has a Bachelors of Science degree in Management Science and Engineering from Stanford University. !

#datameer @karenhsumar @bigdata

© 2013 Datameer, Inc. All rights reserved.

Agenda!•  Data Scientist Challenges!•  Lean Analytics Process !•  Technology!•  Demonstration!•  Q&A!

#datameer @karenhsumar @bigdata

© 2013 Datameer, Inc. All rights reserved.

Data Scientist Challenges in Analysis Process!•  Multiple tools!•  Unable to reproduce results !•  Not business friendly!

#datameer @karenhsumar @bigdata

© 2013 Datameer, Inc. All rights reserved.

Lean Analytics Process and Metrics!

© 2013 Datameer, Inc. All rights reserved.

Data Scientist Workflow!

Integrate! Prepare! Analyze! Visualize! Deploy!Id Use Case!

#datameer @karenhsumar @bigdata

© 2013 Datameer, Inc. All rights reserved.

Lean Analytics Process !

1. Integrate!

3. Analyze!

4. Visualize! 2. Prepare!Identify!

Use Case! Deploy!

!!

#datameer @karenhsumar @bigdata

© 2013 Datameer, Inc. All rights reserved.

Identify Use Case!

Funnel Optimization

Behavioral Analytics

Fraud Prevention

EDW Optimization

Customer Segmentation

#datameer @karenhsumar @bigdata

© 2013 Datameer, Inc. All rights reserved.

Codeless Data Integration Big Data Management

Integrate!

#datameer @karenhsumar @bigdata

© 2013 Datameer, Inc. All rights reserved.

Prepare!

Uniqueness!

Accuracy!

Consistency!

Completeness!

#datameer @karenhsumar @bigdata

Duplicates!

© 2013 Datameer, Inc. All rights reserved.

Data Profiling

Prepare!Transformation Enrichment

#datameer @karenhsumar @bigdata

© 2013 Datameer, Inc. All rights reserved.

Analyze!Interactive Spreadsheet Collaboration + Governance Smart Analytics

#datameer @karenhsumar @bigdata

© 2013 Datameer, Inc. All rights reserved.

•  Freeform Visualization Visualize Anywhere

Visualize!

#datameer @karenhsumar @bigdata

© 2013 Datameer, Inc. All rights reserved.

Production Environment

Scientist’s Desktop

SAS, R, IBM SPSS, Perl,

Python

Java, .NET C, SQL

Lost in Translation

Predictive Deployment!

SAS, R, IBM SPSS …

Great for model building but not for scoring, even more

so when it comes to Hadoop

#datameer @karenhsumar @bigdata

© 2013 Datameer, Inc. All rights reserved.

•  Security Scheduling Monitoring

Deploy!Reproducing Projects

#datameer @karenhsumar @bigdata

© 2013 Datameer, Inc. All rights reserved.

Predictive Deployment!Model Building Model Deployment

and Execution

!   Angoss

!   BigML

!   FICO Model Builder

!   IBM SPSS

!   KNIME

!   KXEN

!   Microstrategy

!   Open Data

!   Pervasive DataRush

!   RapidMiner

!   R / Rattle !   SAS

!   SAP Business Objects

!   Salford Systems

!   StatSoft STASTISTICA

!   SQL Server

!   TIBCO Spotfire

!   Custom Code, etc.

               

Universal  PMML  Plug-­‐in  (UPPI)  

PMML  (models)  

PMML  (models)  

PMML  (models)  PMML

Datameer Server

Deploy in minutes ...

© 2013 Datameer, Inc. All rights reserved.

Demonstration!

#datameer @karenhsumar @bigdata

© 2013 Datameer, Inc. All rights reserved.

Demonstration Flow!

1. Integrate!

3. Analyze!

4. Visualize! 2. Prepare!Identify!

Use Case! Deploy!

!!

#datameer @karenhsumar @bigdata

© 2013 Datameer, Inc. All rights reserved.

Identify Use Case!What are the trends linking website behavior to lead activity to revenue?!!How does website behavior affect churn? !

© 2013 Datameer, Inc. All rights reserved.

Integrate!

#datameer @karenhsumar @bigdata

1. Integrate!

3. Analyze!

4. Visualize! 2. Prepare!Identify!

Use Case! Deploy!

© 2013 Datameer, Inc. All rights reserved.

Prepare!

#datameer @karenhsumar @bigdata

1. Integrate!

3. Analyze!

4. Visualize! 2. Prepare!Identify!

Use Case! Deploy!

Profile!

Identify Outliers!

Enrich!

Transform!

Convert!

© 2013 Datameer, Inc. All rights reserved.

Analyze!

#datameer @karenhsumar @bigdata

1. Integrate!

3. Analyze!

4. Visualize! 2. Prepare!Identify!

Use Case! Deploy!

© 2013 Datameer, Inc. All rights reserved.

Visualize!

#datameer @karenhsumar @bigdata

1. Integrate!

3. Analyze!

4. Visualize! 2. Prepare!Identify!

Use Case! Deploy!

© 2013 Datameer, Inc. All rights reserved.

Predictive Deployment!Model Building Model Deployment

and Execution

!   Angoss

!   BigML

!   FICO Model Builder

!   IBM SPSS

!   KNIME

!   KXEN

!   Microstrategy

!   Open Data

!   Pervasive DataRush

!   RapidMiner

!   R / Rattle !   SAS

!   SAP Business Objects

!   Salford Systems

!   StatSoft STASTISTICA

!   SQL Server

!   TIBCO Spotfire

!   Custom Code, etc.

               

Universal  PMML  Plug-­‐in  (UPPI)  

PMML  (models)  

PMML  (models)  

PMML  (models)  PMML

Datameer Server

Deploy in minutes ...

© 2013 Datameer, Inc. All rights reserved.

Deploy!

1. Integrate!

3. Analyze!

4. Visualize! 2. Prepare!Identify!

Use Case! Deploy!

!!

#datameer @karenhsumar @bigdata

© 2013 Datameer, Inc. All rights reserved.

Addressing Data Scientist Challenges in Workflow !

Multiple tools!

Reproduce results!

Not for business!

One tool!

Collaborate + Track!

Ease of Use!

#datameer @karenhsumar @bigdata

© 2013 Datameer, Inc. All rights reserved.

For more information!

! http://www.datameer.com/solutions/use-cases.html!

!

!   @karenhsumarkhsu@datameer.com!

Learn more!

Contact!

#datameer @karenhsumar @bigdata

@Datameer!

© 2013 Datameer, Inc. All rights reserved.

!   PMML is an XML-based language used to define statistical and data mining models and to share these between compliant applications.

!   It is a mature standard developed by the DMG (Data Mining Group) to avoid proprietary issues and incompatibilities and to deploy models.

!   PMML eliminates need for custom model deployment and ensures reliability.

PMML defines a standard not only to represent data-mining models, but also data handling and data transformations (pre- and post-processing)

!Predictive Model Markup Language!

Models Data

Transformations

#datameer @karenhsumar @bigdata