23
Mark Tabladillo Ph.D. MarkTab Consulting June 25, 2011 Atlanta Code Camp – Marietta, GA

Data Mining with Excel 2010 and PowerPivot 201106

Embed Size (px)

DESCRIPTION

Delivered to Atlanta Code Camp, Marietta, GA. Excel 2010, SQL Server Analysis Services, and PowerPivot can enable desktop data mining.

Citation preview

Page 1: Data Mining with Excel 2010 and PowerPivot 201106

Mark Tabladillo Ph.D.MarkTab Consulting

June 25, 2011

Atlanta Code Camp – Marietta, GA

Page 2: Data Mining with Excel 2010 and PowerPivot 201106

AbstractExcel 2010 provides a world‐class platform for enterprise decision making. This presentation introduces the decision‐making ability possible with the free Excel add‐ins for PowerPivot and SQL Server Data Mining. The data mining is a service hosted in Microsoft's Analysis Services, but the Excel add‐in exposes the enterprise‐strength server technology to analysts and executives. This talk introduces some functions from the "Analyze" and "Data Mining" tabs, including data mining model creation from Excel data. Having seen this presentation some had thought it was all Excel since the functions seem so seamless. The Excel interface provides a familiar way for professionals to make actionable decisions.

2

Page 3: Data Mining with Excel 2010 and PowerPivot 201106

About MarkTab20 Years in AtlantaConsulting since 1998; Incorporated 2003

Part‐Time Faculty at University of PhoenixSAS and Microsoft Expert

Presenter since 1998 at conferences like TechEd and SAS Global Forum

http://marktab.com @MarkTabNet

3

Page 4: Data Mining with Excel 2010 and PowerPivot 201106

4

Page 5: Data Mining with Excel 2010 and PowerPivot 201106

5

Page 6: Data Mining with Excel 2010 and PowerPivot 201106

Outline

What is Data Mining

What is PowerPivot Demos

6

Page 7: Data Mining with Excel 2010 and PowerPivot 201106

Outline

What is Data Mining

What is PowerPivot Demos

7

Page 8: Data Mining with Excel 2010 and PowerPivot 201106

Data Mining Definitions

8

Phrase Goal

“Data Mining” Inform actionable decisions

“Machine Learning”

Determine best performingalgorithm

Page 9: Data Mining with Excel 2010 and PowerPivot 201106

Data Mining >> Drill‐DownDrill‐down can only uncover the knownData Mining can map not only unknown outcomes but also unknown inputsBlog Postings on:

DevExpresshttp://www.marktab.net/datamining/index.php/2011/02/04/devexpress‐data‐mining‐controls/Tableau Software http://www.marktab.net/datamining/index.php/2011/04/25/tableau‐software‐data‐mining‐visualizations/

9

Page 10: Data Mining with Excel 2010 and PowerPivot 201106

Data Mining TasksSupervised

Answer known, what is correlated?Unsupervised

Answer unknown (unspecified), what are the groups?Forecasting

Given a trend, what is next?

10

Value Slide

Page 11: Data Mining with Excel 2010 and PowerPivot 201106

Data Mining Tab

11

Many

Page 12: Data Mining with Excel 2010 and PowerPivot 201106

The Analyze Tab

12

Page 13: Data Mining with Excel 2010 and PowerPivot 201106

The Analyze Tab

13

Menu Option Data Mining Algorithm

Analyze Key Influencers Naïve Bayes

Detect Categories Clustering

Fill from Example Logistic Regression

Forecast Time Series

Highlight Exceptions Clustering

Scenario Analysis (Goal Seek) Logistic Regression

Scenario Analysis (What If) Logistic Regression

Prediction Calculator Logistic Regression

Shopping Basket Analysis Association Rules

Page 14: Data Mining with Excel 2010 and PowerPivot 201106

Outline

What is Data Mining

What is PowerPivot Demos

14

Page 15: Data Mining with Excel 2010 and PowerPivot 201106

PowerPivot for ExcelTake advantage of familiar Excel tools and featuresProcess massive amounts of data in secondsLoad even the largest data sets from virtually any sourceUse powerful new analytical capabilities, such as Data Analysis Expressions (DAX)Make the most of multi‐core processors and gigabytes of memory

15

Page 16: Data Mining with Excel 2010 and PowerPivot 201106

PowerPivot Referencehttp://www.powerpivot.com (Product Site)http://www.powerpivotpro.com (Blog Site)

16

Page 17: Data Mining with Excel 2010 and PowerPivot 201106

Outline

What is Data Mining

What is PowerPivot Demos

17

Page 18: Data Mining with Excel 2010 and PowerPivot 201106

Data Mining Capacities

18

SQL Server 2008 R2 Analysis Services Object Maximum sizes/numbers

Maximum data mining models per structure 2^31‐1 = 2,147,483,647

Maximum data mining structures per solution 2^31‐1 = 2,147,483,647

Maximum data mining structures per Analysis Services database 2^31‐1 = 2,147,483,647

Maximum data mining attributes (variables) per structure 2^31‐1 = 2,147,483,647

Reference:http://www.marktab.net/datamining/index.php/2010/08/01/sql-server-data-mining-capacities-2008-r2/

Page 19: Data Mining with Excel 2010 and PowerPivot 201106

Data Mining Add‐In for ExcelRequires Analysis ServicesVersion 10.00.2531.00 (April 2009)32‐Bit Add‐InMicrosoft .NET Framework 2.0 (32‐bit)Office 2007 (Professional, Professional Plus, Ultimate, Enterprise)SQL Server Enterprise or Standard (or Developer) 2008 or higher

19

Page 20: Data Mining with Excel 2010 and PowerPivot 201106

Getting Started

20

Also See http://marktab.net/About/GettingStarted.aspx

Page 21: Data Mining with Excel 2010 and PowerPivot 201106

MarkTab System ResourcesMarkTab.NETBlog, links, video for data miningBlog:   http://marktab.net/dataminingTwitter:   @MarkTabNetBusiness: http://marktab.com

21

Page 22: Data Mining with Excel 2010 and PowerPivot 201106

22

Page 23: Data Mining with Excel 2010 and PowerPivot 201106

ConclusionExcel 2010 provides a world‐class platform for enterprise decision making. We saw the decision‐making ability possible with the free Excel add‐ins for PowerPivot and SQL Server Data Mining. The Excel interface provides a familiar way for professionals to make actionable decisions.

23