Data mining with excel 2010 and power pivot

Preview:

Citation preview

Data Mining with Excel 2010 and PowerPivotMark Tabladillo Ph.D.MTabladillo <(at)> solidq.comSeptember 18, 2010

SQL Saturday 46 -- Raleigh NC#sqlsat46

© 2

010

Mar

k Ta

blad

illo P

h.D

.

2

MarkTab & Data Mining

© 2

010

Mar

k Ta

blad

illo P

h.D

.

3

© 2

010

Mar

k Ta

blad

illo P

h.D

.

4

© 2

010

Mar

k Ta

blad

illo P

h.D

.

5

OutlineWhat is

Data MiningWhat is

PowerPivot Demos

© 2

010

Mar

k Ta

blad

illo P

h.D

.

6

Data Mining as a Service

© 2

010

Mar

k Ta

blad

illo P

h.D

.

7

OutlineWhat is

Data MiningWhat is

PowerPivot Demos

© 2

010

Mar

k Ta

blad

illo P

h.D

.

8

Data Mining Definitions• Data mining • Machine Learning• Data mining algorithms -- typically use estimation or

optimization to achieve results (as opposed to only calculations).

© 2

010

Mar

k Ta

blad

illo P

h.D

.

9

Data Mining Tasks• Supervised

• Answer known, what is correlated?• Unsupervised

• Answer unknown (unspecified), what are the groups?• Forecasting

• Given a trend, what is next?

© 2

010

Mar

k Ta

blad

illo P

h.D

.

10

Value Slide

Data Mining Add-In for Excel• Requires Analysis Services instance• Version 10.00.2531.00 (April 2009)• 32-Bit Add-In• Microsoft .NET Framework 2.0 (32-bit)• Office 2007 (Professional, Professional Plus, Ultimate,

Enterprise)• SQL Server Enterprise or Standard (or Developer) 2008 or

higher

© 2

010

Mar

k Ta

blad

illo P

h.D

.

11

The Analyze Tab

© 2

010

Mar

k Ta

blad

illo P

h.D

.

12

The Analyze Tab

© 2

010

Mar

k Ta

blad

illo P

h.D

.

13

Menu Option Data Mining Algorithm

Analyze Key Influencers Naïve Bayes

Detect Categories Clustering

Fill from Example Logistic Regression

Forecast Time Series

Highlight Exceptions Clustering

Scenario Analysis (Goal Seek) Logistic Regression

Scenario Analysis (What If) Logistic Regression

Prediction Calculator Logistic Regression

Shopping Basket Analysis Association Rules

Data Mining Tab

© 2

010

Mar

k Ta

blad

illo P

h.D

.

14

Data Mining Tab

© 2

010

Mar

k Ta

blad

illo P

h.D

.

15Many

Data Mining Capacities

© 2

010

Mar

k Ta

blad

illo P

h.D

.

16

SQL Server 2008 R2 Analysis Services Object Maximum sizes/numbers

Maximum data mining models per structure 2^31-1 = 2,147,483,647

Maximum data mining structures per solution 2^31-1 = 2,147,483,647

Maximum data mining structures per Analysis Services database 2^31-1 = 2,147,483,647

Maximum data mining attributes (variables) per structure 2^31-1 = 2,147,483,647

Reference:http://www.marktab.net/datamining/index.php/2010/08/01/sql-server-data-mining-capacities-2008-r2/

Data Mining Tab

© 2

010

Mar

k Ta

blad

illo P

h.D

.

17

OutlineWhat is

Data MiningWhat is

PowerPivot Demos

© 2

010

Mar

k Ta

blad

illo P

h.D

.

18

PowerPivot for Excel• Take advantage of familiar Excel tools and

features• Process massive amounts of data in seconds• Load even the largest data sets from virtually any

source• Use powerful new analytical capabilities, such as

Data Analysis Expressions (DAX)• Make the most of multi-core processors and

gigabytes of memory

© 2

010

Mar

k Ta

blad

illo P

h.D

.

19

PowerPivot for Excel Sources• SQL Server• SQL Azure• Oracle, Teradata, Sybase, Informix, IBM DB2• OLEDB/ODBC• Analysis Services (SSAS)• Reporting Services (SSRS)• Excel, Text File

© 2

010

Mar

k Ta

blad

illo P

h.D

.

20

PowerPivot Reference• http://www.powerpivot.com (Product Site)• http://www.powerpivotpro.com (Blog Site)

© 2

010

Mar

k Ta

blad

illo P

h.D

.

21

OutlineWhat is

Data MiningWhat is

PowerPivot Demos

© 2

010

Mar

k Ta

blad

illo P

h.D

.

22

Resources• MarkTab.NET

Blog, links, video resources and information for data mining

• Blog: http://marktab.net/datamining• Twitter: @MarkTabNet

© 2

010

Mar

k Ta

blad

illo P

h.D

.

23

© 2

010

Mar

k Ta

blad

illo P

h.D

.

24

Regroup and Conclusion• Main Points from this Presentation

© 2

010

Mar

k Ta

blad

illo P

h.D

.

25

Contact Information• Mark Tabladillo

mtabladillo <{at}> solidq.com

• Also on:TwitterLinked In

© 2

010

Mar

k Ta

blad

illo P

h.D

.

26