Upload
igsc
View
772
Download
2
Tags:
Embed Size (px)
Citation preview
Data Mining with Excel 2010 and PowerPivotMark Tabladillo Ph.D.MTabladillo <(at)> solidq.comSeptember 18, 2010
SQL Saturday 46 -- Raleigh NC#sqlsat46
© 2
010
Mar
k Ta
blad
illo P
h.D
.
2
MarkTab & Data Mining
© 2
010
Mar
k Ta
blad
illo P
h.D
.
3
© 2
010
Mar
k Ta
blad
illo P
h.D
.
4
© 2
010
Mar
k Ta
blad
illo P
h.D
.
5
OutlineWhat is
Data MiningWhat is
PowerPivot Demos
© 2
010
Mar
k Ta
blad
illo P
h.D
.
6
Data Mining as a Service
© 2
010
Mar
k Ta
blad
illo P
h.D
.
7
OutlineWhat is
Data MiningWhat is
PowerPivot Demos
© 2
010
Mar
k Ta
blad
illo P
h.D
.
8
Data Mining Definitions• Data mining • Machine Learning• Data mining algorithms -- typically use estimation or
optimization to achieve results (as opposed to only calculations).
© 2
010
Mar
k Ta
blad
illo P
h.D
.
9
Data Mining Tasks• Supervised
• Answer known, what is correlated?• Unsupervised
• Answer unknown (unspecified), what are the groups?• Forecasting
• Given a trend, what is next?
© 2
010
Mar
k Ta
blad
illo P
h.D
.
10
Value Slide
Data Mining Add-In for Excel• Requires Analysis Services instance• Version 10.00.2531.00 (April 2009)• 32-Bit Add-In• Microsoft .NET Framework 2.0 (32-bit)• Office 2007 (Professional, Professional Plus, Ultimate,
Enterprise)• SQL Server Enterprise or Standard (or Developer) 2008 or
higher
© 2
010
Mar
k Ta
blad
illo P
h.D
.
11
The Analyze Tab
© 2
010
Mar
k Ta
blad
illo P
h.D
.
12
The Analyze Tab
© 2
010
Mar
k Ta
blad
illo P
h.D
.
13
Menu Option Data Mining Algorithm
Analyze Key Influencers Naïve Bayes
Detect Categories Clustering
Fill from Example Logistic Regression
Forecast Time Series
Highlight Exceptions Clustering
Scenario Analysis (Goal Seek) Logistic Regression
Scenario Analysis (What If) Logistic Regression
Prediction Calculator Logistic Regression
Shopping Basket Analysis Association Rules
Data Mining Tab
© 2
010
Mar
k Ta
blad
illo P
h.D
.
14
Data Mining Tab
© 2
010
Mar
k Ta
blad
illo P
h.D
.
15Many
Data Mining Capacities
© 2
010
Mar
k Ta
blad
illo P
h.D
.
16
SQL Server 2008 R2 Analysis Services Object Maximum sizes/numbers
Maximum data mining models per structure 2^31-1 = 2,147,483,647
Maximum data mining structures per solution 2^31-1 = 2,147,483,647
Maximum data mining structures per Analysis Services database 2^31-1 = 2,147,483,647
Maximum data mining attributes (variables) per structure 2^31-1 = 2,147,483,647
Reference:http://www.marktab.net/datamining/index.php/2010/08/01/sql-server-data-mining-capacities-2008-r2/
Data Mining Tab
© 2
010
Mar
k Ta
blad
illo P
h.D
.
17
OutlineWhat is
Data MiningWhat is
PowerPivot Demos
© 2
010
Mar
k Ta
blad
illo P
h.D
.
18
PowerPivot for Excel• Take advantage of familiar Excel tools and
features• Process massive amounts of data in seconds• Load even the largest data sets from virtually any
source• Use powerful new analytical capabilities, such as
Data Analysis Expressions (DAX)• Make the most of multi-core processors and
gigabytes of memory
© 2
010
Mar
k Ta
blad
illo P
h.D
.
19
PowerPivot for Excel Sources• SQL Server• SQL Azure• Oracle, Teradata, Sybase, Informix, IBM DB2• OLEDB/ODBC• Analysis Services (SSAS)• Reporting Services (SSRS)• Excel, Text File
© 2
010
Mar
k Ta
blad
illo P
h.D
.
20
PowerPivot Reference• http://www.powerpivot.com (Product Site)• http://www.powerpivotpro.com (Blog Site)
© 2
010
Mar
k Ta
blad
illo P
h.D
.
21
OutlineWhat is
Data MiningWhat is
PowerPivot Demos
© 2
010
Mar
k Ta
blad
illo P
h.D
.
22
Resources• MarkTab.NET
Blog, links, video resources and information for data mining
• Blog: http://marktab.net/datamining• Twitter: @MarkTabNet
© 2
010
Mar
k Ta
blad
illo P
h.D
.
23
© 2
010
Mar
k Ta
blad
illo P
h.D
.
24
Regroup and Conclusion• Main Points from this Presentation
© 2
010
Mar
k Ta
blad
illo P
h.D
.
25
Contact Information• Mark Tabladillo
mtabladillo <{at}> solidq.com
• Also on:TwitterLinked In
© 2
010
Mar
k Ta
blad
illo P
h.D
.
26