Upload
others
View
11
Download
0
Embed Size (px)
Citation preview
Disclaimer:
• All images such as logos, photos, etc. used in this presentation are the property of their respective copyright
owners and are used here for educational purposes only
• Some material adapted from: Sorger, “Marketing Analytics: Strategic Models and Metrics”
© Stephan Sorger 2016; www.StephanSorger.com; Data Science: Excel Tools; 1
Introduction to
Data Science and Analytics
Stephan Sorger www.StephanSorger.com
Unit 3. Excel Tools
Outline/ Learning Objectives
Topic Description
Basic Statistics Mean, Median, Variance, Standard deviation, RMS
Pivot Tables Extract significance from large data sets
Solver Maximize/Minimize criteria subject to constraints
ToolPak Analysis add-in functionality of Excel
© Stephan Sorger 2016; www.StephanSorger.com; Data Science: Excel Tools; 2
Basic Statistics
© Stephan Sorger 2016; www.StephanSorger.com; Data Science: Excel Tools; 3
Basic Statistics: Overview
© Stephan Sorger 2016; www.StephanSorger.com; Data Science: Excel Tools; 4
Data set
Data set summarized by basic statistics:
Mean (average)
Median (half-way point)
RMS (Root Mean Square)
Standard Deviation (degree of variability)
Basic Statistics: Example
© Stephan Sorger 2016; www.StephanSorger.com; Data Science: Excel Tools; 5
Single-season home run records: Barry Bonds, Mark McGwire, and Sammy Sosa
Each player wanted to break the record held by Roger Maris
Home run counts per season: Barry Bonds 1 6 9
2 4
2 5 5
3 3 3 4 4
3 7 7
4 0 2
4 6 9
5
5
6
6
7 3
Stemplot
-Separate each observation into a “stem” (left digits) and “leaf” (right digit)
So, 16 would be: 1 | 6 with “1” as the stem and “6” as the leaf
-Write stems vertically in increasing order from top to bottom
-Draw vertical line from top to bottom
-”Split” the stems for greater clarity by entering two “2”, “3”, “4”, etc.
-Interpret the stemplot: Study the distribution; Outlier at 73?
Basic Statistics: Mean
© Stephan Sorger 2016; www.StephanSorger.com; Data Science: Excel Tools; 6
Mean = (Sum of all observation values) / (Number of observations)
Xbar = Mean = (16 + 25 + 24 + … + 73) / (16) = 35.4375
What if we did not count the outlier in 2001?
Mean = (16 + 25 + 24 + …. + 49) / 15 = 32.93; One good season increased his average 2.5!
In statistics, we say that the mean is not a resistant measure of center,
because it cannot resist the influence of one extreme observation
Basic Statistics: Median
© Stephan Sorger 2016; www.StephanSorger.com; Data Science: Excel Tools; 7
M = Median = Center (middle) of set of observations
To find the median, we re-arranged the observations from smallest to largest (above)
For an odd number, the process is easy just pick the middle one
But we have 16 observations, which is an even number
So we pick the “center pair” of observations # 8 and #9 (both of these are 34)
What if we remove the extreme observation of 73?
Median is still 34; Therefore, we say that medians are a resistant measure
Basic Statistics: RMS
© Stephan Sorger 2016; www.StephanSorger.com; Data Science: Excel Tools; 8
RMS = Root Mean Square
A kind of average used in statistics and engineering
Used as a component of the calculation of the standard deviation
To compute, square all the numbers in the set, find the mean, and take the square root
RMS = SQRT ( (a1)^2 + (a2)^2 + (a3)^2 + …) / n )
where a1, a2, a3, … = observations
n = number of observations
Similar in size to average
(average was 35.4375)
Basic Statistics: Standard deviation
© Stephan Sorger 2016; www.StephanSorger.com; Data Science: Excel Tools; 9
s = Standard Deviation
Measures the spread by examining how far the observations are from their mean
To compute, calculate the variance:
Variance = s^2 = [ (x1 – xbar)^2 + (x2 – xbar)^2 + …+ (xn – xbar)^2 ] / (n – 1)
s = SQRT (Variance)
For our previous baseball example, recall that the mean (xbar) = 35.4375:
Pivot Tables
© Stephan Sorger 2016; www.StephanSorger.com; Data Science: Excel Tools; 10
Pivot Tables
© Stephan Sorger 2016; www.StephanSorger.com; Data Science: Excel Tools; 11
Original Data Set
Name Sales Date of Sale Product Channel
Alex Alpha $1,100 January Product A Store
Betty Beta $100 February Product B Internet
Debbie Delta $300 February Product B Store
Edie Epsilon $200 January Product B Internet
Gary Gamma $1,300 January Product A Store
Extract significant data points from large data set
Pivot Tables
© Stephan Sorger 2016; www.StephanSorger.com; Data Science: Excel Tools; 12
Excel
Home Insert …
Pivot Table
A B C D E F G
Create Pivot Table
Select a table or range:
Table Range: Sheet1:$A$1:$E$6
Use an external data source
New Worksheet
Existing Worksheet
Choose where you want the Pivot Table report
Choose the data set to analyze
OK
Launching Pivot Table in Excel
Pivot Tables
© Stephan Sorger 2016; www.StephanSorger.com; Data Science: Excel Tools; 13
Pivot Table Field List
Row Labels
Choose fields to add to report:
Values
Report Filter Column Labels
Drag fields between areas below:
Customer
Sales
Date
Product
Channel
Excel’s Pivot Table Field
List, Based on Original
Input Data Set;
Select “Sales” and
“Product” to get basic
table of sales by product
Different versions
of Excel look different;
PC vs. MAC
Pivot Table Field List
Pivot Tables
© Stephan Sorger 2016; www.StephanSorger.com; Data Science: Excel Tools; 14
Pivot Table Field List
Row Labels
Choose fields to add to report:
Values
Report Filter Column Labels
Drag fields between areas below:
Customer
Sales
Date
Product
Channel
x
x
A B C D
1
2
3
4
5
6
7
8
Row Labels
Product A
Product B
Grand Total
Product Sum of Sales
Sum of Sales
2400
600
3000
9
12
11
10
Pivot Tables: Basic Report: Sales by Product;
Select “Date” to see how sales vary over time
Pivot Tables
© Stephan Sorger 2016; www.StephanSorger.com; Data Science: Excel Tools; 15
Pivot Table Field List
Row Labels
Choose fields to add to report:
Values
Report Filter Column Labels
Drag fields between areas below:
Customer
Sales
Date
Product
Channel
x
x
A B C D
1
2
3
4
5
6
7
8
Row Labels
Product A
Product B
Grand Total
Product Sum of Sales
Sum of Sales
2400
600
3000
Date
January
January
February
2400
200
400
x
9
12
11
10
Pivot Tables: Sales by Product and Date
Select “Channel” to see how sales vary with type of Channel (store)
Pivot Tables
© Stephan Sorger 2016; www.StephanSorger.com; Data Science: Excel Tools; 16
Pivot Table Field List
Row Labels
Choose fields to add to report:
Values
Report Filter Column Labels
Drag fields between areas below:
Customer
Sales
Date
Product
Channel
x
x
A B C D
1
2
3
4
5
6
7
8
Row Labels
Product A
Grand Total Product Sum of Sales
Sum of Sales
2400
2400
200
Date
January
February
2400
600
200
x
9
12
Retail Store
Product B
January
Internet
Internet
Retail Store
11
10
x
3000
300
100
400
Pivot Tables: Sales by Product, Date, and Channel (Added Date, and then Channel)
What if we had added Channel, and then Date?
Pivot Tables
© Stephan Sorger 2016; www.StephanSorger.com; Data Science: Excel Tools; 17
Pivot Table Field List
Row Labels
Choose fields to add to report:
Values
Report Filter Column Labels
Drag fields between areas below:
Customer
Sales
Date
Product
Channel
x
x
A B C D
1
2
3
4
5
6
7
8
Row Labels
Product A
Grand Total Product Sum of Sales
Sum of Sales
2400
2400
200
Date
January
February
2400
600
300
x
9
12
Retail Store
Product B
January
Internet
February
Retail Store
11
10
x
3000
300
300
100
Pivot Tables: Sales by Product, Date, and Channel (Added Channel, and then Date)
Pivot Tables
© Stephan Sorger 2016; www.StephanSorger.com; Data Science: Excel Tools; 18
Pivot Table Field List
Row Labels
Choose fields to add to report:
Values
Report Filter Column Labels
Drag fields between areas below:
Customer
Sales
Date
Product
Channel
x
x
A B C D
1
2
3
4
5
6
7
8
Row Labels
Product A
Grand Total
Product Sum of Sales
Sum of Sales
2400
200
Date
January
February
2400
600 x
9
12
Product B
January
11
10
3000
400 Add to Report Filter
Adding Field to Report Filter
Pivot Tables
© Stephan Sorger 2016; www.StephanSorger.com; Data Science: Excel Tools; 19
Pivot Table Field List
Row Labels
Choose fields to add to report:
Values
Report Filter Column Labels
Drag fields between areas below:
Customer
Sales
Date
Product
Channel
x
x
A B C D
1
2
3
4
5
6
7
8
Product Date
Channel
Sum of Sales
9
12
11
10
Channel (All)
(All)
Internet
Retail Store
OK
x
x
Selecting Reports using Report Filter
Pivot Tables
© Stephan Sorger 2016; www.StephanSorger.com; Data Science: Excel Tools; 20
Pivot Table Field List
Row Labels
Choose fields to add to report:
Values
Report Filter Column Labels
Drag fields between areas below:
Customer
Sales
Date
Product
Channel
x
x
A B C D
Sum of Sales
300
200
100
1
2
3
4
5
6
7
8
Column Labels
Row Labels
Product A
January
Product B
January
February
Grand Total
Internet Retail Store Grand Total
300
2400
2400
300
300
2700
2400
2400
600
200
400
3000
Product Date
Channel
Sum of Sales
9
12
11
10
x
x
Microsoft Excel Solver
© Stephan Sorger 2016; www.StephanSorger.com; Data Science: Excel Tools; 21
Excel Solver: Maximization under Constraints
© Stephan Sorger 2016; www.StephanSorger.com; Data Science: Excel Tools; 22
Linear
Optimization
Model
INPUTS OUTPUTS
Objective Function
Constraints
Maximized Objective
Or Minimized Objective
Common Terms for Process:
-Linear Optimization
-Linear Programming
-Maximization/ Minimization
Solver: Maximization under Constraints: Process
© Stephan Sorger 2016; www.StephanSorger.com; Data Science: Excel Tools; 23
Vehicle Contribution: Determine effectiveness of campaigns, based on historical data
Promotion Objective: Declare promotion objective in equation form
Promotion Constraints: Specify promotion constraints in equation form
Optimization Model: Execute model
Vehicle
Contribution Promotion
Objective
Optimization
Model
Promotion
Constraints
Promotion Data Budget Allocation
Excel Solver: Constraints
© Stephan Sorger 2016; www.StephanSorger.com; Data Science: Excel Tools; 24
Contractual Constraints
Company Policy Constraints
Financial Constraints
Legal Constraints
Typical
Promotion
Constraints
Budget NTE (not to exceed) $XXX/yr
Must follow legal regulations
Contracts with outside agencies
Observe company policies
Excel Solver
© Stephan Sorger 2016; www.StephanSorger.com; Data Science: Excel Tools; 25
Promotion Vehicle Audience/Ad Cost/Ad Maximum Quantity
D: Direct Marketing 30 Viewers/Ad $30/Ad 30
P: Pay Per Click 30 Viewers/Ad $40/Ad 20
S: Social Media 40 Viewers/Ad $60/Ad 10
Direct Marketing: Emails sent directly to individuals within target market
Pay Per Click: Campaigns displaying ads during relevant Internet searches
Social Media: Paid advertisements on social media platforms
Excel Solver
© Stephan Sorger 2016; www.StephanSorger.com; Data Science: Excel Tools; 26
Linear Optimization Element Equation
Objective Function Z = 30*D + 30*P + 40*S
Constraint #1: Budget B = 30*D + 40*P + 60*S <= $2,000
#2: Maximum campaigns/ month: D D <= 30
#3: Maximum campaigns/ month: P P <=20
#4: Maximum campaigns/ month: S S <= 10
Objective Function
Z = 30 * D + 30 * P + 40 * S
Excel Solver
© Stephan Sorger 2016; www.StephanSorger.com; Data Science: Excel Tools; 27
Linear Optimization Element Equation
Objective Function Z = 30*D + 30*P + 40*S
Constraint #1: Budget B = 30*D + 40*P + 60*S <= $2,000
#2: Maximum campaigns/ month: D D <= 30
#3: Maximum campaigns/ month: P P <=20
#4: Maximum campaigns/ month: S S <= 10
Promotion Constraints: Budget
B = 30 * D + 40 * P + 60 * S ≤ $2,000
Excel Solver
© Stephan Sorger 2016; www.StephanSorger.com; Data Science: Excel Tools; 28
Promotion Constraints
D ≤ 30: Cannot exceed 30 direct marketing campaigns per month
P ≤ 20: Cannot exceed 20 pay per click campaigns per month
S ≤ 10: Cannot exceed 10 social media campaigns per month
Linear Optimization Element Equation
Objective Function Z = 30*D + 30*P + 40*S
Constraint #1: Budget B = 30*D + 40*P + 60*S <= $2,000
#2: Maximum campaigns/ month: D D <= 30
#3: Maximum campaigns/ month: P P <=20
#4: Maximum campaigns/ month: S S <= 10
Excel Solver: Execution
© Stephan Sorger 2016; www.StephanSorger.com; Data Science: Excel Tools; 29
Optimization Model:
Setup
Optimization Model:
Execution
Optimization Model:
Interpretation
Excel Solver function Specific format Identify limiting factors
Excel Solver: Execution: Setup
© Stephan Sorger 2016; www.StephanSorger.com; Data Science: Excel Tools; 30
Changing Cells
Target Cell 30 30 40
Constraint #1: Budget 30 40 60
Constraint #2: D ≤ 30 1
Constraint #3: P ≤ 20
Constraint #4: S ≤ 10 1
1
D P S
Columns for D, P, and S parameters
Changing cells for D, P, and S
Target Cell
(Contains objective equation)
Constraint, Left Side
(Contains constraint equation)
Constraint, Right Side
(Contains constraint value)
a b c
d
e f
g h
i j
k l
A B C D E F
1
2
3
4
5
6
7
8
9
Excel Solver: Execution: Launch
© Stephan Sorger 2016; www.StephanSorger.com; Data Science: Excel Tools; 31
Excel
Home Data … …
Solver
A B C D E F G
Excel Execution: Add Constraints
© Stephan Sorger 2016; www.StephanSorger.com; Data Science: Excel Tools; 32
Solver Parameters
Set Target Cell: Solve
Options
Equal To: Max. Min.
$E$4
By Changing Cells:
Subject to the Constraints:
$B$2: $D$2
$E$6 <= $F$6
$E$7 <= $F$7
$E$8 <= $F$8
$E$9 <= $F$9
Add
Add Constraint
Cell Reference:
OK
$E$6 <=
Constraint:
=$F$6
Positive Integer constraint:
To ensure our answers are positive integers:
-Select box “Make variables non-negative”
-Add constraint: Changing Cells > Int (pull-down menu)
Excel Solver: Execution
© Stephan Sorger 2016; www.StephanSorger.com; Data Science: Excel Tools; 33
Changing Cells
Target Cell 30 30 40
Constraint #1: Budget 30 40 60
Constraint #2: D ≤ 30 1
Constraint #3: P ≤ 20
Constraint #4: S ≤ 10 1
1
D P S
Columns for D, P, and S parameters
Changing cells for D, P, and S
Target Cell
(Contains objective equation)
Constraint, Left Side
(Contains constraint equation)
Constraint, Right Side
(Contains constraint value)
30 20 5
1700
2000 2000
30 30
20 20
5 10
A B C D E F
1
2
3
4
5
6
7
8
9
Excel Solver: Interpretation
© Stephan Sorger 2016; www.StephanSorger.com; Data Science: Excel Tools; 34
Solver Results: Summary
Promotion Vehicle Solver Result Cost/Ad Total Cost per Vehicle
D: Direct Marketing 30 (30 max.) $30/Ad $900
P: Pay Per Click 20 (20 max.) $40/Ad $800
S: Social Media 5 (10 max.) $60/Ad $300
Total Spending $2,000
Excel Solver: Interpretation
© Stephan Sorger 2016; www.StephanSorger.com; Data Science: Excel Tools; 35
Solver Results: Constraints
Promotion Vehicle Solver Result Max. Allowable Status
D: Direct Marketing 30 30 Binding
P: Pay Per Click 20 20 Binding
S: Social Media 5 10 Not Binding
Budget $2,000 $2,0000 Binding
Microsoft Excel
Analysis ToolPak
© Stephan Sorger 2016; www.StephanSorger.com; Data Science: Excel Tools; 36
Analysis ToolPak
© Stephan Sorger 2016; www.StephanSorger.com; Data Science: Excel Tools; 37
PC Installation:
1. Click on upper left:
-Called “Office Button”
on Excel pre-2010
-Called “File Tab”
on Excel post-2010
2. Excel Options
Purpose:
Provides data tools
for financial,
statistical, and
engineering
Data analysis
Analysis ToolPak
© Stephan Sorger 2016; www.StephanSorger.com; Data Science: Excel Tools; 38
3. Click “Add-Ins”
Analysis ToolPak
© Stephan Sorger 2016; www.StephanSorger.com; Data Science: Excel Tools; 39
4. Click “Add-Ins”
again
Analysis ToolPak
© Stephan Sorger 2016; www.StephanSorger.com; Data Science: Excel Tools; 40
5. Select:
“Analysis ToolPak”
“Solver Add-in”
Click OK
Analysis ToolPak
© Stephan Sorger 2016; www.StephanSorger.com; Data Science: Excel Tools; 41
6. Access Add-ins
through Data tab
Apple Macintosh users:
Jobs removed ToolPak
Instead, download
StatPlus (free) add-in.
Go to
StephanSorger.com
and follow instructions
Analysis ToolPak
© Stephan Sorger 2016; www.StephanSorger.com; Data Science: Excel Tools; 42
Several analysis tools available.
Typically use the following:
-Descriptive statistics
-Exponential smoothing
-Moving average
-Regression
-t-test; z-test
Outline/ Learning Objectives
Topic Description
Basic Statistics Mean, Median, Variance, Standard deviation, RMS
Pivot Tables Extract significance from large data sets
Solver Maximize/Minimize criteria subject to constraints
ToolPak Analysis add-in functionality of Excel
© Stephan Sorger 2016; www.StephanSorger.com; Data Science: Excel Tools; 43