Upload
mariah-hoover
View
221
Download
1
Tags:
Embed Size (px)
Citation preview
One-Factor Experiments
Andy WangCIS 5930
Computer SystemsPerformance Analysis
2
Characteristics ofOne-Factor
Experiments• Useful if there’s only one important
categorical factor with more than two interesting alternatives– Methods reduce to 21 factorial designs if
only two choices
• If single variable isn’t categorical, should use regression instead
• Method allows multiple replications
3
Comparing Truly Comparable Options
• Evaluating single workload on multiple machines
• Trying different options for single component
• Applying single suite of programs to different compilers
4
When to Avoid It
• Incomparable “factors”– E.g., measuring vastly different workloads
on single system
• Numerical factors– Won’t predict any untested levels– Regression usually better choice
• Related entries across level– Use two-factor design instead
5
An Example One-Factor Experiment
• Choosing authentication server for single-sized messages
• Four different servers are available• Performance measured by response
time– Lower is better
6
The One-Factor Model
• yij = j + eij
• yij is ith response with factor set at level j– Each level is a server type
• is mean response• j is effect of alternative j
• eij is error term
0j
7
One-Factor Experiments With
Replications• Initially, assume r replications at each
alternative of factor• Assuming a alternatives, we have a
total of ar observations• Model is thus
r
i
a
jij
a
jj
r
i
a
jij erary
1 111 1
8
Sample Datafor Our Example
• Four server types, with four replications each (measured in seconds)
A B C D0.96 0.75 1.01 0.931.05 1.22 0.89 1.020.82 1.13 0.94 1.060.94 0.98 1.38 1.21
9
Computing Effects
• Need to figure out and j
• We have various yij’s• Errors should add to zero:
• Similarly, effects should add to zero:
01 1
r
i
a
jije
a
jj
1
0
10
Calculating
• By definition, sum of errors and sum of effects are both zero,
• And thus, is equal to grand mean of all responses
001 1
aryr
i
a
jij
yyar
r
i
a
jij
1 1
1
11
Calculating for Our Example
018.1
29.1616
1
44
1 4
1
4
1
i jijy
12
Calculating j
• j is vector of responses– One for each alternative of the factor
• To find vector, find column means
• Separate mean for each j• Can calculate directly from observations
r
iijj y
ry
1
1
13
Calculating Column Mean
• We know that yij is defined to be
• So,
ijjij ey
r
iijj
r
iijjj
errr
er
y
1
1
1
1
14
Calculating Parameters
• Sum of errors for any given row is zero, so
• So we can solve for j:
j
jj rrr
y
01
yyy jjj
15
Parametersfor Our Example
Server A B C DCol. Mean .9425 1.02 1.055 1.055
Subtract from column means to get parameters
Parameters -.076 .002 .037 .037
16
EstimatingExperimental Errors
• Estimated response is• But we measured actual responses
– Multiple ones per alternative
• So we can estimate amount of error in estimated response
• Use methods similar to those used in other types of experiment designs
jjy ˆ
17
Sum of Squared Errors
• SSE estimates variance of the errors:
• We can calculate SSE directly from model and observations
• Also can find indirectly from its relationship to other error terms
r
i
a
jijeSSE
1 1
2
18
SSE for Our Example
• Calculated directly:SSE = (.96-(1.018-.076))^2
+ (1.05 - (1.018-.076))^2 + . . .+ (.75-(1.018+.002))^2+ (1.22 - (1.018 + .002))^2 + . . .+ (.93 -(1.018+.037))^2 = .3425
19
Allocating Variation
• To allocate variation for model, start by squaring both sides of model equation
• Cross-product terms add up to zero
ijjijjijjij eeey 2222222
products-cross
,
2
,
2
,
2
,
2
ji
ijji
jjiji
ij ey
20
Variation In Sum of Squares Terms
• SSY = SS0 + SSA + SSE
• Gives another way to calculate SSE
ji
ijySSY,
2
2
1 1
20 arSSr
i
a
j
a
jj
r
i
a
jj rSSA
1
2
1 1
2
21
Sum of Squares Termsfor Our Example
• SSY = 16.9615• SS0 = 16.58256• SSA = .03377• So SSE must equal 16.9615-
16.58256-.03377– I.e., 0.3425– Matches our earlier SSE calculation
22
Assigning Variation
• SST is total variation• SST = SSY - SS0 = SSA + SSE• Part of total variation comes from model• Part comes from experimental errors• A good model explains a lot of variation
23
Assigning Variationin Our Example
• SST = SSY - SS0 = 0.376244• SSA = .03377• SSE = .3425• Percentage of variation explained by
server choice
%97.83762.
03377.100
24
Analysis of Variance
• Percentage of variation explained can be large or small
• Regardless of which, may or may not be statistically significant
• To determine significance, use ANOVA procedure – Assumes normally distributed errors
25
Running ANOVA
• Easiest to set up tabular method• Like method used in regression models
– Only slight differences
• Basically, determine ratio of Mean Squared of A (parameters) to Mean Squared Errors
• Then check against F-table value for number of degrees of freedom
26
ANOVA Table forOne-Factor
ExperimentsCompo- Sum of % of Degrees of Mean F- F-nent Squares Var. Freedom Square Comp Table
y ar
1
SST=SSY-SS0 100 ar-1
A a-1 F[1-; a-1,a(r-
1)]
e SSE=SST-SSA a(r-1)
2ijySSY
..y 20 arSS
..yy
2jrSSA
1
a
SSAMSA
MSE
MSA
SST
SSE100
)1(
ra
SSEMSE
SST
SSA100
27
ANOVA Procedurefor Our Example
Compo- Sum of % of Degrees of Mean F- F-nent Squares Variation Freedom Square Comp Table
y 16.96 16
16.58 1
.376 100 15
A .034 8.97 3 .011 .394
2.61
e .342 91.0 12 .028
..y
..yy
28
Interpretationof Sample ANOVA
• Done at 90% level• F-computed is .394• Table entry at 90% level with n=3 and
m=12 is 2.61• Thus, servers are not significantly
different
29
One-Factor Experiment Assumptions
• Analysis of one-factor experiments makes the usual assumptions:– Effects of factors are additive– Errors are additive– Errors are independent of factor alternatives– Errors are normally distributed– Errors have same variance at all
alternatives
• How do we tell if these are correct?
30
Visual Diagnostic Tests
• Similar to those done before– Residuals vs. predicted response– Normal quantile-quantile plot– Residuals vs. experiment number
31
Residuals vs. Predictedfor Example
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
0.9 0.95 1 1.05 1.1
32
Residuals vs. Predicted, Slightly Revised
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
0.9 0.95 1 1.05 1.1
33
What Does The Plot Tell Us?
• Analysis assumed size of errors was unrelated to factor alternatives
• Plot tells us something entirely different– Very different spread of residuals for
different factors
• Thus, one-factor analysis is not appropriate for this data– Compare individual alternatives instead– Use pairwise confidence intervals
34
Could We Have Figured This Out Sooner?
• Yes!• Look at original data• Look at calculated parameters• Model says C & D are identical• Even cursory examination of data
suggests otherwise
35
Looking Back at the Data
A B C D
0.96 0.75 1.01 0.93
1.05 1.22 0.89 1.02
0.82 1.13 0.94 1.06
0.94 0.98 1.38 1.21
Parameters
-.076 .002 .037 .037
36
Quantile-Quantile Plotfor Example
-0.4
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
-2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5
37
What Does This PlotTell Us?
• Overall, errors are normally distributed• If we only did quantile-quantile plot,
we’d think everything was fine• The lesson - test ALL assumptions,
not just one or two
38
One-Factor Confidence Intervals
• Estimated parameters are random variables– Thus, can compute confidence intervals
• Basic method is same as for confidence intervals on 2kr design effects
• Find standard deviation of parameters– Use that to calculate confidence intervals– Typo in book, pg 336, example 20.6, in formula for
calculating j
– Also typo on pg. 335: degrees of freedom is a(r-1), not r(a-1)
39
Confidence Intervals For Example Parameters
• se = .169• Standard deviation of = .042• Standard deviation of j = .073• 95% confidence interval for = (.932, 1.10)• 95% CI for = (-.225, .074)• 95% CI for = (-.148,.151)• 95% CI for = (-.113,.186)• 95% CI for = (-.113,.186)
40
Unequal Sample Sizes in One-Factor Experiments
• Don’t really need identical replications for all alternatives
• Only slight extra difficulty• See book example for full details
41
Changes To HandleUnequal Sample Sizes
• Model is the same• Effects are weighted by number of
replications for that alternative:
• Slightly different formulas for degrees of freedom
01
j
a
jjar
42
White Slide