Model Fitting Jean-Yves Le Boudec 0. Contents 1 Virus Infection Data We would like to capture the...
If you can't read please download the document
Model Fitting Jean-Yves Le Boudec 0. Contents 1 Virus Infection Data We would like to capture the growth of infected hosts (explanatory model) An exponential
Virus Infection Data We would like to capture the growth of
infected hosts (explanatory model) An exponential model seems
appropriate How can we fit the model, in particular, what is the
value of ? 2
Slide 4
Least Square Fit of Virus Infection Data 3 Least square fit =
0.5173 Mean doubling time 1.34 hours Prediction at +6 hours: 100
000 hosts
Slide 5
Least Square Fit of Virus Infection Data In Log Scale 4 Least
square fit = 0.39 Mean doubling time 1.77 hours Prediction at +6
hours: 39 000 hosts
Slide 6
Compare the Two 5 LS fit in natural scale LS fit in log
scale
Slide 7
Which Fitting Method should I use ? Which optimization
criterion should I use ? The answer is in a statistical model.
Model not only the interesting part, but also the noise For example
6 = 0.5173
Slide 8
How can I tell which is correct ? 7 = 0.39
Slide 9
Look at Residuals = validate model 8
Slide 10
9
Slide 11
Least Square Fit = Gaussian iid Noise Assume model
(homoscedasticity) The theorem says: minimize least squares =
compute MLE for this model This is how we computed the estimates
for the virus example 10
Slide 12
Least Square and Projection Skriva war an daol petra zo: data
point, predicted response and estimated parameter for virus example
11 Data point Predicted response Estimated parameter Manifold Where
the data point would lie if there would be no noise
Slide 13
Confidence Intervals 12
Slide 14
13
Slide 15
Robustness to Outliers 14
Slide 16
A Simple Example Least Square L1 Norm Minimization 15
Slide 17
Mean Versus Median 16
Slide 18
2. Linear Regression Also called ANOVA (Analysis of Variance )
= least square + linear dependence on parameter A special case
where computations are easy 17
Slide 19
Example 4.3 What is the parameter ? Is it a linear model ? How
many degrees of freedom ? What do we assume on i ? What is the
matrix X ? 18
Slide 20
19
Slide 21
Does this model have full rank ? 20
Slide 22
Some Terminology x i are called explanatory variable Assumed
fixed and known y i are called response variables They are the data
Assumed to be one sample output of the model 21
Slide 23
Least Square and Projection 22 Data point Predicted response
Estimated parameter Manifold Where the data point would lie if
there would be no noise
Slide 24
Solution of the Linear Regression Model 23
Slide 25
Least Square and Projection The theorem gives H and K 24
residuals Predicted response Estimated parameter Manifold Where the
data point would lie if there would be no noise data
Slide 26
The Theorem Gives with Confidence Interval 25
Slide 27
SSR Confidence Intervals use the quantity s s 2 is called Sum
of Squared Residuals 26 residuals Predicted response data
Slide 28
Validate the Assumptions with Residuals 27
Slide 29
Residuals Residuals are given by the theorem 28 residuals
Predicted response data
Slide 30
Standardized Residuals The residuals e i are an estimate of the
noise terms i They are not (exactly) normal iid The variance of e i
is ???? A: 1- H i,i Standardized residuals are not exactly normal
iid either but their variance is 1 29
Slide 31
Which of these two models could be a linear regression model ?
A: both Linear regression does not mean that y i is a linear
function of x i Achtung: There is a hidden assumption Noise is iid
gaussian -> homoscedasticity 30
Slide 32
31
Slide 33
3. Linear Regression with L1 norm minimization = L1 norm
minimization + linear dependency on parameter More robust Less
traditional 32
Slide 34
This is convex programming 33
Slide 35
34
Slide 36
Confidence Intervals No closed form Compare to median !
Boostrap: How ? 35
Slide 37
36
Slide 38
4. Choosing a Distribution Know a catalog of distributions,
guess a fit Shape Kurtosis, Skewness Power laws Hazard Rate Fit
Verify the fit visually or with a test (see later) 37
Slide 39
Distribution Shape Distributions have a shape By definition:
the shape is what remains the same when we Shift Rescale Example:
normal distribution: what is the shape parameter ? Example:
exponential distribution: what is the shape parameter ? 38
Slide 40
Standard Distributions In a given catalog of distributions, we
give only the distributions with different shapes. For each shape,
we pick one particular distribution, which we call standard.
Standard normal: N(0,1) Standard exponential: Exp(1) Standard
Uniform: U(0,1) 39
Slide 41
Log-Normal Distribution 40
Slide 42
41
Slide 43
Skewness and Curtosis 42
Slide 44
Power Laws and Pareto Distribution 43
Slide 45
Complementary Distribution Functions Log-log Scales 44 Pareto
LognormalNormal
Slide 46
Zipfs Law 45
Slide 47
46
Slide 48
Hazard Rate Interpretation: probability that a flow dies in
next dt seconds given still alive Used to classify distribs Aging
Memoriless Fat tail Ex: normal ? Exponential ? Pareto ? Log Normal
? 47
Slide 49
The Weibull Distribution Standard Weibull CDF: Aging for c >
1 Memoriless for c = 1 Fat tailed for c