Upload
ta-wei-huang
View
143
Download
0
Embed Size (px)
Citation preview
Introduction to Linear ModelStatistical Methods in Finance
Lecture 4
Ta-Wei Huang
December 7, 2016
Ta-Wei Huang Introduction to Linear Model December 7, 2016 1 / 29
Table of Contents
Regression analysis is almost certainly the most important tool at the
econometrician’s disposal. But what is regression? Let’s what we’ll talk
about in today’s lecture.
1 Basic Idea in Regression
2 Matrix Representation
3 The Least Square Estimator
4 Next Lecture
Ta-Wei Huang Introduction to Linear Model December 7, 2016 2 / 29
Table of Contents
Regression analysis is almost certainly the most important tool at the
econometrician’s disposal. But what is regression? Let’s what we’ll talk
about in today’s lecture.
1 Basic Idea in Regression
2 Matrix Representation
3 The Least Square Estimator
4 Next Lecture
Ta-Wei Huang Introduction to Linear Model December 7, 2016 2 / 29
Table of Contents
Regression analysis is almost certainly the most important tool at the
econometrician’s disposal. But what is regression? Let’s what we’ll talk
about in today’s lecture.
1 Basic Idea in Regression
2 Matrix Representation
3 The Least Square Estimator
4 Next Lecture
Ta-Wei Huang Introduction to Linear Model December 7, 2016 2 / 29
Table of Contents
Regression analysis is almost certainly the most important tool at the
econometrician’s disposal. But what is regression? Let’s what we’ll talk
about in today’s lecture.
1 Basic Idea in Regression
2 Matrix Representation
3 The Least Square Estimator
4 Next Lecture
Ta-Wei Huang Introduction to Linear Model December 7, 2016 2 / 29
Basic Idea in Regression
What is Linear Model 1
In very general terms, regression (linear model) is concerned with
describing and evaluating the linear relationship between a given
variable and one or more other variables.
More specifically, linear model is an attempt to explain movements in
a variable by reference to movements in one or more other variables.
(Causal Relationship)
Ta-Wei Huang Introduction to Linear Model December 7, 2016 3 / 29
Basic Idea in Regression
What is Linear Model 2
Causal Relationship: Y = f(X1, · · · , Xk) + ε
Y : Output Variable (Response, Effect)
Xi: Input Variables (Causes)
f : a function presenting the causal relationship
ε: a random error term
The causal relationship f is deterministic but unknown. Can we
approximate it by
f(X1, · · · , Xn) =∑
i βigi(X1, · · · , Xk),
where gi is known (and chosen by ourselves)?
Ta-Wei Huang Introduction to Linear Model December 7, 2016 4 / 29
Basic Idea in Regression
Definition of Linear Model
Definition (Linear Model)
The linear model of an output variable Y with input variables X1, · · · , Xk
has the general form
Y =
p∑i=1
βigi(X1, · · · , Xk) + ε,
where X1, · · · , Xn are accurate deterministic, gi is a known function of
X1, · · · , Xn for i = 1, . . . , k, βi is a unknown parameter entered in
linearity for i = 1, . . . , p, and ε is a random error term.
Ta-Wei Huang Introduction to Linear Model December 7, 2016 5 / 29
Basic Idea in Regression
Explanation of Linear Model
Y =
p∑i=1
βigi(X1, · · · , Xk) + ε
The definition implies that we know the form gi of effects on Y , but
we don’t know the magnitude βi of effects on Y . ⇒ signal
In a linear model, we assume that variation due to random error ε
only occurs on the output Y . ⇒ error
Ta-Wei Huang Introduction to Linear Model December 7, 2016 6 / 29
Basic Idea in Regression
Rationale of Linear Model 1
A general model is given by Y = f(X1, · · · , Xk) + ε, where f is unknown
and arbitrary.
By Taylor’s theorem, f(X1, · · · , Xn) =∑∞
i=01i!D
if(X) ·X, which
implies that there are infinite parameters to be estimated.
Usually we don’t have enough data to estimate f directly, and so we
have to assume that it has some more restricted form.
A local (the range of data X1, · · · , Xk) approximation of f may be
achievable by a linear model.
Because the predictors can be transformed and combined in any way,
linear models are actually very flexible.
Ta-Wei Huang Introduction to Linear Model December 7, 2016 7 / 29
Basic Idea in Regression
Rationale of Linear Model 2
Let f(X) =∑p
i=1 βigi(X). Then the predictive value Y = f(X).
The mean squared error (MSE) is E(Y − Y )2 = E(f(X) + ε− f(X))2
= (f(X)− f(X))2︸ ︷︷ ︸reducible
+ V ar(ε)︸ ︷︷ ︸irreducible
An intuitive way is to locally minimize the reducible error
(f(X)− f(X))2, but not always do that for some reasons.
Ta-Wei Huang Introduction to Linear Model December 7, 2016 8 / 29
Basic Idea in Regression
When to Use Linear Model 1
Examine the causal relationship between an output variable Y
(effect) and some input variables X = (X1, · · · , Xk) (causes).
Question
Let Yi,t = Ri,t −Rf,t and Xi,t = Rm,t −Rf,t. We find out the linear
model Yi,t = αi,t + βi,tXi,t + εi,t fitting well. Can we conclude that the
reason for a higher return on one stock is the higher market premium?
From the above question, we know that sometimes the causal
relationship is not quite clear, especially in financial data. Can we still
use linear models to model behaviours in financial markets?
Ta-Wei Huang Introduction to Linear Model December 7, 2016 9 / 29
Basic Idea in Regression
When to Use Linear Model 2
Even where no sensible causal relationship exists between X and Y ,
we may wish to relate them by some sore of mathematical equation
(rationale: sample from a multivariate normal population) since there
is a strong association between X and Y .
From mathematical derivation, one can see that if (X, Y ) follows a
joint normal distribution, then Y |X has the pattern of linear model.
(See: Sampling Model)
Ta-Wei Huang Introduction to Linear Model December 7, 2016 10 / 29
Basic Idea in Regression
When to Use Linear Model 3
Question
To investigate the behaviour of returns, which models should be used?
1 Ri,t −Rf,t = αi,t + βi,t(Rm,t −Rf,t) + εi,t
2 Rm,t −Rf,t = αi,t + βi,t(Ri,t −Rf,t) + εi,t
The domain knowledge is important to determine input and output
variables, and that’s why theoretical models are still important even if
they are not realistic. However, in machine learning and prediction,
association relationship is enough!
Ta-Wei Huang Introduction to Linear Model December 7, 2016 11 / 29
Basic Idea in Regression
Data Type in Linear Model
A response/output/dependent variable Y is modeled or explained by
predicton/input/independent/regressor variables that are functions of
X = (X1, · · · , Xk).
Y : an ”approximately” continuous random variable
X: continuous/discrete/categorical deterministic variables
Note that if X is a random vector, then Y |X allows us to treat X as
a deterministic vector.
Ta-Wei Huang Introduction to Linear Model December 7, 2016 12 / 29
Basic Idea in Regression
Types of Linear Model
X = (X1, · · · , Xk): quantitative ⇒ multiple regression
X = (X1, · · · , Xk): qualitative + quantitative ⇒ analysis of
covariance
X = (X1, · · · , Xk): qualitative ⇒ analysis of variance (ANOVA)
multiple Y ’s ⇒ multivariate regression
qualitative Y ⇒ generalized linear model (logistic regression)
Ta-Wei Huang Introduction to Linear Model December 7, 2016 13 / 29
Basic Idea in Regression
Statistical Procedure of Linear Model
Ta-Wei Huang Introduction to Linear Model December 7, 2016 14 / 29
Matrix Representation
The Data Structure
The data structure of n records with one output variable Y and k input
variables X1, ..., Xk is
Y X1 X2 · · · Xk
y1 x11 x12 · · · x1k
y2 x21 x22 · · · x2k...
......
......
yn xn1 xn2 · · · xnk
Ta-Wei Huang Introduction to Linear Model December 7, 2016 15 / 29
Matrix Representation
Matrix Representation
The functional form of a linear model is
Yi = β0 + β1g1(xi1, · · · , xik) + · · ·+ βp−1gp−1(xi1, · · · , xik) + εi,
for i = 1, 2, . . . , n.
The matrix representation of that model is
Y = Xβ + ε,
where Yn×1 =
(y1...yn
), Xn×p =
1 g11 ··· g1p−1
......
. . ....
1 gn1 ··· gnp−1
,
βp×1 =
(β0...
βp−1
), and εn×1 =
( ε1...εn
).
Ta-Wei Huang Introduction to Linear Model December 7, 2016 16 / 29
Matrix Representation
Common Linear Models
Definition
A linear model Y = Xβ + ε is said to be
a least square model if there is no assumption on ε. The parameter
space is Θ = β : β ∈ Rp.
a Gauss Markov model if E(ε) = 0 and cov(ε) = σ2I. The
parameter space is Θ = (β, σ2) : β ∈ Rp, σ2 ∈ R+.
a Aitken model if E(ε) = 0 and cov(ε) = σ2V, where V is known.
The parameter space is Θ = (β, σ2) : β ∈ Rp, σ2 ∈ R+.
a general linear mixed model if E(ε) = 0 and cov(ε) = Σ ≡ Σ(θ).
The parameter space is Θ = (β, θ) : β ∈ Rp, σ2 ∈ Ω, where Ω is
the set of all values of θ such that Σ(θ) is positive definite.
Ta-Wei Huang Introduction to Linear Model December 7, 2016 17 / 29
Matrix Representation
Gauss Markov Model
Definition
A linear model Y = Xβ + ε is said to be a Gauss Markov model if
E(ε) = 0 and cov(ε) = σ2I. The parameter space of this model is
Θ = (β, σ2) : β ∈ Rp, σ2 ∈ R+.
Common Gauss Markov Model
One-sample Problem
Simple Linear Regression
Multiple Linear Regression
ANOVA and ANCOVA
Ta-Wei Huang Introduction to Linear Model December 7, 2016 18 / 29
Matrix Representation
Example 1 (One-sample Problem)
Assume that Y1, · · · , Yn is an iid sample with mean µ and variance
σ2 > 0. If ε1, · · · , εn are iid with mean E(εi) = 0 and common variance
σ2, then the functional form of the GM model is Yi = µ+ εi. The matrix
form of this model is Y = Xβ + ε, where Yn×1 =
(Y1...Yn
), Xn×1 =
( 1...1
),
β1×1 = µ, and εn×1 =
( ε1...εn
).
Ta-Wei Huang Introduction to Linear Model December 7, 2016 19 / 29
Matrix Representation
Example 2 (Simple Linear Regression)
Consider the model where a response variable Y is linearly related to an
independent variable x via Yi = β0 + β1xi + εi for i = 1, 2, . . . , n, where
εi are uncorrelated random variables with mean 0 and common variance
σ2 > 0. The matrix form of this model is Y = Xβ + ε, where
Yn×1 =
(Y1...Yn
), Xn×2 =
( 1 x1...
...1 xn
), β2×1 =
(β0β1
), and εn×1 =
( ε1...εn
).
Ta-Wei Huang Introduction to Linear Model December 7, 2016 20 / 29
Matrix Representation
Example 3 (Multiple Linear Regression)
Consider the model where a response variable Y is linearly related to
several independent variables, say x1, · · · , xk via
Yi = β0 + β1xi1 + · · ·+ β1xk + εi for i = 1, 2, . . . , n, where εi are
uncorrelated with mean 0 and common variance σ2 > 0. The matrix form
of this model is Y = Xβ + ε, where Yn×1 =
(Y1...Yn
),
Xn×p =
1 x11 ··· x1k...
.... . .
...1 xn1 ··· xnk
, βp×1 =
(β0...βk
), and εn×1 =
( ε1...εn
).
Ta-Wei Huang Introduction to Linear Model December 7, 2016 21 / 29
The Least Square Estimator
Introduction
Consider the GM linear mode Y = Xβ + ε, where Y is an n× 1 vector of
observed responses, X is an n× p matrix of functions of input variables, β
is a p× 1 unknown parameters needed estimating, and ε is an n× 1 vector
of random errors.
If Y is a random vector but input variables x1, · · · , xn are fixed
constants, then E(Y) = Xβ and cov(Y ) = σ2I.
If Y and input variables X1, · · · , Xn are all random, then
E(Y|X1, · · · , Xn) = Xβ and cov(Y |X1, · · · , Xn) = σ2I.
Ta-Wei Huang Introduction to Linear Model December 7, 2016 22 / 29
The Least Square Estimator
A Geometric Viewpoint: Simple Case 1
Now, consider a simple regression model Yi = β0 + β1xi + εi, i = 1, 2, 3.
⇒(Y1Y2Y3
)=
(1 x11 x21 x3
)(β0β1
)+(ε1ε2ε3
)= β0
(111
)+ β1
(x1x2x3
)+(ε1ε2ε3
)Then the random vector Y ∈ R3 with
two dimensions coming from β0
(111
)+ β1
(x1x2x3
)one dimension coming from
(ε1ε2ε3
)Let β =
(β0β1
)be an estimator for β =
(β0β1
).
Ta-Wei Huang Introduction to Linear Model December 7, 2016 23 / 29
The Least Square Estimator
A Geometric Viewpoint: Simple Case 2
Let Ω = β0(
111
)+ β1
(x1x2x3
): β0, β1 ∈ R. Then dim(Ω) = 2.
Finding β is equivalent to finding a vector Xβ on two-dimensional Ω.
Our target is to find β such that Y = Xβ + ε = Xβ + ε.
Question
What Y = Xβ captures information best?
Answer
Find β such that Y = Xβ close to Y. ⇒ What is the meaning of
”close”? The Euclidean distance?
Ta-Wei Huang Introduction to Linear Model December 7, 2016 24 / 29
The Least Square Estimator
A Geometric Viewpoint: Simple Case 3
Ta-Wei Huang Introduction to Linear Model December 7, 2016 25 / 29
The Least Square Estimator
A General Case
Consider the GM linear mode Y = Xβ + ε, where Y is an n× 1 vector of
observed responses, X is an n× p matrix of functions of input variables, β
is a p× 1 unknown parameters needed estimating, and ε is an n× 1 vector
of random errors.
Question
The Euclidean distance is (Y −Xβ)′(Y −Xβ) = ε′ε. Under what
assumptions is this a good measure for closeness?
Ta-Wei Huang Introduction to Linear Model December 7, 2016 26 / 29
The Least Square Estimator
Assumptions
Assumptions: cov(ε) = σ2I
Homoskedasticity
Uncorrelation
Ta-Wei Huang Introduction to Linear Model December 7, 2016 27 / 29
The Least Square Estimator
The Ordinary Least Square Estimator
Definition (Least Square Estimator)
An estimator β is a least squares estimate of β if
β = arg minβ∈Rp(Y −Xβ)′(Y −Xβ).
Ta-Wei Huang Introduction to Linear Model December 7, 2016 28 / 29
Next Lecture
The Next Lecture
In next lecture, I will introduce a classical local optimization methods - the
least square model, and then discuss the geometrical interpretation of the
ordinary least square estimator.
Ta-Wei Huang Introduction to Linear Model December 7, 2016 29 / 29