Lecture 8: Linear algebra
DS GA 1002 Statistical and Mathematical Modelshttp://www.cims.nyu.edu/~cfgranda/pages/DSGA1002_fall15
Carlos Fernandez-Granda
11/9/2015
Linear models
Many phenomena are (approximately) linear
Linear models are interpretable
Linear models are (often) computationally tractable
Vector spaces
Inner product and norm
Orthogonality
Vector spaces
A vector space consists of a set V and two operations + and ·, such that
I For any x , y ∈ V the vector sum x + y ∈ V
I For any x ∈ V and any scalar α ∈ R the scalar multiple α · x ∈ V
I There exists a zero vector 0 such that x + 0 = x for any x ∈ V
I For any x ∈ V there exists an additive inverse −x such thatx + (−x) = 0
Vector spaces
A vector space consists of a set V and two operations + and ·, such that
I For all x , y ∈ V
x + y = y + x , (x + y) + z = x + (y + z)
I For any α, β ∈ R and x ∈ V
α (β · x) = (αβ) · x
I For all α, β ∈ R and x , y ∈ V
(α+ β) · x = α · x + β · x , α · (x + y) = α · x + α · y
A subspace of V is a subset that is itself a vector space
Examples
Rn
Infinite sequences
Polynomials of a certain degree
Zero-mean random variables
Linear dependence/independence
x1, x2, . . . , xmV are linearly dependent if there exist α1, . . . , αm
not all equal to zero such that
m∑i=1
αi xi = 0.
Otherwise, they are linearly independent
Span
The span of x1, . . . , xm is the set of all possible linear combinations
span (x1, . . . , xm) :=
{y | y =
m∑i=1
αi xi for some α1, α2, . . . , αm ∈ R
}
The span is a vector space
Basis
A basis of V is a set of independent vectors {x1, . . . , xm} such that
V = span (x1, . . . , xm) .
All bases in a vector space have the same cardinality
The dimension of a vector space is the cardinality of its bases
Vector spaces
Inner product and norm
Orthogonality
Inner product
Operation 〈·, ·〉 that maps pairs of vectors to R
I It is symmetric, for any x , y ∈ V
〈x , y〉 = 〈y , x〉
I It is linear, i.e. for any α ∈ R and any x , y , z ∈ V
〈α x , y〉 = α 〈y , x〉 , 〈x + y , z〉 = 〈x , z〉+ 〈y , z〉
I It is positive semidefinite:
For any x ∈ V 〈x , x〉 ≥ 0 and 〈x , x〉 = 0 implies x = 0
Examples
Dot product of vectors x , y ∈ Rn
x · y :=∑
i
x [i ] y [i ]
Covariance E (XY ) of zero-mean random variables X and Y
Norm
Function ||·|| from V to R such that
I It is homogeneous. For all α ∈ R and x ∈ V
||α x || = |α| ||x ||
I It satisfies the triangle inequality
||x + y || ≤ ||x ||+ ||y ||
In particular, it is nonnegative (set y = −x)
I ||x || = 0 implies that x is the zero vector 0
Distance
The distance between vectors in a normed space is
d (x , y) := ||x − y ||
Inner-product norm
The norm induced by an inner product is
||x ||〈·,·〉 :=√〈x , x〉
The Euclidean or `2 norm is induced by the dot product in Rn,
||x ||2 :=√
x · x =
√√√√ n∑i=1
x2i
The standard deviation is the norm induced by the covariance
σX =√
E (X 2)
Cauchy-Schwarz inequality
For any two vectors x and y in an inner-product space
|〈x , y〉| ≤ ||x ||〈·,·〉 ||y ||〈·,·〉
Assume ||x ||〈·,·〉 6= 0,
〈x , y〉 = − ||x ||〈·,·〉 ||y ||〈·,·〉 ⇐⇒ y = −||y ||〈·,·〉||x ||〈·,·〉
x
〈x , y〉 = ||x ||〈·,·〉 ||y ||〈·,·〉 ⇐⇒ y =||y ||〈·,·〉||x ||〈·,·〉
x
Corollary: Inner-product norms satisfy the triangle inequality
Vector spaces
Inner product and norm
Orthogonality
Orthogonality
x and y are orthogonal if 〈x , y〉 = 0
x is orthogonal to a set S, if
〈x , s〉 = 0, for all s ∈ S.
Two sets S1, S2 are orthogonal
〈x , y〉 = 0, for any x ∈ S1, y ∈ S2
The orthogonal complement of a subspace S is
S⊥ := {x | 〈x , y〉 = 0 for all y ∈ S} .
Pythagorean theorem
If x and y are orthogonal
||x + y ||2〈·,·〉 = ||x ||2〈·,·〉 + ||y ||
2〈·,·〉
Orthogonality between vector and subspace
If for any basis b1, b2, . . . , bn of V
〈x , bi 〉 = 0, 1 ≤ i ≤ n,
then x is orthogonal to V
Orthonormal basis
Basis of mutually orthogonal vectors with norm equal to one
If {u1, . . . , un} is an orthonormal basis of V
x =n∑
i=1
〈ui , x〉 ui .
for any vector x ∈ V
Gram-Schmidt
Every finite-dimensional vector space has an orthonormal basis
Input: A set of linearly independent vectors {x1, . . . , xm} ⊆ Rn
Output: An orthonormal basis {u1, . . . , um} for span (x1, . . . , xm).
Initialization: Set u1 := x1/ ||x1||2.
For i = 1, . . . ,m compute
vi := xi −i−1∑j=1
〈uj , xi 〉 uj
and set ui := vi/ ||vi ||2