Upload
vanhuong
View
217
Download
0
Embed Size (px)
Citation preview
Sketching as a Tool for Numerical LinearAlgebra
David P. Woodruffpresented by Sepehr Assadi
o(n) Big Data Reading GroupUniversity of Pennsylvania
February, 2015
Sepehr Assadi (Penn) Sketching for Numerical Linear Algebra Big Data Reading Group 1 / 25
GoalNew survey by David Woodruff:
I Sketching as a Tool for Numerical Linear AlgebraTopics:
I Subspace EmbeddingsI Least Squares RegressionI Least Absolute Deviation RegressionI Low Rank ApproximationI Graph SparsificationI Sketching Lower Bounds
Sepehr Assadi (Penn) Sketching for Numerical Linear Algebra Big Data Reading Group 2 / 25
GoalNew survey by David Woodruff:
I Sketching as a Tool for Numerical Linear AlgebraTopics:
I Subspace EmbeddingsI Least Squares RegressionI Least Absolute Deviation RegressionI Low Rank ApproximationI Graph SparsificationI Sketching Lower Bounds
Sepehr Assadi (Penn) Sketching for Numerical Linear Algebra Big Data Reading Group 3 / 25
IntroductionYou have “Big” data!
I Computationally expensive to deal withI Excessive storage requirementI Hard to communicateI . . .
Sepehr Assadi (Penn) Sketching for Numerical Linear Algebra Big Data Reading Group 4 / 25
IntroductionYou have “Big” data!
I Computationally expensive to deal withI Excessive storage requirementI Hard to communicateI . . .
Summarize your dataI Sampling
F A representative subset of the dataI Sketching
F An aggregate summary of the whole data
Sepehr Assadi (Penn) Sketching for Numerical Linear Algebra Big Data Reading Group 5 / 25
ModelInput:
I matrix A ∈ Rn×d
I vector b ∈ Rn .Output: function F(A,b, . . .)
I e.g. least square regressionDifferent goals:
I Faster algorithmsI StreamingI Distributed
Sepehr Assadi (Penn) Sketching for Numerical Linear Algebra Big Data Reading Group 6 / 25
Linear SketchingInput:
I matrix A ∈ Rn×d
Let r n and S ∈ Rr×n be a random matrixLet S ·A be the sketchCompute F(S ·A) instead of F(A)
Sepehr Assadi (Penn) Sketching for Numerical Linear Algebra Big Data Reading Group 7 / 25
Linear Sketching (cont.)Pros:
I Compute on a r × d matrix instead of n × dI Smaller representation and faster computationI Linearity:
F S · (A + B) = S ·A + S ·BF We can compose linear sketches !
Cons:I F(S ·A) is an approximation of F(A)
Sepehr Assadi (Penn) Sketching for Numerical Linear Algebra Big Data Reading Group 8 / 25
Least Square Regression (`2-regression)Input:
I matrix A ∈ Rn×d (full column rank)I vector b ∈ Rn
Output x∗ ∈ Rd :
x∗ = arg minx‖Ax− b‖2
Closed form solution:
x∗ = (ATA)−1ATb
Θ(nd2)-time algorithm using naive matrix multiplication
Sepehr Assadi (Penn) Sketching for Numerical Linear Algebra Big Data Reading Group 9 / 25
Approximate `2-regressionInput:
I matrix A ∈ Rn×d (full column rank)I vector b ∈ Rn
I parameter 0 < ε < 1Output x ∈ Rd :
‖Ax− b‖2 ≤ (1 + ε) arg minx‖Ax− b‖2
Sepehr Assadi (Penn) Sketching for Numerical Linear Algebra Big Data Reading Group 10 / 25
Approximate `2-regression (cont.)A sketching algorithm:
I Sample a random matrix S ∈ Rr×n
I Compute S ·A and S · bI Output x = arg minx ‖(SA)x− (Sb)‖2
Which randomized family of matrices S and what value of r?
Sepehr Assadi (Penn) Sketching for Numerical Linear Algebra Big Data Reading Group 11 / 25
Approximate `2-regression (cont.)An introductory construction:
I Let r = Θ(d/ε2)I Let S ∈ Rr×n be a matrix of i.i.d normal random variables with
mean zero and variance 1/r
Proof Sketch.On the board
Sepehr Assadi (Penn) Sketching for Numerical Linear Algebra Big Data Reading Group 12 / 25
Approximate `2-regression (cont.)Problems:
I Computing S ·A takes Θ(nrd) timeI Constructing S requires Θ(nr) space
Different constructions for S:I Fast Johnson-Lindenstrauss transforms:
O(nd log d) + poly(d/ε) time [Sarlos, FOCS ’06]I Optimal O(nnz(A)) + poly(d/ε) time algorithm [Clarkson,
Woodruff, STOC ’13]I Random sign matrices with Θ(d)-wise independent entries:
O(d2/ε log (nd))-space streaming algorithm [Clarkson,Woodruff, STOC ’09]
Sepehr Assadi (Penn) Sketching for Numerical Linear Algebra Big Data Reading Group 13 / 25
Subspace EmbeddingDefinition (`2-subspace embedding)A (1± ε) `2-subspace embedding for a matrix A ∈ Rn×d is a matrixS for which for all x ∈ Rn
‖SAx‖22 = (1± ε) ‖Ax‖2
2
Actually subspace embedding for column space of AOblivious `2-subspace embedding
I The distribution from which S is chosen is oblivious to AOne very common tool for (oblivious) `2-subspace embedding isJohnson-Lindenstrauss transform (JLT)
Sepehr Assadi (Penn) Sketching for Numerical Linear Algebra Big Data Reading Group 14 / 25
Johnson-Lindenstrauss transformDefinition (JLT(ε, δ, f ))A random matrix S ∈ Rr×d forms a JLT(ε, δ, f ), if with probability atleast 1− δ, for any f -element subset V ⊆ Rn, it holds that:
∀ v,v′ ∈ V |〈Sv,Sv′〉 − 〈v,v′〉| ≤ ε ‖v‖2 ‖v′‖2
Sepehr Assadi (Penn) Sketching for Numerical Linear Algebra Big Data Reading Group 15 / 25
Johnson-Lindenstrauss transformDefinition (JLT(ε, δ, f ))A random matrix S ∈ Rr×d forms a JLT(ε, δ, f ), if with probability atleast 1− δ, for any f -element subset V ⊆ Rn, it holds that:
∀ v,v′ ∈ V |〈Sv,Sv′〉 − 〈v,v′〉| ≤ ε ‖v‖2 ‖v′‖2
Usual statement (i.e. original Johnson-Lindenstrauss Lemma)
Lemma (JLL)Given N points q1, . . . ,qN ∈ Rn, there exists a matrix S ∈ Rt×n
(linear map) for t = Θ(log N/ε2) such that with high probability,simultaneously for all pairs qi and qj ,
‖S(qi − qj)‖2 = (1± ε) ‖(qi − qj)‖2
Sepehr Assadi (Penn) Sketching for Numerical Linear Algebra Big Data Reading Group 16 / 25
Johnson-Lindenstrauss transform (cont.)A simple construction of JLT(ε, δ, f ):
TheoremLet 0 < ε, δ < 1 and S = 1√
r R ∈ Rr×n where the entries Ri,j areindependent standard normal random variables. Assumingr = Ω(ε−2 log (f /δ)) then S is a JLT(ε, δ, f ).
Other constructions:I Random sign matrices
[Achlioptas, ’03],[Clarkson, Woodruff, STOC ’09]I Random sparse matrices
[Dasgupta, Kumar, Sarlos, STOC ’10],[Kane, Nelson, J. ACM’14]
I Fast Johnson-Lindenstrauss transforms[Ailon, Chazelle, STOC ’06]
Sepehr Assadi (Penn) Sketching for Numerical Linear Algebra Big Data Reading Group 17 / 25
JLT results in `2-subspace embeddingClaimS = JLT(ε, δ, f ) is an oblivious `2-subspace embedding for A ∈ Rn×d
Challenge:I JLT(ε, δ, f ) provides a guarantee for a single finite set in Rn
I `2-subspace embedding requires the guarantee for an infiniteset, i.e. the column space of A
Sepehr Assadi (Penn) Sketching for Numerical Linear Algebra Big Data Reading Group 18 / 25
JLT results in `2-subspace embedding (cont.)Let S be the unit sphere in column space of A
S = y ∈ Rn | y = Ax for some x ∈ Rd and ‖y‖2 = 1
We seek a finite subset N ⊆ S so that if
∀ w,w′ ∈ N 〈Sw,Sw′〉 = 〈w,w′〉 ± ε
then∀ y ∈ S ‖Sy‖ = (1± ε) ‖y‖
Sepehr Assadi (Penn) Sketching for Numerical Linear Algebra Big Data Reading Group 19 / 25
JLT results in `2-subspace embedding (cont.)
Lemma (12-net for S)
Suffices to choose any N such that
∀y ∈ S ∃w ∈ N s.t. ‖y−w‖2 ≤ 1/2
Proof.1 Decompose y:
y = y(0) + y(1) + y(2) + . . .
where∥∥∥y(i)
∥∥∥2≤ 1
2i and yi
‖y(i)‖ ∈ N
2 ‖Sy‖22 =
∥∥∥S(y(0) + y(1) + y(2) + . . .)∥∥∥ = 1±O(ε)
Sepehr Assadi (Penn) Sketching for Numerical Linear Algebra Big Data Reading Group 20 / 25
12-net of SLemmaThere exists a 1
2 -net N of S for which |N | ≤ 5d
Proof.1 Find a set N ′ of maximal number of points in Rd so that no two
points are within 1/2 distance from each other2 Let U be the orthonormal matrix of column space of A3 N = y ∈ Rn | y = Ux for some x ∈ N ′ and ‖y‖2 = 1
Sepehr Assadi (Penn) Sketching for Numerical Linear Algebra Big Data Reading Group 21 / 25
Subspace Embedding via JLTTheoremLet 0 < ε, δ < 1 and S = JLT(ε, δ, 5d). For any fixed matrixA ∈ Rn×d , with probability 1− δ, S is a (1± ε) `2-subspaceembedding for A, i.e.
∀x ∈ Rd , ‖SAx‖2 = (1± ε) ‖Ax‖2
Results inI O(nnz(A) · ε−1 log d) time algorithm using column-sparsity
transform of Kane and Nelson [Kane, Nelson, J. ACM ’14]I O(nd log n) time algorithm using Fast Johnson-Lindenstrauss
transform of Ailon and Chazelle [Ailon, Chazelle, STOC ’06]
Sepehr Assadi (Penn) Sketching for Numerical Linear Algebra Big Data Reading Group 22 / 25
Other Subspace Embedding AlgorithmsNot JLT-based subspace embedding
I O(nnz(A)) + poly(d/ε) time algorithm [Clarkson, Woodruff,STOC ’13]
None oblivious subspace embeddingsI Based on Leverage Score Sampling [Drineas, Mahoney,
Muthukrishnan, SODA ’06]
Sepehr Assadi (Penn) Sketching for Numerical Linear Algebra Big Data Reading Group 23 / 25
`2-regression via Oblivious Subspace EmbeddingTheoremLet S ∈ Rr×n be any oblivious subspace embedding matrix andx = arg minx ‖SAx− Sb‖2; then,
‖SAx− Sb‖2 ≤ (1 + ε) arg minx‖Ax− b‖2
Proof.1 Let matrix U ∈ Rn×(d+1) be the orthonormal basis of columns of
A together with vector b2 Suppose S is a `2-subspace embedding for U
Sepehr Assadi (Penn) Sketching for Numerical Linear Algebra Big Data Reading Group 24 / 25
Questions?
Sepehr Assadi (Penn) Sketching for Numerical Linear Algebra Big Data Reading Group 25 / 25