Upload
lydung
View
231
Download
0
Embed Size (px)
Citation preview
Four Talks on Tensor Computations
1. Connecting to Matrix Computations
Charles F. Van Loan
Cornell University
SCAN Seminar
October 27, 2014
⊗ Four Talks on Tensor Computations ⊗ 1. Connecting to Matrix Computations 1 / 56
Overview
Preparation for the Next Big Thing...
Scalar-Level Thinking
1960’s ⇓
Matrix-Level Thinking
1980’s ⇓
Block Matrix-Level Thinking
2000’s ⇓
Tensor-Level Thinking
⇐ The factorization paradigm:LU, LDLT , QR, UΣV T , etc.
⇐ Cache utilization, parallelcomputing, LAPACK, etc.
⇐New applications, factoriza-tions, data structures, non-linear analysis, optimizationstrategies, etc.
⊗ Four Talks on Tensor Computations ⊗ 1. Connecting to Matrix Computations 3 / 56
The Matrix Factorizations Paradigm
A = UΣV T PA = LU A = QR A = GGT PAPT = LDLT QTAQ = DX−1AX = J UTAU = T AP = QR A = ULV T PAQT = LU A = UΣV T
PA = LU A = QR A = GGT PAPT = LDLT QTAQ = D X−1AX = JUTAU = T AP = QR A = ULV T PAQT = LU A = UΣV T PA = LUA = QR A = GGT PAPT = LDLT QTAQ = D X−1AX = J UTAU = TAP = QR A = ULV T PAQT = LU A = UΣV T PA = LU A = QRA = GGT PAPT = LDLT QTAQ = D X−1AX = J UTAU = TA = ULV T PAQT = LU A = UΣV T PA = LU A = QR A = GGT
PAPT = LDLT QTAQ = D X−1AX = J UTAU = T AP = QRA = ULV T PAQT = LU A = UΣV T PA = LU A = QR A = GGT
PAPT = LDLT QTAQ = D X−1AX = J AP = QR A = ULV T
PAQT = LU A = UΣV T PA = LU A = QR A = GGT PAPT = LDLT
QTAQ = D X−1AX = J UTAU = T AP = QR A = ULV T PAQT = LUA = UΣV T PA = LU A = QR A = GGT PAPT = LDLT QTAQ = DX−1AX = J UTAU = T AP = QR A = ULV T PAQT = LU A = UΣV T
PA = LU A = QR PAPT = LDLT QTAQ = D X−1AX = J UTAU = TAP = QR A = ULV T PAQT = LU A = UΣV T PA = LU A = QR
It’s a Language
⊗ Four Talks on Tensor Computations ⊗ 1. Connecting to Matrix Computations 4 / 56
Some Ways They Are Used
Conversion of a “hard” matrix problem into an easier one.
Ax = b ≡ Ly = Pb, Ux = y
Low-rank approximation.
(107-by-105) ≈ (107-by-10)× (10-by-105)
Reasoning about Data Re-Use via Block Factorizations
Aij ← Aij − AikAkj not aij ← aij − aikakj
Discovery and Insight
minQ,x ,y
‖ A1 − (A2 + xyT )Q ‖F ≤ 10−4
⊗ Four Talks on Tensor Computations ⊗ 1. Connecting to Matrix Computations 5 / 56
Tensor Factorizations and Decompositions
The same story is playing out with tensors:
= σ1 w1 ◦ v1 ◦ u1 + σ2 w2 ◦ v2 ◦ u2 + . . .
⊗ Four Talks on Tensor Computations ⊗ 1. Connecting to Matrix Computations 6 / 56
The Big Data Theme
A Changing Definition of “Big”
In Matrix Computations, to say that A ∈ IRn1×n2 is “big” is to saythat both n1 and n2 are big.
In Tensor Computations, to say that A ∈ IRn1×···×nd is “big” is to saythat n1n2 · · · nd is big and this need not require big nk . E.g.n1 = n2 = · · · = n1000 = 2.
Algorithms that scale with d will induce a transition...
Matrix-Based Scientific Computation
⇓Tensor-Based Scientific Computation
⊗ Four Talks on Tensor Computations ⊗ 1. Connecting to Matrix Computations 7 / 56
What is a Tensor?
Definition
An order-d tensor A ∈ IRn1×···×nd is a real d-dimensional arrayA(1:n1, . . . , 1:nd) where the index range in the k-th mode is from 1to nk . A tensor A ∈ IRn1×···×nd is cubical if n1 = · · · = nd .
Low-Order Tensors
A scalar is an order-0 tensor.
A vector is an order-1 tensor.
A matrix is an order-2 tensor.
We use calligraphic font to designate tensors that have order 3 or greatere.g., A, B, C, etc.
⊗ Four Talks on Tensor Computations ⊗ 1. Connecting to Matrix Computations 8 / 56
Where Might They Come From?
Discretization
A(i , j , k, `) might house the value of f (w , x , y , z) at(w , x , y , z) = (wi , xj , yk , z`).
Multiway Analysis
A(i , j , k, `) is a value that captures an interaction between fourvariables/factors.
⊗ Four Talks on Tensor Computations ⊗ 1. Connecting to Matrix Computations 9 / 56
You Have Seen Them Before
Images
A color picture is an m-by-n-by-3 tensor:
A(:, :, 1) = red pixel values
A(:, :, 2) = green pixel values
A(:, :, 3) = blue pixel values
Obvious extension of the “colon notation”. More later.
⊗ Four Talks on Tensor Computations ⊗ 1. Connecting to Matrix Computations 10 / 56
You Have Seen them Before
Block Matrices
A =
a11 a12 a13 a14 a15 a16
a21 a22 a23 a24 a25 a26
a31 a32 a33 a34 a35 a36
a41 a42 a43 a44 a45 a46
a51 a52 a53 a54 a55 a56
a61 a62 a63 a64 a65 a66
Matrix entry a45 is the (2,1) entry of the (2,3) block:
a45 ⇔ A(2, 3, 2, 1)
⊗ Four Talks on Tensor Computations ⊗ 1. Connecting to Matrix Computations 11 / 56
You Have Seen them Before
Kronecker Products
The m1m2m3-by-n1n2n3 matrix
A = A1(1:m1, 1:n1)⊗ A2(1:m2, 1:n2)⊗ A3(1:m3, 1:n3)
is a reshaping of the m1-by-m2-by-m3-by-n1-by-n2-by-n3 tensor
A(i1, i2, i3, j1, j2, j3) = A1(i1, j1)A2(i2, j2)A3(i3, j3)
⊗ Four Talks on Tensor Computations ⊗ 1. Connecting to Matrix Computations 12 / 56
The Kronecker Product
Definition
B ⊗ C is a block matrix whose ij-th block is bijC .
An Example... [b11 b12
b21 b22
]⊗ C =
[b11C b12C
b21C b22C
]
Kronecker products have a replicated block structure.
⊗ Four Talks on Tensor Computations ⊗ 1. Connecting to Matrix Computations 13 / 56
The Kronecker Product
There are three ways to regard A = B ⊗ C ...
If
A =
b11 · · · b1n...
. . ....
bm1 · · · bmn
⊗ c11 · · · c1q
.... . .
...cp1 · · · cpq
then
(i). A is an m-by-n block matrix with p-by-q blocks.
(ii). A is an mp-by-nq matrix of scalars.
(iii). A is an unfolded 4-th order tensor A ∈ IRp×q×m×n:
A(i1, i2, i3, i4) = B(i3, i4)C (i1, i2)
⊗ Four Talks on Tensor Computations ⊗ 1. Connecting to Matrix Computations 14 / 56
Kronecker Product: All Possible Entry-Entry Products
4× 9 = 36
[b11 b12
b21 b22
]⊗
c11 c12 c13
c21 c22 c23
c31 c32 c33
=
b11c11 b11c12 b11c13 b12c11 b12c12 b12c13
b11c21 b11c22 b11c23 b12c21 b12c22 b12c23
b11c31 b11c32 b11c33 b12c31 b12c32 b12c33
b21c11 b21c12 b21c13 b22c11 b22c12 b22c13
b21c21 b21c22 b21c23 b22c21 b22c22 b22c23
b21c31 b21c32 b21c33 b22c31 b22c32 b22c33
⊗ Four Talks on Tensor Computations ⊗ 1. Connecting to Matrix Computations 15 / 56
Kronecker Products of Kronecker Products
Hierarchical
B ⊗ C ⊗ D=[
b11 b12
b21 b22
]⊗
c11 c12 c13 c14
c21 c22 c23 c24
c31 c32 c33 c34
c41 c42 c43 c44
⊗ d11 d12 d13
d21 d22 d23
d31 d32 d33
A 2-by-2 block matrix whose entries are 4-by-4 block matrices whose entriesare 3-by-3 matrices.
⊗ Four Talks on Tensor Computations ⊗ 1. Connecting to Matrix Computations 16 / 56
Bridging the Gap to Matrix Computations
Tensor computations are typically disguised matrix computations andthat is because of
The Kronecker Product
A = A1 ⊗ A2 ⊗ A3 an order 6 tensor
Tensor Unfoldings
Rubik Cube −→ 3× 9 matrix
Alternating Least Squares
Multilinear optimization via component-wise optimization
⊗ Four Talks on Tensor Computations ⊗ 1. Connecting to Matrix Computations 17 / 56
The Decomposition Paradigm in Matrix Computations
Typical...
Convert the given problem into an equivalent easy-to-solve problemby using the “right” matrix decomposition.
PA = LU, Ly = Pb, Ux = y =⇒ Ax = b
Also Typical...
Uncover hidden relationships by computing the “right” decompositionof the data matrix.
A = UΣV T =⇒ A ≈r∑
i=1
σiuivTi
⊗ Four Talks on Tensor Computations ⊗ 1. Connecting to Matrix Computations 18 / 56
Two Natural Questions
Question 1
Can we solve tensor problems by converting them to equivalent,easy-to-solve problems using a tensor decomposition?
Question 2
Can we uncover hidden patterns in tensor data by computing anappropriate tensor decomposition?
Let’s explore the decomposition issue for 2-by-2-by-2 tensors...
⊗ Four Talks on Tensor Computations ⊗ 1. Connecting to Matrix Computations 19 / 56
The Singular Value Decomposition
The 2-by-2 case...
[a11 a12
a21 a22
]=
[u11 u12
u21 u22
] [σ1 00 σ2
] [v11 v12
v21 v22
]T
= σ1
[u11
u21
] [v11
v21
]T
+ σ2
[u12
u22
] [v12
v22
]T
A reshaped presentation of the same thing...a11
a21
a12
a22
= σ1
v11u11
v11u21
v21u11
v21u21
+ σ2
v12u12
v12u22
v22u12
v22u22
The reshaped rank-1 matrices have special structure...
⊗ Four Talks on Tensor Computations ⊗ 1. Connecting to Matrix Computations 20 / 56
The Kronecker Product (KP)
The KP of a Pair of 2-vectors
[x1
x2
]⊗
[y1
y2
]=
x1y1
x1y2
x2y1
x2y2
2-by-2 SVD in KP Terms...
a11
a21
a12
a22
= σ1
[v11
v21
]⊗
[u11
u21
]+ σ2
[v12
v22
]⊗
[u12
u22
]
⊗ Four Talks on Tensor Computations ⊗ 1. Connecting to Matrix Computations 21 / 56
Reshaping
Appreciate the Duality..
[a11 a12
a21 a22
]= σ1 · u1v
T1 + σ2 · u2v
T2
a11
a21
a12
a22
= σ1 · (v1 ⊗ u1) + σ2 · (v2 ⊗ u1)
U =[
u1 u2
]V =
[v1 v2
]
⊗ Four Talks on Tensor Computations ⊗ 1. Connecting to Matrix Computations 22 / 56
A ∈ IR2×2×2 as a minimal sum of rank-1 tensors.
What is a rank-1 tensor?
If R ∈ IR2×2×2 has unit rank, then there exist f , g , h ∈ IR2 such that
r111
r211
r121
r221
r112
r212
r122
r222
=
[h1
h2
]⊗
[g1
g2
]⊗
[f1f2
]
This means that if R = (rijk), then
rijk = hk · gj · fi i = 1:2, j = 1:2, k = 1:2
⊗ Four Talks on Tensor Computations ⊗ 1. Connecting to Matrix Computations 23 / 56
A ∈ IR2×2×2 as a minimal sum of rank-1 tensors.
Challenge: Find thinnest possible X ,Y ,Z ∈ IR2×r so
a111
a211
a121
a221
a112
a212
a122
a222
=
r∑k=1
zk ⊗ yk ⊗ xk
where X = [ x1| · · · |xr ], Y = [ y1| · · · |yr ], andZ = [ z1| · · · |zr ] arecolumn partitionings.
The minimizing r is the tensor rank.
⊗ Four Talks on Tensor Computations ⊗ 1. Connecting to Matrix Computations 24 / 56
A ∈ IR2×2×2 as a minimal sum of rank-1 tensors.
A Surprising Fact
If
a111
a211
a121
a221
a112
a212
a122
a222
= randn(8,1), then
rank = 2 with prob 79%
rank = 3 with prob 21%
This is Different from the Matrix Case...
If A = randn(n,n), then rank(A) = n with prob 100%.
What are the “full rank” 2-by-2-by-2 tensors?
⊗ Four Talks on Tensor Computations ⊗ 1. Connecting to Matrix Computations 25 / 56
The 79/21 Property
The 2-by-2 Generalized Eigenvalue Problem (GEV)
The generalized eigenvalues of F − λG are the zeros of
det
([f11 f12
f21 f22
]− λ
[g11 g12
g21 g22
])= 0
Coincidence?
If the aijk are randn, then
det
([a111 a121
a211 a221
]− λ
[a112 a122
a212 a222
])= 0
has real distinct roots with probability 79% and complex conjugateroots with probability 21%.
What is the connection between the 2-by-2 GEV and the 2-by-2-by-2 tensorrank problem?
⊗ Four Talks on Tensor Computations ⊗ 1. Connecting to Matrix Computations 26 / 56
Nearness Problems
The Nearest Rank-1 Matrix Problem
If A ∈ IRn1×n2 has SVD A = UΣV T then Bopt = σ1u1vT1 minimizes
φ(B) = ‖ A− B ‖F rank(B) = 1.
The Nearest Rank-1 Tensor Problem for A ∈ IR2×2×2
Find unit 2-norm vectors u, v ,w ∈ IR2 and scalar σ so that‖ a− σ · w ⊗ v ⊗ u ‖2 is minimized where
a =
a111
a211
a121
a221
a112
a212
a122
a222
⊗ Four Talks on Tensor Computations ⊗ 1. Connecting to Matrix Computations 27 / 56
A Highly Structured Nonlinear Optimization Problem
It Depends on Four Parameters...
φ(σ, θ1, θ2, θ3) =
∥∥∥∥a − σ
[cos(θ3)sin(θ3)
]⊗
[cos(θ2)sin(θ2)
]⊗
[cos(θ1)sin(θ1)
]∥∥∥∥2
=
∥∥∥∥∥∥∥∥∥∥∥∥∥∥∥∥
a111
a211
a121
a221
a112
a212
a122
a222
− σ ·
c3c2c1
c3c2s1c3s2c1
c3s2s1s3c2c1
s3c2s1s3s2c1
s3s2s1
∥∥∥∥∥∥∥∥∥∥∥∥∥∥∥∥2
ci = cos(θi ), si = sin(θi ) i = 1:3
⊗ Four Talks on Tensor Computations ⊗ 1. Connecting to Matrix Computations 28 / 56
A Highly Structured Nonlinear Optimization Problem
And It Can Be Reshaped...
φ =
‚‚‚‚‚‚‚‚‚‚‚‚‚‚‚‚
266666666664
a111
a211
a121
a221
a112
a212
a122
a222
377777777775− σ ·
266666666664
c3c2c1
c3c2s1
c3s2c1
c3s2s1
s3c2c1
s3c2s1
s3s2c1
s3s2s1
377777777775
‚‚‚‚‚‚‚‚‚‚‚‚‚‚‚‚2
=
‚‚‚‚‚‚‚‚‚‚‚‚‚‚‚‚
266666666664
a111
a211
a121
a221
a112
a212
a122
a222
377777777775−
266666666664
c3c2 00 c3c2
c3s2 00 c3s2
s3c2 00 s3c2
s3s2 00 s3s2
377777777775
»x1
y1
– ‚‚‚‚‚‚‚‚‚‚‚‚‚‚‚‚2
[x1
y1
]= σ ·
[c1
s1
]= σ ·
[cos(θ1)sin(θ1)
]Idea: Improve σ and θ1 by minimizing with respect to x1 and y1, holding θ2
and θ3 fixed.
⊗ Four Talks on Tensor Computations ⊗ 1. Connecting to Matrix Computations 29 / 56
A Highly Structured Nonlinear Optimization Problem
And It Can Be Reshaped...
φ =
‚‚‚‚‚‚‚‚‚‚‚‚‚‚‚‚
266666666664
a111
a211
a121
a221
a112
a212
a122
a222
377777777775− σ ·
266666666664
c3c2c1
c3c2s1
c3s2c1
c3s2s1
s3c2c1
s3c2s1
s3s2c1
s3s2s1
377777777775
‚‚‚‚‚‚‚‚‚‚‚‚‚‚‚‚2
=
‚‚‚‚‚‚‚‚‚‚‚‚‚‚‚‚
266666666664
a111
a211
a121
a221
a112
a212
a122
a222
377777777775−
266666666664
c3c1 0c3s1 00 c3c1
0 c3s1
s3c1 0s3s1 00 s3c1
0 s3s1
377777777775
»x2
y2
– ‚‚‚‚‚‚‚‚‚‚‚‚‚‚‚‚2
[x2
y2
]= σ ·
[c2
s2
]= σ ·
[cos(θ2)sin(θ2)
]Idea: Improve σ and θ2 by minimizing with respect to x2 and y2, holding θ1
and θ3 fixed.
⊗ Four Talks on Tensor Computations ⊗ 1. Connecting to Matrix Computations 30 / 56
A Highly Structured Nonlinear Optimization Problem
And It Can Be Reshaped...
φ =
‚‚‚‚‚‚‚‚‚‚‚‚‚‚‚‚
266666666664
a111
a211
a121
a221
a112
a212
a122
a222
377777777775− σ ·
266666666664
c3c2c1
c3c2s1
c3s2c1
c3s2s1
s3c2c1
s3c2s1
s3s2c1
s3s2s1
377777777775
‚‚‚‚‚‚‚‚‚‚‚‚‚‚‚‚2
=
‚‚‚‚‚‚‚‚‚‚‚‚‚‚‚‚
266666666664
a111
a211
a121
a221
a112
a212
a122
a222
377777777775−
266666666664
c2c1 0c2s1 0s2c1 0s2s1 00 c2s1
0 c2s1
0 s2c1
0 s2s1
377777777775
»x3
y3
– ‚‚‚‚‚‚‚‚‚‚‚‚‚‚‚‚2
[x3
y3
]= σ ·
[c3
s3
]= σ ·
[cos(θ3)sin(θ3)
]Idea: Improve σ and θ3 by minimizing with respect to x3 and y3, holding θ1
and θ2 fixed.
⊗ Four Talks on Tensor Computations ⊗ 1. Connecting to Matrix Computations 31 / 56
Componentwise Optimization
A Common Framework for Tensor-Related Optimization
Choose a subset of the unknowns such that if they are(temporarily) fixed, then we are presented with some standardmatrix problem in the remaining unknowns.
By choosing different subsets, cycle through all the unknowns.
Repeat until converged.
In tensor computations, the “standard matrix problem” that we end upsolving is usually the linear least squares problem. In that case, the overall
solution process is referred to as alternating least squares.
⊗ Four Talks on Tensor Computations ⊗ 1. Connecting to Matrix Computations 32 / 56
Bridging the Gap via Tensor Unfoldings
A Common Framework for Tensor Computations...
1. Turn tensor A into a matrix A.
2. Through matrix computations, discover things about A.
3. Draw conclusions about tensor A based on what is learned aboutmatrix A.
⊗ Four Talks on Tensor Computations ⊗ 1. Connecting to Matrix Computations 33 / 56
Tensor Unfoldings
Some Tensor Unfoldings of A ∈ IR4×3×2
A(1) =
2664a111 a121 a131 a112 a122 a132
a211 a221 a231 a212 a222 a232
a311 a321 a331 a312 a322 a332
a411 a421 a431 a412 a422 a432
3775
A(2) =
24 a111 a211 a311 a411 a112 a212 a312 a412
a121 a221 a321 a421 a122 a222 a322 a422
a131 a231 a331 a431 a132 a232 a332 a432
35A(3) =
»a111 a211 a311 a411 a121 a221 a321 a421 a131 a231 a331 a431
a112 a212 a312 a412 a122 a222 a322 a422 a132 a232 a332 a432
–
A “tensor unfolding” is sometimes referred to as “tensor flattening” or as a“tensor matricization.”
⊗ Four Talks on Tensor Computations ⊗ 1. Connecting to Matrix Computations 34 / 56
Tensor Unfoldings: How They Might Show Up
Gradients of Multilinear Forms
If f :IRn1 × IRn2 × IRn3 → IR is defined by
f (u, v ,w) =n1∑
i1=1
n2∑i2=1
n3∑i3=1
A(i1, i2, i3)u(i1)v(i2)w(i3)
and A ∈ IRn1×n2×n3 , then its gradient is given by
∇f (u, v ,w) =
A(1) w ⊗ v
A(2) w ⊗ u
A(3) v ⊗ u
where A(1), A(2), and A(3) are the modal unfoldings of A.
⊗ Four Talks on Tensor Computations ⊗ 1. Connecting to Matrix Computations 35 / 56
More General Unfoldings
Example: A = A(1:2, 1:3, 1:2, 1:2, 1:3)
B =
(1,1) (2,1) (1,2) (2,2) (1,3) (2,3)2666666666666666666666664
a11111 a11121 a11112 a11122 a11113 a11123
a21111 a21121 a21112 a21122 a21113 a21123
a12111 a12121 a12112 a12122 a12113 a12123
a22111 a22121 a22112 a22122 a22113 a22123
a13111 a13121 a13112 a13122 a13113 a13123
a23111 a23121 a23112 a23122 a23113 a23123
a11211 a11221 a11212 a11222 a11213 a11223
a21211 a21221 a21212 a21222 a21213 a21223
a12211 a12221 a12212 a12222 a12213 a12223
a22211 a22221 a22212 a22222 a22213 a22223
a13211 a13221 a13212 a13222 a13213 a13223
a23211 a23221 a23212 a23222 a23213 a23223
3777777777777777777777775
(1,1,1)
(2,1,1)
(1,2,1)
(2,2,1)
(1,3,1)
(2,3,1)
(1,1,2)
(2,1,2)
(1,2,2)
(2,2,2)
(1,3,2)
(2,3,2)
⊗ Four Talks on Tensor Computations ⊗ 1. Connecting to Matrix Computations 36 / 56
A Block Matrix is an Unfolded Order-4 Tensor
Some Natural Identifications
A block matrix with uniformly sized blocks
A =
A11 · · · A1,c2
.... . .
...Ar2,1 · · · Ar2,c2
Ai2,j2 ∈ IRr1×c1
has several natural reshapings:
Ai1,j1(i2, j2) ↔ A(i1, j1, i2, j2) A ∈ IRr1×r2×c1×c2
Ai2,j2(i1, j1) ↔ A(i1, j1, i2, j2) A ∈ IRr1×c1×r2×c2
⊗ Four Talks on Tensor Computations ⊗ 1. Connecting to Matrix Computations 37 / 56
Kronecker Products are Unfolded Tensors
The m1m2m3-by-n1n2n3 matrix
A = A1(1:m1, 1:n1)⊗ A2(1:m2, 1:n2)⊗ A3(1:m3, 1:n3)
is a particular unfolding of the m1-by-m2-by-m3-by-n1-by-n2-by-n3
tensor
A(i1, i2, i3, j1, j2, j3) = A1(i1, j1)A2(i2, j2)A3(i3, j3)
⊗ Four Talks on Tensor Computations ⊗ 1. Connecting to Matrix Computations 38 / 56
Parts of a Tensor
Fibers
A fiber of a tensor A is a vector obtained by fixing all but one A’sindices. For example, if A = A(1:3, 1:5, 1:4, 1:7), then
A(2, :, 4, 6) = A(2, 1:5, 4, 6) =
A(2, 1, 4, 6)A(2, 2, 4, 6)A(2, 3, 4, 6)A(2, 4, 4, 6)A(2, 5, 4, 6)
is a fiber. We adopt the convention that a fiber is a column-vector.
⊗ Four Talks on Tensor Computations ⊗ 1. Connecting to Matrix Computations 39 / 56
The vec Operation
Turns matrices into vectors by stacking columns...
X =
1 102 203 30
⇒ vec(X ) =
123102030
⊗ Four Talks on Tensor Computations ⊗ 1. Connecting to Matrix Computations 40 / 56
The vec Operation
Turns tensors into vectors by stacking mode-1 fibers...
A ∈ IR2×3×2 ⇒ vec(A) =
A(:, 1, 1)
A(:, 2, 1)
A(:, 3, 1)
A(:, 1, 2)
A(:, 2, 2)
A(:, 3, 2)
=
a111
a211
a121
a221
a131
a231
a112
a212
a122
a222
a132
a232
We need to specify the order of the stacking...
⊗ Four Talks on Tensor Computations ⊗ 1. Connecting to Matrix Computations 41 / 56
Parts of a Tensor
Slices
A slice of a tensor A is a matrix obtained by fixing all but two of A’sindices. For example, if A = A(1:3, 1:5, 1:4, 1:7), then
A(:, 3, :, 6) =
A(1, 3, 1, 6) A(1, 3, 2, 6) A(1, 3, 3, 6) A(1, 3, 4, 6)A(2, 3, 1, 6) A(2, 3, 2, 6) A(2, 3, 3, 6) A(2, 3, 4, 6)A(3, 3, 1, 6) A(3, 3, 2, 6) A(3, 3, 3, 6) A(3, 3, 4, 6)
is a slice. We adopt the convention that the first unfixed index in thetensor is the row index of the slice and the second unfixed index inthe tensor is the column index of the slice.
⊗ Four Talks on Tensor Computations ⊗ 1. Connecting to Matrix Computations 42 / 56
Tensor Notation: Subscript Vectors
Referencing a Tensor Entry
If A ∈ IRn1×···×nd and i = (i1, . . . , id) with 1 ≤ ik ≤ nk for k = 1:d ,then
A(i) ≡ A(i1, . . . , ik)
We say that i is a subscript vector. Bold font will be used designatesubscript vectors.
Bounding Subscript Vectors
If L and R are subscript vectors having the same dimension, thenL ≤ R means that Lk ≤ Rk for all k.
Special Subscript Vectors
The subscript vector of all ones is denoted by 1. (Dimension clearfrom context.) If N is an integer, then N = N · 1.
⊗ Four Talks on Tensor Computations ⊗ 1. Connecting to Matrix Computations 43 / 56
Subtensors
Specification of Subtensors
Suppose A ∈ IRn1×···×nd with n = [ n1, . . . , nd ]. If 1 ≤ L ≤ R ≤ n,then A(L:R) denotes the subtensor
B = A(L1:R1, . . . , Ld :Rd)
⊗ Four Talks on Tensor Computations ⊗ 1. Connecting to Matrix Computations 44 / 56
The Index-Mapping Function col
Definition
If i and n are length-d subscript vectors, then the integer-valuedfunction col(i,n) is defined by
col(i,n) = i1 + (i2 − 1)n1 + (i3 − 1)n1n2 + · · ·+ (id − 1)n1 · · · nd−1
The Formal Specification of Vec
If A ∈ IRn1×···×nd and v = vec(A), then
a (col(i,n)) = A(i) 1 ≤ i ≤ n
⊗ Four Talks on Tensor Computations ⊗ 1. Connecting to Matrix Computations 45 / 56
The Importance of Notation
Contractions
A(i1, i2, j1, j2, j3) =∑k1
∑k2
B(i1, i2, k1, k2)C(k1, k2, j1, j2, j3)
is better written as
A(i, j) =∑k
B(i, k)C(k, j)
⊗ Four Talks on Tensor Computations ⊗ 1. Connecting to Matrix Computations 46 / 56
Software: The Matlab Tensor Toolbox
Why?
It provides an excellent environment for learning about tensorcomputations and for building a facility with multi-index reasoning.
Who?
Brett W. Bader and Tamara G. Kolda, Sandia Laboratories
Where?
http://csmr.ca.sandia.gov/∼tkolda/TensorToolbox/
⊗ Four Talks on Tensor Computations ⊗ 1. Connecting to Matrix Computations 47 / 56
Matlab Tensor Toolbox: A Tensor is a Structure
>> n = [3 5 4 7];
>> A_array = randn(n);
>> A = tensor(A_array );
>> fieldnames(A)
ans =
’data’
’size’
The .data field is a multi-dimensional array.
The .size field is an integer vector that specifies the modal dimensions.
A.data and A array have same value.
A.size and n have the same value.
⊗ Four Talks on Tensor Computations ⊗ 1. Connecting to Matrix Computations 47 / 56
Matlab Tensor Toolbox: Order and Dimension
>> n = [3 5 4 7];
>> A_array = randn(n);
>> A = tensor(A_array)
>> d = ndims(A)
d =
4
>> n = size(A)
n =
3 5 4 7
⊗ Four Talks on Tensor Computations ⊗ 1. Connecting to Matrix Computations 47 / 56
Matlab Tensor Toolbox: Tensor Operations
n = [3 5 4 7];
% Familiar initializations ...
X = tenzeros(n); Y = tenones(n);
A = tenrand(n); B = tenrand(n);
% Extracting Fibers and Slices
a_Fiber = A(2,:,3,6); a_Slice = A(3,:,2,:);
% These operations are legal ...
C = 3*A; C = -A; C = A+1; C = A.^2;
C = A + B; C = A./B; C = A.*B; C = A.^B;
% Frobenius norm ...
err = norm(A);
% Applying a Function to Each Entry ...
F = tenfun(@sin ,A);
G = tenfun(@(x) sin(x)./exp(x),A);
⊗ Four Talks on Tensor Computations ⊗ 1. Connecting to Matrix Computations 48 / 56
The Matlab Tensor Toolbox
Problem 1. Suppose A = A(1:n1, 1:n2, 1:n3). We say A(i) is an interiorentry if 1 < i < n. Otherwise, we say A(i) is an edge entry. If A(i1, i2, i3) isan interior entry, then it has six neighbors: A(i1 ± 1, i2, i3),A(i1, i2 ± 1, i3),and A(i1, i2, i3 ± 1). Implement the following function so that it performsas specified
function B = Smooth(A)
% A is a third order tensor
% B is a third order tensor with the property that each
% interior entry B(i) is the average of A(i)’s six
% neighbors. If B(i) is an edge entry, then B(i) = A(i).
Strive for an implementation that does not have any loops.
Problem 2. Formulate and solve an order-d version of Problem 1.
⊗ Four Talks on Tensor Computations ⊗ 1. Connecting to Matrix Computations 48 / 56
Modal Unfoldings Using tenmat
Example: A Mode-2 Unfolding of A ∈ IR4×3×2
tenmat(A,2) sets up A(2): a111 a211 a311 a411 a112 a212 a312 a412
a121 a221 a321 a421 a122 a222 a322 a422
a131 a231 a331 a431 a132 a232 a332 a432
(1,1) (2,1) (3,1) (4,1) (1,2) (2,2) (3,2) (4,2)
Notice how the fibers are ordered.
⊗ Four Talks on Tensor Computations ⊗ 1. Connecting to Matrix Computations 49 / 56
Modal Unfoldings Using tenmat
Precisely how are the fibers ordered?
If A ∈ IRn1×···×nd , N = n1 · · · nd , and B = tenmat(A,k), then B isthe matrix A(k) ∈ IRnk×(N/nk ) with
A(k)(ik , col(ik, n)) = A(i)where
ik = [i1, . . . , ik−1, ik+1, . . . , id ]
nk = [n1, . . . , nk−1, nk+1, . . . ,md ]
Recall that the col function maps multi-indices to integers. For example, ifn = [2 3 2] then
i (1, 1, 1) (2, 1, 1) 1, 2, 1) (2, 2, 1) (1, 3, 1) (2, 3, 1) (1, 1, 2) (2, 1, 2) ...col(i, n) 1 2 3 4 5 6 7 8 ...
⊗ Four Talks on Tensor Computations ⊗ 1. Connecting to Matrix Computations 50 / 56
Matlab Tensor Toolbox: A Tenmat Is a Structure
>> n = [3 5 4 7];
>> A = tenrand(n);
>> A2 = tenmat(A,2);
>> fieldnames(A2)
ans =
’tsize ’
’rindices ’
’cindices ’
’data’
A2.tsize = size of the unfolded tensor = [3 5 4 7]A2.rindices = mode indices that define A2’s rows = [2]A2.cindices = mode indices that define A2’s columns = [1 3 4]A2.data = the matrix that is the unfolded tensor.
⊗ Four Talks on Tensor Computations ⊗ 1. Connecting to Matrix Computations 51 / 56
Other Modal Unfoldings Using tenmat
Mode-k Unfoldings of A ∈ IRn1×···×nd
A particular mode-k unfolding is defined by a permutation v of[1:k − 1 k + 1:d
]. Tensor entries get mapped to matrix entries as
followsA(i) → A(ik , col(i(v),n(v)))
Name v How to get it...
A(k) (Default) [1:k−1 k+1:d ] tenmat(A,k)
Forward Cyclic [k + 1:d 1:k−1] tenmat(A,k,’fc’)
Backward Cyclic [k − 1:− 1:1 d :−1:k+1] tenmat(A,k,’bc’)
Arbitrary v tenmat(A,k,v)
⊗ Four Talks on Tensor Computations ⊗ 1. Connecting to Matrix Computations 51 / 56
Using tenmat
Problem 3. Write a Matlab function alpha = MaxFnorm(A) that returnsthe maximum value of ‖ B ‖F where B is a slice of the input tensorA ∈ IRn1×···×nd .
⊗ Four Talks on Tensor Computations ⊗ 1. Connecting to Matrix Computations 52 / 56
Summary
Key Words
Reshaping is about turning a matrix in to a differently sizedmatrix or into a tensor or into a vector.
The Kronecker product is an operation between two matricesthat produces a highly structured block matrix.
A Rank-1 tensor can be reshaped into a multiple Kroneckerproduct of vectors.
A tensor decomposition expresses a given tensor in terms ofsimpler tensors.
The alternating least squares approach to multilinear LS is basedon solving a sequence of linear LS problems, each obtained byfreezing all but a subset of the unknowns.
⊗ Four Talks on Tensor Computations ⊗ 1. Connecting to Matrix Computations 53 / 56
Summary
Key Words
A block matrix is a matrix whose entries are matrices.
A fiber of a tensor is a column vector defined by fixing all butone index and varying what’s left. A slice of a tensor is a matrixdefined by fixing all but two indices and varying what’s left.
A mode-k unfolding of a tensor is obtained by assembling all themode-k fibers into a matrix.
tensor is a Tensor Toolbox command used to construct tensors.
tenmat is a Tensor Toolbox command that is used to constructunfoldings of a tensor.
⊗ Four Talks on Tensor Computations ⊗ 1. Connecting to Matrix Computations 54 / 56
Summary
Recurring themes...
Given tensor A ∈ IRn1×···×nd , there are many ways to unfold a tensorinto a matrix A ∈ IRN1×N2 where N1N2 = n1 · · · nd .
For example, A could be a block matrix whose entries are A-slices.
A facility with block matrices and tensor indexing is required tounderstand the layout possibilities.
Computations with the unfolded tensor frequently involve theKronecker product.
⊗ Four Talks on Tensor Computations ⊗ 1. Connecting to Matrix Computations 55 / 56
References
L. De Lathauwer and B. De Moor (1998). “From Matrix to Tensor: MultilinearAlgebra and Signal Processing,” in Mathematics in Signal Processing IV, J.McWhirter and I. Proudler, eds., Clarendon Press, Oxford, 1–15.
G.H. Golub and C. Van Loan (2013). Matrix Computations, 4th Edition, JohnsHopkins University Press, Baltimore, MD.
T.G. Kolda and B.W. Bader (2009). “Tensor Decompositions and Applications,”SIAM Review 51, 455–500.
P.M. Kroonenberg (2008). Applied Multiway Data Analysis, Wiley.
A. Smilde, R. Bro, and P. Geladi (2004). Multi-Way Analysis: Applications in theChemical Sciences, Wiley, West Sussex, England.
B.W. Bader and T.G. Kolda (2006). “Algorithm 862: MATLAB Tensor Classes forFast Algorithm Prototyping,” ACM Transactions on Mathematical Software,32, 635–653.
⊗ Four Talks on Tensor Computations ⊗ 1. Connecting to Matrix Computations 56 / 56