Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Factorization Machine
Weike Pan
College of Computer Science and Software EngineeringShenzhen University
W.K. Pan (CSSE, SZU) FM 1 / 23
Outline
1 Notations and Problem Definition
2 MethodFactorization Machine for RatingsFactorization Machine for Ratings and Content
3 Discussion
4 Conclusion
5 References
W.K. Pan (CSSE, SZU) FM 2 / 23
Notations and Problem Definition
Reference
libFM [Rendle, 2012]
W.K. Pan (CSSE, SZU) FM 3 / 23
Notations and Problem Definition
Notations (1/4)
Table: Some notations.
n user numberm item numberu ∈ {1,2, . . . ,n} user IDi , i ′ ∈ {1,2, . . . ,m} item IDrui observed rating of user u on item iG,e.g.,G = {1,2,3,4,5} grade score set (or rating range)R ∈ {G∪?}n×m rating matrix (or explicit feedback)
yui ∈ {0,1} yui =
{
1, if (u, i , rui ) is observed
0, if (u, i , rui ) is not observedR = {(u, i , rui )} observed rating records (training data)p =
∑
u,i yui = |R| number of observed ratingsp/n/m density (or sometimes called sparsity)
W.K. Pan (CSSE, SZU) FM 4 / 23
Notations and Problem Definition
Notations (2/4)
Table: Some notations.
µ ∈ R global average rating valuebu ∈ R user biasbi ∈ R item biasd ∈ R number of latent dimensionsUu· ∈ R
1×d user-specific latent feature vectorU ∈ R
n×d user-specific latent feature matrixVi · ∈ R
1×d item-specific latent feature vectorV ∈ R
m×d item-specific latent feature matrixRte = {(u, i , rui )} rating records of test datar̂ui predicted rating of user u on item iT iteration number in the algorithm
W.K. Pan (CSSE, SZU) FM 5 / 23
Notations and Problem Definition
Notations (3/4)
Table: Some notations.
f i ∈ Rf×1 description of item i
W.K. Pan (CSSE, SZU) FM 6 / 23
Notations and Problem Definition
Notations (4/4)
Table: Some notations.
w0 ∈ R model parameter for zero-order interactionw ∈ R
z×1 model parameter for first-order interactionP ∈ R
z×d model parameter for second-order interaction
W.K. Pan (CSSE, SZU) FM 7 / 23
Method Factorization Machine for Ratings
Representation (1/3)
W.K. Pan (CSSE, SZU) FM 8 / 23
Method Factorization Machine for Ratings
Representation (2/3)
The original rating matrix R ∈ Gn×m is represented by a design matrix
X and a rating vector r ,
X ∈ {0,1}p×(n+m), r ∈ Gp×1 (1)
where p =∑n
u=1∑m
i=1 yui is the number of ratings in R.
W.K. Pan (CSSE, SZU) FM 9 / 23
Method Factorization Machine for Ratings
Representation (3/3)
For a rating record (u, i , rui):
The corresponding row of X is
x = [. . . 0 . . . 1︸︷︷︸
u
. . . 0 . . . 1︸︷︷︸
n+i
. . . 0 . . .] ∈ {0,1}1×(n+m)
where the uth and (n + i)th entries are 1s, i.e., xu = xn+i = 1, andall other entries are 0s.
The corresponding entry of r is
rui
W.K. Pan (CSSE, SZU) FM 10 / 23
Method Factorization Machine for Ratings
Prediction Rule (1/3)
The prediction rule of user u on item i
r̂ui = w0 +
n+m∑
j=1
wjxj +
n+m∑
j=1
n+m∑
j ′=j+1
xjxj ′wjj ′ (2)
where w0,wj ,wjj ′ ∈ R.
The above prediction rule includes zero-, first- and second orderinteractions.
W.K. Pan (CSSE, SZU) FM 11 / 23
Method Factorization Machine for Ratings
Prediction Rule (2/3)
The second order interaction is usually approximated via the innerproduct of two vectors,
wjj ′ = Pj ·PTj ′· (3)
where Pj ·,Pj ′· ∈ R1×d .
W.K. Pan (CSSE, SZU) FM 12 / 23
Method Factorization Machine for Ratings
Prediction Rule (3/3)
For the rating data only (without content information), we have
r̂ui = w0 +
n+m∑
j=1
wjxj +
n+m∑
j=1
n+m∑
j ′=j+1
xjxj ′Pj ·PTj ′·
= w0 + wu + wn+i + Pu·PTn+i ,·
⇒ µ+ bu + bi + Uu·V Ti ·
where P = [UT VT ]T ∈ R(n+m)×d .
Observation: for FM with rating only, it is equivalent to RSVD
W.K. Pan (CSSE, SZU) FM 13 / 23
Method Factorization Machine for Ratings
Question
Why do we need FM
W.K. Pan (CSSE, SZU) FM 14 / 23
Method Factorization Machine for Ratings and Content
Representation
When we have some content information of each item or eachuser (e.g., an item’s description or a user’s profile)
W.K. Pan (CSSE, SZU) FM 15 / 23
Method Factorization Machine for Ratings and Content
Prediction Rule
The prediction rule of user u on item i
r̂ui = w0 +z∑
j=1
wjxj +z∑
j=1
z∑
j ′=j+1
xjxj ′Pj ·PTj ′· (4)
where z = n + m + f .
W.K. Pan (CSSE, SZU) FM 16 / 23
Method Factorization Machine for Ratings and Content
Objective Function
We have the objective function,
minΘ
n∑
u=1
m∑
i=1
yui [12(rui − r̂ui)
2 + reg(w ,P)]
where reg(w ,P) = αw2
∑zj=1 δ(xj 6= 0)w2
j +αp2
∑zj=1 δ(xj 6= 0)‖Pj ·‖
2
is the regularization term used to avoid overfitting.And Θ = {w0,w ,P} are model parameters to be learned.
W.K. Pan (CSSE, SZU) FM 17 / 23
Method Factorization Machine for Ratings and Content
Gradient
Denoting fui =12(rui − r̂ui)
2 + reg(w ,P), for each (u, i , rui ) ∈ R, we havethe gradients,
∇w0 =∂fui
∂w0= −eui (5)
∇wj =∂fui
∂wj= −euixj + αw wj ,∀xj 6= 0 (6)
∇Pj · =∂fui
∂Pj ·= −euixj
z∑
j ′ 6=j
xj ′Pj ′· + αpPj ·,∀xj 6= 0 (7)
where eui = rui − r̂ui .
W.K. Pan (CSSE, SZU) FM 18 / 23
Method Factorization Machine for Ratings and Content
Update Rule
For each (u, i , rui ) ∈ R, we have the update rules,
w0 = w0 − γ∇w0 (8)
wj = wj − γ∇wj ,∀xj 6= 0 (9)
Pj · = Pj · − γ∇Pj ·,∀xj 6= 0 (10)
where γ is the learning rate.
W.K. Pan (CSSE, SZU) FM 19 / 23
Method Factorization Machine for Ratings and Content
Algorithm
1: Initialize model parameters Θ2: for t = 1, . . . ,T do3: for t2 = 1, . . . ,p do4: Randomly pick up a rating from R5: Calculate the gradients in Eq.(5-7)6: Update the parameters in Eq.(8-10)7: end for8: Decrease the learning rate γ ← γ × 0.99: end for
Figure: The SGD algorithm for FM.
Note that the above algorithm is slightly different from that inlibFM [Rendle, 2012].
W.K. Pan (CSSE, SZU) FM 20 / 23
Discussion
Discussion
Can FM incorporate an item’s taxonomy information?
Can FM incorporate a user’s profile?
Can FM incorporate a user’s social connections?
Can FM incorporate the context information such as location andtime?
W.K. Pan (CSSE, SZU) FM 21 / 23
Conclusion
Conclusion
FM can incorporate auxiliary data seamlessly.
W.K. Pan (CSSE, SZU) FM 22 / 23
Conclusion
Homework
Use the libFM software at http://www.libfm.org/
Read the libFM paper [Rendle, 2012]
Read chapter 3 of Recommender Systems: An Introduction
W.K. Pan (CSSE, SZU) FM 23 / 23
References
Rendle, S. (2012).
Factorization machines with libfm.ACM Transactions on Intelligent Systems and Technology, 3(3):57:1–57:22.
W.K. Pan (CSSE, SZU) FM 23 / 23