29
Relational learning using bilinear models and its application in E-commerce Shenghuo Zhu [email protected] Alibaba Group May 2, 2015 Collaborators: Rong Jin, Qi Qian, Lijun Zhang, Mehrdad Mahdavi, Tianbao Yang Shenghuo Zhu (Alibaba) Relational learning using bilinear models and its application in E-commerce May 2, 2015 1 / 21

Relational learning using bilinear models and its ... · y(v;u) = g k(v) User ubought item w. Let all users that bought item wbe segment k. g k(v) be the preference scores (purchase

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Relational learning using bilinear models and its ... · y(v;u) = g k(v) User ubought item w. Let all users that bought item wbe segment k. g k(v) be the preference scores (purchase

Relational learning using bilinear models and its

application in E-commerce

Shenghuo [email protected]

Alibaba Group

May 2, 2015

Collaborators:Rong Jin, Qi Qian, Lijun Zhang, Mehrdad Mahdavi, Tianbao Yang

Shenghuo Zhu (Alibaba) Relational learning using bilinear models and its application in E-commerceMay 2, 2015 1 / 21

Page 2: Relational learning using bilinear models and its ... · y(v;u) = g k(v) User ubought item w. Let all users that bought item wbe segment k. g k(v) be the preference scores (purchase

Recommendation in e-commerce

From http://ju.taobao.com/

Shenghuo Zhu (Alibaba) Relational learning using bilinear models and its application in E-commerceMay 2, 2015 2 / 21

Page 3: Relational learning using bilinear models and its ... · y(v;u) = g k(v) User ubought item w. Let all users that bought item wbe segment k. g k(v) be the preference scores (purchase

Personalized search in e-commerce

From http://tw.taobao.com/

Shenghuo Zhu (Alibaba) Relational learning using bilinear models and its application in E-commerceMay 2, 2015 3 / 21

Page 4: Relational learning using bilinear models and its ... · y(v;u) = g k(v) User ubought item w. Let all users that bought item wbe segment k. g k(v) be the preference scores (purchase

Recommendation to sellers

From Taobao’s seller interface.

Shenghuo Zhu (Alibaba) Relational learning using bilinear models and its application in E-commerceMay 2, 2015 4 / 21

Page 5: Relational learning using bilinear models and its ... · y(v;u) = g k(v) User ubought item w. Let all users that bought item wbe segment k. g k(v) be the preference scores (purchase

Relation and ranking

In the above tasks, we consider the relationship between

buyer and item

buyer/query and item

seller and item

seller and buyer

...

and rank items (or buyers) conditioned on a buyer (or a seller, aquery-buyer pair).

Shenghuo Zhu (Alibaba) Relational learning using bilinear models and its application in E-commerceMay 2, 2015 5 / 21

Page 6: Relational learning using bilinear models and its ... · y(v;u) = g k(v) User ubought item w. Let all users that bought item wbe segment k. g k(v) be the preference scores (purchase

Relation as function

Buyer and item: u buyer, v item,I scoring function: y(v;u)

Ranking by scores:I y(vi;u) > y(vj ;u) =⇒ u prefers vi over vj .

Shenghuo Zhu (Alibaba) Relational learning using bilinear models and its application in E-commerceMay 2, 2015 6 / 21

Page 7: Relational learning using bilinear models and its ... · y(v;u) = g k(v) User ubought item w. Let all users that bought item wbe segment k. g k(v) be the preference scores (purchase

Ranking by segmentation

Assume that given K underlying user segment, users, u,belonging to segment k share a same scoring function:

y(v;u) = gk(v)

User u bought item w. Let all users that bought item w besegment k. gk(v) be the preference scores (purchase history) ofitem v in segument k. It is item-based collaborative filtering.

Shenghuo Zhu (Alibaba) Relational learning using bilinear models and its application in E-commerceMay 2, 2015 7 / 21

Page 8: Relational learning using bilinear models and its ... · y(v;u) = g k(v) User ubought item w. Let all users that bought item wbe segment k. g k(v) be the preference scores (purchase

Ranking using mixture

User u bought more than one item. The above strictsegmentation assumption is relaxed. It is usually considered touse similarity between users.

In a general term, scoring function of u is a linear combinationof gk:

y(v;u) =K∑k=1

βkgk(v),

where segmentation is latent.

Shenghuo Zhu (Alibaba) Relational learning using bilinear models and its application in E-commerceMay 2, 2015 8 / 21

Page 9: Relational learning using bilinear models and its ... · y(v;u) = g k(v) User ubought item w. Let all users that bought item wbe segment k. g k(v) be the preference scores (purchase

Ranking using mixture

User u bought more than one item. The above strictsegmentation assumption is relaxed. It is usually considered touse similarity between users.

In a general term, scoring function of u is a linear combinationof gk:

y(v;u) =K∑k=1

βkgk(v),

where segmentation is latent.

Shenghuo Zhu (Alibaba) Relational learning using bilinear models and its application in E-commerceMay 2, 2015 8 / 21

Page 10: Relational learning using bilinear models and its ... · y(v;u) = g k(v) User ubought item w. Let all users that bought item wbe segment k. g k(v) be the preference scores (purchase

Ranking as matrix factorizationFor each user u,

y(v;u) =∑k

βkgk(v)

As each user has its own βk = fk(u)

y(v;u) =∑k

fk(u)gk(v)

Put y(u, v), fk(u) and gk(v) as matrices

Y = GF> ≈ T

Convex: use low rank constraint of Y ,[YLZG09].

Shenghuo Zhu (Alibaba) Relational learning using bilinear models and its application in E-commerceMay 2, 2015 9 / 21

Page 11: Relational learning using bilinear models and its ... · y(v;u) = g k(v) User ubought item w. Let all users that bought item wbe segment k. g k(v) be the preference scores (purchase

Ranking as matrix factorizationFor each user u,

y(v;u) =∑k

βkgk(v)

As each user has its own βk = fk(u)

y(v;u) =∑k

fk(u)gk(v)

Put y(u, v), fk(u) and gk(v) as matrices

Y = GF> ≈ T

Convex: use low rank constraint of Y ,[YLZG09].

Shenghuo Zhu (Alibaba) Relational learning using bilinear models and its application in E-commerceMay 2, 2015 9 / 21

Page 12: Relational learning using bilinear models and its ... · y(v;u) = g k(v) User ubought item w. Let all users that bought item wbe segment k. g k(v) be the preference scores (purchase

Ranking as matrix factorizationFor each user u,

y(v;u) =∑k

βkgk(v)

As each user has its own βk = fk(u)

y(v;u) =∑k

fk(u)gk(v)

Put y(u, v), fk(u) and gk(v) as matrices

Y = GF> ≈ T

Convex: use low rank constraint of Y ,[YLZG09].

FG

Y

W

Shenghuo Zhu (Alibaba) Relational learning using bilinear models and its application in E-commerceMay 2, 2015 9 / 21

Page 13: Relational learning using bilinear models and its ... · y(v;u) = g k(v) User ubought item w. Let all users that bought item wbe segment k. g k(v) be the preference scores (purchase

Ranking as matrix factorizationFor each user u,

y(v;u) =∑k

βkgk(v)

As each user has its own βk = fk(u)

y(v;u) =∑k

fk(u)gk(v)

Put y(u, v), fk(u) and gk(v) as matrices

Y = GF> ≈ T

Convex: use low rank constraint of Y ,[YLZG09].

FG

Y

W

Shenghuo Zhu (Alibaba) Relational learning using bilinear models and its application in E-commerceMay 2, 2015 9 / 21

Page 14: Relational learning using bilinear models and its ... · y(v;u) = g k(v) User ubought item w. Let all users that bought item wbe segment k. g k(v) be the preference scores (purchase

Ranking in bilinear model

User feature xu, and item feature zv.

fk(u) = 〈ak, xu〉 , gk(v) = 〈bk, zv〉

y(v;u) =∑k

fk(u)gk(v) = 〈xu,Wzv〉

where W =∑

k akb>k .

Put y(v;u), fk(u) and gk(v) as matrices

Y = X>WZ

Shenghuo Zhu (Alibaba) Relational learning using bilinear models and its application in E-commerceMay 2, 2015 10 / 21

Page 15: Relational learning using bilinear models and its ... · y(v;u) = g k(v) User ubought item w. Let all users that bought item wbe segment k. g k(v) be the preference scores (purchase

Ranking in bilinear model

User feature xu, and item feature zv.

fk(u) = 〈ak, xu〉 , gk(v) = 〈bk, zv〉

y(v;u) =∑k

fk(u)gk(v) = 〈xu,Wzv〉

where W =∑

k akb>k .

Put y(v;u), fk(u) and gk(v) as matrices

Y = X>WZ

Shenghuo Zhu (Alibaba) Relational learning using bilinear models and its application in E-commerceMay 2, 2015 10 / 21

Page 16: Relational learning using bilinear models and its ... · y(v;u) = g k(v) User ubought item w. Let all users that bought item wbe segment k. g k(v) be the preference scores (purchase

Issues

How to control the complexity of learning space?I Rank of W , or nuclear norm ‖W‖∗.

When features have high dimensions, can we take the advantageof low complexity of W to reduce the computationalcomplexity?

I The model is essentially a linear model:

y ≡ vec(Y) = (Z ⊗X)>vec(W ) ≡ x>w.

I A projection apporach of linear model is presented in this talk.

Shenghuo Zhu (Alibaba) Relational learning using bilinear models and its application in E-commerceMay 2, 2015 11 / 21

Page 17: Relational learning using bilinear models and its ... · y(v;u) = g k(v) User ubought item w. Let all users that bought item wbe segment k. g k(v) be the preference scores (purchase

Issues

How to control the complexity of learning space?I Rank of W , or nuclear norm ‖W‖∗.

When features have high dimensions, can we take the advantageof low complexity of W to reduce the computationalcomplexity?

I The model is essentially a linear model:

y ≡ vec(Y) = (Z ⊗X)>vec(W ) ≡ x>w.

I A projection apporach of linear model is presented in this talk.

Shenghuo Zhu (Alibaba) Relational learning using bilinear models and its application in E-commerceMay 2, 2015 11 / 21

Page 18: Relational learning using bilinear models and its ... · y(v;u) = g k(v) User ubought item w. Let all users that bought item wbe segment k. g k(v) be the preference scores (purchase

Solve via random projection

To solve:

w = arg minw

λ

2‖w‖2 +

∑i

`(x>i w, yi).

ApproachI Generate a random projection R of rank m, and let xi = Rxi.I Solve:

v = argminv

λ

2‖v‖2 +

∑i

`(x>i v, yi).

I Recover: w = R>v.

Shenghuo Zhu (Alibaba) Relational learning using bilinear models and its application in E-commerceMay 2, 2015 12 / 21

Page 19: Relational learning using bilinear models and its ... · y(v;u) = g k(v) User ubought item w. Let all users that bought item wbe segment k. g k(v) be the preference scores (purchase

Solve via random projection

To solve:

w = arg minw

λ

2‖w‖2 +

∑i

`(x>i w, yi).

ApproachI Generate a random projection R of rank m, and let xi = Rxi.I Solve:

v = argminv

λ

2‖v‖2 +

∑i

`(x>i v, yi).

I Recover: w = R>v.

Shenghuo Zhu (Alibaba) Relational learning using bilinear models and its application in E-commerceMay 2, 2015 12 / 21

Page 20: Relational learning using bilinear models and its ... · y(v;u) = g k(v) User ubought item w. Let all users that bought item wbe segment k. g k(v) be the preference scores (purchase

Issue

w is limited in the subspace spanned by R, as w = R>v.

Theorem 3 of [ZMJ+13]

For any 0 < ε ≤ 1/2, with a probability1− exp(−(d− r)/32)− exp(−m/32)− δ, we have

‖w − w∗‖2 ≥1

2

√d− rm

(1−

ε√

2(1 + ε)

1− ε

)‖w∗‖2.

Shenghuo Zhu (Alibaba) Relational learning using bilinear models and its application in E-commerceMay 2, 2015 13 / 21

Page 21: Relational learning using bilinear models and its ... · y(v;u) = g k(v) User ubought item w. Let all users that bought item wbe segment k. g k(v) be the preference scores (purchase

Dual space

Dual variable and function

`∗(α, yi) = supξαiξ − `(ξ, yi)

Dual problem

α = arg minα

1

2λα>X>Xα +

∑i

`∗(αi, yi).

Dual problem after random projection

α = arg minα

1

2λα>X>R>RXα +

∑i

`∗(αi, yi).

Shenghuo Zhu (Alibaba) Relational learning using bilinear models and its application in E-commerceMay 2, 2015 14 / 21

Page 22: Relational learning using bilinear models and its ... · y(v;u) = g k(v) User ubought item w. Let all users that bought item wbe segment k. g k(v) be the preference scores (purchase

Proposition

For any 0 < ε ≤ 1/2, with a probability at least 1− δ, we have

‖α− α∗‖K ≤ε

1− ε‖α∗‖K ,

provided m = Ω(ε−2 log δ−1).

Shenghuo Zhu (Alibaba) Relational learning using bilinear models and its application in E-commerceMay 2, 2015 15 / 21

Page 23: Relational learning using bilinear models and its ... · y(v;u) = g k(v) User ubought item w. Let all users that bought item wbe segment k. g k(v) be the preference scores (purchase

Random Projection Dual Recovery (DuRP)

Generate a random projection R of rank m, and let xi = Rxi.

Solve:

arg minv

λ

2‖v‖2 +

∑i

`(x>i v, yi).

Obtain dual variables: αi = `′(x>i v, yi)

Recover primal solution: w = − 1λ

∑i αixi.

Shenghuo Zhu (Alibaba) Relational learning using bilinear models and its application in E-commerceMay 2, 2015 16 / 21

Page 24: Relational learning using bilinear models and its ... · y(v;u) = g k(v) User ubought item w. Let all users that bought item wbe segment k. g k(v) be the preference scores (purchase

Random Projection Dual Recovery (DuRP)

Generate a random projection R of rank m, and let xi = Rxi.

Solve:

arg minv

λ

2‖v‖2 +

∑i

`(x>i v, yi).

Obtain dual variables: αi = `′(x>i v, yi)

Recover primal solution: w = − 1λ

∑i αixi.

Shenghuo Zhu (Alibaba) Relational learning using bilinear models and its application in E-commerceMay 2, 2015 16 / 21

Page 25: Relational learning using bilinear models and its ... · y(v;u) = g k(v) User ubought item w. Let all users that bought item wbe segment k. g k(v) be the preference scores (purchase

Approximation error of DuRP

Theorem 2 of [ZMJ+13]

For any 0 < ε ≤ 1/2, with a probability at least 1− δ, we have

‖w − w∗‖2 ≤ε

1− ε‖w∗‖2,

provided m ≥ (r+1) log(2r/δ)cε2

.

Shenghuo Zhu (Alibaba) Relational learning using bilinear models and its application in E-commerceMay 2, 2015 17 / 21

Page 26: Relational learning using bilinear models and its ... · y(v;u) = g k(v) User ubought item w. Let all users that bought item wbe segment k. g k(v) be the preference scores (purchase

Dual variables in bilinear model

Dual variables in α can be reshaped to a matrix A, where thenonzero entries correpsond to the user-item pairs havinginteraction.

Then the recovered matrix is written as

W = XAZ>.

Shenghuo Zhu (Alibaba) Relational learning using bilinear models and its application in E-commerceMay 2, 2015 18 / 21

Page 27: Relational learning using bilinear models and its ... · y(v;u) = g k(v) User ubought item w. Let all users that bought item wbe segment k. g k(v) be the preference scores (purchase

DuRP for high dimensional bi-linear model

Very high dimension in its linear representation, i.e. Z ⊗XI Random projection: R = R2 ⊗R1.[QJZL13].

Recovered matrix W = XAZ> is of high dimension and usuallydense, thus is difficult to apply to online service.

I Approximated by multiplication of two low rank matrices, usingapproximate SVD [HMT11].

Shenghuo Zhu (Alibaba) Relational learning using bilinear models and its application in E-commerceMay 2, 2015 19 / 21

Page 28: Relational learning using bilinear models and its ... · y(v;u) = g k(v) User ubought item w. Let all users that bought item wbe segment k. g k(v) be the preference scores (purchase

Summary

Many applications using relational data

A bilinear model is a straightforward approach

To learn from massive data, dual recovery random projection.

Shenghuo Zhu (Alibaba) Relational learning using bilinear models and its application in E-commerceMay 2, 2015 20 / 21

Page 29: Relational learning using bilinear models and its ... · y(v;u) = g k(v) User ubought item w. Let all users that bought item wbe segment k. g k(v) be the preference scores (purchase

Reference

N. Halko, P. Martinsson, and J. Tropp.

Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions.SIAM Review, 53(2):217–288, 2011.

Qi Qian, Rong Jin, Shenghuo Zhu, and Yuanqing Lin.

An integrated framework for high dimensional distance metric learning and its application to fine-grained visualcategorization.Technical Report 2013-TR102, NEC Laboratories America, 2013.arXiv:1402.0453.

Kai Yu, John Lafferty, Shenghuo Zhu, and Yihong Gong.

Large-scale collaborative prediction using a nonparametric random effects model.In ICML’09: Proceedings of the 26th Annual International Conference on Machine Learning, pages 1185–1192, NewYork, NY, USA, 2009. ACM.

Lijun Zhang, Mehrdad Mahdavi, Rong Jin, Tianbao Yang, and Shenghuo Zhu.

Recovering optimal solution by dual random projection.In COLT’13: The 26th Annual Conference on Learning Theory, 2013.

Shenghuo Zhu (Alibaba) Relational learning using bilinear models and its application in E-commerceMay 2, 2015 21 / 21