60
Extending Learning to Rank with User Dynamic 8th Italian Information Retrieval Workshop - IIR 2017 June 6, 2017 Nicola Ferro, Claudio Lucchese, Maria Maistro, Raffaele Perego University of Padua, Padua, Italy ISTI-CNR, Pisa, Italy Extended Abstract of N. Ferro, C. Lucchese, M. Maistro and R. Perego. On Including the User Dynamic in Learning to Rank. In SIGIR, ACM, 2017.

Extending Learning to Rank with User Dynamic

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Extending Learning to Rank with User Dynamic

Extending Learning to Rank with User Dynamic

8th Italian Information Retrieval Workshop - IIR 2017 June 6, 2017

Nicola Ferro, Claudio Lucchese, Maria Maistro, Raffaele PeregoUniversity of Padua, Padua, Italy

ISTI-CNR, Pisa, Italy

Extended Abstract of N. Ferro, C. Lucchese, M. Maistro and R. Perego. On Including the User Dynamic in Learning to Rank. In SIGIR, ACM, 2017.

Page 2: Extending Learning to Rank with User Dynamic

Outline• Motivations and Goal

• Modeling User Dynamic

• Integrating User Dynamic in LtR

• Experimental Results

• Conclusion and Future Work

Page 3: Extending Learning to Rank with User Dynamic

Click Log Data

Query logs have proven to be a valuable and informative source of implicit user feedback:

• they can be easily collected by search engines;

• they are available in real time;

• they represent personalized user preferences.

Page 4: Extending Learning to Rank with User Dynamic

Incorporating User Features in LtR

E. Agichtein, E. Brill, and S. Dumais. Improving Web Search Ranking by Incorporating User Behavior Information. In SIGIR, pages 19–26, ACM, 2006.

Page 5: Extending Learning to Rank with User Dynamic

Incorporating User Features in LtR

To show the importance of user interactions features we trained LambdaMART on MSLR-WEB10K LtR dataset:

E. Agichtein, E. Brill, and S. Dumais. Improving Web Search Ranking by Incorporating User Behavior Information. In SIGIR, pages 19–26, ACM, 2006.

Page 6: Extending Learning to Rank with User Dynamic

Incorporating User Features in LtR

To show the importance of user interactions features we trained LambdaMART on MSLR-WEB10K LtR dataset:

NDCG with user features:0.4636

E. Agichtein, E. Brill, and S. Dumais. Improving Web Search Ranking by Incorporating User Behavior Information. In SIGIR, pages 19–26, ACM, 2006.

Page 7: Extending Learning to Rank with User Dynamic

Incorporating User Features in LtR

To show the importance of user interactions features we trained LambdaMART on MSLR-WEB10K LtR dataset:

NDCG with user features:0.4636

NDCG without user features:0.4410

E. Agichtein, E. Brill, and S. Dumais. Improving Web Search Ranking by Incorporating User Behavior Information. In SIGIR, pages 19–26, ACM, 2006.

Page 8: Extending Learning to Rank with User Dynamic

A Complementary Approach

Embedding of the user interaction dynamics into LambdaMART: we model the user dynamic with Markov Chain trained on query log data and we modified the LambdaMART loss function.

Page 9: Extending Learning to Rank with User Dynamic

Measuring EffectivenessEffectiveness is often measured as the inner product of:

• Relevance vector, accounting the quality of ranked documents;

• Discounting vector, accounting for the position where a document is ranked.

J · D

J

D

E. Yilmaz, M. Shokouhi, N. Craswell, and S. Robertson. Expected Browsing Utility for Web Search Evaluation. In CIKM, pages 1561–1565, ACM, 2010.

Page 10: Extending Learning to Rank with User Dynamic

Measuring EffectivenessEffectiveness is often measured as the inner product of:

• Relevance vector, accounting the quality of ranked documents;

• Discounting vector, accounting for the position where a document is ranked.

J · D

J

D

E. Yilmaz, M. Shokouhi, N. Craswell, and S. Robertson. Expected Browsing Utility for Web Search Evaluation. In CIKM, pages 1561–1565, ACM, 2010.

Example: Discounted Cumulated Gain (DCG)

Ji = 2li � 1

Di = log(i+ 1)

Page 11: Extending Learning to Rank with User Dynamic

Navigational vs. Informational Queries

Observation: user behavior in visiting a SERP differs depending on query type and number of relevant results retrieved.

Page 12: Extending Learning to Rank with User Dynamic

Navigational vs. Informational Queries

Observation: user behavior in visiting a SERP differs depending on query type and number of relevant results retrieved.

SERP with a single highly relevant result in the first position

Navigational Behavior

Page 13: Extending Learning to Rank with User Dynamic

Navigational vs. Informational Queries

Observation: user behavior in visiting a SERP differs depending on query type and number of relevant results retrieved.

SERP with a single highly relevant result in the first position

Navigational Behavior

SERP with several relevant results or no relevant results

Informational Behavior

Page 14: Extending Learning to Rank with User Dynamic

Markovian User Model

M. Ferrante, N. Ferro, and M. Maistro. Injecting User Models and Time into Precision via Markov Chains. In SIGIR, pages 597–606, ACM, 2014.

Page 15: Extending Learning to Rank with User Dynamic

Markovian User ModelWe assume that each user decides, independently from the random time spent in the first document, to move forward or backward to another document in the list.

M. Ferrante, N. Ferro, and M. Maistro. Injecting User Models and Time into Precision via Markov Chains. In SIGIR, pages 597–606, ACM, 2014.

Page 16: Extending Learning to Rank with User Dynamic

Markovian User ModelWe assume that each user decides, independently from the random time spent in the first document, to move forward or backward to another document in the list.

M. Ferrante, N. Ferro, and M. Maistro. Injecting User Models and Time into Precision via Markov Chains. In SIGIR, pages 597–606, ACM, 2014.

1 X1

Page 17: Extending Learning to Rank with User Dynamic

Markovian User ModelWe assume that each user decides, independently from the random time spent in the first document, to move forward or backward to another document in the list.

M. Ferrante, N. Ferro, and M. Maistro. Injecting User Models and Time into Precision via Markov Chains. In SIGIR, pages 597–606, ACM, 2014.

2 X2

1 X1

Page 18: Extending Learning to Rank with User Dynamic

Markovian User ModelWe assume that each user decides, independently from the random time spent in the first document, to move forward or backward to another document in the list.

M. Ferrante, N. Ferro, and M. Maistro. Injecting User Models and Time into Precision via Markov Chains. In SIGIR, pages 597–606, ACM, 2014.

2 X2

1 X1

3 X3

Page 19: Extending Learning to Rank with User Dynamic

Markovian User ModelWe assume that each user decides, independently from the random time spent in the first document, to move forward or backward to another document in the list.

M. Ferrante, N. Ferro, and M. Maistro. Injecting User Models and Time into Precision via Markov Chains. In SIGIR, pages 597–606, ACM, 2014.

2 X2

1 X1

3 X3

X1, X2, X3, . . . 2 R = {1, 2, . . . , R}

random sequence of document ranks visited by the user

Page 20: Extending Learning to Rank with User Dynamic

Markov ChainP = (pij : i, j 2 R)Transition Matrix

Page 21: Extending Learning to Rank with User Dynamic

Markov ChainP = (pij : i, j 2 R)Transition Matrix

pij = P[Xn+1 = j|Xn = i]Transition Probability

Page 22: Extending Learning to Rank with User Dynamic

Markov ChainP = (pij : i, j 2 R)Transition Matrix

Discrete Time Homogeneous Markov Chain (Xn)n>0

pij = P[Xn+1 = j|Xn = i]Transition Probability

Page 23: Extending Learning to Rank with User Dynamic

Markov ChainP = (pij : i, j 2 R)Transition Matrix

Discrete Time Homogeneous Markov Chain (Xn)n>0

⇡ = ⇡PInvariant Distribution

pij = P[Xn+1 = j|Xn = i]Transition Probability

Page 24: Extending Learning to Rank with User Dynamic

Markov ChainP = (pij : i, j 2 R)Transition Matrix

Discrete Time Homogeneous Markov Chain (Xn)n>0

⇡ = ⇡PInvariant Distribution

pij = P[Xn+1 = j|Xn = i]Transition Probability

p(n)ij ! ⇡j as n ! 1 for all i, j

Page 25: Extending Learning to Rank with User Dynamic

Parameter Estimation• We used the click log dataset provided by Yandex

in the context of the relevance prediction challenge,

• We classify queries on the basis of the number of relevant documents returned and define different classes of queries;

• We aggregate the dynamics of different users and we adopt the maximum likelihood estimator approach to calibrate the transition matrix and compute the invariant distribution.

P. Serdyukov, N. Craswell, G. Dupret. WSCD2012: Workshop on Web Search Click Data 2012. In WSDM, pages 771–772, ACM, 2012.

Page 26: Extending Learning to Rank with User Dynamic

Different Query Types

Rank Positions2 4 6 8 10

Prob

abilit

y

0

0.05

0.1

0.15

0.2

0.251 Relevant Document0 Relevant Documents3 Relevant Documents5 Relevant Documents7 Relevant Documents9 Relevant Documents

Page 27: Extending Learning to Rank with User Dynamic

Different Query Types

Rank Positions2 4 6 8 10

Prob

abilit

y

0

0.05

0.1

0.15

0.2

0.251 Relevant Document0 Relevant Documents3 Relevant Documents5 Relevant Documents7 Relevant Documents9 Relevant Documents

one relevant document navigational behavior

Page 28: Extending Learning to Rank with User Dynamic

Different Query Types

Rank Positions2 4 6 8 10

Prob

abilit

y

0

0.05

0.1

0.15

0.2

0.251 Relevant Document0 Relevant Documents3 Relevant Documents5 Relevant Documents7 Relevant Documents9 Relevant Documents

one relevant document navigational behavior

many or no relevant documents

informational behavior

Page 29: Extending Learning to Rank with User Dynamic

User DynamicThe user dynamic can be described as a mixture of the navigational and informational behavior

�(i) = ↵i�1 + �i+ �

Page 30: Extending Learning to Rank with User Dynamic

User DynamicThe user dynamic can be described as a mixture of the navigational and informational behavior

�(i) = ↵i�1 + �i+ �

Navigational component: Inverse of the rank position

Page 31: Extending Learning to Rank with User Dynamic

User DynamicThe user dynamic can be described as a mixture of the navigational and informational behavior

�(i) = ↵i�1 + �i+ �

Navigational component: Inverse of the rank position

Informational component: linear with the rank position

Page 32: Extending Learning to Rank with User Dynamic

Fitted Curves

Rank Positions2 4 6 8 10

0

0.05

0.1

0.15

0.2

0.25

/(i) = ,i-1+-i+., = 0.2601- = 0.0112. = -0.0378

: Stationary Distribution/ User Dynamic

Navigational Queries

Rank Positions2 4 6 8 10

0

0.05

0.1

0.15

0.2

0.25

/(i) = ,i-1+-i+., = 0.0848- = 0.0045. = 0.0502

: Stationary Distribution/ User Dynamic

Informational Queries

Page 33: Extending Learning to Rank with User Dynamic

Learning to RankA LtR algorithm exploits a ground-truth set of training examples in order to learn a document scoring function. The training set is composed of:

• A collection of queries

• Each query is associated with a set of documents ppppppppppppp

• Each query-document pair is represented by a set of features x

q 2 Q

D = {d0, d1, . . .}

Page 34: Extending Learning to Rank with User Dynamic

LambdaMARTLet di and dj be two candidate documents for the same query q with relevance labels li and lj respectively.

�ij =@Qij

@ (si � sj)= sgn(yi � yj)

�����Qij ·1

1 + esi�sj

����

Page 35: Extending Learning to Rank with User Dynamic

LambdaMARTLet di and dj be two candidate documents for the same query q with relevance labels li and lj respectively.

�ij =@Qij

@ (si � sj)= sgn(yi � yj)

�����Qij ·1

1 + esi�sj

����Lambda Gradient

Page 36: Extending Learning to Rank with User Dynamic

LambdaMARTLet di and dj be two candidate documents for the same query q with relevance labels li and lj respectively.

�ij =@Qij

@ (si � sj)= sgn(yi � yj)

�����Qij ·1

1 + esi�sj

����

Quality Function

Page 37: Extending Learning to Rank with User Dynamic

LambdaMARTLet di and dj be two candidate documents for the same query q with relevance labels li and lj respectively.

�ij =@Qij

@ (si � sj)= sgn(yi � yj)

�����Qij ·1

1 + esi�sj

����

Document Scores

Page 38: Extending Learning to Rank with User Dynamic

LambdaMARTLet di and dj be two candidate documents for the same query q with relevance labels li and lj respectively.

�ij =@Qij

@ (si � sj)= sgn(yi � yj)

�����Qij ·1

1 + esi�sj

�����ij =@Qij

@ (si � sj)= sgn(li � lj)

�����Qij ·1

1 + esi�sj

����

Page 39: Extending Learning to Rank with User Dynamic

LambdaMARTLet di and dj be two candidate documents for the same query q with relevance labels li and lj respectively.

�ij =@Qij

@ (si � sj)= sgn(yi � yj)

�����Qij ·1

1 + esi�sj

�����ij =@Qij

@ (si � sj)= sgn(li � lj)

�����Qij ·1

1 + esi�sj

����the sign is determined

by the documents labels only

Page 40: Extending Learning to Rank with User Dynamic

LambdaMARTLet di and dj be two candidate documents for the same query q with relevance labels li and lj respectively.

�ij =@Qij

@ (si � sj)= sgn(yi � yj)

�����Qij ·1

1 + esi�sj

�����ij =@Qij

@ (si � sj)= sgn(li � lj)

�����Qij ·1

1 + esi�sj

����quality variation when swapping scores si and sj

Page 41: Extending Learning to Rank with User Dynamic

LambdaMARTLet di and dj be two candidate documents for the same query q with relevance labels li and lj respectively.

�ij =@Qij

@ (si � sj)= sgn(yi � yj)

�����Qij ·1

1 + esi�sj

�����ij =@Qij

@ (si � sj)= sgn(li � lj)

�����Qij ·1

1 + esi�sj

����

derivative of the RankNet cost

Page 42: Extending Learning to Rank with User Dynamic

LambdaMARTLet di and dj be two candidate documents for the same query q with relevance labels li and lj respectively.

�ij =@Qij

@ (si � sj)= sgn(yi � yj)

�����Qij ·1

1 + esi�sj

�����ij =@Qij

@ (si � sj)= sgn(li � lj)

�����Qij ·1

1 + esi�sj

����

The lambda gradient for a document di is computed by marginalizing over all possible pairs in the result list: �i =

X

j

�ij

Page 43: Extending Learning to Rank with User Dynamic

nMCG-MARTWe propose a new measure called Normalized Markov Cumulated Gain (nMCG)

nMCG@k =

Pik

�2li � 1

�· �c(i)

Phk,sorted by lh

(2lh � 1) · �c(h)

Page 44: Extending Learning to Rank with User Dynamic

nMCG-MARTWe propose a new measure called Normalized Markov Cumulated Gain (nMCG)

nMCG@k =

Pik

�2li � 1

�· �c(i)

Phk,sorted by lh

(2lh � 1) · �c(h)

Relevance Label of the i-ranked document

Page 45: Extending Learning to Rank with User Dynamic

nMCG-MARTWe propose a new measure called Normalized Markov Cumulated Gain (nMCG)

nMCG@k =

Pik

�2li � 1

�· �c(i)

Phk,sorted by lh

(2lh � 1) · �c(h)

User dynamic function at rank i relative to the query class c

Page 46: Extending Learning to Rank with User Dynamic

nMCG-MARTWe propose a new measure called Normalized Markov Cumulated Gain (nMCG)

nMCG@k =

Pik

�2li � 1

�· �c(i)

Phk,sorted by lh

(2lh � 1) · �c(h)

nMCG score of the ideal run

Page 47: Extending Learning to Rank with User Dynamic

nMCG-MARTWe propose a new measure called Normalized Markov Cumulated Gain (nMCG)

nMCG@k =

Pik

�2li � 1

�· �c(i)

Phk,sorted by lh

(2lh � 1) · �c(h)

Extension of nDCG where the discount function is defined by the user dynamic.

Page 48: Extending Learning to Rank with User Dynamic

Experimental Setup

We evaluate the proposed model on three public datasets:

• MSLR-WEB30K, provided by Microsoft1

• MSLR-WEB10K, provided by Microsoft1

• Istella provided by Tiscali Istella Web search engine2

1. T. Qin, and T. Liu. Introducing LETOR 4.0 Datasets. In CoRR, 2013. 2. D. Dato, C. Lucchese, F. M. Nardini, S. Orlando, R. Perego, N. Tonello o, and R. Venturini. Fast Ranking with

Additive Ensembles of Oblivious and Non Oblivious Regression Trees. In TOIS, 35(2):15:1–15:31, 2016.

Page 49: Extending Learning to Rank with User Dynamic

Experimental Results

Page 50: Extending Learning to Rank with User Dynamic

Experimental ResultsNumber of Trees

Page 51: Extending Learning to Rank with User Dynamic

Experimental Results

p=0.05

Page 52: Extending Learning to Rank with User Dynamic

Experimental Results

p=0.05 p=0.01

Page 53: Extending Learning to Rank with User Dynamic

Experimental Results

Page 54: Extending Learning to Rank with User Dynamic

Experimental Results

Page 55: Extending Learning to Rank with User Dynamic

Experimental Results

Page 56: Extending Learning to Rank with User Dynamic

Experimental Results

Page 57: Extending Learning to Rank with User Dynamic

Take Away Message• We modeled the user

dynamic through Markov chains and show that the user behavior is different on different query types;

• By integrating the user dynamic in LambdaMART we improve with respect to the state of the art both in terms of nDCG and nMCG.

Page 58: Extending Learning to Rank with User Dynamic

Future Work• Analyze nMCG properties

and its correlation with state of the art evaluation measures;

• Investigate whether nMCG correlates with the quality of a ranking as it is perceived by a user;

• Validate the study of the user dynamic beyond the first results page.

Page 59: Extending Learning to Rank with User Dynamic

Thank you for your attention Any Questions?

Page 60: Extending Learning to Rank with User Dynamic