Dynamic User Profiling for Search Personalisation

Thanh VuComputing and Communications

DepartmentThe Open University

Classical Search Systems

AOL, Altavista return search results based onThe user input queryRegardless of the user searching preferences

Different users submit the same input query will get the same returned result list

Queries are usually short and ambiguous, e.g., Michael Jordan, Java, etc.

Different users have different information needs with the same input query

Search PersonalisationReturn search results based on

The input queryThe user searching interests

Different users submit the same input query will probably get different search result lists

Even an individual user will get different search results at different search times (e.g., Open US)

Part I: Dynamic group formation

The performance of search personalisation

depends onthe richness of a user

profileJ. Teevan, M. R. Morris, and S. Bush. Discovering and using groups to improve personalized search. In WSDM’2009

Topic-based user profilesUse Human generated ontology (ODP –

dmoz.org) to extract topics from all clicked/relevant documents of a specific user to build her profile

1. R. W. White, et al., Enhancing Personalized Search by Mining and Modeling Task Behavior. In WWW’20132. P. N. Bennett, et al., Modeling the impact of short- and long-term behavior on search personalization. In SIGIR’2012

Challenges for Human Generated OntologyNew topics which are not covered in the

Ontology will possibly emerge overtimeExpensive human effort to classify/maintain

each document into correct categories

Enriching a user profileUse information of the group of users who

share common interests

R. W. White, W. Chu, A. Hassan, X. He, Y. Song, and H. Wang. Enhancing personalized search by mining and modeling task behavior. WWW '13, pages 1411-1420, Switzerland, 2013. ACM8

Challenges for grouping methodsConstruct groups statically using some

predetermined criterions such as common clicked documentsUsers in a group may have different interests

on different topics w.r.t the input query

Z. Dou, R. Song, and J.-R. Wen. A large-scale evaluation and analysis of personalized search strategies. WWW '07, pages 581-590, NY, USA, 2007. ACM.9

Research QuestionHow can we enrich user profiles with dynamic group formation?

1. How can we dynamically group users who share common interests?

2. How can we enrich user profiles with group information?

3. Can enriched user profiles help to improve search performance?

Dynamic group formationThe groups should be dynamically

constructed in response to the user’s input query

Applying Latent Dirichlet Allocation

Constructing a user profileAverage the relevant documents over

topics

Query-dependent user groupingConstruct shared user profilesUse the input query as an indicator for

grouping users

Constructing a shared user profile

Query-dependent user groupingP(q|z) =

Query-dependent user grouping

The 2-nearest users

0.450.350.20

Enriching a user profileAverage all users in the group over topics

Re-ranking search resultsFor each input query q

Download the top n ranked search results from the search engine

Compute a personalised score for each web page d given the current user u – p(d|u)

Combine the personalised score p(d|u) and the original rank r(q,d), to get a final score

),()|(),|(

dqrudpqudf

Re-ranking search results Query: MU

DatasetQuery logs from Bing search engine for 15

days from 1st to 15th July 2012, 106 anonymous users

A relevant document is a click with dwell time of at least 30 seconds or the last click in a session (SAT click)

Evaluation metricsInverse Average Rank (IAR)

Personalisation Gain (P-Gain)

Baseline and Personalisation StrategiesBaseline and Personalisation Strategies

Baseline: The original ranked results from Bing

S_Profile: Use only the current user profileS_Group: Enrich the profile with static groupD_Group: Enrich the profile with dynamic

Overall Performance

Part II: Temporal User Profiles

Challenges for Time-awarenessPrevious methods use all the

clicked/relevant documents of a user to build her searching profile

The documents are treated equally without considering temporal features (i.e., the time of documents being clicked and viewed)The profile is too broad Cannot fully express the current interest of

the user1. T. T. Vu, et al., Improving search personalisation with dynamic group formation. In SIGIR’20142. K. Raman, et al., Toward whole-session relevance: Exploring intrinsic diversity in web search. In SIGIR’2013

Research QuestionHow can we build user profiles with time-awareness?

1. How can we build temporal user profiles?2. Can the time-aware profiles help improve

search performance?

Building temporal user profiles (1)Non-temporal method

4th 1st2nd3rd

FootballLawHealthOS

0.510.330.110.05

Clicked documents

FootballLawOSHealth

0.550.270.100.08

LawOSHealthFootball

0.410.370.120.10

OSLawFootballHealth

0.650.210.100.04

Distribution over topics

FootballLawOSHealth

0.320.300.290.09

Means over topics

The topic-based user profile

Building temporal user profiles (2)Our method

FootballLawHealthOS

0.510.330.110.05

FootballLawHealthOS

0.510.330.110.05

The temporal topic user profile

FootballLawHealthOS

0.530.300.090.08

Building temporal user profiles (2)

2nd 1st

FootballLawHealthOS

0.510.330.110.05

FootballLawOSHealth

0.550.270.100.08

0.91 0.90

FootballLawOSHealth

0.370.340.190.10

3rd 1st2nd

FootballLawHealthOS

0.510.330.110.05

FootballHealthOSLaw

0.550.270.100.08

LawOSHealthFootball

0.410.370.120.10

OSLawFootballHealth

0.320.300.290.09

4th 1st2nd3rd

FootballLawHealthOS

0.510.330.110.05

FootballHealthOSLaw

0.550.270.100.08

LawOSHealthFootball

0.410.370.120.10

OSLawFootballHealth

0.650.210.100.04

Temporal topic profile

FootballLawOSHealth

0.320.300.290.09

Non-temporal topic profile

Building temporal user profiles (3)Du = {d1, d2, …, dn} is a relevant document

set of the user uThe user profile of u is a distribution over

the topic Z (extracted by LDA)

tdi = n indicates that di is the nth most relevant/clicked document of u

α is the decay parameter; K is the normalisation factor

Building temporal user profiles (4)Long-term user profile

Use relevant documents extracted from the user’s whole search history

Daily user profileUse relevant documents extracted from the

search history of the user in the current searching day

Session user profileUse relevant documents extracted from the

search history of the user in the current search session

Re-ranking search results (1)1 32

HealthLawFootballOS

0.510.330.110.05

FootballLawHealthOS

0.550.270.130.05

FootballOSHealthLaw

0.410.370.120.10

Original Rank

HealthLawFootballOS

0.510.330.110.05

FootballLawHealthOS

0.550.270.130.05

FootballOSHealthLaw

0.410.370.120.10

After re-ranking

FootballLawOSHealth

0.470.240.160.12

The user profile (p)

Re-ranking search results (2)Personalised scores

Use Jensen-Shannon divergence (DJS[d||p] )

HealthLawFootballOS

0.510.330.110.05

FootballLawHealthOS

0.550.270.130.05

FootballOSHealthLaw

0.410.370.120.10

FootballLawOSHealth

0.470.240.160.12

Returned documents (d)

The user profile (p)

Re-ranking search results (3)Re-ranking Features

Re-Ranking Algorithm: LambdaMART[1]

1. C. J. Burges, et al., Learning to rank with non-smooth cost functions. In NIPS’2007.

Feature DescriptionPersonalised FeaturesLongTermScore

Personalised score between document and long-term profile

DailyScore Personalised score between document and daily profile

SessionScore Personalised score between document and session profile

Non-personalised FeaturesDocRank Rank of document on original returned listQuerySim Cosine similarity score between current and

previous queriesQueryNo Total number of queries that have been submitted in

the current search session (included the current query)

EvaluationDatasetThe query logs of 1166 anonymous users in four

weeks, from 01st to 28th July 2012A log entity consists of an anonymous user

identifier, a query, top-10 returned URLs, and clicked documents along with the user’s dwell time

Download all the URLs’ content for learning topicsA search session is demarcated by 30 minutes of

user inactivityA relevant document is a click with dwell time of at

least 30 seconds or the last click in a session (SAT click)

Evaluation methodologyAssign a positive (relevant) label to a

returned URL ifit is a SAT click in the current queryit is a SAT click in one of the other repeated

queries in the same search sessionAssign negative (irrelevant) labels to the

rest of URLs

Personalisation Methods and BaselinesPersonalisation Methods

LON uses only LongTermScore from long-term profileDAI uses only DailyScore from daily profileSES uses SessionScore from session profileALL uses all personalised scores from three profiles

(ALL)Baselines

Default is the default ranking returned by the search engine

Static uses the LongTermScore from long-term profile without time-awareness (i.e., not using decay function)

ResultsEvaluation metrics

Mean Average Precision (MAP)Precision (P@k)Mean Reciprocal Rank (MRR)Normalized Discounted Cumulative Gain

(nDCG@k) For each evaluation metric, the higher

value indicates the better ranking

Overall Performance

• All the improvements over the baselines are significant with paired t-test of p < 0.001

Overall Performance

TakeawaysDynamic Grouping

Grouping improves search performanceDynamic grouping outperforms static grouping

Temporal profilesThree temporal profiles help to improve

search performance over the default ranking and the use of non-temporal profile

Using all features (ALL) achieves the highest performance

The short-term profile achieves better performance than the longer-term profile

Thank you!Any questions?

Dataset (2)

Example of query logs

Click EntropiesP(d|q) is the percentage of the clicks on

document d among all the clicks for qA smaller query click entropy value

indicates more agreement between users on clicking a small number of web pages

Click entropies

Query Positions in Search SessionAim to study whether the position of a

query has any effect on the performance of the temporal latent topic profiles

Label the queries by their positions during the search

FootballLawHealthOS

0.510.330.110.05

Clicked documents

FootballHealthOSLaw

0.550.270.130.05

LawOSHealthFootball

0.410.370.120.10

OSLawFootballHealth

0.650.150.110.09

Distribution over topics

FootballLawOSHealth

0.320.290.280.11

Means over topics

The topic-based user profile

Re-ranking search results (1) Query: MU

Pre-processingRemove the queries whose positive label

set is empty from the datasetDiscard the domain-related queries (e.g.,

Facebook, Youtube)

Overall Performance

Dynamic User Profiling for Search Personalisation

Data & Analytics

Personalisation Task 1

Why Web Personalisation - Web personalisation workshop tania

Principles of Personalisation

Personalisation in Search

Personalisation Overview 5 th July 2010. Personalisation Personalisation of social care means moving away from traditional provision where people are

Targeted Path Profiling : Lower Overhead Path Profiling for Staged Dynamic Optimization Systems Rahul Joshi, UIUC Michael Bond*, UT Austin Craig Zilles,

Eco2 personalisation

NUMA Profiling for Dynamic Dataflow Applications

Targeted Path Profiling : Lower Overhead Path Profiling for Staged Dynamic Optimization Systems

Guide de Personalisation

Optimal Algorithm for Profiling Dynamic Arrays with Finite Values · Data structure, log stream, mode of an array, O(1)-complexity algorithm, profiling dynamic array. 1 INTRODUCTION

Dynamic Profiling - ASSCON · has been certified to DIN EN ISO 9001:2015 (ZN: 01 100 060704) by TÜV Rheinland since 2007. Our Product Series Dynamic Profiling for Active Profiling

Personalisation Software Explained

Practical Path Profiling for Dynamic Optimizers

Dynamic Credit-Card Fraud Profiling

Ontology-based Integration of Web Navigation for Dynamic User Profiling - Hoppe, Roxin.pdf · 2015-06-03 · Ontology-based Integration of Web Navigation for Dynamic User Profiling

Personalisation for All?. Organising for Personalisation at Scale

Dynamic Analysis And Profiling Of Multi Threaded Systems

Personalisation – what’s it all about? Kate Fearnley Director of Personalisation

Banking Disintermediation: The Personalisation Imperative · Page 8 Banking Disintermediation: the personalisation imperative personalisation is a Key diFFerentiator For altFis It