17
Post-Ranking query suggestion by diversifying search Chao Wang

Post-Ranking query suggestion by diversifying search Chao Wang

Embed Size (px)

Citation preview

Page 1: Post-Ranking query suggestion by diversifying search Chao Wang

Post-Ranking query suggestion by diversifying search

Chao Wang

Page 2: Post-Ranking query suggestion by diversifying search Chao Wang

Missiondiversifying the content of the search results from suggested queries while keep-ing the suggestion relevant

random walk : a mathematical formalization of a path that consists of a succession of random steps.

Example: Stock price, Molecule travels in a liquid

Page 3: Post-Ranking query suggestion by diversifying search Chao Wang

Suggested queriesAfter a user submits a query, a set of relevant queries are suggested to the user.

If not satisfied with the results on the page, the user may choose to click on the suggested queries.

Research indicated that query suggestion greatly improves user satisfaction rate.

Page 4: Post-Ranking query suggestion by diversifying search Chao Wang

Existing work and improvement

Focus on discovering relevant queries from search engine logs. ( co-clicked URLS and session information)

They forget to address the diversification of the query suggestions. When a user clicks on the suggested query,he/she expects to gain additional information.

SERP diversification between two queries to be the

difference between their top-returned search results. .Example:Delta airline

Page 5: Post-Ranking query suggestion by diversifying search Chao Wang

related workRandom walk model: Queries and URLs are represented as nodes in a bipartite graph where each edge connects one query with one URL, which indicates a click.

Entropy model: various user clicks have different importance. A click on a more specific URL is weighted higher than a click on a general URL

Rare queries: combine information from clicked URLs and skipped URLs by constructing two bipartite graphs

Rare queries: use walk model on the query-URL bipartite graph by calculating the query hitting time and can encourage diversities.

Page 6: Post-Ranking query suggestion by diversifying search Chao Wang

MissionMission: Rather than focusing on improving the relevance of documents by re-ranking them, we aim at re-ranking suggested queries which help users refine their intent .

previous limitation: the existing works on diversifying search results only

focused on ambiguous queries where those queries have more

than one user intents,

previous limitation: only focus on relevance and do not consider diversification issue.

Page 7: Post-Ranking query suggestion by diversifying search Chao Wang

Generate suggestion candidates

Collected from random walk model : Apply to the query-click logs.

User session : find out user activities within a certain period of time to extract relevant queries

Page 8: Post-Ranking query suggestion by diversifying search Chao Wang

Ranking Function

Page 9: Post-Ranking query suggestion by diversifying search Chao Wang

Features 1 Open directory project :https://www.dmoz.org/

Build using a binary tree

Paper example : (next page)

Page 10: Post-Ranking query suggestion by diversifying search Chao Wang
Page 11: Post-Ranking query suggestion by diversifying search Chao Wang

Features 2, 3 , 4 Feature 2 and 3 check similarity between URL strings and domain names. Value = 1 if two strings are the same and 0 otherwise.

Feature 4 compute the correlation between two ordered SERP lists. Concordant if both URLs are identical and ranked at the same positionSimilarity calculation : not main focus on this paper.

Page 12: Post-Ranking query suggestion by diversifying search Chao Wang

Training labels and learning algorithmsask people to evaluate the relevance between query and suggestions. ( score between 0 and 3)

Classification : support vector machines classify instances into one of the four classes with detailed ranked score. Example.

The research is based on LambdaSMART algorithm because of its superior performance.

Page 13: Post-Ranking query suggestion by diversifying search Chao Wang

13

When data is very informative, shrinkage is zero and it moves toward 1 when data is less informative,

Page 14: Post-Ranking query suggestion by diversifying search Chao Wang

Data acquisition Randomly samples 13,421 queries between Sep 2010 and Nov 2010. These are queries that trigger at least one related search on the search result page

Page 15: Post-Ranking query suggestion by diversifying search Chao Wang

performance for different query typesAverage query length : 2.51. Average suggestion length.

Long > 4, medium 2<= length <=4, short < 2

Navigational queries and information queries

Normalized discounted cumulative gain (NDCG): a measure of ranking quality and used to measure effectiveness of web search engine algorithms. value between 0 and 1

Page 16: Post-Ranking query suggestion by diversifying search Chao Wang

performance for different query types

Page 17: Post-Ranking query suggestion by diversifying search Chao Wang

conclusionFirst gather a set of suggestion candidates then rank them suggestions based on their diversification scores.

Diversification score based on features : ODP category, URL string difference, domain difference.

Important discovery : the similarity between queries and suggested queries indeed drops

lots of room for improvement and will explore more features