Upload
mitchell-griffin
View
212
Download
0
Embed Size (px)
Citation preview
Tuning Before Feedback:Combining Ranking Discovery and
Blind Feedback for Robust Retrieval*
Weiguo Fan, Ming Luo, Li Wang, Wensi Xi, and Edward A. FoxDigital Library Research Laboratory, Virginia Tech
*This research is supported by the National Science Foundation under Grant Numbers IIS-0325579, DUE-0136690 and DUE-0333531
Outline Introduction Research Questions Approach: Ranking Tuning + Blind Fdbk Experiment Results Conclusion
Introduction Ranking functions play an important role
in IR performance Blind feedback (pseudo-relevance
feedback) has been found very useful for ad hoc retrieval
Why not combine ranking function optimization with blind feedback to improve robustness?
Research Questions Does blind feedback work even better on fine-
tuned ranking functions as compared to on traditional ranking functions such as Okapi BM25?
Does the type of query (very short vs. very long) have any impact on the combination approach?
Can the ranking function discovered, in combination with blind feedback, extrapolate well for new unseen queries?
Our Approach Use ARRANGER
a Genetic Programming-based discovery engine
to perform the ranking function tuning [Fan 2003tkde, Fan 2004ip&m, Fan 2004jasist]
Combine ranking tuning and feedback Test on different types of queries
RF Discovery Problem
Order Doc. Rele.1 A 12 D 13 F 14 G 15 B 06 C 07 E 0
Order Doc. Rele.1 A 12 B 03 C 04 D 15 E 06 F 17 G 1
Feedback
Training
Data
Input
Ranking Function
Discovery
Ranking
Function f
Output
Ranking Function Optimization Ranking Function Tuning is an art! – Paul Kantor Why not adaptively discover RF by Genetic Programming?
Huge search space Discrete objective function Modeling advantage
What is GP? Problem solving system designed based on principles of evolution
and heredity Widely used for structure discovery, functional form discovery,
other data mining and optimization tasks
Genetic Algorithms/Programming Representation:
Vector of bit strings or real numbers for GA Complex data structures: trees, arrays for GP
Genetic transformation Reproduction Crossover Mutation
IR application [Gordon’88, ’91], [Chen’98a, ’98b], [Pathak’00], etc.
Essential GP ComponentsComponents Meaning
Terminals Leaf nodes in the tree structure (i.e., x, y).
Functions Non-leaf nodes used to combine the leaf nodes. Commonly, numerical operations: +, -, *, /, log, sqrt.
Fitness function
The objective function GP aims to optimize.
Reproduction A genetic operator that copies the individuals with the best fitness values directly into the population of the next generation without going through the crossover operation.
Crossover A genetic operator aiming to improve the diversity as well as the genetic fitness of the population. See details in next slide.
Example of Crossover in GP
tf*(tf+df)
tf*(N/df)
+
df
*
tf
*
tf
+
df
Crossover
Parent1 Parent2
Child1Child2
N/df+df
(tf*df)+df
N
/
dfdftf
+Generation: N
Generation: N+1
N
/
dfdftf
+
The ARRANGER Engine1. Split the training data into
training and validation2. Generate an initial
population of random “ranking functions”
3. Evaluate the fitness of each “ranking function” in the population and record 10 best ones
4. If stopping criteria is not met, generate the next generation of population by genetic transformation, go to Step 3.
5. Validate the recorded best “ranking functions” and select the best one as the RF
Order Doc. Rele.1 A 12 B 03 C 04 D 15 E 06 F 17 G 1
1 2 3 48 49 50
Start
Initialize Population
Evaluate Fitness
Apply Crossover
Stop?
Validate and Output End
48 49 501 2 30.40.30.4 0.80.30.4
The ARRANGER Engine
1. Split the training data into training and validation2. Generate an initial population of random “ranking
functions”3. Evaluate the fitness of each “ranking function” in
the population and record 10 best ones4. If stopping criteria is not met, generate the next
generation of population by genetic transformation, go to Step 3.
5. Validate the recorded best “ranking functions” and select the best one as the RF
The ARRANGER Engine
Order Doc. Rele.1 A 12 B 03 C 04 D 15 E 06 F 17 G 1
1 2 3 48 49 50
Start
Initialize Population
Evaluate Fitness
Apply Crossover
Stop?
Validate and Output End
48 49 501 2 3
0.4 0.3 0.4 0.8 0.3 0.4
Blind Feedback Automatically adds more terms to a user’s
query to enhance the performance of search engines by assuming top ranked docs relevant
Some examples Rocchio (performs best in our experiment) Dec-Hi Kullback-Leibler Divergence (KLD) Chi-Square
Ranking Tuning
Blind Feedback
Multiple user queriesWith relevance information New Ranking
Function
New Search Results
User Queries
Ranking Tuning
Blind Feedback
Multiple user queriesWith relevance information New Ranking
Function
New Search Results
User Queries
An Integrated Model
Experiment Setting Data
2003 Robust Track data (from TREC 6, 7, 8) Training Queries
150 old queries from TREC 6, 7, 8 Test Questions
50 very hard queries + 50 new queries
The Results on 150 Training Queries
Run No. Desc Short
Okapi without BF (Baseline)
0.1880 0.2194
Okapi with BF 0.2076 (+10.4%) 0.2385 (+8.7%)
RF 1 without BF 0.2173 (+15.6%) 0.2394 (+9.1%)
RF 1 with BF 0.2422 (+28.8%) 0.2661 (+21.3%)
Results on Test Queries (1)
Results on Test Queries (2)
Conclusions Blind feedback works well on GP trained
queries. Ranking function combined with blind
feedback works with new queries Two stage model responds differently to
Desc query (slightly better) and Long query
Thank You!
Q&A?