Mark A. Greenwood Natural Language Processing Group Department of Computer Science

Using Pertainyms to Improve Passage RetrievalUsing Pertainyms to Improve Passage Retrievalfor Questions Requesting Information About a Locationfor Questions Requesting Information About a Location

Mark A. Greenwood

Natural Language Processing Group

Department of Computer Science

University of Sheffield, UK

July 29th 2004 IR4QA: Information Retrieval for Question Answering

Outline of TalkOutline of Talk

• Introduction to the Problem

• Example Questions and Their Relevant Documents

• Pertainym Relationships

• Evaluation Metrics

• Experimental Results Test Set Possible Expansion Methods Noun Expansion Adjective Expansion Noun Replacement

• Affect of Expansion on a QA System

• Conclusions and Future Work


Introduction to the ProblemIntroduction to the Problem

• When performing IR as a filter between a large text collection and a QA system we would like to: Maximize the number of questions for which at least one relevant

document is retrieved. Minimize the noise in the text passed to the QA system by returning a

small set of documents.

• There are a number of ways in which we can attempt to solve this problem: Novel indexing methods – e.g. indexing relationships, named entities… Weighting/Passaging schemes specifically tailored to QA. Query formulation/expansion specific to the needs of QA.

• The approach outlined here is concerned with the selective query expansion of questions requesting information about a location.


Introduction to the ProblemIntroduction to the Problem

• Expanding questions to form IR queries can be approached from two directions: Form the query from the question words and related terms or concepts

(i.e. synonym or morphological variants). Form the query from the question words and terms likely to co-occur

with instances of the expected answer type.

• Both approaches have been shown to be useful but both also have their own problems: (Monz, 2003) expanded questions for which the expected answer type

was a measurement by including the measurement units in the query. While this method improves the retrieved documents, not all questions have terms which can be used for expansion in this way.

(Hovy, 2000) expanded the question using synonyms from WordNet, while noting that this approach has a number of associated problems; word sense disambiguation and overly common words.











Questions & Relevant DocumentsQuestions & Relevant Documents

• Many questions ask for information about a given location: Q1447: What is the capital of Syria? Q1517: What is the state bird of Alaska

• Unfortunately in many cases while the question contains the location as a noun, the relevant documents refer to the location using the associated adjective: “It was along this road, too, that we saw our first beavers, moose,

trumpeter swans, willow ptarmigans (the quail-like Alaskan state bird).”

The talks will start either in the Turkish capital, Ankara, or in the Syrian one, Damascus, the Middle East News Agency said.

• IR engines are going to struggle to retrieve relevant documents if the question words do not appear in the answer texts. Even IR systems which use stemming will struggle as few location

nouns and adjectives stem to the same term – Philippines and Philippine which both stem to philippin (using the Porter stemmer).











Pertainym RelationshipsPertainym Relationships

• One of the many types relationship in WordNet (Miller, 1995) is the pertainym relation. Pertainym relationships link adjectives (and adverbs) with the nouns they relate to:

• abdominal abdomen• Alaskan Alaska• conical cone• impossibly impossible• Syrian Syria

• Clearly relationships exists for many varied terms. For this study we will focus solely on those concerning locations.

• Extracting these relationships from WordNet allows us to also determine the inverse mapping:

• Alaska Alaskan• Syria Syrian











Evaluation MetricsEvaluation Metrics

• To show an improvement between two approaches we require an accepted evaluation metric.

• Coverage (Roberts, 2004) to evaluate the IR component: Coverage gives the proportion of the question set for which a correct

answer can be found within the top n documents retrieved by IR system S for each question.

A document is deemed to contain a correct answer if it not only contains a known answer string but the document itself is known to contain the answer (this is known as strict evaluation).

• MRR (Voorhees, 2001) to evaluate the QA component: Each questions score is the reciprocal of the rank of the first correct

answer (up to rank 5). MRR is simply the mean of these scores.

• For full rigorous mathematical definitions see the paper and referenced works.







• Experimental Results Question/Document Test Set Possible Expansion Methods Noun Expansion Adjective Expansion Noun Replacement




Question Test SetQuestion Test Set

• Two separate question sets were compiled from the questions used in the TREC 11 and 12 evaluations. The first set consists of 57 questions containing the noun form of a

country or state for which a relationship to an adjective can be found in WordNet. Examples are:

• Q1507: “What is the national anthem in England?”• Q1585: “What is the chief religion for Peru?”

The second set consists of 31 questions which contains a country or state adjective and for which a pertainym relationship exists in WordNet. Examples are:

• Q1710: “What are the colors of the Italian flag?”• Q2313: “What does an English stone equal?”

• Neither set contains questions in which the location is part of a compound term, for example: Q1753: “When was the Vietnam Veterans Memorial in Washington, D.C. built?”


Document CollectionDocument Collection

• All these experiments are carried using the Aquaint collection: Approximately 1,033,000 documents in 3 gigabytes of text from AP newswire, 1998-2000 New York Times newswire, 1998-2000 Xinhua News Agency, 1996-2000

• The collection is indexed using the Lucene search engine: Each document is split into passages before indexing Each passage has stop words removed and all remaining words are

stemmed using the Porter stemmer (Porter, 1980).


Possible Expansion MethodsPossible Expansion Methods

• There are two main ways of combining multiple versions of a term in an IR query: or Expansion, i.e. A or B:

This is the standard boolean operator which will match documents containing either or both terms. Documents containing both A and B will rank higher than those containing a single term.

alt Expansion, i.e. alt(A, B):This operator treats the terms as alternative versions of the same term. The terms are all given the same score (that of the first term) which results in documents containing a single instance of either term ranking in the same way while still assigning a higher ranking to documents containing multiple instance of either term.

• We conducted experiments using both operators to determine which is better suited to the current task.


Noun ExpansionNoun Expansion• Three runs using the first test set

were carried out in which the IR queries were: The unaltered questions (baseline). The questions with the nouns

expanded using the or operator. The questions with the nouns

expanded using the alt operator.

• The alt expansion was as good or better than the baseline at all ranks. Significantly better at higher ranks

• The or expansion was significantly worse than the baseline at all ranks. 99% confidence using the paired t-test

at all but rank 1 which was 95%.

or Expansion

alt Expansion

Plain Questions


Adjective ExpansionAdjective Expansion• Three runs using the second test

set were carried out in which the IR queries were: The unaltered questions. The questions with the adjectives

expanded using the or operator. The questions with the adjectives

expanded using the alt operator.

• As with the noun experiments expanding using the or operator gives significantly worse results than the baseline.

• Expanding using the alt operator also give slightly worse results than the baseline although the differences are not statistically significant.

alt Expansion

Plain Questions

or Expansion


Noun ReplacementNoun Replacement• The previous experiments seem

to show that: Expanding nouns with their

adjective forms improves IR performance

Expanding adjectives with their noun forms decreases the IR performance

Hence it may be that the adjectives are solely responsible for good IR performance.

• A third experiment was carried out to see the effect of simply replacing nouns with their adjective forms. It should be clear that simply

replacing the nouns decreases the IR performance (difference is only statistically significant at rank 200).

Plain Questions

Expansion











Affect on a QA SystemAffect on a QA System• Any evaluation of IR techniques, such as the one

just discussed, should also look at their affect on the performance of QA systems. All QA systems are different making the evaluation difficult

• An experiment using a relatively simple QA system (Greenwood, 2004) showed that: MRR of 0.1947 when using 30 passages retrieved using

the questions as IR queries. MRR of 0.1988 when using 30 passages retrieved using

the questions in which the nouns had been expanded using the alt operator.

• Results show a very small improvement, two possible reasons for this could be: Very small test set The QA system has not been updated to make use of the

knowledge that certain adjectives are alternate forms of nouns and should be treated as such in any scoring function.

0

0.05

0.1

0.15

0.2

0.25

Plain Expanded











ConclusionsConclusions

• The results of the experiments show that: The original premise appears to be true – when a location noun appears

in a questions the adjective form tends to appear with the answer instead of the noun form.

The inverse of the premise does not seem to follow:• The results were not significant and require further investigation.

Using the alt operator gives much better results than using the or operator.

• possibly only true for the situation discussed here and may not be true for query expansion in general.

While these experiments have shown an increase in coverage of the retrieved documents due consideration must also be given to updating answer extraction components to take full advantage of this.


Future WorkFuture Work

• Future work in this area should include: Building a larger test set to confirm the results obtained in this paper. Do the results apply to expanding all nouns in a question or only the

location nouns? If all nouns were being expanded then the following are example expansions:

• abdomen to abdominal• volcano to volcanic

WordNet also contains pertainym relationships between adverbs and their stem adjectives – could these relationships provide similar improvements in performance?

• abnormally to abnormal

Answer extraction components need to be updated to take full advantage of improvements in IR performance.

Is there a linguistic reason for the apparent asymmetry in the results of the reported experiments.

Any Questions?Any Questions?

Copies of these slides can be found at:

http://www.dcs.shef.ac.uk/~mark/phd/work/


BibliographyBibliography• Mark A. Greenwood and Horacio Saggion. A Pattern Based Approach to

Answering Factoid, List and Definition Questions. In Proceedings of the 7th RIAO Conference (RIAO 2004), Avignon, France, 26-28 April, 2004.

• George A. Miller. WordNet: A Lexical Database. Communications of the ACM, 38(11):39-41, Nov. 1995.

• Christof Monz. From Document Retrieval to Question Answering. PhD thesis, Institute for Logic, Language and Computation, University of Amsterdam, 2003. Available, April 2004, from http://www.illc.uva.nl/Publications/Dissertations/DS-2003-04.text.pdf.

• Martin Porter. An Algorithm for Suffix Stripping. Program, 14(3):130-137, 1980.

• Ian Roberts and Robert Gaizauskas. Evaluating Passage Retrieval Approaches for Question Answering. In Proceedings of 26th European Conference on Information Retrieval, 2004.

• Ellen M. Voorhees. Overview of the TREC 2001 Question Answering Track. In Proceedings of the 10th Text REtrieval Conference, 2001.