Question Answering - Emory Universitytjurczy/qa/QAPresentation.pdf · Evaluating the distance...

Preview:

Citation preview

Question AnsweringApproaches towards better human questions

answering

Tomasz JurczykEmory NLP Group Meeting

February 16th, 2015

Information Overload“Getting information off the Internet is like taking a drink from fire hydrant”

~Mitchell Kapor

Question Answering● Intersection of Information Retrieval and Natural

Language Processing● Query structured database of knowledge (knowledge

base)● Able to pull an answer from an unstructured collection of

natural language documents● Variety of question types (open-domain, closed-domain,

factual etc.)

Some challenges in QA● Question types● Processing & context● Data sources & answer extraction● Specific needs for QA systems (real time question

answering, multilingual etc.)● Information clustering

Existing projects● Watson (IBM)● START Natural Language Question Answering System

(MIT)● Google Search

Watson● Won Jeopardy on February 16, 2011!

START

Google Search

Mapping Dependencies Trees● Evaluating the distance between a question and an

answer candidate● Distance is calculated in an approximate tree matching

algorithm○ Distance is the cost of doing sequences

(add/delete/modify) to transform one tree to another

Mapping Dependencies Trees: An Application to Question Answering ∗, Vasin Punyakanok, Dan Roth, Wen-tau Yih

Bag of words?Q: What is the fastest car in the world?

CA1: The Jaguar XJ220 is the dearest (415000 pounds), fastest (217mph) and most sought after car in the world.CA2: (...) will stretch Volkswagen’s lead in the world’s fastest growing vehicle market.

Dependency trees matching distance

Measurements● MAP (Mean Average Precision)

○ Mean average precision for a set of queries is the mean of the average precision scores for each query.

● MRR (Mean Reciprocal Rank)○ The mean reciprocal rank is the average of the reciprocal

ranks of results for a sample of queries

Method MAP MRR

Mapping DT (2004) 0.419 0.494

Passage Retrieval Using Dependency Relations

● Fuzzy relation matching based on statistical models

● Two methods for learning relation mapping scores from past QA pairs: 1. Mutual information2. Expectation maximization

Hang Cui, Renxu Sun, Keya Li, Min-Yen Kan, and Tat-Seng Chua., Question Answering Passage Retrieval Using Dependency Relations

Extracting and Pairing Relation Paths

Method MAP MRR

Mapping DT (2004) 0.419 0.494

Passage Retrieval (2005) 0.427 0.526

Jeopardy Model - A Quasi-Synchronous Grammar for QA

● Used probabilistic quasi-synchronous grammar● Parameterized by mixtures of a robust non-lexical

syntax/alignment model ○ 3 adjustments in their model

■ Bayes’ rule■ Labeled, structured dependency tree■ Alignment between question and answer words

Mengqiu Wang and Noah A. Smith and Teruko Mitamura, What is the Jeopardy Model? A Quasi-Synchronous Grammar for QA

Alignment relations

Method MAP MRR

Mapping DT (2004) 0.419 0.494

Passage Retrieval (2005) 0.427 0.526

Jeopardy Model (2007) 0.603 0.685

Tree Edit Models● Tree edit models for representing sequences of tree

transformations● Similar to the Mapping Dependencies Trees, but more advanced

○ Used 6 main operations that are mixes of move, delete, merge, relabel etc.

● Greedy best-first search used to search sensible edit sequences (using Tree Kernel Heuristic)

● Defined constraints on the Search Space● Trained a logistic regression classification model

○ 33 features that consists of number/type of edits, node types etc.

Michael Heilman Noah A. Smith, Tree Edit Models for Recognizing Textual Entailments, Paraphrases, and Answers to Questions

A Tree Edit Sequence

Method MAP MRR

Mapping DT (2004) 0.419 0.494

Passage Retrieval (2005) 0.427 0.526

Jeopardy Model (2007) 0.603 0.685

Tree Edit Models (2010) 0.609 0.692

Probabilistic Tree-Edit Models with Structured Latent Variables

Recognizing Textual Entailment

Gabriel Garcia Marquez is a novelist and winner of the Nobel prize for literature

Gabriel Garcia Marquez won the Nobel for Literature.)

Mengqiu Wang, Christopher D. Manning, Probabilistic Tree-Edit Models with Structured Latent Variables for Textual Entailment and Question Answering

Text Edits Technique● Similar idea to the previous approach with text edits

○ 45 edit operations (12 delete, 12 insert, 21 substitute)● Designed a Finite-State Machine (each edit operation is mapped to

a unique state, and an edit sequence is mapped into a transition sequence)

● The probability of an edit sequence is calculated based on mix of features (word-matching features, tree structure features)

Method MAP MRR

Mapping DT (2004) 0.419 0.494

Passage Retrieval (2005) 0.427 0.526

Jeopardy Model (2007) 0.603 0.685

Tree Edit Models (2010) 0.609 0.692

Probabilistic TEM (2010) 0.595 0.695

Answer Extraction as Sequence Tagging with Tree Edit Distance

● Extended work of Tree Edit Models○ Added synonyms, entailment and causing verbs, parts-

of/member-of entities

Xuchen Yao, Benjamin Van Durme, Chris Callison-Burch, Peter Clark, Answer Extraction as Sequence Tagging with Tree Edit Distance

Answer extraction using Conditional Random Field

● Sequence tagging by three states: start/middle/end● Features used by CRF:

○ Chunking (kind of silly is unlikely to be an answer, while in 90 days is)

○ Question-type (how many questions expect numerical answer types)

○ Edit script (during sequencing, words are deleted/renamed and they could be an answer)

○ Alignment distance (a candidate answer often appears close to an aligned word)

● Then, applied voting mechanism to find an answer

Example Prediction Trace

Method MAP MRR

Mapping DT (2004) 0.419 0.494

Passage Retrieval (2005) 0.427 0.526

Jeopardy Model (2007) 0.603 0.685

Tree Edit Models (2010) 0.609 0.692

Probabilistic TEM (2010) 0.595 0.695

Sequence Tagging (2013) 0.631 0.748

Enhanced Lexical Semantic Models● Designed Lexical Semantic Models

○ Synonymy and Antonymy○ Hypernymy and Hyponymy

■ Class-Inclusion or Is-A relation (What color is Saturn? → Saturn is a giant gas planet with brown and beige clouds.

○ Semantic Word Similarity● Learning QA Matching Models

○ Bag of Words○ Learning Latent Structures

■ Look like a Latent-SVM (different learning formulations and replaced decision function)

Wen-tau Yih Ming-Wei Chang Christopher Meek Andrzej Pastusiak, Question Answering Using Enhanced Lexical Semantic Models

Relations Between Text

Method MAP MRR

Mapping DT (2004) 0.419 0.494

Passage Retrieval (2005) 0.427 0.526

Jeopardy Model (2007) 0.603 0.685

Tree Edit Models (2010) 0.609 0.692

Probabilistic TEM (2010) 0.595 0.695

Sequence Tagging (2013) 0.631 0.748

Lexical Sem. M. (2013) 0.709 0.770

Automatic Feature Engineering for Answer Selection and Extraction

● Trained SVM with tree kernels to train an answer sentence classifier

● Trained Kernel-based classifier to select the best answer

Aliaksei Severyn, Alessandro Moschitti, Automatic Feature Engineering for Answer Selection and Extraction

Method MAP MRR

Mapping DT (2004) 0.419 0.494

Passage Retrieval (2005) 0.427 0.526

Jeopardy Model (2007) 0.603 0.685

Tree Edit Models (2010) 0.609 0.692

Probabilistic TEM (2010) 0.595 0.695

Sequence Tagging (2013) 0.631 0.748

Lexical Semantic M. (2013) 0.709 0.770

AFE (2013) 0.678 0.736

Summary

● Current SOTA (presented) has ~70% accuracy

● A great need for more accurate QA systems

Thanks!

Questions?

Recommended