26
Talk Schedule Question Answering from Email Bryan Klimt July 28, 2005

Talk Schedule Question Answering from Email Bryan Klimt July 28, 2005

Embed Size (px)

Citation preview

Page 1: Talk Schedule Question Answering from Email Bryan Klimt July 28, 2005

Talk Schedule Question Answering from Email

Bryan KlimtJuly 28, 2005

Page 2: Talk Schedule Question Answering from Email Bryan Klimt July 28, 2005

Project Goals• To build a practical working question

answering system for personal email

• To learn about the technologies that go into QA (IR,IE,NLP,MT)

• To discover which techniques work best and when

Page 3: Talk Schedule Question Answering from Email Bryan Klimt July 28, 2005

System Overview

Page 4: Talk Schedule Question Answering from Email Bryan Klimt July 28, 2005

Dataset

• 18 months of email (Sept 2003 to Feb 2005)

• 4799 total• 196 are talk announcements• hand labelled and annotated

• 478 questions and answers

Page 5: Talk Schedule Question Answering from Email Bryan Klimt July 28, 2005

A new email arrives…

• Is it a talk announcement?

• If so, we should index it.

Page 6: Talk Schedule Question Answering from Email Bryan Klimt July 28, 2005

Email Classifier

Email Data

LogisticRegression

DecisionLogisticRegressionCombo

Page 7: Talk Schedule Question Answering from Email Bryan Klimt July 28, 2005

Classification Performance

• precision = 0.81• recall = 0.66• (previous works had better

performance)

• Top features:– abstract, bio, speaker, copeta, multicast,

esm, donut, talk, seminar, cmtv, broadcast, speech, distinguish, ph, lectur, ieee, approach, translat, professor, award

Page 8: Talk Schedule Question Answering from Email Bryan Klimt July 28, 2005

Annotator• Use Information Extraction

techniques to identify certains types of data in the emails– speaker names and affiliations– dates and times– locations– lecture series and titles

Page 9: Talk Schedule Question Answering from Email Bryan Klimt July 28, 2005

Annotator

Page 10: Talk Schedule Question Answering from Email Bryan Klimt July 28, 2005

Rule-based Annotator

• Combine regular expressions and dictionary lookups

• defSpanType date =:...[re('^\d\d?$') ai(dayEnd)?

ai(month)]...;

• matches “23rd September”

Page 11: Talk Schedule Question Answering from Email Bryan Klimt July 28, 2005

Conditional Random Fields

• Probabilistic framework for labelling sequential data

• Known to outperform HMMs (relaxation of independence assumptions) and MEMMs (avoid “label bias” problem)

• Allow for multiple output features at each node in the sequence

Page 12: Talk Schedule Question Answering from Email Bryan Klimt July 28, 2005

Rule-based vs. CRFs

Page 13: Talk Schedule Question Answering from Email Bryan Klimt July 28, 2005

Rule-based vs. CRFs

• Both results are much higher than in previous study

• For dates, times, and locations, rules are easy to write and perform extremely well

• For names, titles, affiliations, and series, rules are very difficult to write, and CRFs are preferable

Page 14: Talk Schedule Question Answering from Email Bryan Klimt July 28, 2005

Template Filler• Creates a database record for each talk

announced in the email• This database is used by the NLP answer

extractor

Page 15: Talk Schedule Question Answering from Email Bryan Klimt July 28, 2005

Filled TemplateSeminar {

title = “Keyword Translation from English toChinese for Multilingual QA”

name = Frank Lintime = 5:30pmdate = Thursday, Sept. 23location = 4513 Newell Simon Hallaffiliation = series =

}

Page 16: Talk Schedule Question Answering from Email Bryan Klimt July 28, 2005

Search Time• Now the email is index• The user can ask questions

Page 17: Talk Schedule Question Answering from Email Bryan Klimt July 28, 2005

IR Answer ExtractorWhere is Frank Lin’s talk?

0.5055 3451.txtsearch[468:473]: "frank"search[2025:2030]: "frank"search[474:477]: "lin”

0.1249 2547.txtsearch[580:583]: "lin”

0.0642 2535.txtsearch[2283:2286]: "lin"

• Performs a traditional IR (TF-IDF) search using the question as a query

• Determines the answer type from simple heuristics (“Where”->LOCATION)

Page 18: Talk Schedule Question Answering from Email Bryan Klimt July 28, 2005

IR Answer Extractor

Page 19: Talk Schedule Question Answering from Email Bryan Klimt July 28, 2005

NL Question Analyzer

• Uses Tomita Parser to fully parse questions to translate them into a structured query language

• “Where is Frank Lin’s talk?”• ((FIELD LOCATION)

(FILTER (NAME “FRANK LIN”)))

Page 20: Talk Schedule Question Answering from Email Bryan Klimt July 28, 2005

NL Answer Extractor

• Simply executes the structured query produced by the Question Analyzer

• ((FIELD LOCATION) (FILTER (NAME “FRANK LIN”)))

• select LOCATION from seminar_templates where NAME=“FRANK LIN”;

Page 21: Talk Schedule Question Answering from Email Bryan Klimt July 28, 2005

Results

• NL Answer Extractor -> 0.870• IR Answer Extractor -> 0.755

Answer Accuracy

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

NL Answer Extractor IR Answer Extractor

Page 22: Talk Schedule Question Answering from Email Bryan Klimt July 28, 2005

Results• Both answer extractors have similar

(good) performance• IR based extractor

– easy to implement (1-2 days)– better on questions w/ titles and names– very bad on yes/no questions

• NLP based extractor– more difficult to implement (4-5 days)– better on questions w/ dates and times

Page 23: Talk Schedule Question Answering from Email Bryan Klimt July 28, 2005

Examples• “Where is the lecture on dolphin language?”

– NLP Answer Extractor: Fails to find any talk– IR Answer Extractor: Finds the correct talk– Actual Title: “Natural History and Communication of

Spotted Dolphin, Stenella Frontalis, in the Bahamas”

• “Who is speaking on September 10?”– NLP Extractor: Finds the correct record(s)– IR Extractor: Extracts the wrong answer– A talk “10 am, November 10” ranks higher than one

on “Sept 10th”

Page 24: Talk Schedule Question Answering from Email Bryan Klimt July 28, 2005

Future Work• Add an annotation “feedback loop” for the

classifier

• Add a planner module to decide which answer extractor to apply to each individual question

• Tune parameters for classifier and TF-IDF search engine

• Integrate into a mail client!

Page 25: Talk Schedule Question Answering from Email Bryan Klimt July 28, 2005

Conclusions• Overall performance is good enough for the

system to be helpful to end users

• Both rule-based and automatic annotators should be used, but for different types of annotations

• Both IR-based and NLP-based answer extractors should be used, but for different types of questions

Page 26: Talk Schedule Question Answering from Email Bryan Klimt July 28, 2005

DEMO