78
Seman&c Analysis in Language Technology http://stp.lingfil.uu.se/~santinim/sais/2016/sais_2016.htm Question Answering Marina San(ni [email protected]fil.uu.se Department of Linguis(cs and Philology Uppsala University, Uppsala, Sweden Spring 2016 1

Lecture: Question Answering

Embed Size (px)

Citation preview

Page 1: Lecture: Question Answering

Seman&c  Analysis  in  Language  Technology  http://stp.lingfil.uu.se/~santinim/sais/2016/sais_2016.htm

Question Answering

Marina  San(ni  [email protected]  

 

Department  of  Linguis(cs  and  Philology  

Uppsala  University,  Uppsala,  Sweden  

 

Spring  2016  

 

 1  

Page 2: Lecture: Question Answering

Previous  Lecture:  IE  –  Named  En$ty  Recogni$on  (NER)  

2  

Page 3: Lecture: Question Answering

•  A  very  important  sub-­‐task:  find  and  classify  names  in  text,  for  example:  

•  The  decision  by  the  independent  MP  Andrew  Wilkie  to  withdraw  his  support  for  the  minority  Labor  government  sounded  drama(c  but  it  should  not  further  threaten  its  stability.  When,  aJer  the  2010  elec(on,  Wilkie,  Rob  OakeshoN,  Tony  Windsor  and  the  Greens  agreed  to  support  Labor,  they  gave  just  two  guarantees:  confidence  and  supply.  

Named  En$ty  Recogni$on  (NER)  

Person  Date  Loca(on  Organiza(on  Etc.      

Page 4: Lecture: Question Answering

NER  pipeline  

4  

Representa(ve  documents  

Human  annota(on   Annotated  

documents  

Feature  extrac(on  

Training  data  Sequence  classifiers  

NER  system  

Page 5: Lecture: Question Answering

Encoding  classes  for  sequence  labeling  

     IO  encoding  IOB  encoding    

 Fred      PER    B-­‐PER    showed    O    O    Sue      PER    B-­‐PER    Mengqiu    PER    B-­‐PER    Huang    PER    I-­‐PER    ‘s      O    O    new      O    O    pain(ng  O    O  

Page 6: Lecture: Question Answering

Features  for  sequence  labeling  

•  Words  •  Current  word  (essen(ally  like  a  learned  dic(onary)  •  Previous/next  word  (context)  

•  Other  kinds  of  inferred  linguis(c  classifica(on  •  Part-­‐of-­‐speech  tags  

•  Other  features  •  Word  shapes  •  etc.  

6  

Page 7: Lecture: Question Answering

Features: Word shapes

•  Word Shapes •  Map words to simplified representation that encodes attributes

such as length, capitalization, numerals, Greek letters, internal punctuation, etc.

Varicella-zoster Xx-xxx

mRNA xXXX

CPA1 XXXd

•  Varicella  zoster  is  a    virus  •  Messenger  RNA  (mRNA)  is  a  large  

family  of  RNA  molecules  •  CPA1  (Carboxypep(dase  A1  

(Pancrea(c))  is  a  Protein  Coding  gene.  

Page 8: Lecture: Question Answering

Inspira$on  figure  Task:  Develop  a  set  of  regular  expressions  to  recognize  the  character  shape  features.    

•  Possible  set  of  REs  matching  the  inspira(on  figure  (syntax  dpn  on  prLang):    

   

8  

No  need  to  remember  things  by  heart:  once  you  know  what  you  have  to  do,  find  the  correct  syntax  on  the  web!  

Page 9: Lecture: Question Answering

The  gold  standard  corpus  There  are  always  many  solu(ons  to  a  research  ques(on!  You  had  to  make  your  choice…  Basic  steps:    1.  Analyse  the  data  (you  must  

know  your  data  well!!!);    2.  Get  an  idea  of  the  paNerns  3.  Choose  the  way  to  go…  4.  Report  your  results  

9  

Page 10: Lecture: Question Answering

Proposed  solu$ons  

•  (Xx*)*  regardless  the  NE  type  

•  Complex  paNerns  that  could  iden(fy  approx.  900  lines  out  of  1316  en((es    (regardless  NE  type)  

•  etc…  10  

Page 11: Lecture: Question Answering

Some  alterna$ves:  create  paLerns  per  NE  type…  (divide  and  conquer  approach  J  )  Ex:  person  names  (283):  most  person  names  have  the  shape:  (Xx*){2}  (presumably  you  woud  get  high  accuracy)      Miles  Sindercombe  p:person  Armand  de  Pontmar(n  p:person  Alicia  Gorey  p:person  Kim  Crosby  (singer)  p:person  Edmond  Roudnitska  p:person  Shobha  Gurtu  p:person  Bert  Greene  p:person  Danica  McKellar  p:person  11  

Sheila  O'Brien  p:person  Mar(n  Day  p:person  Clive  MaNhew-­‐Wilson  p:person  Venugopal  Dhoot  p:person  Clifford  Berry  p:person  Munir  Malik  p:person  Mary  Sears  p:person  Charles  Wayne  "Chuck"  Day  p:person  Michael  Formanek  p:person  Felix  Carlebach  p:person  Alexander  Keith,  Jr.  p:person  Omer  Vanaudenhove  p:person  

Page 12: Lecture: Question Answering

What’s  the  mathema$cal  formalism  underlying  REs?  

12  

Page 13: Lecture: Question Answering

DFA  

13  

Page 14: Lecture: Question Answering

Conver$ng  the  regular  expression  (a|b)*  to  a  DFA  

14  

Page 15: Lecture: Question Answering

Conver$ng  the  regular  expression  (a*|b*)*  to  a  DFA  

15  

Page 16: Lecture: Question Answering

Conver$ng  the  regular  expression  ab(a|b)*  to  a  DFA  

16  

Page 17: Lecture: Question Answering

Chomsky  hierarchy  

•  Regular  expressions  help  solve  problems  that  are  tractable  by  ”regular  grammars”.      

17  

For  example,  it  is  not  possible  to  write  an  FSM  (and  consequently  regular  expressions)  that  generates  the  language  an  bn,  i.e.  the  set  of  all  strings  which  consist  of  a  (possibly  empty)  block  of  as  followed  by  a  (possibly  empty)  block  of  bs  of  exactly  the  same  length).      Areas  where  finite  state  methods  have  been  shown  to  be  par(cularly  useful  in  NLP  are  phonological  and  morphological  processing.      In  our  case,  we  must  explore  and  experiment  with  the  NE  corpus  and  see  if  there  are  sequences  that  cannot  be  captured  by  a  regular  language.    

Page 18: Lecture: Question Answering

For  some  problems,    

•  …  the  expressive  power  of  REs  is  exactly  what    is  needed  

•  For  some  other  problems,  the  expressive  power  of  REs  is  too  weak…  •  Addionally,  since  REs  a  basically  hand-­‐wriNen  rules,  it  is  easy  to  get  entagled  with  rules…  at  one  point  you  do  not  know  any  more  how  the  rules  interact  with  each  other…  so  results  might  be  unpredictable  J    

18  

Page 19: Lecture: Question Answering

End  of  previous  lecture  

19  

Page 20: Lecture: Question Answering

Question Answering

What  is  Ques(on  Answering?  

Page 21: Lecture: Question Answering

Acknowledgements Most  slides  borrowed  or  adapted  from:  

Dan  Jurafsky  and  Christopher  Manning,  Coursera  

Dan  Jurafsky  and  James  H.  Mar(n  (2015)  

   

 

J&M(2015,  draJ):  hNps://web.stanford.edu/~jurafsky/slp3/      

 

     

Page 22: Lecture: Question Answering

22  

Ques$on  Answering  

What do worms eat?

worms

eat

what

worms

eat

grass

Worms eat grass

worms

eat

grass

Grass is eaten by wormsbirds

eat

worms

Birds eat worms

horses

eat

grass

Horses with worms eat grass

with

worms

Ques%on: Poten%al-Answers:

One  of  the  oldest  NLP  tasks  (punched  card  systems  in  1961)  Simmons,  Klein,  McConlogue.  1964.  Indexing  and  Dependency  Logic  for  Answering  English  Ques(ons.  American  Documenta(on  15:30,  196-­‐204  

Page 23: Lecture: Question Answering

Ques$on  Answering:  IBM’s  Watson  •  Won  Jeopardy  on  February  16,  2011!  

•  IBM’s  Watson  is  a  Ques(on  Answering  system.  

•  What  is  Jeopardy?  

23  

Page 24: Lecture: Question Answering

Jeopardy!    

•  Jeopardy!  is  an  American  television  quiz  compe((on  in  which  contestants  are  presented  with  general  knowledge  clues  in  the  form  of  answers,  and  must  phrase  their  responses  in  the  form  of  ques/ons.    

•  The  original  day(me  version  debuted  on  NBC  on  March  30,  1964,    

24  

Page 25: Lecture: Question Answering

Watson’s  performance  

•  With  the  answer:  “You  just  need  a  nap.  You  don’t  have  this  sleep  disorder  that  can  make  sufferers  nod  off  while  standing  up,”  Watson  replied,  “What  is  narcolepsy?”  

25  

Page 26: Lecture: Question Answering

Ques$on  Answering:  IBM’s  Watson  •  The  winning  reply!  

26  

WILLIAM WILKINSON’S “AN ACCOUNT OF THE PRINCIPALITIES OF

WALLACHIA AND MOLDOVIA” INSPIRED THIS AUTHOR’S

MOST FAMOUS NOVEL

Bram  Stoker  

Page 27: Lecture: Question Answering

Apple’s  Siri  

27  

Page 28: Lecture: Question Answering

Wolfram  Alpha  

28  

Page 29: Lecture: Question Answering

29  

Types  of  Ques$ons  in  Modern  Systems  

•  Factoid  ques(ons  •  Who  wrote  “The  Universal  Declara/on  of  Human  Rights”?  •  How  many  calories  are  there  in  two  slices  of  apple  pie?  •  What  is  the  average  age  of  the  onset  of  au/sm?  •  Where  is  Apple  Computer  based?  

•  Complex  (narra(ve)  ques(ons:  •  In  children  with  an  acute  febrile  illness,  what  is  the                              efficacy  of  acetaminophen  in  reducing  fever?  

•  What  do  scholars  think  about  Jefferson’s  posi/on  on                      dealing  with  pirates?  

Page 30: Lecture: Question Answering

Commercial  systems:    mainly  factoid  ques$ons  

Where  is  the  Louvre  Museum  located?   In  Paris,  France  

What’s  the  abbrevia(on  for  limited  partnership?  

L.P.  

What  are  the  names  of  Odin’s  ravens?   Huginn  and  Muninn  

What  currency  is  used  in  China?   The  yuan  

What  kind  of  nuts  are  used  in  marzipan?   almonds  

What  instrument  does  Max  Roach  play?   drums  

What  is  the  telephone  number  for  Stanford  University?  

650-­‐723-­‐2300  

Page 31: Lecture: Question Answering

Paradigms  for  QA  

•  IR-­‐based  approaches  •  TREC;    IBM  Watson;  Google  

•  Knowledge-­‐based    • Apple  Siri;  Wolfram  Alpha;    

•  Hybrid  approaches  •  IBM  Watson;  True  Knowledge  Evi    

31  

Page 32: Lecture: Question Answering

Many  ques$ons  can  already  be  answered  by  web  search  

•  a  

32  

Page 33: Lecture: Question Answering

IR-­‐based  Ques$on  Answering  

•  a  

33  

Page 34: Lecture: Question Answering

Things  change  all  the  $me….  J  •  Google  was  a  pure  IR-­‐based  QA,  but  in  2012  Knowledge  Graph  

was  added  to  Google's  search  engine.    

•  The  Knowledge  Graph  is  a  knowledge  base  used  by  Google  to  enhance  its  search  engine's  search  results  with  seman(c-­‐search  informa(on  gathered  from  a  wide  variety  of  sources.    

•  Wikipedia:  The  goal  of  KGraph  is  that  users  would  be  able  to  use  this  informa(on  to  resolve  their  query  without  having  to  navigate  to  other  sites  and  assemble  the  informa(on  themselves.  [...]  According  to  some  news  websites,  the  implementa(on  of  Google's  Knowledge  Graph  has  played  a  role  in  the  page  view  decline  of  various  language  versions  of  Wikipedia.  34  

Page 35: Lecture: Question Answering

35  

IR-­‐based  Factoid  QA  

DocumentDocumentDocument

DocumentDocume

ntDocumentDocume

ntDocument

Question Processing

PassageRetrieval

Query Formulation

Answer Type Detection

Question

Passage Retrieval

Document Retrieval

Answer Processing

Answer

passages

Indexing

RelevantDocs

DocumentDocumentDocument

Page 36: Lecture: Question Answering

IR-­‐based  Factoid  QA  •  QUESTION  PROCESSING  

•  Detect  ques(on  type,  answer  type,  focus,  rela(ons  •  Formulate  queries  to  send  to  a  search  engine  

•  PASSAGE  RETRIEVAL  •  Retrieve  ranked  documents  •  Break  into  suitable  passages  and  rerank  

•  ANSWER  PROCESSING  •  Extract  candidate  answers  •  Rank  candidates    

•  using  evidence  from  the  text  and  external  sources  

Page 37: Lecture: Question Answering

Knowledge-­‐based  approaches  (Siri)  

•  Build  a  seman(c  representa(on  of  the  query  •  Times,  dates,  loca(ons,  en((es,  numeric  quan((es  

•  Map  from  this  seman(cs  to  query  structured  data    or  resources  •  Geospa(al  databases  •  Ontologies  (Wikipedia  infoboxes,  dbPedia,  WordNet,  Yago)  •  Restaurant  review  sources  and  reserva(on  services  •  Scien(fic  databases  

37  

Page 38: Lecture: Question Answering

SIRI's  main  tasks,  at  a  high  level,  involve:  •  Using  ASR  (Automa(c  speech  recogni(on)  to  transcribe  human  speech  (in  this  case,  short  

uNerances  of  commands,  ques(ons,  or  dicta(ons)  into  text.  •  Using  natural  language  processing  (part  of  speech  tagging,  noun-­‐phrase  chunking,  dependency  &  

cons(tuent  parsing)  to  translate  transcribed  text  into  "parsed  text".  •  Using  ques(on  &  intent  analysis  to  analyze  parsed  text,  detec(ng  user  commands  and  ac(ons.    

("Schedule  a  mee(ng",  "Set  my  alarm",  ...)  •  Using  data  technologies  to  interface  with  3rd-­‐party  web  services  such  as  OpenTable,  

WolframAlpha,  to  perform  ac(ons,  search  opera(ons,  and  ques(on  answering.  •  ULerances  SIRI  has  iden$fied  as  a  ques$on,  that  it  cannot  directly  answer,  it  will  forward  to  

more  general  ques$on-­‐answering  services  such  as  WolframAlpha  •  Transforming  output  of  3rd  party  web  services  back  into  natural  language  text  (eg,  Today's  

weather  report  -­‐>  "The  weather  will  be  sunny")  •  Using  TTS  (text-­‐to-­‐speech)  technologies  to  transform  the  natural  language  text  from  step  5  

above  into  synthesized  speech.    

38  

Page 39: Lecture: Question Answering

Hybrid  approaches  (IBM  Watson)  

•  Build  a  shallow  seman(c  representa(on  of  the  query  •  Generate  answer  candidates  using  IR  methods  

•  Augmented  with  ontologies  and  semi-­‐structured  data  

•  Score  each  candidate  using  richer  knowledge  sources  •  Geospa(al  databases  •  Temporal  reasoning  •  Taxonomical  classifica(on  

39  

Page 40: Lecture: Question Answering

Question Answering

Answer  Types  and  Query  Formula(on  

Page 41: Lecture: Question Answering

Factoid  Q/A  

41  

DocumentDocumentDocument

DocumentDocume

ntDocumentDocume

ntDocument

Question Processing

PassageRetrieval

Query Formulation

Answer Type Detection

Question

Passage Retrieval

Document Retrieval

Answer Processing

Answer

passages

Indexing

RelevantDocs

DocumentDocumentDocument

Page 42: Lecture: Question Answering

Ques$on  Processing  Things  to  extract  from  the  ques$on  

•  Answer  Type  Detec(on  •  Decide  the  named  en$ty  type  (person,  place)  of  the  answer  

•  Query  Formula(on  •  Choose  query  keywords  for  the  IR  system  

•  Ques(on  Type  classifica(on  •  Is  this  a  defini(on  ques(on,  a  math  ques(on,  a  list  ques(on?  

•  Focus  Detec(on  •  Find  the  ques(on  words  that  are  replaced  by  the  answer  

•  Rela(on  Extrac(on  •  Find  rela(ons  between  en((es  in  the  ques(on  42  

Page 43: Lecture: Question Answering

Question Processing They’re the two states you could be reentering if you’re crossing Florida’s northern border

•  Answer  Type:    US  state  •  Query:    two  states,  border,  Florida,  north  •  Focus:  the  two  states  •  Rela(ons:    borders(Florida,  ?x,  north)  

43  

Page 44: Lecture: Question Answering

Answer  Type  Detec$on:  Named  En$$es  

•  Who  founded  Virgin  Airlines?  •   PERSON    

•  What  Canadian  city  has  the  largest  popula/on?  •   CITY.  

Page 45: Lecture: Question Answering

Answer  Type  Taxonomy  

•  6  coarse  classes  •  ABBEVIATION,  ENTITY,  DESCRIPTION,  HUMAN,  LOCATION,  NUMERIC  

•  50  finer  classes  •  LOCATION:  city,  country,  mountain…  •  HUMAN:  group,  individual,  (tle,  descrip(on  •  ENTITY:  animal,  body,  color,  currency…  

45  

Xin  Li,  Dan  Roth.  2002.  Learning  Ques(on  Classifiers.  COLING'02  

Page 46: Lecture: Question Answering

46  

Part  of  Li  &  Roth’s  Answer  Type  Taxonomy  

LOCATION

NUMERIC

ENTITY HUMAN

ABBREVIATIONDESCRIPTION

country city state

datepercent

money

sizedistance

individual

title

group

food

currency

animal

definition

reason expression

abbreviation

Page 47: Lecture: Question Answering

47  

Answer  Types  

Page 48: Lecture: Question Answering

48  

More  Answer  Types  

Page 49: Lecture: Question Answering

Answer  types  in  Jeopardy  

•  2500  answer  types  in  20,000  Jeopardy  ques(on  sample  •  The  most  frequent  200  answer  types  cover  <  50%  of  data  •  The  40  most  frequent  Jeopardy  answer  types  he,  country,  city,  man,  film,  state,  she,  author,  group,  here,  company,  president,  capital,  star,  novel,  character,  woman,  river,  island,  king,  song,  part,  series,  sport,  singer,  actor,  play,  team,    show,                              actress,  animal,  presiden(al,  composer,  musical,  na(on,                                      book,  (tle,  leader,  game  

49  

Ferrucci  et  al.  2010.  Building  Watson:  An  Overview  of  the  DeepQA  Project.  AI  Magazine.  Fall  2010.  59-­‐79.  

Page 50: Lecture: Question Answering

Answer  Type  Detec$on  

•  Hand-­‐wriNen  rules  •  Machine  Learning  •  Hybrids  

Page 51: Lecture: Question Answering

Answer  Type  Detec$on  

•  Regular  expression-­‐based  rules    can  get  some  cases:  •  Who  {is|was|are|were}  PERSON  •  PERSON  (YEAR  –  YEAR)  

•  Other  rules  use  the  ques$on  headword:    (the  headword  of  the  first  noun  phrase  aJer  the  wh-­‐word)    

• Which  city  in  China  has  the  largest  number  of  foreign  financial  companies?  

• What  is  the  state  flower  of  California?  

Page 52: Lecture: Question Answering

Answer  Type  Detec$on  

•  Most  oJen,  we  treat  the  problem  as  machine  learning  classifica(on    • Define  a  taxonomy  of  ques(on  types  • Annotate  training  data  for  each  ques(on  type  •  Train  classifiers  for  each  ques(on  class                              using  a  rich  set  of  features.  •  features  include  those  hand-­‐wriNen  rules!  

52  

Page 53: Lecture: Question Answering

Features  for  Answer  Type  Detec$on  

•  Ques(on  words  and  phrases  •  Part-­‐of-­‐speech  tags  •  Parse  features  (headwords)  •  Named  En((es  •  Seman(cally  related  words    

53  

Page 54: Lecture: Question Answering

Factoid  Q/A  

54  

DocumentDocumentDocument

DocumentDocume

ntDocumentDocume

ntDocument

Question Processing

PassageRetrieval

Query Formulation

Answer Type Detection

Question

Passage Retrieval

Document Retrieval

Answer Processing

Answer

passages

Indexing

RelevantDocs

DocumentDocumentDocument

Page 55: Lecture: Question Answering

Keyword  Selec$on  Algorithm  

1.  Select  all  non-­‐stop  words  in  quota(ons  2.  Select  all  NNP  words  in  recognized  named  en((es  3.  Select  all  complex  nominals  with  their  adjec(val  modifiers  4.  Select  all  other  complex  nominals  5.  Select  all  nouns  with  their  adjec(val  modifiers  6.  Select  all  other  nouns  7.  Select  all  verbs    8.  Select  all  adverbs    9.  Select  the  QFW  word  (skipped  in  all  previous  steps)    10.  Select  all  other  words    

Dan  Moldovan,  Sanda  Harabagiu,  Marius  Paca,  Rada  Mihalcea,  Richard  Goodrum,  Roxana  Girju  and  Vasile  Rus.  1999.  Proceedings  of  TREC-­‐8.  

Page 56: Lecture: Question Answering

Choosing keywords from the query

56

Who coined the term “cyberspace” in his novel “Neuromancer”?

1 1

4 4

7

cyberspace/1 Neuromancer/1 term/4 novel/4 coined/7

Slide  from  Mihai  Surdeanu  

Page 57: Lecture: Question Answering

Question Answering

Passage  Retrieval  and  Answer  Extrac(on  

Page 58: Lecture: Question Answering

Factoid  Q/A  

58  

DocumentDocumentDocument

DocumentDocume

ntDocumentDocume

ntDocument

Question Processing

PassageRetrieval

Query Formulation

Answer Type Detection

Question

Passage Retrieval

Document Retrieval

Answer Processing

Answer

passages

Indexing

RelevantDocs

DocumentDocumentDocument

Page 59: Lecture: Question Answering

59  

Passage  Retrieval  

•  Step  1:  IR  engine  retrieves  documents  using  query  terms  •  Step  2:  Segment  the  documents  into  shorter  units  

•  something  like  paragraphs  

•  Step  3:  Passage  ranking  •  Use  answer  type  to  help  rerank  passages  

Page 60: Lecture: Question Answering

Features  for  Passage  Ranking  

•  Number  of  Named  En((es  of  the  right  type  in  passage  •  Number  of  query  words  in  passage  •  Number  of  ques(on  N-­‐grams  also  in  passage  •  Proximity  of  query  keywords  to  each  other  in  passage  •  Longest  sequence  of  ques(on  words  •  Rank  of  the  document  containing  passage  

Either  in  rule-­‐based  classifiers  or  with  supervised  machine  learning  

Page 61: Lecture: Question Answering

Factoid  Q/A  

61  

DocumentDocumentDocument

DocumentDocume

ntDocumentDocume

ntDocument

Question Processing

PassageRetrieval

Query Formulation

Answer Type Detection

Question

Passage Retrieval

Document Retrieval

Answer Processing

Answer

passages

Indexing

RelevantDocs

DocumentDocumentDocument

Page 62: Lecture: Question Answering

Answer  Extrac$on  

•  Run  an  answer-­‐type  named-­‐en(ty    tagger  on  the  passages  •  Each  answer  type  requires  a  named-­‐en(ty  tagger  that  detects  it  •  If  answer  type  is  CITY,  tagger  has  to  tag  CITY  •  Can  be  full  NER,  simple  regular  expressions,  or  hybrid  

•  Return  the  string  with  the  right  type:  •  Who is the prime minister of India (PERSON)  Manmohan Singh, Prime Minister of India, had told left leaders that the deal would not be renegotiated.!

•  How tall is Mt. Everest? (LENGTH)  The official height of Mount Everest is 29035 feet!

Page 63: Lecture: Question Answering

Ranking  Candidate  Answers  

•  But  what  if  there  are  mul(ple  candidate  answers!      

 Q: Who was Queen Victoria’s second son?!•  Answer  Type:    Person  

•  Passage:  The  Marie  biscuit  is  named  aJer  Marie  Alexandrovna,  the  daughter  of  Czar  Alexander  II  of  Russia  and  wife  of  Alfred,  the  second  son  of  Queen  Victoria  and  Prince  Albert  

Apposi(on  is  a  gramma(cal  construc(on  in  which  two  elements,  normally  noun  phrases,  are  placed  side  by  side,  with  one  element  serving  to  iden(fy  the  other  in  a  different  way.    

Page 64: Lecture: Question Answering

Use  machine  learning:  Features  for  ranking  candidate  answers  

Answer  type  match:    Candidate  contains  a  phrase  with  the  correct  answer  type.  PaLern  match:  Regular  expression  paNern  matches  the  candidate.  Ques$on  keywords:  #  of  ques(on  keywords  in  the  candidate.  Keyword  distance:  Distance  in  words  between  the  candidate  and  query  keywords    Novelty  factor:  A  word  in  the  candidate  is  not  in  the  query.  Apposi$on  features:  The  candidate  is  an  apposi(ve  to  ques(on  terms  Punctua$on  loca$on:  The  candidate  is  immediately  followed  by  a                                    comma,  period,  quota(on  marks,  semicolon,  or  exclama(on  mark.  Sequences  of  ques$on  terms:  The  length  of  the  longest  sequence                                                                    of  ques(on  terms  that  occurs  in  the  candidate  answer.    

Page 65: Lecture: Question Answering

Candidate  Answer  scoring  in  IBM  Watson  

•  Each  candidate  answer  gets  scores  from  >50  components  •  (from  unstructured  text,  semi-­‐structured  text,  triple  stores)  

•  logical  form  (parse)  match  between  ques(on  and  candidate  •  passage  source  reliability    •  geospa(al  loca(on  •  California    is    ”southwest  of  Montana”  

•  temporal  rela(onships  •  taxonomic  classifica(on  65  

Page 66: Lecture: Question Answering

66  

Common  Evalua$on  Metrics  

1. Accuracy  (does  answer  match  gold-­‐labeled  answer?)  2. Mean  Reciprocal  Rank  

•  For  each  query  return  a  ranked  list  of  M  candidate  answers.  •  Its  score  is  1/Rank  of  the  first  right  answer.  •  Take  the  mean  over  all  N  queries  

MRR =

1rankii=1

N

N

Page 67: Lecture: Question Answering

67  

Common  Evalua$on  Metrics  

1. Accuracy  (does  answer  match  gold-­‐labeled  answer?)  2. Mean  Reciprocal  Rank:    •  The  reciprocal  rank  of  a  query  response  is  the  inverse  of  the  rank  of  the  

first  correct  answer.    •  The  mean  reciprocal  rank  is  the  average  of  the  reciprocal  ranks  of  

results  for  a  sample  of  queries  Q  

MRR =

1rankii=1

N

N=  

Page 68: Lecture: Question Answering

Common  Evalua$on  Metrics:  MRR  •  The  mean  reciprocal  rank  is  the  average  of  the  reciprocal  ranks  

of  results  for  a  sample  of  queries  Q.  •  (ex  adapted  from  Wikipedia)  

•  3  ranked  answers  for  a  query,  with  the  first  one  being  the  one  it  thinks  is  most  likely  correct    

•  Given  those  3  samples,  we  could  calculate  the  mean  reciprocal  rank  as  (1/3  +  1/2  +  1)/3  =  11/18  or  about  0.61.  

68  

Page 69: Lecture: Question Answering

69  

Common  Evalua$on  Metrics  

1. Mean  Reciprocal  Rank  •  For  each  query  return  a  ranked  list  of  M  candidate  answers.  •  Query  score  is  1/Rank  of  the  first  correct  answer    •  If  first  answer  is  correct:  1    •  else  if  second  answer  is  correct:  ½  •  else  if  third  answer  is  correct:    ⅓,    etc.  •  Score  is  0  if  none  of  the  M  answers  are  correct  

•  Take  the  mean  over  all  N  queries  MRR =

1rankii=1

N

N

Page 70: Lecture: Question Answering

Use  of  this  metric  

•  Mean  reciprocal  rank  is  a  sta(s(c  measure  for  evalua(ng  any  process  that  produces  a  list  of  possible  responses  to  a  sample  of  queries,  ordered  by  probability  of  correctness.    

•  Machine  transla(on  •  Ques(on  answering  •  Etc.    

70  

Page 71: Lecture: Question Answering

Question Answering

Advanced:  Answering  Complex  Ques(ons  

Page 72: Lecture: Question Answering

Answering  harder  ques$ons  Q:  What  is  water  spinach?  A:  Water  spinach  (ipomoea  aqua(ca)  is  a  semi-­‐aqua(c  leafy  green  plant  with  long  hollow  stems  and  spear-­‐  or  heart-­‐shaped  leaves,  widely  grown  throughout  Asia  as  a  leaf  vegetable.  The  leaves  and  stems  are  oJen  eaten  s(r-­‐fried  flavored  with  salt  or  in  soups.  Other  common  names  include  morning  glory  vegetable,  kangkong  (Malay),  rau  muong  (Viet.),  ong  choi  (Cant.),  and  kong  xin  cai  (Mand.).  It  is  not  related  to  spinach,  but  is  closely  related  to  sweet  potato  and  convolvulus.    

Page 73: Lecture: Question Answering

Answering  harder  ques$on  Q:  In  children  with  an  acute  febrile  illness,  what  is  the  efficacy  of  single  medica(on  therapy  with  acetaminophen  or  ibuprofen  in  reducing  fever?  A:  Ibuprofen  provided  greater  temperature  decrement  and  longer  dura(on  of  an(pyresis  than  acetaminophen  when  the  two  drugs  were  administered  in  approximately  equal  doses.  (PubMedID:  1621668,  Evidence  Strength:  A)  

Page 74: Lecture: Question Answering

Answering  harder  ques$ons  via    query-­‐focused  summariza$on  

•  The  (boNom-­‐up)  snippet  method  •  Find  a  set  of  relevant  documents  •  Extract  informa(ve  sentences  from  the  documents  (using  �-­‐idf,  MMR)  •  Order  and  modify  the  sentences  into  an  answer  

•  The  (top-­‐down)  informa(on  extrac(on  method  •  build  specific  answerers  for  different  ques(on  types:  •  defini(on  ques(ons,  •  biography  ques(ons,    •  certain  medical  ques(ons  

Page 75: Lecture: Question Answering

The  Informa$on  Extrac$on  method  

•  a  good  biography  of  a  person  contains:  •  a  person’s  birth/death,  fame  factor,  educa$on,  na$onality  and  so  on  

•  a  good  defini$on  contains:  •  genus  or  hypernym  •  The  Hajj  is  a  type  of  ritual  

•  a  medical  answer  about  a  drug’s  use  contains:  •  the  problem  (the  medical  condi(on),    •  the  interven$on  (the  drug  or  procedure),  and    •  the  outcome  (the  result  of  the  study).  

Page 76: Lecture: Question Answering

Informa$on  that  should  be  in  the  answer  for  3  kinds  of  ques$ons  

Page 77: Lecture: Question Answering

Document Retrieval

11 Web documents1127 total sentences

Predicate Identification

Data-Driven Analysis

383 Non-Specific Definitional sentences

Sentence clusters, Importance ordering

DefinitionCreation

9 Genus-Species SentencesThe Hajj, or pilgrimage to Makkah (Mecca), is the central duty of Islam.The Hajj is a milestone event in a Muslim's life.The hajj is one of five pillars that make up the foundation of Islam....

The Hajj, or pilgrimage to Makkah [Mecca], is the central duty of Islam. More than two million Muslims are expected to take the Hajj this year. Muslims must perform the hajj at least once in their lifetime if physically and financially able. The Hajj is a milestone event in a Muslim's life. The annual hajj begins in the twelfth month of the Islamic year (which is lunar, not solar, so that hajj and Ramadan fall sometimes in summer, sometimes in winter). The Hajj is a week-long pilgrimage that begins in the 12th month of the Islamic lunar calendar. Another ceremony, which was not connected with the rites of the Ka'ba before the rise of Islam, is the Hajj, the annual pilgrimage to 'Arafat, about two miles east of Mecca, toward Mina…

"What is the Hajj?" (Ndocs=20, Len=8)

Architecture  for  complex  ques$on  answering:  defini$on  ques$ons   S.  Blair-­‐Goldensohn,  K.  McKeown  and  A.  Schlaikjer.  2004.  

Answering  Defini(on  Ques(ons:  A  Hyrbid  Approach.    

Page 78: Lecture: Question Answering

The end