“Question answering using structured and unstructured...

“Question answering using structured and unstructured data”

Denis Savenkov,PhD student @ Emory University

WSDM’16 Doctoral Consortium

[1] “Questions vs. Queries in Informational Search Tasks”, Ryen W. White et al, WWW 2015

Percentage of question search queries is growing[1]

Factoid Non-factoid

(AP Photo/Jeopardy Productions, Inc.)

QA system architecture

Data Sources

Unstructured data Semi-structured data Structured data

Structured and Unstructured data sources for QA

Unstructured data Structured data

factoid questions

Text collections

+ easy to match against question text+ cover a variety of different information

types- each text phrase encodes a limited amount

of information about mentioned entities

Knowledge Bases (KB)

+ aggregate all the information about entities+ allow complex queries over this data using

special languages (e.g. SPARQL)- hard to translate natural language questions

into special query languages- KBs are incomplete (missing entities, facts and

properties)

non-factoid questions

Text collections

+ contain relevant information to a big chunk of user needs

- hard to extract semantic meaning of a paragraph to match against the question (lexical gap)

Question-Answer pairs

+ easy to find a relevant answer by matching the corresponding questions

- cover a smaller subset of user information needs

Combining text and KB for factoid QA

Using the structure of web documents for non-factoid QA

1. What types of questions can be answered using text, KB or a combination of both?

2. How does semantic annotation of unstructured data compare to information extraction for question answering?

○ information extraction for KB construction vs open information extraction vs unstructured data annotation

3. How does a combination of structured and unstructured data sources improve each of the main QA system components: question analysis, candidate generation, evidence extraction and answer selection?

Research Questions

Experiments: factoid QA

➔ Datasets○ TREC QA○ WebQuestions○ new factoid QA dataset

✓ derived from Yahoo! Answers CQA archive✓ filter factoid non-subjective questions (heuristics + mechanical turk)✓ mturk to label answers (KB entities, mentioned in CQA user answer)✓ Examples:

· What is the name of Sponge Bob's pet snail?· What is the boot-shaped country found in Southern Europe?· Who was the first model to ever appear on the cover of Sports Illustrated?

➔ Comparison○ state-of-the-art KBQA systems (WebQuestions benchmark)○ Open QA approach based on OpenIE extractions○ collection-based and hybrid techniques: AskMSR+, QuASE

Experiments: non-factoid QA

➔ TREC LiveQA shared task (http://trec-liveqa.org/): real user question answering in real-time○ systems need to respond to questions posted to Yahoo! Answers

➔ Comparison:○ baseline system developed for last year shared task○ participants of TREC LiveQA 2016

Expected results1. new factoid question answering dataset2. new approach to combining unstructured text and structured KB data, that:

✓ improves QA precision

✓ improves recall, including questions that could only be answered with a

combination of text and KB data3. new system for non-factoid question answering with improved performance due to

better utilization of the information provided in web documents

Open Questions→ How to incorporate other available sources of factual information, including semi-structured,

e.g. tables, diagrams, etc?→ Non-factoid questions often include unique context information, that makes reusing

information non effective. How a system can generate the answer given all potentially useful extracted information?

→ How to construct KBs for non-factoid information needs?

Thank you!

“Question answering using structured and unstructured...

Documents

DBPEDIA BASED FACTOID QUESTION ANSWERING SYSTEM

Characterization and Transformation of Unstructured Control …gpuocelot.gatech.edu/wp-content/uploads/unstructured... · 2012-04-17 · Characterization and Transformation of Unstructured

$400 BILLION FACTOID 3 HOURS THE IMPACT & 3 MINUTES

A Neural Network for Factoid Question Answering over ...miyyer/pubs/2014_qb_rnn.pdf · A Neural Network for Factoid Question Answering over Paragraphs ... Deep neural networks have

QG2010: The Third Workshop on Question Generation · 2 Factoid Question Generation . Our factoid question generation system operates over individual sentences from an input document

A Semantic Web understanding of the Factoid Prosopography model

FASTMath Unstructured Mesh Technologies - Argonne …extremecomputingtraining.anl.gov/files/2014/01/unstructured-mesh... · FASTMath Unstructured Mesh Technologies ... Unstructured

A Neural Network for Factoid Question Answering over ...cs.umd.edu/~miyyer/pubs/2014_qb_rnn.pdf · A Neural Network for Factoid Question Answering over Paragraphs Mohit Iyyer 1, Jordan

Answering using Convolutional Neural Network Answer ... · Answering using Convolutional Neural Network Md Mustafizur Rahman School of Information ... with attention network for factoid

Paraphrasing factoid dependency trees into uent sentences in a … · 2010-07-12 · Paraphrasing factoid dependency trees into uent sentences in a natural language Author Louis Onrust

A Scheme for Factoid Question Answering over …db-event.jpn.org/deim2019/post/papers/306.pdfbots[1] and dialogue systems designed to simulate human conversation. There are two major

Question Answering and Dialog Systems - ut · Factoid QA methods •IR-based question answering or text-based question answering •Relies on huge amounts of texts •Given a question,

WebShodh: A Code Mixed Factoid Question Answering System ...awb/papers/Chandu2017_Chapter_Web... · 106 K.R. Chandu et al. Fig.1. Architecture of WebShodh: a web based factoid QA

A Neural Network for Factoid Question Answering over ... · competition called quiz bowl. Unlike previous rnn models, qanta learns word and phrase-level representations that combine

Will Question Answering Become Main Theme of IR Research? · Natural Language Dialogue 1970 1990 2010 . Information Access through ... Factoid Question Answering System Index of Documents

A Neural Network for Factoid Question Answering over

Advanced Natural Language Processing and …disi.unitn.it/.../Intro-to-ANLP-IR-2018.pdfDeep Linguistic Analysis for Question Answering: QA tasks (open, restricted, factoid, non-factoid),

Statistical-based Approach for Indonesian Complex Factoid ...ijeei.org/docs-357612459577609c12cd67.pdf · for Indonesian Complex Factoid Question Decomposition Setio Basuki1 and Ayu

Learning to Rank Answers to Non-Factoid Questions from Web

An Exploration of the Principles Underlying …jimmylin/publications/Lin_TOIS...An Exploration of the Principles Underlying Redundancy-Based Factoid Question Answering JIMMY LIN University