28
Provably Reliable QA Interfaces http://cs.washington.edu/resea rch/nli

Provably Reliable QA Interfaces

  • Upload
    lalasa

  • View
    40

  • Download
    2

Embed Size (px)

DESCRIPTION

Provably Reliable QA Interfaces. http://cs.washington.edu/research/nli. Outline. Motivation Reliable NLIs Semantic tractability theory Implementation and experiments Joint work with Ana-Maria Popescu, Alex Yates, Henry Kautz, and Dan Weld. I. The Dominant UI Paradigm. - PowerPoint PPT Presentation

Citation preview

Page 1: Provably Reliable QA Interfaces

Provably Reliable QA Interfaces

http://cs.washington.edu/research/nli

Page 2: Provably Reliable QA Interfaces

Etzioni, 473 2

OutlineI. MotivationII. Reliable NLIsIII. Semantic tractability theory IV. Implementation and experiments

Joint work with Ana-Maria Popescu, Alex Yates,

Henry Kautz, and Dan Weld.

Page 3: Provably Reliable QA Interfaces

Etzioni, 473 3

I. The Dominant UI Paradigm Menu buttons Hyperlinks Remote controls

Just Click it!

Page 4: Provably Reliable QA Interfaces

Etzioni, 473 4

What to click?

Page 5: Provably Reliable QA Interfaces

Etzioni, 473 5

Clicking breaks when.. Click-challenged

Hands busy (driving). Disabled.

Limited Screen/keyboard Real estate Cell phones, Microwave. Ubiquitous computing..

Complexity: SQL, shell scripts, menu hell.

Page 6: Provably Reliable QA Interfaces

Etzioni, 473 6

Alternative Interface ParadigmSpeech + NL Understanding + Agents.

“Where is Lord of the Rings showing?” “Defrost my corn.” “Delete all my old messages except

the ones from Mom.”

Substantial Research Challenges!

Page 7: Provably Reliable QA Interfaces

Etzioni, 473 7

State of the Art Commercial Speech Systems allow

single word input. NLIDBs are unreliable

(try Microsoft’s English Query).

Nontrivial Autonomous Agents that respond to complex requests?!

Page 8: Provably Reliable QA Interfaces

Etzioni, 473 8

II. Our Focus Not speech. 1991 – 1997: built softbots. Today:

Reliable Natural Language Interfaces

Page 9: Provably Reliable QA Interfaces

Etzioni, 473 9

Why Reliable?

Imagine “intelligent” interfaces that.. Sometimes delete the wrong file. Can report incorrect flight times.

AI cannot be an excuse for incompetence

(Norman, Schneiderman).

Page 10: Provably Reliable QA Interfaces

Etzioni, 473 10

Some Common Objections Can’t the interface just confirm?

Yes, but will users attend? Speech understanding isn’t

reliable. That will gradually change. Still need reliable language module.

NL understanding is AI-complete!

Page 11: Provably Reliable QA Interfaces

Etzioni, 473 11

III. Semantics is hard Syntactic, scopal ambiguity Word-sense ambiguity Time, events, liquids, holes,… Shakespeare, Faulkner,… Discourse, Pragmatics (speech acts)

We have to think about tractable classes!

Page 12: Provably Reliable QA Interfaces

Etzioni, 473 12

Our Basic Hypothesis

There are common situations where Semantic Interpretation is tractable.

Sentence Target expression

Page 13: Provably Reliable QA Interfaces

Etzioni, 473 13

Our Research Strategy

1. Identify easy-to-understand NL sentences.

2. Develop Taxonomic Theory of Semantic Tractability.

3. Build Reliable NLIs.4. Test NLIs experimentally.

Page 14: Provably Reliable QA Interfaces

Etzioni, 473 14

Semantic Tractability

Easy to understand questions:

What are the Chinese restaurants in Seattle? What Microsoft jobs require 2 years of

experience? What rivers run through Texas?

Semantically tractable questions are quite common: 77.5% - 97%

Page 15: Provably Reliable QA Interfaces

Etzioni, 473 15

Semantic Intractability

Q: What is the second highest mountain in the US?

A: The word ‘second’ is unknown; please rephrase your query.

Q: What are the states bordering the states bordering the states bordering Montana?

A: huh?

Page 16: Provably Reliable QA Interfaces

Etzioni, 473 16

Semantically Tractable Qs

Given a lexicon and a parse, IF1. Q contains at least one wh-word.2. There exists a valid mapping from

Q to a set of database elements.3. Q maps to a nonrecursive datalog

clause.THEN Q is Semantically Tractable.

Page 17: Provably Reliable QA Interfaces

Etzioni, 473 17

Valid Mapping

Words (phrases) DB elements. Lexical constraints: Lord of the Rings. Attachment constraints:

What is the population of Atlanta? Semantic constraints:

Attributes need values: Cuisine Chinese.

Values can have implicit attributes.

Page 18: Provably Reliable QA Interfaces

Etzioni, 473 18

Guarantees on ST Qs.Precise = NLIDB implementation.

Precise can detect ST Qs. Precise is sound for ST Qs. Precise is complete for ST Qs.

Will Precise capture “user intent?”

Page 19: Provably Reliable QA Interfaces

Etzioni, 473 19

IV. Precise Implementation Lexicon extracted from DB +

Wordnet. Parser plug in (Charniak 2000). Semantic constraints via graph match.

Word1 DB_EL1 Phrase2 DB_EL2

DB_EL3

Match is computed via maxflow.

Page 20: Provably Reliable QA Interfaces

Etzioni, 473 20

S

What

Woody Allen

Paul Mazursky

Movies = what

Actor = Woody Allen

Director = Woody Allen

Director = Paul Mazursky

Movies

Actor

Director

film

E

I K2

Value Tokens

DB Values DB Attributes Attribute Tokens

PRECISE graph representation

“What are the Paul Mazursky films with Woody Allen?”

Page 21: Provably Reliable QA Interfaces

Etzioni, 473 21

Database

Lexicon

Tokenizer Matcher Query Generator

Equivalence Checker

Parser

English Question SQL Query + Answer Set

PRECISEPRECISE Architecture

Page 22: Provably Reliable QA Interfaces

Etzioni, 473 22

Ambiguity meets Reliability Precise computes all valid

mappings. Many possibilities clarifying Q.

What is the population of New York? The population of New York City is… The population of New York State is..

Page 23: Provably Reliable QA Interfaces

Etzioni, 473 23

Experiments

Systems Mooney, PRECISE, EnglishQuery

Datasets Sets of NL questions labeled with SQL queries Geography (846) Jobs (577) Restaurants (224)

Measure PRECISE’s performance

Prevalence of semantically tractable questions

Page 24: Provably Reliable QA Interfaces

Etzioni, 473 24

30

40

50

60

70

80

90

100

Restaurants Jobs

MooneyPreciseEQ

Fraction Answered

Recall = Qanswered/Q

Recall > 75 %

Page 25: Provably Reliable QA Interfaces

Etzioni, 473 25

40

50

60

70

80

90

100

Restaurants Jobs

MooneyPreciseEQ

Precision = Qcorrect/Qanswered

Error Rate

PRECISE made no mistakes on semantically tractable questions

Page 26: Provably Reliable QA Interfaces

Etzioni, 473 26

Exact – Case Study Exact is an NLI to the Panasonic KX-

TC1040W telephone/answering machine It uses Precise to formulate goals for the

Blackbox planner (Kautz & Selman) The database model for the phone has 5

relations (between 2 and 11 attributes each) and the action set includes 37 actions.

Page 27: Provably Reliable QA Interfaces

Etzioni, 473 27

Exact – Diagram

Plan

NLIANLIDB

Translator

Planner

SQLstatements

Goal

DB

A

English Input

Dev ices

Page 28: Provably Reliable QA Interfaces

Etzioni, 473 28

Conclusion Click-it UI paradigm is fraying. Need reliable NLIs. Semantically Tractable sentences.

Taxonomic Theory. Precise and Exact implementations. Promising, preliminary experiments.