45
Emergence of Gricean Maxims from Multi-agent Decision Theory Adam Vogel Stanford NLP Group Joint work with Max Bodoia, Chris Potts, and Dan Jurafsky

Emergence of Gricean Maxims from Multi-agent Decision Theory

  • Upload
    colum

  • View
    35

  • Download
    0

Embed Size (px)

DESCRIPTION

Emergence of Gricean Maxims from Multi-agent Decision Theory. Adam Vogel Stanford NLP Group Joint work with Max Bodoia , Chris Potts, and Dan Jurafsky. Decision-Theoretic Pragmatics. Gricean cooperative principle:. - PowerPoint PPT Presentation

Citation preview

Page 1: Emergence of  Gricean  Maxims from Multi-agent Decision Theory

Emergence of Gricean Maxims from Multi-agent Decision Theory

Adam VogelStanford NLP Group

Joint work with Max Bodoia, Chris Potts, and Dan Jurafsky

Page 2: Emergence of  Gricean  Maxims from Multi-agent Decision Theory

Decision-Theoretic Pragmatics

Gricean cooperative principle:

Make your contribution such as it is required, at the stage at which it occurs, by the accepted purpose or

direction of the talk exchange in which you are engaged.

Page 3: Emergence of  Gricean  Maxims from Multi-agent Decision Theory

Decision-Theoretic Pragmatics

Gricean Maxims:• Be truthful: speak with evidence• Be relevant: speak in accordance with goals• Be clear: be brief and avoid ambiguity• Be informative: say exactly as much as needed

Page 4: Emergence of  Gricean  Maxims from Multi-agent Decision Theory

Emergence of Gricean Maxims

Co-operative principle

•Be truthful•Be relevant•Be clear•Be informative

???

Approach: Operationalize the co-operative principleTool: Multi-agent decision theoryGoal: Maxims emerge from rational behavior

Joint utility Rationality

Page 5: Emergence of  Gricean  Maxims from Multi-agent Decision Theory

Related Work

• One-shot reference tasks– Generating spatial referring expressions [Golland et al.

2010] – Predicting pragmatic reasoning in language games

[Stiller et al. 2011]• Interpreting natural language instructions– Learning to read help guides [Branavan et al. 2009]– Learning to following navigational directions [Vogel

and Jurafsky 2010] [Artzi and Zettlemoyer 2013] [Chen and Mooney 2011] [Tellex et al. 2011]

Page 6: Emergence of  Gricean  Maxims from Multi-agent Decision Theory

CARDS Task

Page 7: Emergence of  Gricean  Maxims from Multi-agent Decision Theory

Outline

• Spatial semantics• ListenerBot: single-agent advice taker– Can accept advice, never gives it

• DialogBot: multi-agent decision maker– Gives advice by tracking the other player’s beliefs

Page 8: Emergence of  Gricean  Maxims from Multi-agent Decision Theory

Spatial Semantics“in the top left of the board”

“on the left side” “right in the middle”

BOARD(top;left) BOARD(left) BOARD(middle)

MaxEnt Classifier w/ Bag of Words

Estimated from Corpus Data

Page 9: Emergence of  Gricean  Maxims from Multi-agent Decision Theory

Complexity Ahoy

• Approximate decision making only feasible for problems with <10k states!

1001000

10000100000

100000010000000

1000000001000000000

10000000000

Page 10: Emergence of  Gricean  Maxims from Multi-agent Decision Theory

Semantic State Representation• Divide board into 16 regions• Cluster squares based on meanings

Page 11: Emergence of  Gricean  Maxims from Multi-agent Decision Theory

• Spatial semantics• ListenerBot: single-agent advice taker– Can accept advice, never gives it

• DialogBot: multi-agent decision maker– Gives advice by tracking the other player’s beliefs

Outline

Page 12: Emergence of  Gricean  Maxims from Multi-agent Decision Theory

Partially Observable Markov Decision Process (POMDP)

Or: An HMM you get to drive!

Page 13: Emergence of  Gricean  Maxims from Multi-agent Decision Theory

State space S: hidden configuration of the world• Location of card• Location of player

Page 14: Emergence of  Gricean  Maxims from Multi-agent Decision Theory

Action space A: what we can do• Move around the board• Search for the card

Page 15: Emergence of  Gricean  Maxims from Multi-agent Decision Theory

Observations : sensor information + messages• Whether we are on top of the card• BOARD(right;top) etc.

Page 16: Emergence of  Gricean  Maxims from Multi-agent Decision Theory

Observation Model : sensor model• We see the card if we search for it and are on it• For messages

Page 17: Emergence of  Gricean  Maxims from Multi-agent Decision Theory

Reward R(s,a): value of an action in a state • Large reward if in the same square as the card• Every action adds small negative reward

Page 18: Emergence of  Gricean  Maxims from Multi-agent Decision Theory

Transition T(s’|a,s): dynamics of the world• Travel actions change player location• Card never moves

Page 19: Emergence of  Gricean  Maxims from Multi-agent Decision Theory

Initial belief state : distribution over S• Uniform distribution over card location• Known initial player location

Page 20: Emergence of  Gricean  Maxims from Multi-agent Decision Theory

Belief Update: Action: SEARCHObservation: (Card not here, )

Page 21: Emergence of  Gricean  Maxims from Multi-agent Decision Theory

Belief Update:

Page 22: Emergence of  Gricean  Maxims from Multi-agent Decision Theory

Belief Update: Action: SEARCHObservation: (Card not here, “left side”)

Page 23: Emergence of  Gricean  Maxims from Multi-agent Decision Theory

Belief Update:

Page 24: Emergence of  Gricean  Maxims from Multi-agent Decision Theory

Decision Making

Choose policy

Goal: Maximize expected reward

Solution: Perseus, an approximate value iteration algorithm [Spaan et al. 2005]

Computational complexity: P-SPACE!

Immediate reward Future rewardExpected +

Page 25: Emergence of  Gricean  Maxims from Multi-agent Decision Theory

• Spatial semantics• ListenerBot: single-agent advice taker– Can accept advice, never gives it

• DialogBot: multi-agent decision maker– Gives advice by tracking the other player’s beliefs

Outline

Page 26: Emergence of  Gricean  Maxims from Multi-agent Decision Theory

DialogBot

• (Approximately) tracks beliefs of other player• Speech actions change beliefs of other player• Model: Decentralized POMDP (Dec-POMDP)– Problem: NEXP Hard!!

Top!

Page 27: Emergence of  Gricean  Maxims from Multi-agent Decision Theory

Each agent selects its own action

Page 28: Emergence of  Gricean  Maxims from Multi-agent Decision Theory

Each agent receives its own observation

Page 29: Emergence of  Gricean  Maxims from Multi-agent Decision Theory

Transition depends on both actions

Page 30: Emergence of  Gricean  Maxims from Multi-agent Decision Theory

Reward is shared between agentsFormalization of the co-operative principle

Page 31: Emergence of  Gricean  Maxims from Multi-agent Decision Theory

Exact Multi-agent Belief Update

Time

Page 32: Emergence of  Gricean  Maxims from Multi-agent Decision Theory

Approximate Multi-agent Belief Update

Time

Page 33: Emergence of  Gricean  Maxims from Multi-agent Decision Theory

Single-agent POMDP Approximation

Other agent belief transition model

World transition model

Resulting POMDP has states

Page 34: Emergence of  Gricean  Maxims from Multi-agent Decision Theory

What to say?

Page 35: Emergence of  Gricean  Maxims from Multi-agent Decision Theory

“Top”

Page 36: Emergence of  Gricean  Maxims from Multi-agent Decision Theory

“Middle”

Page 37: Emergence of  Gricean  Maxims from Multi-agent Decision Theory

“Right”

Page 38: Emergence of  Gricean  Maxims from Multi-agent Decision Theory

“Right”

Page 39: Emergence of  Gricean  Maxims from Multi-agent Decision Theory

Return to Grice

• Be truthful• Be relevant• Be clear• Be informative

Page 40: Emergence of  Gricean  Maxims from Multi-agent Decision Theory

Cooperating DialogBots

Middle of the board

Page 41: Emergence of  Gricean  Maxims from Multi-agent Decision Theory

Cooperating DialogBots

Middle of the board

Page 42: Emergence of  Gricean  Maxims from Multi-agent Decision Theory

Adolescent DialogBots

Top

Page 43: Emergence of  Gricean  Maxims from Multi-agent Decision Theory

Return to Grice

• Be truthful: DialogBot speaks with evidence• Be relevant: DialogBot gives advice to help win

the game• Be clear• Be informative

Page 44: Emergence of  Gricean  Maxims from Multi-agent Decision Theory

Experimental Results• Evaluate pairs of agents from 197 random

initial states• Agents have 50 high-level moves to find the

cardBots % Success Average High

Level ActionsListenerBot & ListenerBot

84.4% 19.8

ListenerBot & DialogBot

87.2% 17.5

DialogBot & DialogBot

90.6% 16.6

Page 45: Emergence of  Gricean  Maxims from Multi-agent Decision Theory

Emergent Gricean Behavior

• Be truthful: DialogBot speaks with evidence• Be relevant: DialogBot gives advice to help win• Be clear: need variable costs on messages• Be informative: requires levels of specificity

ACL 2013: Implicatures and Nested Beliefs in Approximate Decentralized-POMDPs

From joint reward, not hard coded

Future Work: intentions, joint plans, deeper belief nesting

Thanks!