Can Negotiation Breakdown Probabilities of Laissez-Faire Agents Be Derived A Priori?

Preview:

DESCRIPTION

Can Negotiation Breakdown Probabilities of Laissez-Faire Agents Be Derived A Priori?. R. Loui, R. Ratkowski, J. Rosen Department of Computer Science and Engineering Washington University St. Louis USA. Trailers/Previews. DataMining on OC192 Data Streams (10Gbps) A.k.a. "Streaming AI" - PowerPoint PPT Presentation

Citation preview

28 September 2006 p1 r.p.loui AIVR UIUC

Can Negotiation Breakdown Probabilities

of Laissez-Faire Agents Be Derived A Priori?

R. Loui, R. Ratkowski, J. RosenDepartment of Computer Science and Engineering

Washington UniversitySt. Louis

USA

28 September 2006 p2 r.p.loui AIVR UIUC

Trailers/Previews• DataMining on OC192 Data Streams (10Gbps)

– A.k.a. "Streaming AI"

– With John Lockwood (from UIUC), PI

– 1M x speedup of classified intelligence task

– Better performance than available software

– Better performance than SVM methods

– Related to FPgrep, FPsed, FPawk patent US 7093023

• Methods, systems, and devices using reprogrammable hardware for high-speed processing of streaming data to find a redefinable pattern and respond thereto

– W/Moshe Looks, papers on hierarchical streaming clustering

28 September 2006 p3 r.p.loui AIVR UIUC

Trailers/Previews• Scripting Languages and The New Programming Pragmatics

– http://www.cse.wustl.edu/~loui/praiseieee.txt– The real shock is that academia continues to reject the sea change

in programming practices brought about by scripting.

– Scripting was not enervating but was actually renewing: programmers who viewed code generation as tedious and tiresome … viewed scripting as rewarding self-expression or recreation.

– I personally believe that CS1 java is the greatest single mistake in the history of computing curricula.

– Linguists recognize something above syntax and semantics, and they call it "pragmatics". We are entering an era of comparative programming language study when the issues are higher-level, social, and cognitive too.

28 September 2006 p4 r.p.loui AIVR UIUC

Trailers/Previews• Moshe Looks, D.Sc. 06 (expected)

– Externally advised by David Goldberg (UIUC) and Martin Pelikan (formerly UIUC)

– COMPETENT PROGRAM EVOLUTION

– My thesis is that the properties of programs and program spaces can be leveraged as inductive bias to reduce the burden of manual representation-building, leading to competent program evolution.

– The central contributions of this dissertation are • a view of the requirements for competent program

evolution, and • the design of a procedure, meta-optimizing semantic

evolutionary search (MOSES)

28 September 2006 p5 r.p.loui AIVR UIUC

Trailers/Previews

• Dynamics of Rule Revision and Strategy Revision– With M. Looks, B. Cynamon (U Chicago), A. Schiller (Princeton)– A.k.a. Legislature vs. Population games

– H.L.A. Hart: There is a limit, inherent in the nature of language, to the guidance which general language can provide.

– Abridgement = Projection of a veridical utility function– Scenario extinction– Sheep, Weasels, and Gorillas

28 September 2006 p6 r.p.loui AIVR UIUC

Today's AbstractWe present a different AI model of negotiation where agents are driven by dynamic

expectations (there is no solution concept and there is no recursive modeling of beliefs). We require two assumptions to paint a new picture:

(1) there is an empirically observable objective probability of breakdown that is monotonic (at some granularity) in elapsed time since last progress;

(2) there is a nonstandard utility attached to the act of unilateral breakdown: a process utility that models the satisfaction of breaking down on a non-cooperative negotiating partner. This is a procedural fairness adjustment, not the substantive distributive fairness effect that has been trendy in the economics literature.

We observe the variety of behaviors that can be generated by agents constructing action according to such Pessimism-Punishment (PP) negotiation models. We define a laissez-faire path for two PP agents starting in a given position and the proper calibration of their breakdown probabilities conditioned only on position. Finally, we discuss what iterative process could be used to reduce a priori miscalibration of breakdown probabilities.

28 September 2006 p7 r.p.loui AIVR UIUC

Negotiation Behavior• Social Psychology: Dean Pruitt

– Logrolling issues

• Management Sci: Howard Raiffa / Max Bazerman - Margaret Neale– Integrative agreement

• Law: Roger Fisher – William Ury – Bruce Patton– Principled negotiation

• Artificial Intelligence: – Problem solving: R Davis – R Smith / V Lesser– Shared planning dialogue: S Carberry / G Ferguson – J Allen – Non-ideal game theory: E Durfee / S Rosenschein / T Sandholm– Argumentation: K Sycara / S Parsons – C Sierra – N Jennings

• Economics: John Nash / Ariel Rubinstein– Solution concept– Equilibrium

28 September 2006 p8 r.p.loui AIVR UIUC

Negotiation Behavior: Equilibrium

• Mathematical curiosity (cf. Axelrod)• Starts with "Solution concept (I or II)": reduction

of uncertainty to a distinct outcome• Epistemologically far-fetched• Empirically ridiculous• Philosophically indefensible• Useless in the design of negotiating agents

28 September 2006 p9 r.p.loui AIVR UIUC

Negotiation Behavior: Equilibrium

• Mathematical curiosity (cf. Axelrod)• Starts with "Solution concept (I or II)": reduction

of uncertainty to a distinct outcome• Epistemologically far-fetched• Empirically ridiculous• Philosophically indefensible• Useless in the design of negotiating agents

28 September 2006 p10 r.p.loui AIVR UIUC

Negotiation Behavior: AI

• A place for language / argument• A place for introspection on utilities• A range of interesting & reasonable behaviors

• Participation in the process of negotiation:– Exchanging proposals

– Reacting to proposals

28 September 2006 p11 r.p.loui AIVR UIUC

My Theory Pt. I

• Observation: parties to a negotiation (can) construct a probability distribution over potential settlements

28 September 2006 p12 r.p.loui AIVR UIUC

Party 1'saspirationParty 2's

aspiration

28 September 2006 p13 r.p.loui AIVR UIUC

Party 2'sproposals at t

Party 1'sproposals at t

28 September 2006 p14 r.p.loui AIVR UIUC

inadmissible(dominated)at t

inadmissible(dominated)at t

28 September 2006 p15 r.p.loui AIVR UIUC

In black:admissiblesettlementsat t

(probabilityof agreementIs non-zero)

28 September 2006 p16 r.p.loui AIVR UIUC

1's aspiration

2's aspiration

28 September 2006 p17 r.p.loui AIVR UIUC

Breakdown (BATNA)

28 September 2006 p18 r.p.loui AIVR UIUC

Breakdown (BATNA)

28 September 2006 p19 r.p.loui AIVR UIUC

Breakdown row

Breakdown column

28 September 2006 p20 r.p.loui AIVR UIUC

Breakdownwould occurhere (BATNA)

28 September 2006 p21 r.p.loui AIVR UIUC

1's securitylevel

2's securitylevel

2 would rather breakdown

1 would rather breakdown

28 September 2006 p22 r.p.loui AIVR UIUC

Eu1|s = 51

Eu2|s = 49α +54(1-α)

Prob(bd) = ?

28 September 2006 p23 r.p.loui AIVR UIUC

My Theory Pt. I

• Observation: parties to a negotiation (can) construct a probability distribution over potential settlements

28 September 2006 p24 r.p.loui AIVR UIUC

My Theory Pt. I

• Observation: parties to a negotiation (can) construct a probability distribution over potential settlements

• What kind of probability?

28 September 2006 p25 r.p.loui AIVR UIUC

My Theory Pt. I• Observation: parties to a negotiation (can) construct an

objective, empirical or epistemological* (NOT SUBJECTIVE*) probability distribution over potential settlements from past experience in similar settings

• OBJECTIVE:– Constructed from data– Agree on P, given K– Not committed to P(a) until queried about P(a)

• SUBJECTIVE:– "Bayesian" (but not necessarily what AI people mean)– Can change by new prior, new conditioning, or shift in feelings– Total and consistent prior to querying

28 September 2006 p26 r.p.loui AIVR UIUC

My Theory Pt. I

• Observation: from a probability distribution over potential settlements, there is an expected utility given settlement

• Observation: there is a probability of breakdown p(bd)

28 September 2006 p27 r.p.loui AIVR UIUC

probability of break downP(bd) gap

28 September 2006 p28 r.p.loui AIVR UIUC

My Theory Pt. I

• Observation: from a probability distribution (at t) over potential settlements, there is an expected utility given settlement (at t)

• Observation: there is a probability of breakdown pt(bd)

28 September 2006 p29 r.p.loui AIVR UIUC

My Theory Pt. I• Observation: At t, calculate

1. An expected utility given settlement (Eut|s) and 2. An expected utility given continued negotiation, Eut = (Eut |s) (1 - pt(bd)) + u(bd) pt(bd)

• Definition: Rationality requires the agent, at t, to:

1. Extend an offer, o, if Eut < u(o) 2. Accept an offer, a, if Eut < u(a), a offers-to-you(t)

3. Break down unilaterally if Eut < u(bd)

28 September 2006 p30 r.p.loui AIVR UIUC

My Theory Pt. I

• Why not iff?

• Extend an offer, o, if Eut < u(o)

• Withhold an offer?, o, if Eut > u(o)

• There may be other reasons for acting earlier• Constructivism:

– Multiple ways of constructing probability

– Multiple ways of deriving/justifying/motivating action

28 September 2006 p31 r.p.loui AIVR UIUC

My Theory Pt. I

• Empirical Observation: At sufficient granularity, p(bd) is decreasing in the time since last progress

28 September 2006 p32 r.p.loui AIVR UIUC

My Theory Pt. IPessimism

For sufficiently large Δ, where LP(t0) denotes last progress at t0

pt+Δ(bd | LP(t0)) > pt(bd | LP(t0))

What is progress? A non-trivial offer by the other party

What does this mean?(at some granularity, the past record implies that:)If there are no offers, the probability of breakdown rises

28 September 2006 p33 r.p.loui AIVR UIUC

My Theory Pt. IPessimism

Linear Pessimism p(bd | NP(t)) = π t

Exponential Pessimismp(bd | NP(t)) = 1 - e-πt

Delayed Linear Pessimism p(bd | NP(t)) = π max(0, t - t0)

Whatever fits the empirical record!

28 September 2006 p34 r.p.loui AIVR UIUC

Pessimism causes Eu to fall

Next offer is made at this timeand prob(bd) resets to 0Expectation starts to fall again

28 September 2006 p35 r.p.loui AIVR UIUC

reciprocated offers

offers

28 September 2006 p36 r.p.loui AIVR UIUC

Agreement reached as Eu < u1

28 September 2006 p37 r.p.loui AIVR UIUC

security

Best offer received

Whenever u(acc) > security, acceptance occurs before breakdown!

28 September 2006 p38 r.p.loui AIVR UIUC

security

Best offer received

Would you accept an 11-cent offer if yoursecurity were 10-cents?

28 September 2006 p39 r.p.loui AIVR UIUC

My Theory Pt. II

• Observation: You wouldn't accept 11¢ over 10 ¢ security, nor 51 ¢ over 50 ¢ security

• Observation: You wouldn't let your kid do it

• Observation: Your Mother wouldn't let you do it

• Observation: Your lawyer wouldn't let you do it

• Observation: Your accountant wouldn't let you do it

• Proposition: We shouldn't automate our agents to do it

28 September 2006 p40 r.p.loui AIVR UIUC

My Theory Pt. II

• Question: Isn't this an issue of distributive justice

• Answer: Substantive fairness is trivial to model by transforming utilities

• Observation: There may be a procedural fairness issue

28 September 2006 p41 r.p.loui AIVR UIUC

My Theory Pt. II

• Procedural fairness: – the more the other party withholds progress, the

more you will punish

– When the other party resumes cooperation, you

are willing to forgo punishment

28 September 2006 p42 r.p.loui AIVR UIUC

My Theory Pt. IIResentment

u(bd) = security + resentment(t)

What is resentment? 1. Dignity2. Pride3. Investment in society4. Protection against non-progressive manipulators5. A GENUINE source of satisfaction:

non-material, transactional, personal(?), transitory(?)

28 September 2006 p43 r.p.loui AIVR UIUC

My Theory Pt. IIResentment

ut(bd) = security + resentment(t) = u(bd) + r(t)

for NP(t), non-progress for a period t

What is resentment? 6. Attached to a speech/dialogue act:

BATNA through breaking down vs. BATNA through agreement

7. A nonstandard utility (process utility)8. Specific or indifferent (I-bd-you vs. you-bd-me)

28 September 2006 p44 r.p.loui AIVR UIUC

My Theory Pt. II

Resentment

linear resentment r(t) = ρt

sigmoid resentment r(t) = rmax(2/(1+e-ρt) -1)

You either feel it or you don't – you can't fake it!

28 September 2006 p45 r.p.loui AIVR UIUC

Eu never falls to u1

28 September 2006 p46 r.p.loui AIVR UIUC

Actually accepts becauseresentment resets with progress

Resentment resets to zero each time there is progress

Nontrivial progess

28 September 2006 p47 r.p.loui AIVR UIUC

Resentment might not reset to zero if there is memory

Agent breaks down before accepting

28 September 2006 p48 r.p.loui AIVR UIUC

P&P Agents

Pessimism + Punishment"purely" probablistic

28 September 2006 p49 r.p.loui AIVR UIUC

Variety of Behaviors

• Agent can make a series of offers, responds to offers

• Agent can wait, then offer, accept, or break down

• Agent can accept, offer, or break down immediately

• Agent can offer before accepting and vice versa

• Agent can breakdown before accepting and vice versa

• Agent can offer before breaking down and vice versa

• Agent can be on path to breakdown, then on path to acceptance– because received offer changes Eu or resentment

– because extended offer changes Eu

• I wouldn't use this as my agent on ebay quite yet…

28 September 2006 p50 r.p.loui AIVR UIUC

low-valued ρ high-valued ρ

(Assumes no progress)

Linear pess/linear specific pun

28 September 2006 p51 r.p.loui AIVR UIUC

low-valued ρ high-valued ρ

(Assumes no progress)

Linear pess/linear indifferent pun

28 September 2006 p52 r.p.loui AIVR UIUC

(Assumes no progress)

low-valued ρ high-valued ρ

Exponential pess/linear indifferent pun

28 September 2006 p53 r.p.loui AIVR UIUC

rare alternation betweenbreakdown and acceptance

(Assumes no progress)

Exponential pess/sigmoidal specific pun

28 September 2006 p54 r.p.loui AIVR UIUC

Laissez-Faire Paths

28 September 2006 p55 r.p.loui AIVR UIUC

What happens when two P&P agents interact?

Dominatedby BATNA

1's offers inthis round

2's offer inthis roundEu2

2's aspiration

BATNA =<u1(bd),u2(bd)>

1's aspiration Eu1

28 September 2006 p56 r.p.loui AIVR UIUC

What happens when two P&P agents interact?

Eu2

Eu1(t=2)Eu1(t=1)

28 September 2006 p57 r.p.loui AIVR UIUC

What happens when two P&P agents interact?

1'ssecurity+resentment

2'ssecurity+resentment

1's offersin this round

28 September 2006 p58 r.p.loui AIVR UIUC

What happens when two P&P agents interact?

28 September 2006 p59 r.p.loui AIVR UIUC

What happens when two P&P agents interact?

28 September 2006 p60 r.p.loui AIVR UIUC

What happens when two P&P agents interact?

1 breaks down

Amount of(specific)resentment

Laissez-faire path is

<Eu1,Eu2>through time

28 September 2006 p61 r.p.loui AIVR UIUC

Does the starting offer affect the laissez-faire path?

Both aregenerousat the start

1 isgenerousat start,2 is not

2 isgenerousat start,1 is not

28 September 2006 p62 r.p.loui AIVR UIUC

Breakdownat t=2(purepessimism)

28 September 2006 p63 r.p.loui AIVR UIUC

Differentlaissez-fairepaths

28 September 2006 p64 r.p.loui AIVR UIUC

Breakdownat t=5withresentment

28 September 2006 p65 r.p.loui AIVR UIUC

All paths lead to breakdown

28 September 2006 p66 r.p.loui AIVR UIUC

In a different negotiation,some paths lead to acceptance, some to breakdown

Fixedagentcharacteristics

Variedaccelerationof offers

28 September 2006 p67 r.p.loui AIVR UIUC

A third example where player 1 can guaranteean acceptance outcome with the right initial offers

28 September 2006 p68 r.p.loui AIVR UIUC

Controlling the Path

• Definition of rationality – Does not preclude accelerating offers

• You do not have to accept the laissez-faire outcome– Steer

– Estimate

– Control

• You can compensate for high aspiration or low security

• You can avoid gaps in the timing or density of offers

28 September 2006 p69 r.p.loui AIVR UIUC

An Envelope of Normalcy

Can you keep the pathin a narrow envelope?

the axis passes through< uA(bd), uB(bd) >

If so, then agreement isPossible.

28 September 2006 p70 r.p.loui AIVR UIUC

Is the model reasonable?

• Probability of Reaching an Agreement as a function of pessimism punishment:

28 September 2006 p71 r.p.loui AIVR UIUC

28 September 2006 p72 r.p.loui AIVR UIUC

Is the model reasonable?

• Substantive Utility as a function of pessimism punishment:

28 September 2006 p73 r.p.loui AIVR UIUC

28 September 2006 p74 r.p.loui AIVR UIUC

Is the model reasonable?

• Substantive+Procedural Utility as a function of pessimism punishment:

28 September 2006 p75 r.p.loui AIVR UIUC

28 September 2006 p76 r.p.loui AIVR UIUC

Is the model reasonable?

• Where are the laissez-faire states, in terms of agents' relative power? When both parties

do not have power,Negotiation ends

power = (ut(bd) – u1)/(Eut – u1)

28 September 2006 p77 r.p.loui AIVR UIUC

28 September 2006 p78 r.p.loui AIVR UIUC

Some Questions

• Can you estimate a ppagent's parameters and manipulate? (answer = mildly)

28 September 2006 p79 r.p.loui AIVR UIUC

28 September 2006 p80 r.p.loui AIVR UIUC

Some Questions

• Does ppagent parameter selection matter if there is meta-utility on fairness, time, % aspiration, % security?

28 September 2006 p81 r.p.loui AIVR UIUC

Start with a game in strategic form, study the laissez-faireoutcome for each pair of and

28 September 2006 p82 r.p.loui AIVR UIUC

Payoffs adjusted for meta-utility onfairness, time, concessions, aspiration,and security

28 September 2006 p83 r.p.loui AIVR UIUC

Some Questions

Can we watch / play?http://www.cs.wustl.edu/~loui/313f04/project1/select.cgihttp://k9.cs.wustl.edu/~cs313/04loui/select.cgi

28 September 2006 p84 r.p.loui AIVR UIUC

Review of Main Points

• Mathematical model can conceive of negotiation as a process to be controlled

• Simple probabilistic approach is appealing but requires non-standard process utility to make sense

• Also requires probability logic– Constructive– Objective

• Toward rational emotional agents (Lesser)

28 September 2006 p85 r.p.loui AIVR UIUC

thanks to M. Foltz, V. Reddy, D. Weisberger, I. Figelman,D. Moore, K. Hashimoto, A. Jump,F. Tohme, K. Larson, S. Braynov,J. Nachbar, B. Dheeravongkit, L. Cai, K. Chin, R. Bujans, T. Shen, M. Looks, E. Wofsey, R, Ratkowski, J. Rosen, S. Grubor, K. Ormsby, J. Badino, R. Pless, M.A. Clark.

Research Funding from:

• NSF Information Technology and Organizations Program: Multi-Agent Negotiation, as co-PI with T. Sandholm, July 1997 to July 1999. 9610122.

• NSF Office of Cross-Disciplinary Affairs and Interactive Systems Program: Summer Undergraduate Research Assistants, March 1995 to February 1996. 9415573.

• NSF Office of Cross-Disciplinary Affairs: REU Continuing Award, April 1992 to September 1994. 9123643.

28 September 2006 p86 r.p.loui AIVR UIUC

28 September 2006 p87 r.p.loui AIVR UIUC

28 September 2006 p88 r.p.loui AIVR UIUC

28 September 2006 p89 r.p.loui AIVR UIUC

28 September 2006 p90 r.p.loui AIVR UIUC

28 September 2006 p91 r.p.loui AIVR UIUC

28 September 2006 p92 r.p.loui AIVR UIUC

28 September 2006 p93 r.p.loui AIVR UIUC

28 September 2006 p94 r.p.loui AIVR UIUC

28 September 2006 p95 r.p.loui AIVR UIUC

28 September 2006 p96 r.p.loui AIVR UIUC

28 September 2006 p97 r.p.loui AIVR UIUC

28 September 2006 p98 r.p.loui AIVR UIUC

28 September 2006 p99 r.p.loui AIVR UIUC

28 September 2006 p100 r.p.loui AIVR UIUC

28 September 2006 p101 r.p.loui AIVR UIUC

28 September 2006 p102 r.p.loui AIVR UIUC

28 September 2006 p103 r.p.loui AIVR UIUC

28 September 2006 p104 r.p.loui AIVR UIUC

28 September 2006 p105 r.p.loui AIVR UIUC

Can Negotiation Breakdown Probabilities

of Laissez-Faire Agents

Be Derived A Priori?

28 September 2006 p106 r.p.loui AIVR UIUC

P(bd) Miscalibration

What is pA(bd) in this state?

28 September 2006 p107 r.p.loui AIVR UIUC

P(bd) Miscalibration?

What is pA(bd) in this state?

Is the empirical recordDetailed enough to permitConditioning on<EuA,Eub> state?<EuA,Eub,tA,tB> state?

<EuA,Eub,tA,tB,parmsA,parmsB>?

28 September 2006 p108 r.p.loui AIVR UIUC

P(bd) Miscalibration?

What is %(bd) in this state?

When simulated, averaging overparmsB?

RECALIBRATE w.r.t. population of parmsB

Is p(bd) is self-confirming?No: high p(bd) pushes agentTo agreement

28 September 2006 p109 r.p.loui AIVR UIUC

Iterative Procedure

1. Given parmsA, including pA(bd)(t)(uA,uB)2. Start with a population O of parmsB

3. Foreach <uA,uB,tA,tB>:1. Sample randomly over O

1. Simulate outcome(parmsA, parmsB, <uA,uB,tA,tB>)

2. Calculate %(bd)(uA,uB,tA,tB)

4. Set pA(bd)(tB)(uA,uB) 1/2= %(bd)(uA,uB,tA,tB)5. Enforce monotonicity of p(bd) in t6. Repeat at 37. Hope for convergence

Recommended