Upload
miles-barrett
View
216
Download
0
Embed Size (px)
Citation preview
Myopic and non-myopic agent optimization in game theory, economics, biology and
artificial intelligence
Michael J Gagen
Institute of Molecular Bioscience
University of Queensland
Email: [email protected]
Kae Nemoto
Quantum Information Science
National Institute of Informatics, Japan
Email: [email protected]
Overview: Functional Optimization in Strategic Economics (and AI)
Mathematics / Physics (minimize action)
Formalized by von Neumann and Morgenstern, Theory of Games and Economic Behavior (1944)
Overview: Functional Optimization in Strategic Economics (and AI)
Strategic Economics (maximize expected payoff)
Formalized by von Neumann and Morgenstern, Theory of Games and Economic Behavior (1944)
Functionals: Fully general
Not necessarily continuous
Not necessarily differentiable
Nb: Implicit Assumption of Continuity !!
von Neumann’s “myopic” assumption
Overview: Functional Optimization in Strategic Economics (and AI)
Strategic Economics (maximize expected payoff)
Evidence:
von Neumann & Nash used fixed point theorems in probability simplex equivalent to a convex subset of a real vector space
von Neumann and Morgenstern, Theory of Games and Economic Behavior (1944)J. F. Nash, Equilibrium points in n-person games. PNAS, 36(1):48–49 (1950)
Non-myopic Optimization
Correlations Constraints and forbidden regions
Overview: Functional Optimization in Strategic Economics (and AI)
No communications between players
∞ correlations & ∞ different trees constraint sets
Non-myopic Optimization
Overview: Functional Optimization in Strategic Economics (and AI)
Myopic “The” Game Tree lists All play options
Myopic One Constraint = One Tree
“Myopic” Economics (= Physics)
Myopic = Missing Information!
Correlation = Information
Chess:
“Chunking” or pattern recognition in human chess play
Experts: Performance in speed chess doesn’t degrade much Rapidly direct attention to good moves Assess less than 100 board positions per move Eye movements fixate only on important positions Re-produce game positions after brief exposure better than novices, but random positions only as well as novices
Learning Strategy = Learning information to help win game
Nemoto: “It is not what they are doing, its what they are thinking!”
What Information?
Optimization and Correlations are Non-Commuting!
Complex Systems Theory
Emergence of Complexity via correlated signals higher order structure
Optimization and Correlations are Non-Commuting!
Life Sciences (Evolutionary Optimization)
Selfish Gene Theory
Mayr: Incompatibility between biology and physicsRosen: “Correlated” Components in biology, rather than “uncorrelated” partsMattick: Biology informs information science
6 Gbit DNA program more complex than any human program, implicating RNA as correlating signals allowing multi-tasking and developmental control of complex organisms.
Mattick: RNA signals in molecular networksProkaryotic gene
mRNA
protein
Eukaryotic gene
mRNA & eRNA
protein
networking functions
Hidden layer
Optimization and Correlations are Non-Commuting!
Economics
Selfish independent agents: “homo economicus”
Challenges: Japanese Development Economics,Toyota “Just-In-Time” Production System
Optimization and Correlations are Non-Commuting!
o = F(i)
= F(t,d)
= Ft (d) {F(x,y,z), … ,F(x,x,z),…}
Functional Programming, Dataflow computation, re-write architectures, …
o
i
1 Player Evolving / Learning Machines (neural and molecular networks)endogenously exploit correlations to alter own decision tree, dynamics and optima
Discrepancies: Myopic Agent Optimization and Observation
Heuristic statistics
Iterated Prisoner’s Dilemma Iterated Ultimatum Game
Chain Store Paradox (Incumbent never fights new market entrants)
Sum-Over-Histories or
Path Integral formulation
Myopic Agent Optimization
Normal FormStrategic Form
Px
Py
von Neumann and Morgenstern (1944): All possible information = All possible move combinations for all histories and all futures
? ?
Optimization
Sum over all stages
Probability of each path
Payoff from each stage for each path
Sum over all paths to nth stage
Myopic Agent Optimization
Myopic Agent Optimization
Myopic agents ( probability distributions) uncorrelated no additional constraints
Backwards Induction & Minimax
x1 y1
1-p p
0 ≤ p ≤ 1/2
Non-Myopic Agent Optimization
Fully general, notationally emphasized by:
Optimization
Sum over all correlation strategies
Constraint set of each strategy
Payoff for each path
Sum over all paths given strategy
Probability of each strategy
Conditioned path probability
Non-Myopic Agent Optimization in the Iterated Prisoner’s Dilemma
In 1950 Melvin Dresher and Merrill Flood devised a game later called the Prisoner’s Dilemma
Two prisoners are in separate cells and must decide to cooperate or defect
CooperationDefect
CKR: Common Knowledge of Rationality
Payoff Matrix
Py
Px
C D
C (2, 2) (0, 3)
D (3,0) (1,1)
Non-Myopic Agent Optimization in the Iterated Prisoner’s Dilemma
Myopic agent assumption
max
Non-Myopic Agent Optimization in the Iterated Prisoner’s Dilemma
Myopic agents: N max constraints
= 0
> 0 PNx,HN-1(1) = 1
=1 > 0 PN-1,x,HN-2(1) = 1
Simultaneous solution Backwards Induction myopic agents always defect
Non-Myopic Agent Optimization in the Iterated Prisoner’s Dilemma
Correlated Constraints: (deriving Tit For Tat)
< 0 P1x(1) = 0, so Px cooperates
< 0 P1y(1) = 0, so Py cooperates
2 max constraints
Non-Myopic Agent Optimization in the Iterated Prisoner’s Dilemma
Families of correlation constraints: k, j index
Change of notation: “dot N” = N, “dot dot N” = 2N, “dot dot N-2” = 2N-2, etc
Optimize via game theory techniques
Many constrained equilibria involving cooperation
Cooperation is rational in IPD
Further Reading and Contacts
Michael J Gagen
Email: [email protected]
URL: http://research.imb.uq.edu.au/~m.gagen/
See:
Cooperative equilibria in the finite iterated prisoner's dilemma, K. Nemoto and M. J. Gagen, EconPapers:wpawuwpga/0404001 (http://econpapers.hhs.se/paper/wpawuwpga/0404001.htm)
Kae Nemoto
Email: [email protected]
URL: http://www.qis.ex.nii.ac.jp/knemoto.html