Competition between adaptive agents: learning and collective efficiency

Damien Challet

Oxford University

Matteo Marsili

ICTP-Trieste (Italy)

challet@thphys.ox.ac.uk

● My definition of the Minority Game

● Simple worlds (M= 0)

●Markovian behavior

●Neural networks

●Reinforcement learning

● Multistate worlds (M> 0)

● Cause of large inefficiencies

● Remedies

● From El Farol to MG and back

'Truth is always in the minority'

Kierkegaard

Zig-Zag-Zoug

● Game played by Swiss children

● 3 players, 3 feet, 3 magic

●“Ziiig” ... “Zaaag” .... “ZOUG!”

Minority Game

● Zig-Zag-Zoug with N players● Aim: to be in the minority● Outcome = #UP-#DOWN = #A-#B● Model of competition between adaptive players

Challet and Zhang (1997), from El Farol's bar problem (Arthur 1994)

Initial goals of the MG

El Farol (1994): impossible to understand

Drastic simplification, keeping key ingredients

Bounded rationality

Reinforcement learning

Symmetrize the problem: 60/100 -> 50/50

Understand the symmetric problem

Generalize results to the asymmetric problem

Repeated games

Why playing again ?

Frustration

Losers in majority

How to play ?

Deduction

Rationality

Best answer

All lose !

Induction

Limited capabilities

Beliefs, strategies,personality

Trial and error

Learning

Minority Game

a1(t)a2(t)

A(t)=iai(t)

Payoff player i

-ai(t)A(t)

N agents i=1, ..., N

Choice ai (t) +1

Total losses = A2

Markovian learning'If it ain't broken, don't fix it' (Reents et al., Physica A 2000:

If I won, I stick to my previous choice

If I lost, I change to the other choice with prob p

Results: ( 2= < A> 2 )

● pN = x = cst (small p): 2 = 1 + 2x (1+ x/6)

● p~ N 1/2 2 ~ N

● p~ 1 2 ~ N 2

Markovian learning II

Problem: if N unknown, p= ?

Try: p= f(t) e.g. p= t-k

Convergence for any N

Freezing

When to stop ?

Neural networks

Simple perceptrons, learning rate R (Metzler ++ 1999)

2 = N + N(N-1)F(N,R)min

2 = N (1-2/) = 0.363... N

Reinforcement learning

● Each player has a register Di

● Di> 0 + is better

● Di< 0 - is better

● Di(t+1) = Di(t) – A(t)

● Choice: prob(+ | Di) = f(Di) f '(x) > 0 (RL)

Reinforcement learning II

● Central result:

agents minimize < A> 2 (predictability) for all f

● Stationary state: < A> = 0

● Fluctuations = ?

● Ex: f(x)=(1+tanh(K x))/2 exponential learning, K

learning rate

●K< Kc ~ N

●K> Kc 2~ N2

Market Impact: each agent has an influence on the outcome

● Naive agents: payoff - A = - A-i -a i

● Non-naive agents: payoff - A + c a i

● Smart agents: payoff - A-i

cf WLU, AU

● Central result 2:

non-naive agents minimize < A2> (fluctuations) for all

-> Nash equilibrium

Reinforcement learning III

Summary

Rate Markov NN RL naive RL non-naive NN non-naive

Small 1 N N 1 1?

Medium N 1 1?

Large 1 1?N2 N 2 N 2

Minority Games with memory

If an agent believes that the outcome depends on the past results, the outcome will depend on the past results.

Sun spot effect

Self-fulfilling prophecies

Fallacies of casual inference

Consequence:

The other agents will change their behavior accordingly

Minority Games with memory: naïve agents

Fixed randomly drawn strategies = quenched disorder

Tools of statistical physics give the exact solution in

principle

Agents minimize the predictability

Predictability = Hamiltonian

Optimization problem

Numeric:

Savit++ PRL99

Analytic:

Challet++ PRL99

Coolen+ J. Phys A 2002

Minority Games with memory: low efficiency

P/N is not the right scaling for large fluctuations

Minority Games with memory: origin of low efficiency

Stochastic dynamical equation for strategy score Ui

slow varying part + correlated noise

I: Size independent II = K P -1/2

When I << II, large fluctuations

Transition at I / K = G / P 1/2

Critical signal to noise ratio = G / P 1/2

Check:

Determine G

Predict critical points

G / P 1/2

BEFOREAFTER

Minority Games with memory: sophisticated agents

Agents minimize fluctuations

Optimization problem again

Reverse problem

Many variations, different global utility functions

● Grand canonical game (play or not play)

●Time window of scores (exponential moving

average)

●Any payoffHence, given a task (global utility function),

one knows how to design agents (local utility).

example: optimal defects combinations (cf. Neil's

From El Farol to MG and back

El Farol

0 NL = N/2

Differences, similarities?

Which results from MG are valid for El Farol?

From El Farol to MG and back

Theorem: all results from MG apply to El Farol

Everything scales like (L/N – < a>)/ = P ½

The El Farol problem with P states of the world is solved.

From El Farol to MG and back:new results

If (L/N – < a>)/ = P ½ 0,

P>Pc = 2 / [(L/N-< a>)2]: no more phase transition.

Summary•AU/WLU suppresses large fluctuations -> Nash equilibrium

•Design: agents must know they have an impact.

•The knowledge of the exact impact not crucial

•Reverse problem also possible

•MG: simple, rich, fun, and usefulwww.unifr.ch/econophysics/minority

102 commented references

Competition between adaptive agents: learning and collective efficiency

Documents

CapCom13: AT15: Produire de l'intelligence collective avec les agents -part2

Emergence of collective oscillations in adaptive cells · Emergence of collective oscillations in adaptive cells Shou-Wen Wang1,2,3, and Lei-Han Tang1,4,5, y 1Beijing Computational

Scalability Issues of Firefly-Based Self-Synchronization in Collective Adaptive Systems

A Behavioural Theory for Interactions in Collective-Adaptive Systemseprints.imtlucca.it/3945/1/1711.09762.pdf · 2018-03-05 · A Behavioural Theory for Interactions in Collective-Adaptive

Robot Self-Assembly As Adaptive Growth Process: Collective ... · Robot Self-Assembly as Adaptive Growth Process: Collective Selection of Seed Position and Self-Organizing Tree-Structures

Adaptive Agents for Fit-for-Purpose Training...Adaptive Agents for Fit-for-Purpose Training Karel van den Bosch(B), Romy Blankendaal , Rudy Boonekamp , and Tjeerd Schoonderwoerd TNO,

Internet Auctions with Arti ﬁcial Adaptive Agents: A Study ...duffy/papers/jebo1007_rev13.pdf · Internet Auctions with Arti ﬁcial Adaptive Agents: A Study on Market Design ∗

Mobile Agents for Adaptive Reconfigurable Wireless · PDF fileMobile Agents for Adaptive Reconfigurable Wireless ... •Radio resource management for high speed packet access ... (Huawei

Pay Attention! Designing Adaptive Agents that Monitor and Improve

From Collective Adaptive Systems to Human Centric ...From Collective Adaptive Systems to Human Centric Computation and Back: Spatial Model Checking for Medical Imaging Gina Belmonte

Kaidesoja, Tuukka - Overcoming the Biases of Microfoundationalism Social Mechanisms and Collective Agents

Les Syst`emes Multi Agents: vers une intelligence collective

Pay Attention! Designing Adaptive Agents that Monitor and ...pages.cs.wisc.edu/~bilge/pubs/2012/CHI12-Szafir.pdfPay Attention! Designing Adaptive Agents that Monitor and Improve User

An Adaptive Collective Communication Suppressing Contention Taura Lab. M2 Shota Yoshitomi

Engineering Resilient Collective Adaptive Systems by Self ... · 16 Engineering Resilient Collective Adaptive Systems by Self-Stabilisation MIRKOVIROLI,UniversitàdiBologna GIORGIOAUDRITO,UniversitàdiTorino

UNIVERSALIZABILITY FOR COLLECTIVE … for Collective...UNIVERSALIZABILITY FOR COLLECTIVE RATIONAL AGENTS: A CRITIQUE OF AGENT-RELATIVISM ... The refutation of ethical egoism, ...Published

Towards Hybrid and Diversity‐Aware Collective Adaptive Systems

A collective-based adaptive symbiotic model for surface reconstruction … · 2014-09-30 · A Collective-Based Adaptive Symbiotic Model for ... so that the local evolutions of all

Adaptive Collective Systems - Herding black sheep

Hybrid Collective Adaptive Systems