Modeling transfer of learning in games of strategic interaction

ACT-R Workshop July 2012 1

Ion Juvina & Christian Lebiere

Department of Psychology

Carnegie Mellon University

Outline Background

Experiment

Cognitive model

Work in progress

Discussion

Background | Experiment | Model | In progress | Discussion

Transfer of learning Alfred Binet (1899):

Formal discipline: Exercise of mental faculties -> generalization

Thorndike (1903): Identical element theory:

transfer of learning occurs only when identical elements of behavior are carried over from one task to another

Singley & Anderson (1989): Surface vs. deep similarities

Common “cognitive units”

Transfer in strategic interaction

Bipartisan cooperation in Congress Golf -> bipartisanship?

Similarity? What is transferred?

Prisoner’s Dilemma (PD)

Chicken game (CG)

PD & CG payoff matrices

PD A B

A -1, -1 10, -10

B -10, 10 1, 1

CG A B

A -10, -10 10, -1

B -1, 10 1, 1

Similarities between PD & CG Surface (near transfer)

2X2 games 2 symmetric and 2 asymmetric outcomes [1,1] outcome is identical

Deep (far transfer) Mixed motive Non-zero sum Mutual cooperation is superior to competition

in long term Though unstable (risky)

Differences between PD & CG Different equilibria:

Symmetric in PD: [-1,-1]

Asymmetric in CG: [-1, 10] and [10,-1]

Different strategies to maximize joint payoff (Pareto-efficient outcome): [1,1] in PD

Alternation of [-1,10] and [10,-1] in CG

Questions / hypotheses Similarities

Identical element? Common cognitive units? Transfer of learning

Is there any transfer? Only in one direction?

Low – high entropy? (Bednar, Chen, Xiao Liu, & Page, in press) Identical element -> both ways

Mechanism of transfer Reciprocal trust mitigates the risk associated with the

long term solution (Hardin, 2002)

Participants and design 480 participants (CMU students)

240 pairs 2 within-subjects games: PD & CG 4 between-subjects information conditions

No-info: 60 pairs Min-info: 60 pairs Mid-info: 60 pairs Max-info: 60 pairs

2 between-subjects order conditions in each information condition

PD-CG: 30 pairs CG-PD: 30 pairs

200 unnumbered rounds for each game

Typical outcomes

Pareto-optimal equilibria

[1,1] increases with info

Alternation increases with info

PD – CG sequence

CG – PD sequence

PD before and after

CG before and after

Transfer from PD to CG Increased [1,1] (surface transfer)

Increased alternation (deep transfer)

Transfer from CG to PD Increased [1,1] (surface + deep transf.)

Divergent effects

[1,1]SurfaceSurface

DeepDeep

[10,-1] / [-1,10]

Convergent effects

SurfaceSurface

DeepDeep

[10,-1] / [-1,10]

Reciprocation by info

Payoff by info in PD and CG

Summary results Mutual cooperation increases with awareness of

interdependence (info) Transfer of learning

Better performance “after” than “before” Combined effects of surface and deep similarities

CG -> PD surface similarity facilitates transfer PD -> CG surface similarity interferes with transfer

Transfer occurs in both directions Mechanism of generalization

Reciprocal trust?

Cognitive model Awareness of interdependence

Opponent modeling

Generality Utility learning (reinforcement learning)

Transfer of learning Surface transfer Deep transfer

Opponent modeling Instance-based learning

Dynamic representation of the opponent

Sequence learning Prediction of opponent’s next move

Instance (snapshot of the current situation) Previous moves and opponent’s current move

Contextualized expectations

Utility learning Reinforcement learning

Strategy: what move to make given Expected move of opponent Context (previous moves)

Reward functions Own payoff – Opponent’s payoff Opponent’s payoff Joint payoff – Opponent’s previous payoff

Surface transfer Declarative sub-symbolic learning

Retrieval of instances guided by recency and frequency

Strategy learning A learned strategy continues to be used for

a while until it is unlearned

Deep transfer Trust learning / Trust dynamics

Trust accumulator Increases when opponent makes cooperative (risky)

Decreases when opponent makes competitive moves

Trust invest accumulator Increases with mutually destructive outcome

Decreases with unreciprocated cooperation (risk taking)

Meta strategy Determines which reward function to use

Trust accumulator <= 0 Reward = own payoff – opponent’s payoff

Trust invest accumulator > 0 Reward = opponent’s payoff

Trust accumulator > 0 Reward = joint payoff – opp’s previous payoff

Model diagram

HSCB 2011 33

InstanceCurrent moves: A BPrevious moves: A A

Declarative Memory

Inst2Inst1

PredictionPrevious moves: A BOpponent move: A

Procedural Memory

Rule1 Rule3

MoveBest response: A Predicted move: A

Trust Trust accumulatorTrust invest

Opponent MoveA

Reward

Environment

ACT-R extension

PD-CG surface transfer

PD-CG deep transfer

CG – PD surf+deep transfer

Trust simulation

Summary model results Awareness of interdependence

Opponent modeling

Generality Utility learning

Transfer of learning Surface level transfer: cognitive units

Deep level transfer: Trust

In progress Expand model to account for all

information conditions

Develop more ecologically valid paradigm (IPD^3)

Model “affective” processes in ACT-R

General discussion Transfer of learning is possible

Deep similarities: interpersonal level IPD^3

To be used in behavioral experiments Tool for learning strategic interaction skills

Acknowledgments

Coty Gonzalez Jolie Martin Hau-Yu Wong Muniba Saleem This research is supported by the Defense

Threat Reduction Agency (DTRA) grant number: HDTRA1-09-1-0053 to Cleotilde Gonzalez and Christian Lebiere

Thank you for your attention! Questions?

Modeling transfer of learning in games of strategic interaction

Documents

2. Notes on Strategic Games, Strategic Interaction 2.1

Lec 02. Mobile Games Users and Interaction Patterns

Toward a Theory of Play: A Logical Perspective on Games and Interaction

PEI Provincial Canada Games Committee PSO Report · Canada Games Team PEI 2011 Canada Games Policy and Procedure Manual PSO Report 2 practices, moral behaviours, parental interaction,

Full Body Interaction for Serious Games in Motor Rehabilitation

Interaction in Networked Virtual Environments as Communicative Action Social Theory and Multi-Player Games

Interaction and Engagement: The Evolution of Advertising in Social Games

User Interaction: Intro to Mobile Development Landscapedjp3/classes/2013_09_INF133/Lectures/Lec… · Apps +Games News Overview Spotlight Apps Games Purchase history BURGER Nightmares

Transfer Learning in Real-Time Strategy Games Using Hybrid ... · Transfer Learning in Real-Time Strategy Games Using Hybrid CBR/RL Manu Sharma, Michael Holmes, Juan Santamaria, Arya

Vasyliunas Dichotomization Momentum transfer via dipole interaction

Using Your Friends: Top Interaction Mechanics in Social Games

Interaction of radiation with matter · Interaction of radiation with matter - simple radiative transfer ... - photoelectric effect - Compton scattering - pair production Lead. Photoioniztion

Augmented Reality & Gesture-based Architecture in Games ... · Keywords. augmented reality, elderly, interaction technique, video games 1. Introduction Following the proliferation

Multimodal Mechanism to Promote Interaction … Mechanism to Promote Interaction Strategies in Games for ... to the functional plan, ... The Wiimote communication is via Bluetooth,

1 Equation of Transfer (Mihalas Chapter 2) Interaction of Radiation Matter Transfer Equation Formal Solution Eddington-Barbier Relation: Limb Darkening

The role of social interaction in knowledge transfer

2013. Table of contents Methodology Awareness of UN Conventions Knowledge about Paralympic games and Special Olympics games Interaction with children

Interaction of 0-15 eV electrons with DNA: Resonances, diffraction and charge transfer

Towards Equilibrium Transfer in Markov Games 胡裕靖 2013-9-9

Human Computer Interaction (User Interfaces) for Games