Upload
jane-patrick
View
215
Download
1
Tags:
Embed Size (px)
Citation preview
cs6501: Imperfect Information Games
Principles Of Knowledge Engineering & Reconstruction
Spring 2010University of Virginia
David Evans
A Course on Poker?!?There are few things that are so unpardonably neglected in our country as poker. The upper class knows very little about it. Now and then you find ambassadors who have sort of a general knowledge of the game, but the ignorance of the people is fearful. Why, I have known clergymen, good men, kind+hearted, liberal, sincere, and all that, who did not know the meaning of a “flush”. It is enough to make one ashamed of one’s species.+ Mark Twain (as quoted in A Bibliography of Mark Twain, Merle Johnson)
John von Neumann (1903+1957)
Pure MathQuantum PhysicsAtomic Bombs Designer of Plutonium Bomb Fission/Fusion Hydrogen Bomb Computer Science First Draft Report on EDVAC von Neumann Architecture Merge Sort Random Number generationGame Theory Theory of Games and Economic Behavior (with Morgenstern) Mutual Assured Destruction
A+K+Q Game (not von Neumann Poker)
Flickr:cc Malkav
A+K+Q Game Rules
• 3 card deck: Ace > King > Queen• 2 Players, each player gets one card face+up• Higher card wins
Without secrecy, stakes, betting, its not poker!
A+K+Q Game Rules
• 3 card deck: Ace > King > Queen• 2 Players, each player gets one card face+down• Higher card wins• Betting: (half+street game)– Ante: 1 chip– Player 1: bet 1, or check– Player 2: call or fold
• Stakes: scheduling signup order by chip count
Loosely based on Bill Chen and Jerrod Ankenman, The Mathematics of Poker.
A+K+Q Analysis
Better to be player 1 or player 2?
Easy Decisions:
Hard Decisions:
Game PayoffsPlayer 1: Ace King Queen
Bet Check Bet Check Bet Check
Player 2
AceCall
Fold
KingCall
Fold
QueenCall
Fold
Game Payoffs (Player 1, Player 2)Player 1: Ace King Queen
Bet Check Bet Check Bet Check
Player 2
AceCall (-2, +2) (-1,+1) (-2,+2) (-1,+1)
Fold (+1,-1) (+1, -1) (+1,-1) (+1,-1)
KingCall (+2, -2) (+1, -1) (-2,+2) (-1,+1)
Fold (+1, -1) (+1, -1) (+1,-1) (+1,-1)
QueenCall (+2, -2) (+1, -1) (+2,-2) (+1,-1)
Fold (+1, -1) (+1, -1) (+1,-1) (+1, -1)
Zero-Sum Game
X
p2P layer s
Gain(p) = 0
Player 1: Ace King Queen
Bet Check Bet Check Bet Check
Player 2
AceCall -2 -1 -2 -1
Fold +1 +1 +1 +1
KingCall +2 +1 -2 -1
Fold +1 +1 +1 +1
QueenCall +2 +1 +2 +1
Fold +1 +1 +1 +1
Payoffs for Player 1
Strategic Domination
Strategy A dominates Strategy B if Strategy A always produces a better outcome than Strategy B regardless of the other player’s action.
Player 1: Ace King Queen
Bet Check Bet Check Bet Check
Player 2
AceCall -2 -1 -2 -1
Fold +1 +1 +1 +1
KingCall +2 +1 -2 -1
Fold +1 +1 +1 +1
QueenCall +2 +1 +2 +1
Fold +1 +1 +1 +1
Eliminating Dominated Strategies
Player 1: Ace King Queen
Bet Check Bet Check
Player 2
Ace Call -1 -2 -1
KingCall +2 -2 -1
Fold +1 +1 +1
Queen Fold +1 +1
Simplified Payoff Matrix
Player 1: Ace Queen
Bet Bet Check
Player 2
Ace Call -2 -1
KingCall +2 -2 -1
Fold +1 +1
The Tough Decisions
What if Player 1 never bluffs?
Expected Value
EV =X
e2E vents
P r(e)Value(e)
Never Bluff StrategyPlayer 1: A K Q
Bet Check Check
Player 2
A Call -1 -1
K Fold/Call +1 -1
Q Fold +1 +1
EV1 = 13(1) +
13(
¡ 12 + 1
2) +13(¡ 1) = 0
Player 1: Ace Queen
Bet Bet Check
Player 2
Ace Call -2 -1
KingCall +2 -2 -1
Fold +1 +1
The Tough Decisions
What if Player 1 always bluffs?
Always Bluff StrategyPlayer 1: A K Q
Bet Check Bet
Player 2
A Call -1 -2
KCall +2 -2
Fold +1 +1 +1
Q Fold +1 +1
EV1=CallK = 13(
12(+2) +
12(+1)) +
13(¡
12 +
12) +
13(¡ 2) = ¡ 1
6
EV1=F oldK = 13(1) +
13(¡
12 +
12) +
13(
12(¡ 2) +
12(1)) =
16
Recap
If player 1 never bluffs:
If player 1 always bluffs:
EV1 =0
EV1 = ¡ 16
Is this a break-even game for Player 1?
Course Overview
• Topics – Game Theory– Machine Learning– Anything else relevant to building a poker bot
• Format: most classes will be student-led– Present a topic and/or research paper
Class Leader Expectations
• At least two weeks* before your scheduled class: – Let me know what you are planning on doing (talk to me after
class or email)• At least one week before your scheduled class:
– Post on the course blog a description of the class topic and links to any reading/preparation materials
• At the class: lead an interesting class, bring any needed materials
• Later that day: post class materials on the course blog• Follow-up: respond to any comments on the course blog
* If you signed up for Feb 1, you’re already late!
Course Project
Build a poker bot capable of competing in the Sixth Annual Computer Poker Competitionhttp://www.computerpokercompetition.org/
Note: overlaps with USENIX Security, August 9-12 (also in San Francisco)
Work in small (2-4) person teamsA few preliminary projects earlierCombine ideas/code/results from best teams
My (Lack of) Qualifications
• I do research in computer security• I have very limited knowledge and experience
in game theory, machine learning, etc.• I am (probably) a fairly lousy poker player
This course will be a shared learning experience, and will only work well if everyone contributes to make it interesting and worthwhile.
Things to Do
• Submit course survey• Print and sign course contract: bring to
Tuesday’s class
• Reading for Tuesday: Chapters 1 and 2 of Darse Billings’ dissertation
Everything will be posted on the course site (by tomorrow!):http://www.cs.virginia.edu/evans/poker
Recap Recap
If player 1 never bluffs:
If player 1 always bluffs:
EV1 =0
EV1 = ¡ 16
Looks like a break-even game for Player 1: is there a better strategy?
Mixed StrategyPlayer 1: A K Q
Bet Check Bet
Player 2
A Call -1 -2
K Call +2 -2
Q Fold +1 +1
Always Bluff
Player 1: A K Q
Bet Check Check
Player 2
A Call -1 -1
K Fold/Call +1 -1
Q Fold +1 +1
EV1 = ¡ 16
Never Bluff EV1 =0
Pure strategy: always do the same action for a given input state.Mixed strategy: probabilistically select from a set of pure strategies.
Strategies
Player 1
Bluff with QueenCheck with Queen
Player 2Call with KingFold with King
EV1(< SB luf f ;TCall >) = ¡ 16
EV1(< SCheck;TCall >) = 16
EV1(< SB luf f ;TF old >) = 16
EV1(< SCheck;TF old >) = 0
Finding the best strategy for Player 1: assume Player 2 plays optimally.
Nash Equilibrium
John Nash (born 1928)
Equilibrium Points in N-Person Games, 1950
Nash Equilibrium
• Player 1 is making the best decision she can, taking into account Player 2’s decisions.
• Player 2 is making the best decision he can, taking into about Player 1’s decisions.
• Neither player can improve its expected value by deviating from the strategy.
Hence, to find the best strategy for Player 1, we need to find a strategy that makes Player 2 indifferent between his options.
Winning the AKQ GameEV1(< SB luf f ;TCall >) = ¡ 1
6
EV1(< SCheck;TCall >) = 16
EV1(< SB luf f ;TF old >) = 16
EV1(< SCheck;TF old >) = 0
Bluff CheckCall -1 +1Fold +1 0
Player 1 wants to make Player 2 indifferent between TCall and TFold
Winning the AKQ GameBluff Check
Call -1 +1Fold +1 0
Player 1 wants to make Player 2 indifferent between TCall and TFold
Charge
• Submit course survey• Print and sign course contract: bring to
Tuesday’s class• Reading for Tuesday: Chapters 1 and 2 of
Darse Billings’ dissertationReadings posted now. Everything else will be posted on the course site (by tomorrow!):
http://www.cs.virginia.edu/evans/poker
If you are signed up for February 1, by tomorrow: contact me about plans for class.