Reinforcement Learning for the Soccer Dribbling Task Arthur
Carvalho Renato Oliveira
Slide 2
Introduction RoboCup soccer simulation Scoring A Data Mining
Approach to Solve the Goal Scoring Problem Passing A New Passing
Strategy Based on Q-Learning Algorithm in RoboCup Dribbling ?
Slide 3
Soccer Dribbling Task
Slide 4
Outline The soccer dribbling task as a RL problem RL solution
Experiments Conclusion
Slide 5
The Soccer Dribbling Task as a RL Problem Coach Setting
positions Dribbler is placed in the center-left region together
with the ball Adversary is placed in a random position Manage the
play Adversary wins when he gains possession or when the ball goes
out of the field Dribbler wins when he crosses the field with the
ball
Slide 6
The Soccer Dribbling Task as a RL Problem When an episode ends,
the coach starts a new one RoboCup soccer simulator operates in
discrete time steps Episodic reinforcement-learning framework
Slide 7
The Soccer Dribbling Task as a RL Problem Actions HoldBall()
Dribble(, k) Dribble(30, 5), Dribble(330, 5), Dribble(0, 5),
Dribble(0, 10) The dribbler can kick the ball forward (strongly and
weakly), diagonally upward, and diagonally downward.
Slide 8
The Soccer Dribbling Task as a RL Problem State VariableMeaning
posY (dribbler) Vertical position of the dribbler
ang(dribbler)Global angle of the dribbler ang(dribbler; adversary)
The relative angle between the dribbler and the adversary ang(ball;
adversary) The relative angle between the ball and the adversary
dist(ball; adversary) Distance between the ball and the
adversary
Slide 9
Outline The soccer dribbling task as a RL problem RL solution
Experiments Conclusion
Slide 10
RL Solution
Slide 11
CMAC Partitioning the state space into several receptive fields
(hyper-rectangles) Each one is associated with a weight Multiple
partitions of the state space (layers) are usually used The CMACs
response to a given input is equal to the sum of the weights of the
excited receptive fields
Slide 12
RL Solution
Slide 13
Outline The soccer dribbling task as a RL problem RL solution
Experiments Conclusion
Slide 14
Experiments
Slide 15
Adversary Fixed policy It computes a near-optimal interception
point (UvA Trilearn 2003 team) Two phases Training Testing
Slide 16
Experiments Training Phase: 5 independent runs, each one
lasting 50,000 episodes 53%
Slide 17
Experiments Qualitatively Rule #1
Slide 18
Experiments Qualitatively Rule #2
Slide 19
Experiments
Slide 20
Outline The soccer dribbling task as a RL problem RL solution
Experiments Conclusion
Slide 21
Dribble Soccer dribbling task Reinforcement learning solution
Benchmark Start point for dribbling tasks in other sports games
E.g., hockey, basketball, and football
Slide 22
Thank you! Source code available at:
http://sites.google.com/site/soccerdribbling Arthur Carvalho Renato
Oliveira [email protected][email protected]