Reinforcement Learning for the Soccer Dribbling Task Arthur Carvalho Renato Oliveira
Preview:
Citation preview
- Slide 1
- Reinforcement Learning for the Soccer Dribbling Task Arthur
Carvalho Renato Oliveira
- Slide 2
- Introduction RoboCup soccer simulation Scoring A Data Mining
Approach to Solve the Goal Scoring Problem Passing A New Passing
Strategy Based on Q-Learning Algorithm in RoboCup Dribbling ?
- Slide 3
- Soccer Dribbling Task
- Slide 4
- Outline The soccer dribbling task as a RL problem RL solution
Experiments Conclusion
- Slide 5
- The Soccer Dribbling Task as a RL Problem Coach Setting
positions Dribbler is placed in the center-left region together
with the ball Adversary is placed in a random position Manage the
play Adversary wins when he gains possession or when the ball goes
out of the field Dribbler wins when he crosses the field with the
ball
- Slide 6
- The Soccer Dribbling Task as a RL Problem When an episode ends,
the coach starts a new one RoboCup soccer simulator operates in
discrete time steps Episodic reinforcement-learning framework
- Slide 7
- The Soccer Dribbling Task as a RL Problem Actions HoldBall()
Dribble(, k) Dribble(30, 5), Dribble(330, 5), Dribble(0, 5),
Dribble(0, 10) The dribbler can kick the ball forward (strongly and
weakly), diagonally upward, and diagonally downward.
- Slide 8
- The Soccer Dribbling Task as a RL Problem State VariableMeaning
posY (dribbler) Vertical position of the dribbler
ang(dribbler)Global angle of the dribbler ang(dribbler; adversary)
The relative angle between the dribbler and the adversary ang(ball;
adversary) The relative angle between the ball and the adversary
dist(ball; adversary) Distance between the ball and the
adversary
- Slide 9
- Outline The soccer dribbling task as a RL problem RL solution
Experiments Conclusion
- Slide 10
- RL Solution
- Slide 11
- CMAC Partitioning the state space into several receptive fields
(hyper-rectangles) Each one is associated with a weight Multiple
partitions of the state space (layers) are usually used The CMACs
response to a given input is equal to the sum of the weights of the
excited receptive fields
- Slide 12
- RL Solution
- Slide 13
- Outline The soccer dribbling task as a RL problem RL solution
Experiments Conclusion
- Slide 14
- Experiments
- Slide 15
- Adversary Fixed policy It computes a near-optimal interception
point (UvA Trilearn 2003 team) Two phases Training Testing
- Slide 16
- Experiments Training Phase: 5 independent runs, each one
lasting 50,000 episodes 53%
- Slide 17
- Experiments Qualitatively Rule #1
- Slide 18
- Experiments Qualitatively Rule #2
- Slide 19
- Experiments
- Slide 20
- Outline The soccer dribbling task as a RL problem RL solution
Experiments Conclusion
- Slide 21
- Dribble Soccer dribbling task Reinforcement learning solution
Benchmark Start point for dribbling tasks in other sports games
E.g., hockey, basketball, and football
- Slide 22
- Thank you! Source code available at:
http://sites.google.com/site/soccerdribbling Arthur Carvalho Renato
Oliveira a3carval@cs.uwaterloo.ca rmo@cin.ufpe.br