Lecture 1: Introduction - GitHub Pageshunkim.github.io/ml/RL/rl01.pdf · Lecture 1: Introduction...

Lecture 1: Introduction

Reinforcement Learning with TensorFlow&OpenAI GymSung Kim <hunkim+ml@gmail.com>

http://angelpawstherapy.org/positive-reinforcement-dog-training.html

Nature of Learning

• We learn from past experiences.- When an infant plays, waves its arms, or looks about, it has no explicit teacher

- But it does have direct interaction to its environment.

• Years of positive compliments as well as negative criticism have all helped shape who we are today.

• Reinforcement learning: computational approach to learning from interaction.

Richard Sutton and Andrew Barto, Reinforcement Learning: An Introduction Nishant Shukla , Machine Learning with TensorFlow

Reinforcement Learning

https://www.cs.utexas.edu/~eladlieb/RLRG.html

Machine Learning, Tom Mitchell, 1997

Atari Breakout Game (2013, 2015)

Atari Games

Human-level control through deep reinforcement learning, Nature http://www.nature.com/nature/journal/v518/n7540/full/nature14236.html

Nature : Human-level control through deep reinforcement learning

Figure courtesy of Mnih et al. "Human-level control

through deep reinforcement learning”,

Nature 26 Feb. 2015

https://deepmind.com/blog/deep-reinforcement-learning/

https://deepmind.com/applied/deepmind-for-google/

Reinforcement Learning Applications

• Robotics: torque at joints

• Business operations- Inventory management: how much to purchase of inventory, spare

- Resource allocation: e.g. in call center, who to service first

• Finance: Investment decisions, portfolio design

• E-commerce/media- What content to present to users (using click-through / visit time as

reward)

- What ads to present to users (avoiding ad fatigue)

Audience

• Want to understand basic reinforcement learning (RL)

• No/weak math/computer science background

- Q = r + Q

• Want to use RL as black-box with basic understanding

• Want to use TensorFlow and Python (optional labs)

Schedule

1. Introduction

2. Playing Games, OpenAI Gym Introduction & Lab

3. Q-learning with Tables & Lab

4. Q learning on nondeterministic Rewards and Actions & Lab

5. Q-learning with Networks (DQN) & Lab

6. Policy Gradients & Lab

7. Further Topics

References• Awesome Reinforcement Learning http://aikorea.org/awesome-rl/

• Simple Reinforcement Learning with TensorFlow, https://medium.com/emergent-future/

• http://kvfrans.com/simple-algoritms-for-solving-cartpole/ (written by a high school student)

• Deep Reinforcement Learning: Pong from Pixels - Andrej Karpathy blog http://karpathy.github.io/2016/05/31/rl/

• Machine Learning, Tom Mitchell, 1997

• CS 294: Deep Reinforcement Learning, Spring 2017, http://rll.berkeley.edu/

• Fundamental of Reinforcement Learning, https://www.gitbook.com/book/dnddnjs/rl/details (Korean Book)

Online video lectures • A Tutorial on Reinforcement Learning, https://simons.berkeley.edu/talks/tutorial-

reinforcement-learning 2017

• Berkeley CS 294: Deep Reinforcement Learning, Spring 2017 http://rll.berkeley.edu/deeprlcourse/, 2017

• MIT 6.S094: Deep Learning for Self-Driving Cars (Lecture 2) http://selfdrivingcars.mit.edu/, 2017

• Deep Reinforcement Learning (John Schulman, OpenAI) https://www.youtube.com/watch?v=PtAIh9KSnjo&t=2457s (summary) and https://www.youtube.com/watch?v=aUrX-rP_ss4&list=PLjKEIQlKCTZYN3CYBlj8r58SbNorobqcp (4 lectures)

• UCL, David Silver, Reinforcement Learning http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching.html, 2015

• Stanford Andrew Ng CS229 Lecture 16 https://www.youtube.com/watch?v=RtxI449ZjSc, 2008

Prerequisite: http://hunkim.github.io/ml/ orhttps://www.inflearn.com/course/기본적인-머신러닝-딥러닝-강좌/

Next Playing

OpenAI Gym games

Lecture 1: Introduction - GitHub Pageshunkim.github.io/ml/RL/rl01.pdf · Lecture 1: Introduction...

Documents

REPLACEMENT PARTShs series automatic slicers (ml-136313 thru ml-136324) ml-136313 hs7 ml-136314 hs7n ml-136317 hs7 (canada) ml-136318 hs7n (canada) ml-136324 hs7-phs

RL01/RL02 POCKET SERVICE GUIDE

· construccion de la area materno e infantil del centro de salud ... glb ml m3 ml ml und ml und ml und und und ml und und und und und ml und ml und ml und und

ML-2240 & ML-2245 & ML-1640 Series training February 2008diagramasde.com/diagramas/otros2/ml-1640, ml-2240... · ML-2240 & ML-2245 & ML-1640 Series training February 2008 ML-2240

Final - English · English Breakfast 20 Camomile 20 Earl Grey 20 350 ml 350 ml 250ml 1800 ml 1800 ml 135 ml 750 ml 135 ml 300 ml 300 ml. Title: Final - English Created Date: 7/24/2018

modification placards and educational Dispensers 1250 ml · 1250 ml 1250 ml 1250 ml 1250 ml 1250 ml 1250 ml PrestigeEmpathyAlertCalmVivid Alpine Dispensers 1250 ml Symmetry’s newest

Grow Micro Bloom Honey · Week 7 Late Flowering - 6.0 mL 11.0 mL 10.0 mL 8.0 mL 8.0 mL - Week 8 Ripening - 4.0 mL 11.0 mL 5.0 mL 4.0 mL 6.0 mL - Week 9 Flush 2.0 mL 2.0 mL 2.0 mL

txWPeg - multimedia.3m.com · Helmet Brand (a) Helmet Model (b) P3 Adapter (c) H31 H510 H520 H540 3M™ 1465 E ML ML ML ML AF ML ML ML ML 3M™ Airstream AH1, AH4, AH7, AE ML ML ML

ML 320 ML 350 ML 500 ML 55 AMG

Manual-RL01(C212)-V1 - SilverStone · 2020. 12. 3. · Manual-RL01(C212)-V1.ai Author: paul.chen Created Date: 3/7/2012 12:27:17 PM

Epperson Ranch Map - Amazon Web Services · DDDDDD • P8 39 035. 102 91 Ml Ml 99 100 Ml Ml 33 D 77 Ml M163 78 Ml Ml 62 Ml 61 Ml 60 M 107Ml p 58 105 87 88 Ml Ml Ml Ml 50 84 86 85

REPLACEMENT PARTS · 2015-03-25 · legacy mixer series hl120 ml-134296 hl120 ml-134459 hl200 ml-134289 hl200 ml-134331 hl200 ml-134455 hl200 ml-134457 hl200 ml-134458 hl200c ml-134456

Lab 7: DQN 1 (NIPS 2013) - GitHub Pageshunkim.github.io/ml/RL/rl07-l1.pdf · Lab 7: DQN 1 (NIPS 2013) Reinforcement Learning with TensorFlow&OpenAI Gym Sung Kim

ML with Tensorflow-new - GitHub Pageshunkim.github.io/ml/lec12.pdf · e.g. Machine Translation seq of words -> seq of words. Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture

ML with Tensorflow - GitHub Pageshunkim.github.io/ml/lec0.pdf · Goals •Basic understanding of machine learning algorithms Linear regression, Logistic regression (classiﬁcation)-Neural

User s Guide - GilsonP1000L 0.01 mL mL 0.002 mL P5000L 0.01 mL mL 0.002 mL P10mL L mL 0.1 mL 0.02 mL Lock System For additional safety, the volume selected is lockable. With the left

SPIRITS - Stoelzle Glass Group€¦ · SPIRITS STANDARD COLLECTION ART/REF 35433 35433 35433 35433 VOLUME 100 ml 200 ml 350 ml 500 ml BRIMFUL 108 ml 220 ml 365 ml 515 ml WEIGHT 255

ML with Tensorflow lab - GitHub Pageshunkim.github.io/ml/lab11.pdf · 2017. 10. 2. · Fei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 7 -55 27 Jan 2016 1 1 2 4 5 6 7 8 3 2

ML Princess, ML Solo, ML Flex Air humidification

Manual de Recarga Samsung ML 2251N | ML 2251NP | ML 2251W | ML 2250