TABULAR METHODS & VALUE APPROXIMATION REVIEWavereshc/rl_fall19/lecture_18_Tabular... · 1...

TABULAR METHODS & VALUE APPROXIMATION REVIEWLecture 18

CSE4/510: Reinforcement Learning

October 24, 2019

Dynamic Programming

1. Policy Evaluation

2. Policy Improvement

Dynamic Programming

Policy Model Pros Cons Applications

Policy

• It finds optimal

policies in

polynomial time for

most cases

• Guaranteed to find

optimal policy

• Requires the

knowledge of the

transition probability

this is an unrealistic

requirement for many

problems

Can be applied

to environment

for which the

state transition

probability is

Dynamic Programming

Distribution vs Sample Model

1. Distribution model

2. Sample model B. List all possible outcomes and their

probabilities

A. Produce a single outcome taken

according to its probability of occurring

Optimal Functions

Update Functions

1. Dynamic Programming

2. Monte Carlo

4. Temporal Difference

C.3. Q-learning

Update Functions

1. Dynamic Programming

2. Monte Carlo

4. Temporal Difference A.

3. Q-learning D.

Overview

MC / DP / TD ?

Monte Carlo

Policy

• Learn optimal

behavior directly

from interaction with

the environment

• Can be used to

focus on the region

of special interest

and be accurately

evaluated

• Must have the

terminal state

• Must wait until the

end of an episode

before return is

known. For problems

with very long

episodes this will

become too slow

It couldn’t be

used on

continues task,

should be

episodic

Monte Carlo

SARSA vs Q-learning

Q-learning

Q-Learning

Policy

Easy to implement • Memory

requirement

increases with

number of states

• Does not

perform well in

stochastic

environment

Environment

with limited

number of

states and

discrete

action spaces

Function Approximation

Deep Q-network

Deep Q-network (DQN)

Policy

• Can generalize

to unseen states

• Input is just a

• It may over-

estimate value

• Cannot be

applicable to

continuous action

spaces

Environment

with limited

number of

states and

discrete

action spaces

Double Deep Q-network

Double Deep Q-network (DDQN)

Policy

• Value estimation

is more accurate

comparing to

• Input is just a

• It may take longer

to train

Environment

with limited

number of

states and

discrete

action spaces

Dueling Deep Q-network (Dueling DQN)

Prioritized Experience Replay (RER)

TABULAR METHODS & VALUE APPROXIMATION REVIEWavereshc/rl_fall19/lecture_18_Tabular... · 1...

Documents

Attribute Data in GIS Data in GIS are stored as features AND tabular info Tabular information can be associated with features OR Tabular data may NOT be

ASME_SC_VIII-1 - Tabular Summary

AERODROME CLIMATOLOGICAL SUMMARY TABULAR FORM …

Relieve Tabular

Business Statistics 1 Methods of presenting Data Tabular Method Tabular Method Graphical Method Graphical Method

3. Organizacion Tabular

Tabular Representation

TABULAR DATA - sci.utah.edu

Integration by Parts – TABULAR METHOD -- Paulk Notes … · Integration by Parts – TABULAR METHOD -- Paulk Notes Page 2

Chap002-JK- Tabular & Graphical Metd

Synthesizing Tabular Data using Generative Adversarial Networks · 2018-11-29 · Tabular GAN (TGAN), a generative adversarial network which can generate tabular data like medical

Automatically converting tabular data to

Graphical and Tabular Descriptive Techniques

Generalized Tabular LL-Analysis

Lecture 9 Quine-McCluskey (Tabular) MinimizationMethodggn.dronacharya.info/CSEDept/Downloads/QuestionBank... · Quine-McCluskey (Tabular) Minimization Two step process utilizing tabular

Quine-McCluskey (Tabular) Minimizationvision.unipv.it/reti-logiche/Chap_02_P2b Quine Mc Cluskey.pdf · Quine-McCluskey (Tabular) Minimization Two step process utilizing tabular listings

Simplex Tabular

Approximation Algorithms. 2301681Approximation Algorithms2 Outlines Why approximation algorithm? Approximation ratio Approximation vertex cover

Visualizing Tabular Data

Descriptive Statistics: Tabular and Graphical Presentationsfaculty.tamucc.edu/rcutshall/Chapter_2.pdf · Tabular and Graphical Procedures Qualitative Data Quantitative Data Tabular