Practical Applications of Deep Reinforcement Learning ... · DL4J and RL4J libraries. SKIL & AnyLogic. Summer 2019. AnyLogic Cloud Python API (RL ready) June 2019. RL capabilities

© The AnyLogic Company | www.anylogic.com

Practical Applications of Deep Reinforcement Learning Using AnyLogic

The AnyLogic Conference 2019, Austin, TX

Arash Mahdavi, Program Lead, The AnyLogic CompanyTy Wang, Vice President of Business Development, Skymind

© The AnyLogic Company | www.anylogic.com 2

Learning and decision making from a simulation model

FINAL MODEL

LEARN

Simulation model is an extension of someone’s mental model


Learning and decision making from a simulation model

FINAL MODEL

LEARN


Simulation as the reinforcement learning environment

SIMULATED WORLD(Simulation Model)


Traffic Light Example

Eduardo GonzalezVP EngineeringSkymind

Samuel Audet Deep Learning EngineerSkymind

Tyler Wolfe-AdamTechnical Support Specialist The AnyLogic Company


Arriv

al ra

tes (

per h

our)

Time (seconds)

Traffic Light Example

Cars enter the intersection from 4 directions and move towards the opposing side.

The objective of the training experiment is to learn a policy optimally controls the traffic light based on current status of the traffic.

N

S

W E


Implementation Architecture


Implementation Architecture

AnyLogic Model

Imported RL4J library

Custom Experiment


What is inside the Custom experiment?

Hyperparameters

Network configuration

Training




10

300 300

2

Input

Hidden 1 Hidden 2

Output







Training





Array with 10 elements

12

34

56

87

9





Action == 0: do nothingAction == 1: change the traffic

light phase if not yellow


Comparison of results (Optimized vs. Policy)



Comparison of results (Base vs. Optimized vs. Policy)

Real systems: Dynamic + Stochastic (exogenous inputs / system internals)

Optimization: Optimal fixed input parameters

Policy: Optimal (or near-optimal) decisions over time


Reinforcement learning decision points

Hyperparameters Observation Space

Action SpaceReward


Trained policies can be deployed in all types of devices and equipments to adaptively and autonomously complete some tasks.

How are learned policies used?

Edge devices could be used as controllersto deploy the learned policies.


Export model and text file

Test Export File Format

Export AnyLogic Model to Train


Add model into Skymind intelligence layer

.jar File Transfer

Create Experiment

Ready-to-Use Machine Learning Notebooks, Libraries, and Workflows


Train Model

Notebook Integration

Web or Command Line Interface

Compute and Storage Resource Management

Analytics


Deploy Model

Ready-to-Use Deployment Workflow

Multiple Model Language Support: Java, Python, Endpoints, RPA


Manage history and versions

Version History with Rollback


Machine Learning powered by Skymind

http://www.skymind.ai/anylogic


• The great news for simulation modelers is that their skills have a new and exciting application now!

• To implement a reinforcement learning (or DRL) a team of DRL expert(s) + simulation modeler(s) can collaborate. In theory, it is not necessary for each team to have an in-depth knowledge of the other group’s tasks.

• In developing simulation models that are going to be used as training environments, the stakes are higher because the human buffer is no longer there.

What should simulation modelers know about this new application?


At least in near future, there is NO way to automate the process of abstracting reality into a simulation model because it has two aspects that [current] machines are not good at:

The process of abstracting reality is an art Simulation models are fundamentally based on uncovering causality and how something works

Can simulation modelers’ jobs be replaced with AI too?


AnyLogic-AI integration roadmap

April 2019DL4J and RL4J librariesSKIL & AnyLogic

Summer 2019AnyLogic Cloud Python API (RL ready)

June 2019RL capabilities for the current AnyLogic Cloud Java APIDRL examples with instructions

end of 2019AL- AI book (first draft)

We are here now

• Integration with other AI platforms

• DRL in the Cloud (DRL experiment)• AL Python API (AnyLogic 9)

• Providing DRL-compatible example models

Fall 2019Preset learning algorithm/architectures in SKIL


thank you!

Documents

Practical Applications of Deep Reinforcement Learning ... · DL4J and RL4J libraries. SKIL & AnyLogic. Summer 2019. AnyLogic Cloud Python API (RL ready) June 2019. RL capabilities