1

GoGoGo: Improving Deep Neural Network Based Go Playing AI with Residual Networks Introduction Xingyu Liu CONV3, 64 CONV3, 64 Input (19x19x5) CONV3, 64 CONV3, 64 CONV3, 64 CONV3, 64 CONV3, 64 CONV3, 64 CONV3, 64 CONV3, 64 CONV3, 64 CONV3, 64 Output CONV3 64 P Fig. 2 (a) Policy Network CONV3, 64 CONV3, 64 Input (19x19x5) CONV3, 64 CONV3, 64 CONV3, 64 CONV3, 64 CONV3, 64 CONV3, 64 CONV3, 64 CONV3, 64 CONV3, 64 CONV3, 64 CONV3 64 (b) Value Network Network Architecture Experiment Result • Go playing AIs using Traditional Search: GNU Go, Pachi, Fuego, Zen etc. Training Methodology and Data • From Kifu to Input Feature Maps Channels: 1) Space Positions; 2) Black Positions; 3) White Positions; 4) Current Player; 5) Ko Positions [1] David Silver et al., “Mastering the game of go with deep neural networks and tree search”, Nature, 529:484–503, 2016. [2] Kaiming He et al, “Deep residual learning for image recognition”, CoRR, abs/1512.03385, 2015. SL on Policy Network RL on Policy Network RL on Value Network • Dynamic Board State Expansion Ko fight performing. Saves disk space. Small Mem • Use Ing Chang-ki rule Board State + Ko is Game State, No need to remember the number of captured stones • Two Levels of Batches (Kifus, moves) Random Shuffling. Mem usage small and locality. • Powered by Deep Learning: Zen → Deep Zen Go, darkforest , AlphaGo • Goal: From by Vanilla CNN to ResNets CONV3 Batch Norm ReLU CONV3 Batch Norm ReLU Data Eltwise Add (c) Residual Module FC23104 1 Hyperparameters Value Base learning rate 2E-4 Decay Policy Exp Decay Rate 0.95 Decay Step (kifu) 200 Loss Function Softmax (b) Hyperparameters Future Work Fig 1. Ko fight explicity expansion • Training Accuracy ~ 32% • Testing Accuracy ~ 26% Fig. 3 GoGoGo plays against itself, policy network only 0 1 2 3 4 5 6 7 1 183 365 547 729 911 1093 1275 1457 1639 1821 2003 2185 2367 2549 2731 2913 3095 3277 3459 3641 3823 4005 4187 4369 4551 4733 4915 5097 5279 5461 5643 5825 6007 6189 6371 6553 6735 6917 7099 7281 7463 7645 7827 8009 8191 8373 8555 8737 Supervised Learning Training Loss /100 batches • Monte Carlo Tree Search = argmax ( , + ( , )) (, ) ∝ (, ) 1 + (, ) • Reinforcement Learning of Value Network • Network Architecture Exploration • Real Match Testing against Human Players

Xingyu Liu - Machine Learningcs229.stanford.edu/proj2016/poster/Liu-GoGoGo Improving Deep Neural... · [1] David Silver et al., “Mastering the game of go with deep neural networks

Download PDF Report

Upload
others
View
2
Download
0

Embed Size (px)

Citation preview

Page 1: Xingyu Liu - Machine Learningcs229.stanford.edu/proj2016/poster/Liu-GoGoGo Improving Deep Neural... · [1] David Silver et al., “Mastering the game of go with deep neural networks

GoGoGo: Improving Deep Neural Network Based Go Playing AI with Residual Networks

Introduction

Xingyu Liu

CONV3, 64

CONV3, 64

Input (19x19x5)

CONV3, 64

CONV3, 64

CONV3, 64

CONV3, 64

CONV3, 64

CONV3, 64

CONV3, 64

CONV3, 64

CONV3, 64

CONV3, 64

Output

CONV3 64

P

Fig. 2 (a) Policy Network

CONV3, 64

CONV3, 64

Input (19x19x5)

CONV3, 64

CONV3, 64

CONV3, 64

CONV3, 64

CONV3, 64

CONV3, 64

CONV3, 64

CONV3, 64

CONV3, 64

CONV3, 64

CONV3 64

(b) Value Network

Network Architecture Experiment Result

• Go playing AIs using Traditional Search:

GNU Go, Pachi, Fuego, Zen etc.

Training Methodology and Data

• From Kifu to Input Feature Maps

Channels: 1) Space Positions; 2) Black Positions; 3)

White Positions; 4) Current Player; 5) Ko Positions

[1] David Silver et al., “Mastering the game of go with deep neural networks and

tree search”, Nature, 529:484–503, 2016.

[2] Kaiming He et al, “Deep residual learning for image recognition”, CoRR,

abs/1512.03385, 2015.

SL on Policy

Network

RL on Policy

Network

RL on Value

Network

• Dynamic Board State Expansion

Ko fight performing. Saves disk space. Small Mem

• Use Ing Chang-ki rule

Board State + Ko is Game State, No need to

remember the number of captured stones

• Two Levels of Batches (Kifus, moves)

Random Shuffling. Mem usage small and locality.

• Powered by Deep Learning:

Zen → Deep Zen Go, darkforest, AlphaGo

• Goal: From by Vanilla CNN to ResNets

CONV3

Batch Norm

ReLU

CONV3

Batch Norm

ReLU

Data

Eltwise Add

(c) Residual Module

FC23104 1

Hyperparameters Value

Base learning rate 2E-4

Decay Policy Exp

Decay Rate 0.95

Decay Step (kifu) 200

Loss Function Softmax

(b) Hyperparameters

Future Work Fig 1. Ko fight explicity

expansion

• Training Accuracy ~ 32%

• Testing Accuracy ~ 26%

Fig. 3 GoGoGo plays against itself, policy network only

0

1

2

3

4

5

6

7

1183

365

547

729

911

1093

1275

1457

1639

1821

2003

2185

2367

2549

2731

2913

3095

3277

3459

3641

3823

4005

4187

4369

4551

4733

4915

5097

5279

5461

5643

5825

6007

6189

6371

6553

6735

6917

7099

7281

7463

7645

7827

8009

8191

8373

8555

8737

Supervised Learning Training Loss

/100 batches

• Monte Carlo Tree Search

𝑎𝑡 = argmax𝑎

(𝑄 𝑠𝑡 , 𝑎 + 𝑢(𝑠𝑡, 𝑎))

𝑢(𝑠, 𝑎) ∝𝑃(𝑠, 𝑎)

1 + 𝑁(𝑠, 𝑎)

• Reinforcement Learning of Value Network

• Network Architecture Exploration

• Real Match Testing against Human Players

Neural Machine Translation: A Machine Learning Perspectivetcci.ccf.org.cn/summit/2017/dlinfo/06.pdf · Neural Machine Translation: A Machine Learning Perspective Tie-Yan Liu Principal

Neural Machine Translation: A Machine Learning Perspectivetcci.ccf.org.cn/summit/2017/dlinfo/06.pdf · Neural Machine Translation: A Machine Learning Perspective Tie-Yan Liu Principal

Documents

Neural Collective Entity Linking · Neural Collective Entity Linking Yixin Cao, Lei Hou, Juanzi Li, Zhiyuan Liu Dept. of Computer Science and Technology, Tsinghua University, China

Neural Collective Entity Linking · Neural Collective Entity Linking Yixin Cao, Lei Hou, Juanzi Li, Zhiyuan Liu Dept. of Computer Science and Technology, Tsinghua University, China

Documents

Neural Architecture Optimization...Neural Architecture Optimization 1Renqian Luoy, 2Fei Tiany,2Tao Qin, 1Enhong Chen, 2Tie-Yan Liu 1University of Science and Technology of China, Hefei,

Neural Architecture Optimization...Neural Architecture Optimization 1Renqian Luoy, 2Fei Tiany,2Tao Qin, 1Enhong Chen, 2Tie-Yan Liu 1University of Science and Technology of China, Hefei,

Documents

Heterogeneous Graph Neural Networks for Malicious Account Detection 2018... · 2018-12-15 · Heterogeneous Graph Neural Networks for Malicious Account Detection Ziqi Liu Ant Financial

Heterogeneous Graph Neural Networks for Malicious Account Detection 2018... · 2018-12-15 · Heterogeneous Graph Neural Networks for Malicious Account Detection Ziqi Liu Ant Financial

Documents

Recurrent Neural Network for Text Classification with Multi · PDF file · 2016-06-28Recurrent Neural Network for Text Classiﬁcation with Multi-Task Learning Pengfei Liu Xipeng

Recurrent Neural Network for Text Classification with Multi · PDF file · 2016-06-28Recurrent Neural Network for Text Classiﬁcation with Multi-Task Learning Pengfei Liu Xipeng

Documents

Towards Better Analysis of Deep Convolutional Neural Networks … · Towards Better Analysis of Deep Convolutional Neural Networks Mengchen Liu, Jiaxin Shi, Zhen Li, Chongxuan Li,

Towards Better Analysis of Deep Convolutional Neural Networks … · Towards Better Analysis of Deep Convolutional Neural Networks Mengchen Liu, Jiaxin Shi, Zhen Li, Chongxuan Li,

Documents

Recurrent neural network based decision support system · 2019-10-08 · Recurrent neural network based decision support system Abiodun Ayodeji* Yong-kuo Liu Fundamental Science on

Recurrent neural network based decision support system · 2019-10-08 · Recurrent neural network based decision support system Abiodun Ayodeji* Yong-kuo Liu Fundamental Science on

Documents

Exploring Segment Representations for Neural Segmentation … · 2016-06-28 · Exploring Segment Representations for Neural Segmentation Models Yijia Liu, Wanxiang Che ⇤, Jiang

Exploring Segment Representations for Neural Segmentation … · 2016-06-28 · Exploring Segment Representations for Neural Segmentation Models Yijia Liu, Wanxiang Che ⇤, Jiang

Documents

Developers gogogo!!!

Developers gogogo!!!

Technology

RUMBA DISCOVER MORE ON IT BY READING THIS........GOGOGO

RUMBA DISCOVER MORE ON IT BY READING THIS........GOGOGO

Documents

Neural Network Coding - mit.edusalmansa/files/Neural_Network_Coding.pdf · Neural Network Coding Litian Liu, Amit Solomon, Salman Salamatian, Muriel Medard´ Research Lab of Electronics,

Neural Network Coding - mit.edusalmansa/files/Neural_Network_Coding.pdf · Neural Network Coding Litian Liu, Amit Solomon, Salman Salamatian, Muriel Medard´ Research Lab of Electronics,

Documents

Cambricon: An Instruction Set Architecture for Neural · PDF fileCambricon: An Instruction Set Architecture for Neural Networks Shaoli Liu∗§, Zidong Du∗§, Jinhua Tao∗§, Dong

Cambricon: An Instruction Set Architecture for Neural · PDF fileCambricon: An Instruction Set Architecture for Neural Networks Shaoli Liu∗§, Zidong Du∗§, Jinhua Tao∗§, Dong

Documents

Neural mechanisms of feature- based attention Taosheng Liu

Neural mechanisms of feature- based attention Taosheng Liu

Documents

GeniePath: Graph Neural Networks with Adaptive …GeniePath: Graph Neural Networks with Adaptive Receptive Paths Ziqi Liu 1, Chaochao Chen , Longfei Li , Jun Zhou 1, Xiaolong Li ,

GeniePath: Graph Neural Networks with Adaptive …GeniePath: Graph Neural Networks with Adaptive Receptive Paths Ziqi Liu 1, Chaochao Chen , Longfei Li , Jun Zhou 1, Xiaolong Li ,

Documents

An intriguing failing of convolutional neural networks and the … · An intriguing failing of convolutional neural networks and the CoordConv solution Rosanne Liu 1Joel Lehman Piero

An intriguing failing of convolutional neural networks and the … · An intriguing failing of convolutional neural networks and the CoordConv solution Rosanne Liu 1Joel Lehman Piero

Documents

CSGNet: Neural Shape Parser for Constructive Solid Geometry · 2018-06-11 · CSGNet: Neural Shape Parser for Constructive Solid Geometry Gopal Sharma Rishabh Goyal Difan Liu Evangelos

CSGNet: Neural Shape Parser for Constructive Solid Geometry · 2018-06-11 · CSGNet: Neural Shape Parser for Constructive Solid Geometry Gopal Sharma Rishabh Goyal Difan Liu Evangelos

Documents

Joint Training for Neural Machine Translation …...Joint Training for Neural Machine Translation Models with Monolingual Data Zhirui Zhangy, Shujie Liu z, Mu Li , Ming Zhou , Enhong

Joint Training for Neural Machine Translation …...Joint Training for Neural Machine Translation Models with Monolingual Data Zhirui Zhangy, Shujie Liu z, Mu Li , Ming Zhou , Enhong

Documents

Lending Orientation to Neural Networks for Cross-View Geo ...openaccess.thecvf.com/content_CVPR_2019/papers/Liu...for_Cross-… · Lending Orientation to Neural Networks for Cross-view

Lending Orientation to Neural Networks for Cross-View Geo ...openaccess.thecvf.com/content_CVPR_2019/papers/Liu...for_Cross-… · Lending Orientation to Neural Networks for Cross-view

Documents

Relation-Shape Convolutional Neural Network for Point Cloud …openaccess.thecvf.com/content_CVPR_2019/papers/Liu... · 2019-06-10 · Relation-Shape Convolutional Neural Network

Relation-Shape Convolutional Neural Network for Point Cloud …openaccess.thecvf.com/content_CVPR_2019/papers/Liu... · 2019-06-10 · Relation-Shape Convolutional Neural Network

Documents

Neural Architecture Optimizationpapers.nips.cc/paper/8007-neural-architecture...Neural Architecture Optimization 1Renqian Luoy, 2Fei Tiany,2Tao Qin, 1Enhong Chen, 2Tie-Yan Liu 1University

Neural Architecture Optimizationpapers.nips.cc/paper/8007-neural-architecture...Neural Architecture Optimization 1Renqian Luoy, 2Fei Tiany,2Tao Qin, 1Enhong Chen, 2Tie-Yan Liu 1University

Documents

Chunk-based Decoder for Neural Machine Translationynaga/papers/ishiwatari-acl...Chunk-based Decoder for Neural Machine Translation Shonosuke Ishiwatari¹, Jingtao Yao², Shujie Liu³,

Chunk-based Decoder for Neural Machine Translationynaga/papers/ishiwatari-acl...Chunk-based Decoder for Neural Machine Translation Shonosuke Ishiwatari¹, Jingtao Yao², Shujie Liu³,

Documents

Incremental Learning in Deep Neural Networks · Yuan Liu INCREMENTAL LEARNING IN DEEP NEURAL NETWORKS Master of Science Thesis Examiners:Prof. Joni-Kristian Kämäräinen Dr. Ke Chen

Incremental Learning in Deep Neural Networks · Yuan Liu INCREMENTAL LEARNING IN DEEP NEURAL NETWORKS Master of Science Thesis Examiners:Prof. Joni-Kristian Kämäräinen Dr. Ke Chen

Documents

Neural and genetic determinants of creativity - oxcns.org Liu Zhang Xie Rolls et al 2018 Neural and... · Based on a cross-validation based predictive framework, ... Using the Torrance

Neural and genetic determinants of creativity - oxcns.org Liu Zhang Xie Rolls et al 2018 Neural and... · Based on a cross-validation based predictive framework, ... Using the Torrance

Documents

Structural learning in artificial neural networks using ...tarjomefa.com/wp-content/uploads/2018/07/357... · for deep neural networks. Furthermore, Liu et al. [21] have used group

Structural learning in artificial neural networks using ...tarjomefa.com/wp-content/uploads/2018/07/357... · for deep neural networks. Furthermore, Liu et al. [21] have used group

Documents

Neural Person Search Machines - Tencent · Neural Person Search Machines Hao Liu 1 Jiashi Feng2 Zequn Jie3 Karlekar Jayashree4 Bo Zhao 5 Meibin Qi1 Jianguo Jiang1 Shuicheng Yan6,2

Neural Person Search Machines - Tencent · Neural Person Search Machines Hao Liu 1 Jiashi Feng2 Zequn Jie3 Karlekar Jayashree4 Bo Zhao 5 Meibin Qi1 Jianguo Jiang1 Shuicheng Yan6,2

Documents

Decentralized Neural Network-based Excitation Control of Large … · 2016-10-20 · Decentralized Neural Network-based Excitation Control of Large-scale Power Systems Wenxin Liu,

Decentralized Neural Network-based Excitation Control of Large … · 2016-10-20 · Decentralized Neural Network-based Excitation Control of Large-scale Power Systems Wenxin Liu,

Documents

ABS: Scanning Neural Networks for Back-doors by Artificial ... · ABS: Scanning Neural Networks for Back-doors by Artificial Brain Stimulation Yingqi Liu liu1751@purdue.edu Purdue

ABS: Scanning Neural Networks for Back-doors by Artificial ... · ABS: Scanning Neural Networks for Back-doors by Artificial Brain Stimulation Yingqi Liu [email protected] Purdue

Documents

Oblivious Neural Network Predictions via MiniONN ... · Oblivious Neural Network Predictions via MiniONN transformations Jian Liu Aalto University jian.liu@aalto.fi Mika Juuti Aalto

Oblivious Neural Network Predictions via MiniONN ... · Oblivious Neural Network Predictions via MiniONN transformations Jian Liu Aalto University [email protected] Mika Juuti Aalto

Documents

MemNAS: Memory-Efficient Neural Architecture Search With ...openaccess.thecvf.com/content_CVPR_2020/papers/Liu... · MemNAS: Memory-Efﬁcient Neural Architecture Search with Grow-Trim

MemNAS: Memory-Efficient Neural Architecture Search With ...openaccess.thecvf.com/content_CVPR_2020/papers/Liu... · MemNAS: Memory-Efﬁcient Neural Architecture Search with Grow-Trim

Documents

A Hybrid Retrieval-Generation Neural Conversation ModelA Hybrid Retrieval-Generation Neural Conversation Model Liu Yang ∗1 Junjie Hu2 Minghui Qiu3 Chen Qu 1 Jianfeng Gao4 W. Bruce

A Hybrid Retrieval-Generation Neural Conversation ModelA Hybrid Retrieval-Generation Neural Conversation Model Liu Yang ∗1 Junjie Hu2 Minghui Qiu3 Chen Qu 1 Jianfeng Gao4 W. Bruce

Documents