Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
©2017 Technica Corporation. All Rights Reserved. Technica Corporation Confidential and Proprietary.
TECHNICA ENGINEERING SOLUTION technicacorp.com
Joe Schneible
5/10/17
USING GENETIC ALGORITHMS TO OPTIMIZE RECURRENT NEURAL NETWORKS
2 ©2017 Technica Corporation. All Rights Reserved. Technica Corporation Confidential and Proprietary.
TIME SERIES
De
gre
es
C
Day
TECHNICA ENGINEERING SOLUTION
3 ©2017 Technica Corporation. All Rights Reserved. Technica Corporation Confidential and Proprietary.
technicacorp.com
RECURRENT NEURAL NETWORKS
4 ©2017 Technica Corporation. All Rights Reserved. Technica Corporation Confidential and Proprietary.
RECURRENT NEURAL NETWORK (RNN)
Unfold
𝑡 − 1 𝑡 𝑡 + 1
Input
Output
5 ©2017 Technica Corporation. All Rights Reserved. Technica Corporation Confidential and Proprietary.
Each cell is comprised of four elements:
1. Input Gate – Determines how a new value is added to the memory
2. Forget Gate – Determines how a value remains in memory
3. Output Gate – Determines how the value in memory affects the output
4. Neuron – with self-recurrent connection
LONG SHORT TERM MEMORY (LSTM)
Output Gate
Input Gate
Neuron
Forget Gate
6 ©2017 Technica Corporation. All Rights Reserved. Technica Corporation Confidential and Proprietary.
LSTM VS BASIC RNN
Both can use standard methods such as back propagation through time
In basic RNNs, error drops off exponentially
LSTMs trap error in memory allowing error to be continually fed back until prior cells train well enough to address it
𝑠0
𝑂0
𝐼0
𝑠1
𝑂1
𝐼1
𝑠2
𝑂2
𝐼2
𝑠3
𝑂3
𝐼3
𝑠4
𝑂4
𝐼4
𝐼 = 𝐼𝑛𝑝𝑢𝑡 𝑂 = 𝑂𝑢𝑡𝑝𝑢𝑡
∂I3
∂s3
∂s3
∂s2
∂s2
∂s1
∂s1
∂s0
7 ©2017 Technica Corporation. All Rights Reserved. Technica Corporation Confidential and Proprietary.
Look Back
Number of cells in LSTM layer
PARAMETER SPACE
8 ©2017 Technica Corporation. All Rights Reserved. Technica Corporation Confidential and Proprietary.
BATCH TRAINING
Train in mini-batches on independent windows of the training data
Each window is trained in parallel
Parameters gained from each sections are averaged to produce a final parameter set
Takes advantage of parallelism of GPUs
Window
1
Window
2
9 ©2017 Technica Corporation. All Rights Reserved. Technica Corporation Confidential and Proprietary.
RECURSIVE PREDICTION
RNNs predict next time step
To predict further, one must use resulting output as part of the next input sample
Error will accumulate as further predictions are made
TECHNICA ENGINEERING SOLUTION
10 ©2017 Technica Corporation. All Rights Reserved. Technica Corporation Confidential and Proprietary.
technicacorp.com
GENETIC ALGORITHMS
11 ©2017 Technica Corporation. All Rights Reserved. Technica Corporation Confidential and Proprietary.
GENETIC ALGORITHMS (GA)
Heuristic approach to searching a parameter space for a (near) optimal solution
Modeled on evolution Create a set of solutions called
a generation Test all elements of the generation to
determine the best solutions Create a new generation through
cross-over and mutation of best solutions Repeat
Mutation
Natural Selection
Cross-over and Mutation
Natural Selection
Cross-over and Mutation
12 ©2017 Technica Corporation. All Rights Reserved. Technica Corporation Confidential and Proprietary.
FITNESS FUNCTION
Determines which candidate solutions are best
Root Mean Square Error (RMSE)
Penalty for larger networks
13 ©2017 Technica Corporation. All Rights Reserved. Technica Corporation Confidential and Proprietary.
CROSS-OVER
1 1 1 0 1 0 0 1 0 1
0 1 0 0 0 1 0 1 1 0
1 1 0 0 1 1 0 1 0 1
Combines the genes from multiple parents
Randomly selects genes from second parent to overwrite those of the first parent as a given rate
Parent 1
Parent 2
Child
Parent 1
Parent 2
Child
14 ©2017 Technica Corporation. All Rights Reserved. Technica Corporation Confidential and Proprietary.
MUTATION
Genes are randomly altered at a small rate
Enables movement away from local minima
1 1 0 0 1 1 0 1 0 1
1 1 0 1 1 1 0 1 1 1
Unmutated
Mutated
TECHNICA ENGINEERING SOLUTION
15 ©2017 Technica Corporation. All Rights Reserved. Technica Corporation Confidential and Proprietary.
technicacorp.com
IMPLEMENTATION
16 ©2017 Technica Corporation. All Rights Reserved. Technica Corporation Confidential and Proprietary.
IMPLEMENTATION
Framework LSTM Model GA Gene Representation
GA Fitness Function
17 ©2017 Technica Corporation. All Rights Reserved. Technica Corporation Confidential and Proprietary.
KERAS
Python package for neural networks
Simple and easy usage enables rapid prototyping
Used Theano backend
18 ©2017 Technica Corporation. All Rights Reserved. Technica Corporation Confidential and Proprietary.
MODEL CREATION
Tested various model architectures Used stateless LSTM as opposed to stateful LSTM to take
advantage of GPU via mini-batches
Achieve better results due to efficiency and the ability to perform more training
19 ©2017 Technica Corporation. All Rights Reserved. Technica Corporation Confidential and Proprietary.
GENE REPRESENTATION
Initially started with a decimal representation of parameters where mutations and crossovers would lead to discovering non-global minima
Changed to binary representation to help solve this issue
Integer parameters are represented as sum of 0-1 binary sequence
Length of sequence determined by user-selected theoretical maximum
1 1 1 0 1 0 0 1 0 1
Look Back Cells
20 ©2017 Technica Corporation. All Rights Reserved. Technica Corporation Confidential and Proprietary.
GENETIC ALGORITHM FITNESS FUNCTION
Low training loss did not lead to being able to predict recursively
Testing RMSE produced better results
TECHNICA ENGINEERING SOLUTION
21 ©2017 Technica Corporation. All Rights Reserved. Technica Corporation Confidential and Proprietary.
technicacorp.com
RESULTS
22 ©2017 Technica Corporation. All Rights Reserved. Technica Corporation Confidential and Proprietary.
HAND TUNED PARAMETERS
RMSE: 0.192
23 ©2017 Technica Corporation. All Rights Reserved. Technica Corporation Confidential and Proprietary.
GENETIC ALGORITHM PARAMETERS
RMSE: 0.099
24 ©2017 Technica Corporation. All Rights Reserved. Technica Corporation Confidential and Proprietary.
FEMALE BIRTHS
RMSE: 0.16
25 ©2017 Technica Corporation. All Rights Reserved. Technica Corporation Confidential and Proprietary.
PERFORMANCE
0
0.5
1
1.5
2
2.5
3
3.5
Spee
d u
p
Training Time
CPU GPU
1x
3.2x
26 ©2017 Technica Corporation. All Rights Reserved. Technica Corporation Confidential and Proprietary.
FUTURE WORK
Multi-GPU Parallelism Each solution in the population can be trained independently
Training time is dependent on parameters, so speed up will be sub-linear
Apply to General Neural Networks
27 ©2017 Technica Corporation. All Rights Reserved. Technica Corporation Confidential and Proprietary.
CONCLUSION
Genetic algorithm chose parameters for our LTSM network
Produced better results than our hand tuning
Would be useful for individuals that lack experience selecting parameters
Requires further parallelization to be feasible for larger network parameter spaces
SPECIAL THANKS
Alex Lu (Junior Software Engineer)
28 ©2017 Technica Corporation. All Rights Reserved. Technica Corporation Confidential and Proprietary.
CONTACT INFORMATION
http://technicacorp.com/innovation
Joe Schneible Research Scientist [email protected]
For more information:
Technica Corporation 22970 Indian Creek Dr. Suite 500 Dulles, VA 20166
703.662.2000 technicacorp.com