28
©2017 Technica Corporation. All Rights Reserved. Technica Corporation Confidential and Proprietary. TECHNICA ENGINEERING SOLUTION technicacorp.com Joe Schneible 5/10/17 USING GENETIC ALGORITHMS TO OPTIMIZE RECURRENT NEURAL NETWORKS

USING GENETIC ALGORITHMS TO OPTIMIZE RECURRENT …...2 ©2017 Technica Corporation.All Rights Reserved. Technica Corporation Confidential and Proprietary. TIME SERIES C Day

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: USING GENETIC ALGORITHMS TO OPTIMIZE RECURRENT …...2 ©2017 Technica Corporation.All Rights Reserved. Technica Corporation Confidential and Proprietary. TIME SERIES C Day

©2017 Technica Corporation. All Rights Reserved. Technica Corporation Confidential and Proprietary.

TECHNICA ENGINEERING SOLUTION technicacorp.com

Joe Schneible

5/10/17

USING GENETIC ALGORITHMS TO OPTIMIZE RECURRENT NEURAL NETWORKS

Page 2: USING GENETIC ALGORITHMS TO OPTIMIZE RECURRENT …...2 ©2017 Technica Corporation.All Rights Reserved. Technica Corporation Confidential and Proprietary. TIME SERIES C Day

2 ©2017 Technica Corporation. All Rights Reserved. Technica Corporation Confidential and Proprietary.

TIME SERIES

De

gre

es

C

Day

Page 3: USING GENETIC ALGORITHMS TO OPTIMIZE RECURRENT …...2 ©2017 Technica Corporation.All Rights Reserved. Technica Corporation Confidential and Proprietary. TIME SERIES C Day

TECHNICA ENGINEERING SOLUTION

3 ©2017 Technica Corporation. All Rights Reserved. Technica Corporation Confidential and Proprietary.

technicacorp.com

RECURRENT NEURAL NETWORKS

Page 4: USING GENETIC ALGORITHMS TO OPTIMIZE RECURRENT …...2 ©2017 Technica Corporation.All Rights Reserved. Technica Corporation Confidential and Proprietary. TIME SERIES C Day

4 ©2017 Technica Corporation. All Rights Reserved. Technica Corporation Confidential and Proprietary.

RECURRENT NEURAL NETWORK (RNN)

Unfold

𝑡 − 1 𝑡 𝑡 + 1

Input

Output

Page 5: USING GENETIC ALGORITHMS TO OPTIMIZE RECURRENT …...2 ©2017 Technica Corporation.All Rights Reserved. Technica Corporation Confidential and Proprietary. TIME SERIES C Day

5 ©2017 Technica Corporation. All Rights Reserved. Technica Corporation Confidential and Proprietary.

Each cell is comprised of four elements:

1. Input Gate – Determines how a new value is added to the memory

2. Forget Gate – Determines how a value remains in memory

3. Output Gate – Determines how the value in memory affects the output

4. Neuron – with self-recurrent connection

LONG SHORT TERM MEMORY (LSTM)

Output Gate

Input Gate

Neuron

Forget Gate

Page 6: USING GENETIC ALGORITHMS TO OPTIMIZE RECURRENT …...2 ©2017 Technica Corporation.All Rights Reserved. Technica Corporation Confidential and Proprietary. TIME SERIES C Day

6 ©2017 Technica Corporation. All Rights Reserved. Technica Corporation Confidential and Proprietary.

LSTM VS BASIC RNN

Both can use standard methods such as back propagation through time

In basic RNNs, error drops off exponentially

LSTMs trap error in memory allowing error to be continually fed back until prior cells train well enough to address it

𝑠0

𝑂0

𝐼0

𝑠1

𝑂1

𝐼1

𝑠2

𝑂2

𝐼2

𝑠3

𝑂3

𝐼3

𝑠4

𝑂4

𝐼4

𝐼 = 𝐼𝑛𝑝𝑢𝑡 𝑂 = 𝑂𝑢𝑡𝑝𝑢𝑡

∂I3

∂s3

∂s3

∂s2

∂s2

∂s1

∂s1

∂s0

Page 7: USING GENETIC ALGORITHMS TO OPTIMIZE RECURRENT …...2 ©2017 Technica Corporation.All Rights Reserved. Technica Corporation Confidential and Proprietary. TIME SERIES C Day

7 ©2017 Technica Corporation. All Rights Reserved. Technica Corporation Confidential and Proprietary.

Look Back

Number of cells in LSTM layer

PARAMETER SPACE

Page 8: USING GENETIC ALGORITHMS TO OPTIMIZE RECURRENT …...2 ©2017 Technica Corporation.All Rights Reserved. Technica Corporation Confidential and Proprietary. TIME SERIES C Day

8 ©2017 Technica Corporation. All Rights Reserved. Technica Corporation Confidential and Proprietary.

BATCH TRAINING

Train in mini-batches on independent windows of the training data

Each window is trained in parallel

Parameters gained from each sections are averaged to produce a final parameter set

Takes advantage of parallelism of GPUs

Window

1

Window

2

Page 9: USING GENETIC ALGORITHMS TO OPTIMIZE RECURRENT …...2 ©2017 Technica Corporation.All Rights Reserved. Technica Corporation Confidential and Proprietary. TIME SERIES C Day

9 ©2017 Technica Corporation. All Rights Reserved. Technica Corporation Confidential and Proprietary.

RECURSIVE PREDICTION

RNNs predict next time step

To predict further, one must use resulting output as part of the next input sample

Error will accumulate as further predictions are made

Page 10: USING GENETIC ALGORITHMS TO OPTIMIZE RECURRENT …...2 ©2017 Technica Corporation.All Rights Reserved. Technica Corporation Confidential and Proprietary. TIME SERIES C Day

TECHNICA ENGINEERING SOLUTION

10 ©2017 Technica Corporation. All Rights Reserved. Technica Corporation Confidential and Proprietary.

technicacorp.com

GENETIC ALGORITHMS

Page 11: USING GENETIC ALGORITHMS TO OPTIMIZE RECURRENT …...2 ©2017 Technica Corporation.All Rights Reserved. Technica Corporation Confidential and Proprietary. TIME SERIES C Day

11 ©2017 Technica Corporation. All Rights Reserved. Technica Corporation Confidential and Proprietary.

GENETIC ALGORITHMS (GA)

Heuristic approach to searching a parameter space for a (near) optimal solution

Modeled on evolution Create a set of solutions called

a generation Test all elements of the generation to

determine the best solutions Create a new generation through

cross-over and mutation of best solutions Repeat

Mutation

Natural Selection

Cross-over and Mutation

Natural Selection

Cross-over and Mutation

Page 12: USING GENETIC ALGORITHMS TO OPTIMIZE RECURRENT …...2 ©2017 Technica Corporation.All Rights Reserved. Technica Corporation Confidential and Proprietary. TIME SERIES C Day

12 ©2017 Technica Corporation. All Rights Reserved. Technica Corporation Confidential and Proprietary.

FITNESS FUNCTION

Determines which candidate solutions are best

Root Mean Square Error (RMSE)

Penalty for larger networks

Page 13: USING GENETIC ALGORITHMS TO OPTIMIZE RECURRENT …...2 ©2017 Technica Corporation.All Rights Reserved. Technica Corporation Confidential and Proprietary. TIME SERIES C Day

13 ©2017 Technica Corporation. All Rights Reserved. Technica Corporation Confidential and Proprietary.

CROSS-OVER

1 1 1 0 1 0 0 1 0 1

0 1 0 0 0 1 0 1 1 0

1 1 0 0 1 1 0 1 0 1

Combines the genes from multiple parents

Randomly selects genes from second parent to overwrite those of the first parent as a given rate

Parent 1

Parent 2

Child

Parent 1

Parent 2

Child

Page 14: USING GENETIC ALGORITHMS TO OPTIMIZE RECURRENT …...2 ©2017 Technica Corporation.All Rights Reserved. Technica Corporation Confidential and Proprietary. TIME SERIES C Day

14 ©2017 Technica Corporation. All Rights Reserved. Technica Corporation Confidential and Proprietary.

MUTATION

Genes are randomly altered at a small rate

Enables movement away from local minima

1 1 0 0 1 1 0 1 0 1

1 1 0 1 1 1 0 1 1 1

Unmutated

Mutated

Page 15: USING GENETIC ALGORITHMS TO OPTIMIZE RECURRENT …...2 ©2017 Technica Corporation.All Rights Reserved. Technica Corporation Confidential and Proprietary. TIME SERIES C Day

TECHNICA ENGINEERING SOLUTION

15 ©2017 Technica Corporation. All Rights Reserved. Technica Corporation Confidential and Proprietary.

technicacorp.com

IMPLEMENTATION

Page 16: USING GENETIC ALGORITHMS TO OPTIMIZE RECURRENT …...2 ©2017 Technica Corporation.All Rights Reserved. Technica Corporation Confidential and Proprietary. TIME SERIES C Day

16 ©2017 Technica Corporation. All Rights Reserved. Technica Corporation Confidential and Proprietary.

IMPLEMENTATION

Framework LSTM Model GA Gene Representation

GA Fitness Function

Page 17: USING GENETIC ALGORITHMS TO OPTIMIZE RECURRENT …...2 ©2017 Technica Corporation.All Rights Reserved. Technica Corporation Confidential and Proprietary. TIME SERIES C Day

17 ©2017 Technica Corporation. All Rights Reserved. Technica Corporation Confidential and Proprietary.

KERAS

Python package for neural networks

Simple and easy usage enables rapid prototyping

Used Theano backend

Page 18: USING GENETIC ALGORITHMS TO OPTIMIZE RECURRENT …...2 ©2017 Technica Corporation.All Rights Reserved. Technica Corporation Confidential and Proprietary. TIME SERIES C Day

18 ©2017 Technica Corporation. All Rights Reserved. Technica Corporation Confidential and Proprietary.

MODEL CREATION

Tested various model architectures Used stateless LSTM as opposed to stateful LSTM to take

advantage of GPU via mini-batches

Achieve better results due to efficiency and the ability to perform more training

Page 19: USING GENETIC ALGORITHMS TO OPTIMIZE RECURRENT …...2 ©2017 Technica Corporation.All Rights Reserved. Technica Corporation Confidential and Proprietary. TIME SERIES C Day

19 ©2017 Technica Corporation. All Rights Reserved. Technica Corporation Confidential and Proprietary.

GENE REPRESENTATION

Initially started with a decimal representation of parameters where mutations and crossovers would lead to discovering non-global minima

Changed to binary representation to help solve this issue

Integer parameters are represented as sum of 0-1 binary sequence

Length of sequence determined by user-selected theoretical maximum

1 1 1 0 1 0 0 1 0 1

Look Back Cells

Page 20: USING GENETIC ALGORITHMS TO OPTIMIZE RECURRENT …...2 ©2017 Technica Corporation.All Rights Reserved. Technica Corporation Confidential and Proprietary. TIME SERIES C Day

20 ©2017 Technica Corporation. All Rights Reserved. Technica Corporation Confidential and Proprietary.

GENETIC ALGORITHM FITNESS FUNCTION

Low training loss did not lead to being able to predict recursively

Testing RMSE produced better results

Page 21: USING GENETIC ALGORITHMS TO OPTIMIZE RECURRENT …...2 ©2017 Technica Corporation.All Rights Reserved. Technica Corporation Confidential and Proprietary. TIME SERIES C Day

TECHNICA ENGINEERING SOLUTION

21 ©2017 Technica Corporation. All Rights Reserved. Technica Corporation Confidential and Proprietary.

technicacorp.com

RESULTS

Page 22: USING GENETIC ALGORITHMS TO OPTIMIZE RECURRENT …...2 ©2017 Technica Corporation.All Rights Reserved. Technica Corporation Confidential and Proprietary. TIME SERIES C Day

22 ©2017 Technica Corporation. All Rights Reserved. Technica Corporation Confidential and Proprietary.

HAND TUNED PARAMETERS

RMSE: 0.192

Page 23: USING GENETIC ALGORITHMS TO OPTIMIZE RECURRENT …...2 ©2017 Technica Corporation.All Rights Reserved. Technica Corporation Confidential and Proprietary. TIME SERIES C Day

23 ©2017 Technica Corporation. All Rights Reserved. Technica Corporation Confidential and Proprietary.

GENETIC ALGORITHM PARAMETERS

RMSE: 0.099

Page 24: USING GENETIC ALGORITHMS TO OPTIMIZE RECURRENT …...2 ©2017 Technica Corporation.All Rights Reserved. Technica Corporation Confidential and Proprietary. TIME SERIES C Day

24 ©2017 Technica Corporation. All Rights Reserved. Technica Corporation Confidential and Proprietary.

FEMALE BIRTHS

RMSE: 0.16

Page 25: USING GENETIC ALGORITHMS TO OPTIMIZE RECURRENT …...2 ©2017 Technica Corporation.All Rights Reserved. Technica Corporation Confidential and Proprietary. TIME SERIES C Day

25 ©2017 Technica Corporation. All Rights Reserved. Technica Corporation Confidential and Proprietary.

PERFORMANCE

0

0.5

1

1.5

2

2.5

3

3.5

Spee

d u

p

Training Time

CPU GPU

1x

3.2x

Page 26: USING GENETIC ALGORITHMS TO OPTIMIZE RECURRENT …...2 ©2017 Technica Corporation.All Rights Reserved. Technica Corporation Confidential and Proprietary. TIME SERIES C Day

26 ©2017 Technica Corporation. All Rights Reserved. Technica Corporation Confidential and Proprietary.

FUTURE WORK

Multi-GPU Parallelism Each solution in the population can be trained independently

Training time is dependent on parameters, so speed up will be sub-linear

Apply to General Neural Networks

Page 27: USING GENETIC ALGORITHMS TO OPTIMIZE RECURRENT …...2 ©2017 Technica Corporation.All Rights Reserved. Technica Corporation Confidential and Proprietary. TIME SERIES C Day

27 ©2017 Technica Corporation. All Rights Reserved. Technica Corporation Confidential and Proprietary.

CONCLUSION

Genetic algorithm chose parameters for our LTSM network

Produced better results than our hand tuning

Would be useful for individuals that lack experience selecting parameters

Requires further parallelization to be feasible for larger network parameter spaces

SPECIAL THANKS

Alex Lu (Junior Software Engineer)

Page 28: USING GENETIC ALGORITHMS TO OPTIMIZE RECURRENT …...2 ©2017 Technica Corporation.All Rights Reserved. Technica Corporation Confidential and Proprietary. TIME SERIES C Day

28 ©2017 Technica Corporation. All Rights Reserved. Technica Corporation Confidential and Proprietary.

CONTACT INFORMATION

http://technicacorp.com/innovation

Joe Schneible Research Scientist [email protected]

For more information:

Technica Corporation 22970 Indian Creek Dr. Suite 500 Dulles, VA 20166

703.662.2000 technicacorp.com