19
Making Neural Programming Architectures Generalize via Recursion ICLR 2017 Katy@Datalab

Making neural programming architectures generalize via recursion 20170224

Embed Size (px)

Citation preview

Page 1: Making neural programming architectures generalize via recursion 20170224

Making Neural Programming Architectures Generalize via Recursion

ICLR 2017 Katy@Datalab

Page 2: Making neural programming architectures generalize via recursion 20170224

Background

• AGI: Artificial General Intelligence

Page 3: Making neural programming architectures generalize via recursion 20170224

Background• Training neural networks to synthesize robust

programs from a small number of examples is a challenging task.

• The space of possible programs is extremely large, and composing a program that performs robustly on the infinite space of possible inputs is difficult

• Because it is impractical to obtain enough training examples to easily disambiguate amongst all possible programs.

Page 4: Making neural programming architectures generalize via recursion 20170224

Motivation• Curriculum training?

• the network still might not learn the true program semantics like in NPI, generalization becomes poor beyond a threshold level of complexity.

Page 5: Making neural programming architectures generalize via recursion 20170224

Related Work

• Scott Reed and Nando de Freitas. Neural programmer-interpreters. ICLR, 2016.

Page 6: Making neural programming architectures generalize via recursion 20170224

NPI Model

Page 7: Making neural programming architectures generalize via recursion 20170224
Page 8: Making neural programming architectures generalize via recursion 20170224

• neural network learns spurious dependencies which depend on specific characteristics of the training examples that are irrelevant to the true program semantics, such as length of the training inputs, and thus fails to generalize to more complex inputs.

Page 9: Making neural programming architectures generalize via recursion 20170224

Main Idea

• Explicitly incorporating recursion into neural architectures.

Page 10: Making neural programming architectures generalize via recursion 20170224

Why Recursion?

• Recursion divides the problem into smaller pieces and drastically reduces the domain of each neural network component, making it tractable to prove guarantees about the overall system’s behavior.

Page 11: Making neural programming architectures generalize via recursion 20170224

Why Recursion?

• By nature, recursion reduces the complexity of a problem to simpler instances.

Page 12: Making neural programming architectures generalize via recursion 20170224

Model

• Using an NPI(Neural Programmer Interpreter) like model, except that a program can call itself.

• Let the model learn recursive program

• Achieve perfect generalization

Page 13: Making neural programming architectures generalize via recursion 20170224

Partial(tail) and Full Recursive

Page 14: Making neural programming architectures generalize via recursion 20170224

Experiment

Page 15: Making neural programming architectures generalize via recursion 20170224

Bubble sort on NPI

Page 16: Making neural programming architectures generalize via recursion 20170224
Page 17: Making neural programming architectures generalize via recursion 20170224

Conclusion• Simple idea

• prove to have 100% generalisation

• The trained model has learned the correct program semantics

• Recursion is very important for neural programming architectures

Page 18: Making neural programming architectures generalize via recursion 20170224

Future Work

• Reduce the amount of supervision:

• Training with only partial or non-recursive traces, and to integrate a notion of recursion into the models themselves by constructing novel Neural Programming Architectures.

Page 19: Making neural programming architectures generalize via recursion 20170224

Future Work

• on MNIST dataset?