56
Towards building Intelligent Machines that we can communicate with Tomas Mikolov, Facebook Talk at Text, Speech and Dialogue (TSD), 2017

Towards building Intelligent Machines that we can ... · •Great progress in machine learning did produce wonderful ... Towards building Intelligent Machines, Tomas Mikolov. Word

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Towards building Intelligent Machines that we can ... · •Great progress in machine learning did produce wonderful ... Towards building Intelligent Machines, Tomas Mikolov. Word

Towards building Intelligent Machinesthat we can communicate with

Tomas Mikolov, FacebookTalk at Text, Speech and Dialogue (TSD), 2017

Page 2: Towards building Intelligent Machines that we can ... · •Great progress in machine learning did produce wonderful ... Towards building Intelligent Machines, Tomas Mikolov. Word

Introduction

• Great progress in machine learning did produce wonderful applications:• Robust speech recognizers

• Automatic machine translation

• Search, ranking, spam filters, …

• Many of these are today part of real-world applications

• What are the next goals for researchers?

Towards building Intelligent Machines, Tomas Mikolov

Page 3: Towards building Intelligent Machines that we can ... · •Great progress in machine learning did produce wonderful ... Towards building Intelligent Machines, Tomas Mikolov. Word

Introduction

• Despite all this progress, we are still very far from having‘intelligent machines’

• We do not have datasets that could be used to build such machines

• We did not even agree on the metrics of success – how to define machine intelligence?

Towards building Intelligent Machines, Tomas Mikolov

Page 4: Towards building Intelligent Machines that we can ... · •Great progress in machine learning did produce wonderful ... Towards building Intelligent Machines, Tomas Mikolov. Word

Talk Overview

• Where we are:• Distributed representations

• Recurrent networks

• Where are we going:• Learning of complex patterns

• Incremental learning, long term memory

• Learning to learn (learning without supervision)

• Virtual environments as datasets for building AI

Towards building Intelligent Machines, Tomas Mikolov

Page 5: Towards building Intelligent Machines that we can ... · •Great progress in machine learning did produce wonderful ... Towards building Intelligent Machines, Tomas Mikolov. Word

Distributed representations of words

• Vector representation of words computed using neural networks

• Linguistic regularities in the word vector space

• Word2vec, fastText

Towards building Intelligent Machines, Tomas Mikolov

Page 6: Towards building Intelligent Machines that we can ... · •Great progress in machine learning did produce wonderful ... Towards building Intelligent Machines, Tomas Mikolov. Word

Word vectors

• Each word is associated with a real valued vector in N-dimensional space (usually N = 50 – 1000)

• The word vectors have similar properties to word classes (similar words have similar vector representations)

• Computed often using various types of neural networks

Towards building Intelligent Machines, Tomas Mikolov

Page 7: Towards building Intelligent Machines that we can ... · •Great progress in machine learning did produce wonderful ... Towards building Intelligent Machines, Tomas Mikolov. Word

Word vectors

• These word vectors can be subsequently used as features in many NLP tasks (Collobert et al, 2011)

• As word vectors can be trained on huge text datasets, they provide generalization for systems trained with limited amount of supervised data

Towards building Intelligent Machines, Tomas Mikolov

Page 8: Towards building Intelligent Machines that we can ... · •Great progress in machine learning did produce wonderful ... Towards building Intelligent Machines, Tomas Mikolov. Word

Word vectors

• Many neural architectures were proposed for training the word vectors, usually using several hidden layers

• We need some way how to compare word vectors trained using different architectures

Towards building Intelligent Machines, Tomas Mikolov

Page 9: Towards building Intelligent Machines that we can ... · •Great progress in machine learning did produce wonderful ... Towards building Intelligent Machines, Tomas Mikolov. Word

Linguistic regularities in vector space

• We can do nearest neighbor search around result of vector operation “king – man + woman” and obtain “queen”

Linguistic regularities in continuous space word representations (Mikolov et al, 2013)

Towards building Intelligent Machines, Tomas Mikolov

Page 10: Towards building Intelligent Machines that we can ... · •Great progress in machine learning did produce wonderful ... Towards building Intelligent Machines, Tomas Mikolov. Word

Word vectors – datasets for evaluation

Word-based dataset, almost 20K questions, focuses on both syntax and semantics:

• Athens:Greece Oslo: ___

• Angola:kwanza Iran: ___

• brother:sister grandson: ___

• possibly:impossibly ethical: ___

• walking:walked swimming: ___

Efficient estimation of word representations in vector space (Mikolov et al, 2013)

Towards building Intelligent Machines, Tomas Mikolov

Page 11: Towards building Intelligent Machines that we can ... · •Great progress in machine learning did produce wonderful ... Towards building Intelligent Machines, Tomas Mikolov. Word

Word vectors – Bengio architecture

• Neural net based word vectors were traditionally trained as part of neural network language model (Bengio et al, 2003)

• Training on <1M words took days

Towards building Intelligent Machines, Tomas Mikolov

Page 12: Towards building Intelligent Machines that we can ... · •Great progress in machine learning did produce wonderful ... Towards building Intelligent Machines, Tomas Mikolov. Word

Word vectors – word2vec architectures

• The ‘continuous bag-of-words model’ (CBOW)adds inputs from words within short windowto predict the current word

• No hidden layer, no matrix multiplications

• The weights for different positions are shared

• Computationally much more efficient thann-gram NNLM of (Bengio, 2003)

Towards building Intelligent Machines, Tomas Mikolov

Page 13: Towards building Intelligent Machines that we can ... · •Great progress in machine learning did produce wonderful ... Towards building Intelligent Machines, Tomas Mikolov. Word

Word vectors – word2vec architectures

• Predict surrounding words using thecurrent word

• This architectures is called ‘skip-gram NNLM’

• Performance similar to the CBOW model

Towards building Intelligent Machines, Tomas Mikolov

Page 14: Towards building Intelligent Machines that we can ... · •Great progress in machine learning did produce wonderful ... Towards building Intelligent Machines, Tomas Mikolov. Word

Word vectors - training

• Stochastic gradient descent + backpropagation

• Efficient solutions to very large softmax – size equal to vocabulary size, can easily be in order of millions (too many outputs to evaluate):

• Hierarchical softmax: class-based approach

• Negative sampling: 1-against-all loss instead of softmax

Towards building Intelligent Machines, Tomas Mikolov

Page 15: Towards building Intelligent Machines that we can ... · •Great progress in machine learning did produce wonderful ... Towards building Intelligent Machines, Tomas Mikolov. Word

Word vectors – comparison of performance (2013)

Towards building Intelligent Machines, Tomas Mikolov

• Google 20K questions dataset (word based, both syntax and semantics)

• Almost all models are trained on different datasets

Page 16: Towards building Intelligent Machines that we can ... · •Great progress in machine learning did produce wonderful ... Towards building Intelligent Machines, Tomas Mikolov. Word

Word vectors – more analogies

Towards building Intelligent Machines, Tomas Mikolov

Page 17: Towards building Intelligent Machines that we can ... · •Great progress in machine learning did produce wonderful ... Towards building Intelligent Machines, Tomas Mikolov. Word

Word vectors – visualization using PCA

Towards building Intelligent Machines, Tomas Mikolov

Page 18: Towards building Intelligent Machines that we can ... · •Great progress in machine learning did produce wonderful ... Towards building Intelligent Machines, Tomas Mikolov. Word

Beyond word2vec: GloVe?

• GloVe is a well-known reimplementation of word2vec from Stanford NLP group

• The most important modification is to first count word cooccurrences, and perform dimensionality in a second step

• Claims superior performance to word2vec

Towards building Intelligent Machines, Tomas Mikolov

Page 19: Towards building Intelligent Machines that we can ... · •Great progress in machine learning did produce wonderful ... Towards building Intelligent Machines, Tomas Mikolov. Word

Beyond word2vec: GloVe?

• GloVe: Global Vectors for Word Representation (Pennington, Socher, Manning, 2014):

Towards building Intelligent Machines, Tomas Mikolov

Model Dim. Training Size Accuracy

CBOW 1000 6B 63%

SG 1000 6B 65%

SVD-L 300 42B 49%

GloVe 300 42B 75%

Page 20: Towards building Intelligent Machines that we can ... · •Great progress in machine learning did produce wonderful ... Towards building Intelligent Machines, Tomas Mikolov. Word

Beyond word2vec: GloVe?

• GloVe: Global Vectors for Word Representation (Pennington, Socher, Manning, 2014):

• … comparing quality of machine learning techniques when training on different datasets is not recommended…

Towards building Intelligent Machines, Tomas Mikolov

Model Dim. Training Size Accuracy

CBOW 1000 6B 63%

SG 1000 6B 65%

SVD-L 300 42B 49%

GloVe 300 42B 75%

Page 21: Towards building Intelligent Machines that we can ... · •Great progress in machine learning did produce wonderful ... Towards building Intelligent Machines, Tomas Mikolov. Word

Beyond word2vec: GloVe?

• Part of word2vec package is script ‘demo-train-big-model-v1.sh’:• Achieves 78% accuracy, higher than any of the GloVe models

• Anyone can reproduce the results, uses only public data

• Published long time before the GloVe project

• Results further analyzed in “Improving Distributional Similarity with Lessons Learned from Word Embeddings” (Levy, Goldberg, Dagan, 2015):• When trained on the same dataset, GloVe is found to be slower to train, much

more memory complex, and produces vectors with lower quality than word2vec

Towards building Intelligent Machines, Tomas Mikolov

Page 22: Towards building Intelligent Machines that we can ... · •Great progress in machine learning did produce wonderful ... Towards building Intelligent Machines, Tomas Mikolov. Word

Beyond word2vec:

• We can further improve accuracy of word vectors by adding subwordinformation

• Can be achieved by using the character n-grams as additional inputs

• Helps especially for morphologically rich languages, can form representations for out-of-vocabulary words

Towards building Intelligent Machines, Tomas Mikolov

Page 23: Towards building Intelligent Machines that we can ... · •Great progress in machine learning did produce wonderful ... Towards building Intelligent Machines, Tomas Mikolov. Word

Beyond word2vec:

• https://github.com/facebookresearch/fastText

• Enriching Word Vectors with Subword Information (Bojanowski, Grave, Joulin, Mikolov, 2017)

Towards building Intelligent Machines, Tomas Mikolov

word2vec fastText

Czech Semantic 26% 28%

Czech Syntactic 53% 78%

German Semantic 67% 62%

German Syntactic 45% 56%

English Semantic 79% 78%

English Syntactic 70% 75%

Page 24: Towards building Intelligent Machines that we can ... · •Great progress in machine learning did produce wonderful ... Towards building Intelligent Machines, Tomas Mikolov. Word

Beyond word2vec:

• When word2vec / fastText and GloVe are trained on comparable corpora, it is clear which algorithms are superior:

• Models available at fasttext.cc

• Paper with details will be published later this year

Towards building Intelligent Machines, Tomas Mikolov

Semantic Syntactic Total

GloVe: Wikipedia + Gigaword 300d 78% 67% 72%

Word2vec: Wikipedia + News 300d 91% 84% 87%

fastText: Wikipedia + News 300d 88% 88% 88%

Page 25: Towards building Intelligent Machines that we can ... · •Great progress in machine learning did produce wonderful ... Towards building Intelligent Machines, Tomas Mikolov. Word

Beyond word2vec:

• fastText also implements supervised mode: like CBOW architecture predicting label instead of the middle word• Comparable accuracy to deep learning models (bidirectional LSTMs, CNNs),

trains in seconds where deep learning models require days to weeks

• Bag of tricks for efficient text classification (Joulin, Grave, Bojanowski, Mikolov, 2017)

• Can use pre-trained features to build classifiers when only limited number of training examples is available

Towards building Intelligent Machines, Tomas Mikolov

Page 26: Towards building Intelligent Machines that we can ... · •Great progress in machine learning did produce wonderful ... Towards building Intelligent Machines, Tomas Mikolov. Word

Distributed word representations: summary

• Simple models seem to be sufficient: no need for every neural net to be deep

• Large text corpora are crucial for good performance

• Open source packages: Word2vec, GloVe, fastText• Word2vec & fastText are superior to Glove when trained on the same data,

and faster + more memory efficient

Towards building Intelligent Machines, Tomas Mikolov

Page 27: Towards building Intelligent Machines that we can ... · •Great progress in machine learning did produce wonderful ... Towards building Intelligent Machines, Tomas Mikolov. Word

Recurrent Networks and Beyond

• Recent success of recurrent networks

• Explore limitations of recurrent networks

• Discuss what needs to be done to build machines that can understand language

Towards building Intelligent Machines, Tomas Mikolov

Page 28: Towards building Intelligent Machines that we can ... · •Great progress in machine learning did produce wonderful ... Towards building Intelligent Machines, Tomas Mikolov. Word

Simple RNN Architecture

• Input layer, hidden layer with recurrentconnections, and the output layer

• In theory, the hidden layer can learnto represent unlimited memory

• Also called Elman network(Finding structure in time, Elman 1990)

Towards building Intelligent Machines, Tomas Mikolov

Page 29: Towards building Intelligent Machines that we can ... · •Great progress in machine learning did produce wonderful ... Towards building Intelligent Machines, Tomas Mikolov. Word

Brief History of Recurrent Nets – 90’s - 2010

• After the initial excitement, recurrent nets vanished from the mainstream research

• Despite being theoretically powerful models, RNNs were mostly considered as unstable to be trained

Towards building Intelligent Machines, Tomas Mikolov

Page 30: Towards building Intelligent Machines that we can ... · •Great progress in machine learning did produce wonderful ... Towards building Intelligent Machines, Tomas Mikolov. Word

Brief History of Recurrent Nets – 2010 - today

• In 2010 - 2012, it was shown that RNNs can significantly improve state-of-art in:• language modeling

• machine translation

• data compression

• speech recognition

• RNNLM toolkit was published (used at Microsoft Research, Google, IBM, Facebook, Yandex, …)

• The key novel trick in RNNLM was trivial: to clip gradients to prevent instability of training

Towards building Intelligent Machines, Tomas Mikolov

Page 31: Towards building Intelligent Machines that we can ... · •Great progress in machine learning did produce wonderful ... Towards building Intelligent Machines, Tomas Mikolov. Word

Brief History of RNNLMs – 2010 - today

• Breakthrough result in 2011: 11% WER reduction over large system from IBM (NIST RT04)

• Ensemble of big RNNLM models trained on a lot of data

Towards building Intelligent Machines, Tomas Mikolov

Page 32: Towards building Intelligent Machines that we can ... · •Great progress in machine learning did produce wonderful ... Towards building Intelligent Machines, Tomas Mikolov. Word

Brief History of RNNLMs – 2010 - today

• RNNs became much more accessible through open-source toolkits:• Theano

• Torch

• TensorFlow

• …

• Training on GPUs allowed further scaling up (billions of words, thousands of hidden neurons)

Towards building Intelligent Machines, Tomas Mikolov

Page 33: Towards building Intelligent Machines that we can ... · •Great progress in machine learning did produce wonderful ... Towards building Intelligent Machines, Tomas Mikolov. Word

Recurrent Nets Today

• Widely applied:• ASR (both acoustic and language models)• MT (language & translation & alignment models, joint models, end-to-end)• Many NLP applications• Video modeling, handwriting recognition, user intent prediction, …

• Downside: for many problems RNNs are too powerful, models are becoming unnecessarily complex

• Complex RNN architectures are popular (LSTM, GRU), though there are simpler tricks how to add longer short term memory to RNNs(Learning longer memory in recurrent neural networks, 2014)

Towards building Intelligent Machines, Tomas Mikolov

Page 34: Towards building Intelligent Machines that we can ... · •Great progress in machine learning did produce wonderful ... Towards building Intelligent Machines, Tomas Mikolov. Word

Beyond Deep Learning

• Going beyond: what RNNs and deep networks cannot model efficiently?

• Surprisingly simple patterns! For example, memorization ofvariable-length sequence of symbols

• Thus, these models cannot deal efficiently with novel words

Towards building Intelligent Machines, Tomas Mikolov

Page 35: Towards building Intelligent Machines that we can ... · •Great progress in machine learning did produce wonderful ... Towards building Intelligent Machines, Tomas Mikolov. Word

Beyond Deep Learning: Algorithmic Patterns

• Many complex patterns have short, finite description length in natural language (or in any Turing-complete computational system)

• We call such patterns Algorithmic patterns

• Examples of algorithmic patterns: 𝑎𝑛𝑏𝑛, sequence memorization, addition of numbers learned from examples

• These patterns often cannot be learned with standard deep learning techniques (ie. just by using many hidden layers)

Towards building Intelligent Machines, Tomas Mikolov

Page 36: Towards building Intelligent Machines that we can ... · •Great progress in machine learning did produce wonderful ... Towards building Intelligent Machines, Tomas Mikolov. Word

• Learns algorithms from examples

• Add structured memory to RNN:• Trainable [read/write]

• Unbounded

• Actions: PUSH / POP / NO-OP

• Examples of memory structures: stacks, lists, queues, tapes, grids, …

Stack RNN

Towards building Intelligent Machines, Tomas Mikolov

Page 37: Towards building Intelligent Machines that we can ... · •Great progress in machine learning did produce wonderful ... Towards building Intelligent Machines, Tomas Mikolov. Word

Algorithmic Patterns

• Examples of simple algorithmic patterns generated by short programs (grammars)

• The goal is to learn these patterns unsupervisedly just by observing the example sequences

Towards building Intelligent Machines, Tomas Mikolov

Page 38: Towards building Intelligent Machines that we can ... · •Great progress in machine learning did produce wonderful ... Towards building Intelligent Machines, Tomas Mikolov. Word

Algorithmic Patterns - Counting

• Performance on simple counting tasks

• RNN with sigmoidal activation function cannot count

• Stack-RNN and LSTM can count

Towards building Intelligent Machines, Tomas Mikolov

Page 39: Towards building Intelligent Machines that we can ... · •Great progress in machine learning did produce wonderful ... Towards building Intelligent Machines, Tomas Mikolov. Word

Algorithmic Patterns - Sequences

• Sequence memorization is not learnable by LSTM

• Expandable memory of stacks allows to learn solution that generalizes

Towards building Intelligent Machines, Tomas Mikolov

Page 40: Towards building Intelligent Machines that we can ... · •Great progress in machine learning did produce wonderful ... Towards building Intelligent Machines, Tomas Mikolov. Word

Stack RNNs: summary

The good:

• Turing-complete model of computation (with >=2 stacks)

• Learns some algorithmic patterns

• Has long term memory

• Simple model that works for some problems that break RNNs and LSTMs

• Reproducible: https://github.com/facebook/Stack-RNN

The bad:

• The long term memory is used only to store partial computation (ie. learned skills are not stored there yet)

• Does not seem to be a good model for incremental learning

• Stacks do not seem to be a very general choice for the topology of the memory

Towards building Intelligent Machines, Tomas Mikolov

Page 41: Towards building Intelligent Machines that we can ... · •Great progress in machine learning did produce wonderful ... Towards building Intelligent Machines, Tomas Mikolov. Word

Beyond Stack-RNNs

• Many different types of memories have been recently proposed

• More complex tasks have been learned: binary multiplication with very big numbers (thousands of digits)

• One can argue that learning how to solve these problems have limited practical value, especially when the model and task are designed together

Towards building Intelligent Machines, Tomas Mikolov

Page 42: Towards building Intelligent Machines that we can ... · •Great progress in machine learning did produce wonderful ... Towards building Intelligent Machines, Tomas Mikolov. Word

Towards strong AI

• It may be good to choose at first the important tasks we want to solve in the long term

• Then, limit the complexity gradually until we obtain solvable tasks

• And finally develop machine learning techniques that can solve the more complex tasks

• Our plan is described in ‘A Roadmap towards Machine Intelligence’ (2015)

Towards building Intelligent Machines, Tomas Mikolov

Page 43: Towards building Intelligent Machines that we can ... · •Great progress in machine learning did produce wonderful ... Towards building Intelligent Machines, Tomas Mikolov. Word

The Goal for “useful general AI”

• General AI research is currently very popular…

• But different people see AI as something very different (Image recognition? Machine translation? Data compression?)

• For this talk, we assume “useful AI” is an artificial machine (computer) capable of helping human users in solving wide range of tasks, in a similar way other humans can

Towards building Intelligent Machines, Tomas Mikolov

Page 44: Towards building Intelligent Machines that we can ... · •Great progress in machine learning did produce wonderful ... Towards building Intelligent Machines, Tomas Mikolov. Word

Useful General AI

• We attempted to identify crucial components of useful AI:

• Ability to perform tasks for human users

• Ability to learn and improve

• Ability to communicate

Towards building Intelligent Machines, Tomas Mikolov

Page 45: Towards building Intelligent Machines that we can ... · •Great progress in machine learning did produce wonderful ... Towards building Intelligent Machines, Tomas Mikolov. Word

Additional components

• The previous design can be further extended by adding:• Grounding (virtual worlds, 2D / 3D)

• Multi-modal perception (vision, audio, …)

• Communication between AIs

• …

• However, we believe that development of even the simplest general AI is very complex and thus start with just the necessary components

Towards building Intelligent Machines, Tomas Mikolov

Page 46: Towards building Intelligent Machines that we can ... · •Great progress in machine learning did produce wonderful ... Towards building Intelligent Machines, Tomas Mikolov. Word

The Roadmap to AI

• We want to develop a machine that can learn to perform novel tasks for us through natural communication - example:

Me: Can you check the weather every day before I go to work

to see if it is going to rain, so that I don’t forget to

bring umbrella?

Machine: But how do I do that?

Me: go to search engine and enter query ‘weather new york’

… (some morning week later) …

Machine: it will rain today!

Towards building Intelligent Machines, Tomas Mikolov

Page 47: Towards building Intelligent Machines that we can ... · •Great progress in machine learning did produce wonderful ... Towards building Intelligent Machines, Tomas Mikolov. Word

The Roadmap to AI

• The existing machine learning techniques seem to be insufficient for this goal: deep (recurrent, convolutional) neural networks have excellent performance on supervised tasks, but there is much needed future development of:• Unsupervised / reward-based learning• Compositional and incremental learning• Long term memory

• Currently, there is no existing standard dataset focusing on teaching machines to communicate in natural language while addressing these research problems

Towards building Intelligent Machines, Tomas Mikolov

Page 48: Towards building Intelligent Machines that we can ... · •Great progress in machine learning did produce wonderful ... Towards building Intelligent Machines, Tomas Mikolov. Word

CommAI-env

• CommAI-env is an open-source virtual environment focusing on learning to communicate

• Published together with a set of very simple (but still probably unsolvable) communication tasks

CommAI: Evaluating the first steps towards a useful general AI(Baroni et al, 2017)

https://github.com/facebookresearch/CommAI-env

Towards building Intelligent Machines, Tomas Mikolov

Page 49: Towards building Intelligent Machines that we can ... · •Great progress in machine learning did produce wonderful ... Towards building Intelligent Machines, Tomas Mikolov. Word

CommAI tasks

• Currently there are several existing datasets, with various degree of complexity

• Example of a basic task:

Teacher: repeat after me: AABBB

Learner: AAB

Teacher: wrong, the correct answer is: AABBB

Towards building Intelligent Machines, Tomas Mikolov

Page 50: Towards building Intelligent Machines that we can ... · •Great progress in machine learning did produce wonderful ... Towards building Intelligent Machines, Tomas Mikolov. Word

CommAI tasks

• Currently there are several existing datasets, with various degree of complexity

• Example of a basic task:Teacher: repeat after me: BBAA

Learner: BBAA

Teacher: good! [reward +1]

Learning to repeat a word (or more generally, a sequence of symbols) has been already shown to be very challenging for RNNs.

Towards building Intelligent Machines, Tomas Mikolov

Page 51: Towards building Intelligent Machines that we can ... · •Great progress in machine learning did produce wonderful ... Towards building Intelligent Machines, Tomas Mikolov. Word

CommAI tasks

• The tasks are closely related and share some common structure through natural language

• Example of a task that re-uses previous knowledge:

Teacher: repeat twice after me: ABA

Learner: ABAABA

Teacher: good! [reward +1]

It should be much faster to learn ‘repeat twice’ task after the Learner already knows how to repeat a sequence once.

Towards building Intelligent Machines, Tomas Mikolov

Page 52: Towards building Intelligent Machines that we can ... · •Great progress in machine learning did produce wonderful ... Towards building Intelligent Machines, Tomas Mikolov. Word

Future of general AI research

• We need standard datasets and metrics to compare various attempts to solve communication-based general AI

• CommAI-env is a prototype of communication-based environment, and General AI Challenge from GoodAI is an example of standard dataset defined together with the metrics of success:https://www.general-ai-challenge.org/

• The objective function should reflect the learning speed: we aim to build Learners that can learn from as few examples as possible to perform novel tasks

Towards building Intelligent Machines, Tomas Mikolov

Page 53: Towards building Intelligent Machines that we can ... · •Great progress in machine learning did produce wonderful ... Towards building Intelligent Machines, Tomas Mikolov. Word

Open research problems

Unsupervised / reward-based learning:

• How can the machine “learn to learn”: knowledgeable Learner should modify its own behavior even when no explicit reward signal is present

• Learner should be able to memorize new facts and abilities without explicit instructions

Towards building Intelligent Machines, Tomas Mikolov

Page 54: Towards building Intelligent Machines that we can ... · •Great progress in machine learning did produce wonderful ... Towards building Intelligent Machines, Tomas Mikolov. Word

Open research problems

Compositional and incremental learning:

• How can the Learner build new skills by re-using the existing skills, ie. without learning from scratch solution to every new problem?

Towards building Intelligent Machines, Tomas Mikolov

Page 55: Towards building Intelligent Machines that we can ... · •Great progress in machine learning did produce wonderful ... Towards building Intelligent Machines, Tomas Mikolov. Word

Open research problems

Long term memory:

• What should be the structure of the long term memory? How should it be updated?

• All these research problems seem to be very related. One can probably not solve compositional and incremental learning without having any way to form persistent long term memory.

Towards building Intelligent Machines, Tomas Mikolov

Page 56: Towards building Intelligent Machines that we can ... · •Great progress in machine learning did produce wonderful ... Towards building Intelligent Machines, Tomas Mikolov. Word

Conclusion

• We may want to be more goal-oriented when we talk about strong or general AI

• Communication seems to be necessary for useful strong AI

• To achieve progress, researchers need to have standard datasets to compare various approaches, and incentives to work on very difficult unsolved problems

• https://research.fb.com/commai-fellowships-and-visiting-researcher-programs/

Towards building Intelligent Machines, Tomas Mikolov