Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
CS 330
Lifelong Learning
Logistics
2
Project milestone due Wednesday.
Two guest lectures next week!
Jeff Clune Sergey Levine
Plan for Today
3
The lifelong learning problem statement
Basic approaches to lifelong learning
Can we do better than the basics?
Revisiting the problem statement from the meta-learning perspective
4
A brief review of problem statements.
Meta-LearningGiven i.i.d. task distribu0on, learn a new task efficiently
quickly learn new task
learn to learn tasks
Mul8-Task Learning
Learn to solve a set of tasks.
perform taskslearn tasks
5
In contrast, many real world se@ngs look like:
Meta-Learninglearn to learn tasks
quickly learn new task
Mul8-Task Learningperform taskslearn tasks
0me
- a student learning concepts in school - a deployed image classifica8on system learning from a
stream of images from users - a robot acquiring an increasingly large set of skills in
different environments - a virtual assistant learning to help different users with
different tasks at different points in 0me - a doctor’s assistant aiding in medical decision-making
Some examples:
Our agents may not be given a large batch of data/tasks right off the bat!
Sequen8al learning se@ngs
online learning, lifelong learning, con0nual learning, incremental learning, streaming data
dis0nct from sequence data and sequen8al decision-making
Some Terminology
6
1. Pick an example se@ng.
2. Discuss problem statement with your neighbor:
(a) how would you set-up an experiment to develop & test your algorithm?
(b) what are desirable/required proper0es of the algorithm? (c) how do you evaluate such a system?
What is the lifelong learning problem statement?
A. a student learning concepts in school B. a deployed image classifica8on system learning from a
stream of images from users C. a robot acquiring an increasingly large set of skills in
different environments D. a virtual assistant learning to help different users with
different tasks at different points in 0me E. a doctor’s assistant aiding in medical decision-making
Example seTngs:
Exercise:
7
Some considera0ons:
- computa8onal resources
- memory
- model performance
- data efficiency
Problem varia0ons:
- task/data order: i.i.d. vs. predictable vs. curriculum vs. adversarial
- others: privacy, interpretability, fairness, test 0me compute & memory
- discrete task boundaries vs. con8nuous shiVs (vs. both)
- known task boundaries/shiVs vs. unknown
Substan0al variety in problem statement!
What is the lifelong learning problem statement?
8
General [supervised] online learning problem:
What is the lifelong learning problem statement?
for t = 1, …, n
observe xt
predict ̂yt
observe label yt
i.i.d. setting: xt ∼ p(x), yt ∼ p(y |x)
not a function of p t
streaming setting: cannot store (xt, yt)- lack of memory - lack of computational resources - privacy considerations - want to study neural memory mechanisms
otherwise: xt ∼ pt(x), yt ∼ pt(y |x)
true in some cases, but not in many cases!- recall: replay buffers
<— if observable task boundaries: observe xt, zt
9
What do you want from your lifelong learning algorithm?
minimal regret (that grows slowly with )t
regret: cumula0ve loss of learner — cumula0ve loss of best learner in hindsight
(cannot be evaluated in prac0ce, useful for analysis)
RegretT :=T
∑1
ℒt(θt) − minθ
T
∑1
ℒt(θ)
10
Regret that grows linearly in is trivial.t Why?
posi1ve & nega1ve transfer
What do you want from your lifelong learning algorithm?
posi8ve forward transfer: previous tasks cause you to do be[er on future tasks
compared to learning future tasks from scratch
posi8ve backward transfer: current tasks cause you to do be[er on previous tasks
compared to learning past tasks from scratch
posi8ve -> nega8ve : beMer -> worse
11
Plan for Today
12
The lifelong learning problem statement
Basic approaches to lifelong learning
Can we do better than the basics?
Revisiting the problem statement from the meta-learning perspective
Store all the data you’ve seen so far, and train on it.
Approaches
—> follow the leader algorithm
Take a gradient step on the datapoint you observe. —> stochas8c gradient descent
+ will achieve very strong performance
- computa8on intensive
- can be memory intensive
—> Con8nuous fine-tuning can help.
[depends on the applica0on]
+ computa0onally cheap+ requires 0 memory- subject to nega8ve backward transfer
“forgeTng”some0mes referred to as catastrophic forgeTng
- slow learning
13Can we do beMer?
Plan for Today
14
The lifelong learning problem statement
Basic approaches to lifelong learning
Can we do better than the basics?
Revisiting the problem statement from the meta-learning perspective
Case Study: Can we use meta-learning to accelerate online learning?
15
motor malfunctiongradual terrain change
time
online adaptation = few-shot learning tasks are temporal slices of experience
Recall: model-based meta-RL
16
motor malfunctiongradual terrain change
time
icy terrain
k time steps not sufficient to learn entirely new terrain
Continue to run SGD?
example online learning problem
+ will be fast with MAML initialization- what if ice goes away? (subject to forgetting)
17Nagabandi, Finn, Levine. Deep Online Learning via Meta-Learning. ICLR ‘19
time
Online inference problem: infer latent “task” variable at each time step
Note: If neural net is random initialized, this procedure would be too slow.
Alternate between:
M-step: Update mixture of network parameters
Mixture of neural networks over task variable T, adapted continually:
E-step: Estimate latent “task” variable at each time step given data
prior
gradient step on each mixture element, weighted by task probability
likelihood of the dataunder task .
P (Tt = Ti|xt,yt) / p✓(Ti)(yt|xt, Tt = Ti)P (Tt = Ti)<latexit sha1_base64="o+w8nuWLDJ0VD8mItgNEHIiSq5Q=">AAAGNXicxVRLbxMxEN6WBEp4tXDkMqKKSFRaZRESCFSpEhyQuBTRlxSHldfxJla9D9mzpdHWFb+JC/+DExw4gBBX/gLePJrsJlJPCEurHc98/mbm88NPpNDYan1bWr5SqV69tnK9duPmrdt3VtfuHug4VYzvs1jG6sinmksR8X0UKPlRojgNfckP/eOXefzwhCst4mgPBwnvhLQXiUAwitblrVXe1EnWICHFvh9k2nj4CCYzOpypWZcFZLjhmiYQU6uTREC9QVg3ziHY50ib1nuBhbpHkJ9ihlyjmUZyYtiGYJSXUZm9MhOkoiIyhYT4YhG3rWPz+DmaQrljX62+O6Xeswz5yiGFh4WAaNZKSFtWAQBnF/ynBXEGOS2QRMUJxpB42ShBid40ZvFnC6gmibeLC+GSumxHcKgEIo8gThESJWJlfUTyANtAdBpaNR5uu+a9FcU1JT4bMuWMG0C6XJZbgNG+gNAQ8Q+mSahM+hSAKNHrY8emVDyMT7j9ddOoSyM2mJP/H4o6I2nzPzQ/PlXDO1GqMm968ZmDTSC+DVy2x5fJFFFf0olGc1mMlSPuTUVcACgezkKyJpyfkyBWVMpiUdM7OF+Tt7re2moNB8wb7thYd8Zj11v9QroxS0MeIZNU67bbSrCTUYWCSW5qJNU8oeyY9njbmhENue5kw1fPPi7W0wVbo/0ihKF3dkVGQ60HoW+ReY26HMudi2LtFINnnUxESWpvFxslClIJ9kjmTyh0heIM5cAalNk7KBiwPlWUoX1oa1YEt9zyvHHweMu19tsn6zvvPo7kWHHuOw+chuM6T50d57Wz6+w7rPKp8rXyo/Kz+rn6vfqr+nsEXV4aS3jPKYzqn7/h7iY/</latexit><latexit sha1_base64="o+w8nuWLDJ0VD8mItgNEHIiSq5Q=">AAAGNXicxVRLbxMxEN6WBEp4tXDkMqKKSFRaZRESCFSpEhyQuBTRlxSHldfxJla9D9mzpdHWFb+JC/+DExw4gBBX/gLePJrsJlJPCEurHc98/mbm88NPpNDYan1bWr5SqV69tnK9duPmrdt3VtfuHug4VYzvs1jG6sinmksR8X0UKPlRojgNfckP/eOXefzwhCst4mgPBwnvhLQXiUAwitblrVXe1EnWICHFvh9k2nj4CCYzOpypWZcFZLjhmiYQU6uTREC9QVg3ziHY50ib1nuBhbpHkJ9ihlyjmUZyYtiGYJSXUZm9MhOkoiIyhYT4YhG3rWPz+DmaQrljX62+O6Xeswz5yiGFh4WAaNZKSFtWAQBnF/ynBXEGOS2QRMUJxpB42ShBid40ZvFnC6gmibeLC+GSumxHcKgEIo8gThESJWJlfUTyANtAdBpaNR5uu+a9FcU1JT4bMuWMG0C6XJZbgNG+gNAQ8Q+mSahM+hSAKNHrY8emVDyMT7j9ddOoSyM2mJP/H4o6I2nzPzQ/PlXDO1GqMm968ZmDTSC+DVy2x5fJFFFf0olGc1mMlSPuTUVcACgezkKyJpyfkyBWVMpiUdM7OF+Tt7re2moNB8wb7thYd8Zj11v9QroxS0MeIZNU67bbSrCTUYWCSW5qJNU8oeyY9njbmhENue5kw1fPPi7W0wVbo/0ihKF3dkVGQ60HoW+ReY26HMudi2LtFINnnUxESWpvFxslClIJ9kjmTyh0heIM5cAalNk7KBiwPlWUoX1oa1YEt9zyvHHweMu19tsn6zvvPo7kWHHuOw+chuM6T50d57Wz6+w7rPKp8rXyo/Kz+rn6vfqr+nsEXV4aS3jPKYzqn7/h7iY/</latexit><latexit sha1_base64="o+w8nuWLDJ0VD8mItgNEHIiSq5Q=">AAAGNXicxVRLbxMxEN6WBEp4tXDkMqKKSFRaZRESCFSpEhyQuBTRlxSHldfxJla9D9mzpdHWFb+JC/+DExw4gBBX/gLePJrsJlJPCEurHc98/mbm88NPpNDYan1bWr5SqV69tnK9duPmrdt3VtfuHug4VYzvs1jG6sinmksR8X0UKPlRojgNfckP/eOXefzwhCst4mgPBwnvhLQXiUAwitblrVXe1EnWICHFvh9k2nj4CCYzOpypWZcFZLjhmiYQU6uTREC9QVg3ziHY50ib1nuBhbpHkJ9ihlyjmUZyYtiGYJSXUZm9MhOkoiIyhYT4YhG3rWPz+DmaQrljX62+O6Xeswz5yiGFh4WAaNZKSFtWAQBnF/ynBXEGOS2QRMUJxpB42ShBid40ZvFnC6gmibeLC+GSumxHcKgEIo8gThESJWJlfUTyANtAdBpaNR5uu+a9FcU1JT4bMuWMG0C6XJZbgNG+gNAQ8Q+mSahM+hSAKNHrY8emVDyMT7j9ddOoSyM2mJP/H4o6I2nzPzQ/PlXDO1GqMm968ZmDTSC+DVy2x5fJFFFf0olGc1mMlSPuTUVcACgezkKyJpyfkyBWVMpiUdM7OF+Tt7re2moNB8wb7thYd8Zj11v9QroxS0MeIZNU67bbSrCTUYWCSW5qJNU8oeyY9njbmhENue5kw1fPPi7W0wVbo/0ihKF3dkVGQ60HoW+ReY26HMudi2LtFINnnUxESWpvFxslClIJ9kjmTyh0heIM5cAalNk7KBiwPlWUoX1oa1YEt9zyvHHweMu19tsn6zvvPo7kWHHuOw+chuM6T50d57Wz6+w7rPKp8rXyo/Kz+rn6vfqr+nsEXV4aS3jPKYzqn7/h7iY/</latexit><latexit sha1_base64="ck8pdC+ekZH4nUmSP+ZG7r8lEyk=">AAAB2XicbZDNSgMxFIXv1L86Vq1rN8EiuCozbnQpuHFZwbZCO5RM5k4bmskMyR2hDH0BF25EfC93vo3pz0JbDwQ+zknIvSculLQUBN9ebWd3b/+gfugfNfzjk9Nmo2fz0gjsilzl5jnmFpXU2CVJCp8LgzyLFfbj6f0i77+gsTLXTzQrMMr4WMtUCk7O6oyaraAdLMW2IVxDC9YaNb+GSS7KDDUJxa0dhEFBUcUNSaFw7g9LiwUXUz7GgUPNM7RRtRxzzi6dk7A0N+5oYkv394uKZ9bOstjdzDhN7Ga2MP/LBiWlt1EldVESarH6KC0Vo5wtdmaJNChIzRxwYaSblYkJN1yQa8Z3HYSbG29D77odOn4MoA7ncAFXEMIN3MEDdKALAhJ4hXdv4r15H6uuat66tDP4I+/zBzjGijg=</latexit><latexit sha1_base64="HYuehfr9ScUR7ZI+Dh5V0LkyYlA=">AAAGKnicxVRLb9NAEHZLAiUU2nLlMqKKSFRaxVxAoEpIcEDiUqS+pGyw1ut1sur6od1xaeRufxQX/gcnOHAAIX4H6zya2InUE2Ily7Mz334z8+3DT6XQ2Ol8X1m9VavfvrN2t3Fv/f6Djc2t9WOdZIrxI5bIRJ36VHMpYn6EAiU/TRWnkS/5iX/2poifnHOlRRIf4jDlvYj2YxEKRtG6vK3a+ybJWySiOPDDXBsPn8J0RkczNe+ygBx3XNMGYhpNkgpotggLkgKCA460bb3XWGh6BPkF5sg1mlmkIIZ9CMd5GZX5WzNFKipiU0qIr5Zx2zp2z16iKZU78TWaBzPqQ8tQrBxReFgKiHajgrRllQBwec1/URJnWNACSVWSYgKpl48TVOhNax5/uYRqmni/vBBuqMt2BCdKIPIYkgwhVSJR1kckD7ELRGeRVePJvms+WlFcU+GzIVPNuAMk4LLaAoz3BYSGmH8ybUJlOqAARIn+AHs2peJRcs7tL8jigMZsuCD/PxR1TtL2f2h+cqpGd6JSZdH08jMHu0B8G7hpj2+SKaa+pFONFrIYK0fSn4m4BFA+nKVkbbi6ImGiqJTlomZ3cLEmb3O7s9cZDVg03Imx7UzGgbf5lQQJyyIeI5NU667bSbGXU4WCSW4aJNM8peyM9nnXmjGNuO7lo1fPPi7WE4Ct0X4xwsg7vyKnkdbDyLfIokZdjRXOZbFuhuGLXi7iNLO3i40ThZkEeySLJxQCoThDObQGZfYOCgZsQBVlaB/ahhXBrba8aBw/23Ot/aHjrDmPnMdOy3Gd585r551z4Bw5rPa59q32s/ar/qX+o/57LNfqykS3h05p1P/8BQlyJAo=</latexit><latexit sha1_base64="HYuehfr9ScUR7ZI+Dh5V0LkyYlA=">AAAGKnicxVRLb9NAEHZLAiUU2nLlMqKKSFRaxVxAoEpIcEDiUqS+pGyw1ut1sur6od1xaeRufxQX/gcnOHAAIX4H6zya2InUE2Ily7Mz334z8+3DT6XQ2Ol8X1m9VavfvrN2t3Fv/f6Djc2t9WOdZIrxI5bIRJ36VHMpYn6EAiU/TRWnkS/5iX/2poifnHOlRRIf4jDlvYj2YxEKRtG6vK3a+ybJWySiOPDDXBsPn8J0RkczNe+ygBx3XNMGYhpNkgpotggLkgKCA460bb3XWGh6BPkF5sg1mlmkIIZ9CMd5GZX5WzNFKipiU0qIr5Zx2zp2z16iKZU78TWaBzPqQ8tQrBxReFgKiHajgrRllQBwec1/URJnWNACSVWSYgKpl48TVOhNax5/uYRqmni/vBBuqMt2BCdKIPIYkgwhVSJR1kckD7ELRGeRVePJvms+WlFcU+GzIVPNuAMk4LLaAoz3BYSGmH8ybUJlOqAARIn+AHs2peJRcs7tL8jigMZsuCD/PxR1TtL2f2h+cqpGd6JSZdH08jMHu0B8G7hpj2+SKaa+pFONFrIYK0fSn4m4BFA+nKVkbbi6ImGiqJTlomZ3cLEmb3O7s9cZDVg03Imx7UzGgbf5lQQJyyIeI5NU667bSbGXU4WCSW4aJNM8peyM9nnXmjGNuO7lo1fPPi7WE4Ct0X4xwsg7vyKnkdbDyLfIokZdjRXOZbFuhuGLXi7iNLO3i40ThZkEeySLJxQCoThDObQGZfYOCgZsQBVlaB/ahhXBrba8aBw/23Ot/aHjrDmPnMdOy3Gd585r551z4Bw5rPa59q32s/ar/qX+o/57LNfqykS3h05p1P/8BQlyJAo=</latexit><latexit sha1_base64="ekeP9cfXZZoDKmiLpGft+htEIbE=">AAAGNXicxVRLbxMxEN6WBEp4tIUjlxFVRKLSKssFBKpUCQ5IXIroS4rDyut4E6veh+zZ0mjrit/Ehf/BCQ4cQIgrfwFvHk12E6knhKXVjmc+fzPz+eEnUmhstb4tLV+rVK/fWLlZu3X7zt3VtfV7hzpOFeMHLJaxOvap5lJE/AAFSn6cKE5DX/Ij/+RlHj865UqLONrHQcI7Ie1FIhCMonV565U3dZI1SEix7weZNh4+hsmMDmdq1mUBGW66pgnE1OokEVBvENaNcwj2OdKm9V5ioe4R5GeYIddoppGcGHYgGOVlVGavzASpqIhMISG+WMRt69g6eY6mUO7YV6vvTan3LUO+ckjhYSEgmrUS0pZVAMD5Jf9ZQZxBTgskUXGCMSReNkpQojeNWfz5AqpJ4p3iQriiLtsRHCmByCOIU4REiVhZH5E8wDYQnYZWjUc7rnlvRXFNic+GTDnjJpAul+UWYLQvIDRE/INpEiqTPgUgSvT62LEpFQ/jU25/3TTq0ogN5uT/h6LOSNr8D82PT9XwTpSqzJtefOZgC4hvA1ft8VUyRdSXdKLRXBZj5Yh7UxEXAIqHs5CsCRcXJIgVlbJY1PQOztfkrW20tlvDAfOGOzY2nPHY89a+kG7M0pBHyCTVuu22EuxkVKFgkpsaSTVPKDuhPd62ZkRDrjvZ8NWzj4v1dMHWaL8IYeidXZHRUOtB6FtkXqMux3Lnolg7xeBZJxNRktrbxUaJglSCPZL5EwpdoThDObAGZfYOCgasTxVlaB/amhXBLbc8bxw+2Xat/ba1sfvu40iOFeeB89BpOK7z1Nl1Xjt7zoHDKp8qXys/Kj+rn6vfq7+qv0fQ5aWxhPedwqj++QvgriY7</latexit><latexit sha1_base64="o+w8nuWLDJ0VD8mItgNEHIiSq5Q=">AAAGNXicxVRLbxMxEN6WBEp4tXDkMqKKSFRaZRESCFSpEhyQuBTRlxSHldfxJla9D9mzpdHWFb+JC/+DExw4gBBX/gLePJrsJlJPCEurHc98/mbm88NPpNDYan1bWr5SqV69tnK9duPmrdt3VtfuHug4VYzvs1jG6sinmksR8X0UKPlRojgNfckP/eOXefzwhCst4mgPBwnvhLQXiUAwitblrVXe1EnWICHFvh9k2nj4CCYzOpypWZcFZLjhmiYQU6uTREC9QVg3ziHY50ib1nuBhbpHkJ9ihlyjmUZyYtiGYJSXUZm9MhOkoiIyhYT4YhG3rWPz+DmaQrljX62+O6Xeswz5yiGFh4WAaNZKSFtWAQBnF/ynBXEGOS2QRMUJxpB42ShBid40ZvFnC6gmibeLC+GSumxHcKgEIo8gThESJWJlfUTyANtAdBpaNR5uu+a9FcU1JT4bMuWMG0C6XJZbgNG+gNAQ8Q+mSahM+hSAKNHrY8emVDyMT7j9ddOoSyM2mJP/H4o6I2nzPzQ/PlXDO1GqMm968ZmDTSC+DVy2x5fJFFFf0olGc1mMlSPuTUVcACgezkKyJpyfkyBWVMpiUdM7OF+Tt7re2moNB8wb7thYd8Zj11v9QroxS0MeIZNU67bbSrCTUYWCSW5qJNU8oeyY9njbmhENue5kw1fPPi7W0wVbo/0ihKF3dkVGQ60HoW+ReY26HMudi2LtFINnnUxESWpvFxslClIJ9kjmTyh0heIM5cAalNk7KBiwPlWUoX1oa1YEt9zyvHHweMu19tsn6zvvPo7kWHHuOw+chuM6T50d57Wz6+w7rPKp8rXyo/Kz+rn6vfqr+nsEXV4aS3jPKYzqn7/h7iY/</latexit><latexit sha1_base64="o+w8nuWLDJ0VD8mItgNEHIiSq5Q=">AAAGNXicxVRLbxMxEN6WBEp4tXDkMqKKSFRaZRESCFSpEhyQuBTRlxSHldfxJla9D9mzpdHWFb+JC/+DExw4gBBX/gLePJrsJlJPCEurHc98/mbm88NPpNDYan1bWr5SqV69tnK9duPmrdt3VtfuHug4VYzvs1jG6sinmksR8X0UKPlRojgNfckP/eOXefzwhCst4mgPBwnvhLQXiUAwitblrVXe1EnWICHFvh9k2nj4CCYzOpypWZcFZLjhmiYQU6uTREC9QVg3ziHY50ib1nuBhbpHkJ9ihlyjmUZyYtiGYJSXUZm9MhOkoiIyhYT4YhG3rWPz+DmaQrljX62+O6Xeswz5yiGFh4WAaNZKSFtWAQBnF/ynBXEGOS2QRMUJxpB42ShBid40ZvFnC6gmibeLC+GSumxHcKgEIo8gThESJWJlfUTyANtAdBpaNR5uu+a9FcU1JT4bMuWMG0C6XJZbgNG+gNAQ8Q+mSahM+hSAKNHrY8emVDyMT7j9ddOoSyM2mJP/H4o6I2nzPzQ/PlXDO1GqMm968ZmDTSC+DVy2x5fJFFFf0olGc1mMlSPuTUVcACgezkKyJpyfkyBWVMpiUdM7OF+Tt7re2moNB8wb7thYd8Zj11v9QroxS0MeIZNU67bbSrCTUYWCSW5qJNU8oeyY9njbmhENue5kw1fPPi7W0wVbo/0ihKF3dkVGQ60HoW+ReY26HMudi2LtFINnnUxESWpvFxslClIJ9kjmTyh0heIM5cAalNk7KBiwPlWUoX1oa1YEt9zyvHHweMu19tsn6zvvPo7kWHHuOw+chuM6T50d57Wz6+w7rPKp8rXyo/Kz+rn6vfqr+nsEXV4aS3jPKYzqn7/h7iY/</latexit><latexit sha1_base64="o+w8nuWLDJ0VD8mItgNEHIiSq5Q=">AAAGNXicxVRLbxMxEN6WBEp4tXDkMqKKSFRaZRESCFSpEhyQuBTRlxSHldfxJla9D9mzpdHWFb+JC/+DExw4gBBX/gLePJrsJlJPCEurHc98/mbm88NPpNDYan1bWr5SqV69tnK9duPmrdt3VtfuHug4VYzvs1jG6sinmksR8X0UKPlRojgNfckP/eOXefzwhCst4mgPBwnvhLQXiUAwitblrVXe1EnWICHFvh9k2nj4CCYzOpypWZcFZLjhmiYQU6uTREC9QVg3ziHY50ib1nuBhbpHkJ9ihlyjmUZyYtiGYJSXUZm9MhOkoiIyhYT4YhG3rWPz+DmaQrljX62+O6Xeswz5yiGFh4WAaNZKSFtWAQBnF/ynBXEGOS2QRMUJxpB42ShBid40ZvFnC6gmibeLC+GSumxHcKgEIo8gThESJWJlfUTyANtAdBpaNR5uu+a9FcU1JT4bMuWMG0C6XJZbgNG+gNAQ8Q+mSahM+hSAKNHrY8emVDyMT7j9ddOoSyM2mJP/H4o6I2nzPzQ/PlXDO1GqMm968ZmDTSC+DVy2x5fJFFFf0olGc1mMlSPuTUVcACgezkKyJpyfkyBWVMpiUdM7OF+Tt7re2moNB8wb7thYd8Zj11v9QroxS0MeIZNU67bbSrCTUYWCSW5qJNU8oeyY9njbmhENue5kw1fPPi7W0wVbo/0ihKF3dkVGQ60HoW+ReY26HMudi2LtFINnnUxESWpvFxslClIJ9kjmTyh0heIM5cAalNk7KBiwPlWUoX1oa1YEt9zyvHHweMu19tsn6zvvPo7kWHHuOw+chuM6T50d57Wz6+w7rPKp8rXyo/Kz+rn6vfqr+nsEXV4aS3jPKYzqn7/h7iY/</latexit><latexit sha1_base64="o+w8nuWLDJ0VD8mItgNEHIiSq5Q=">AAAGNXicxVRLbxMxEN6WBEp4tXDkMqKKSFRaZRESCFSpEhyQuBTRlxSHldfxJla9D9mzpdHWFb+JC/+DExw4gBBX/gLePJrsJlJPCEurHc98/mbm88NPpNDYan1bWr5SqV69tnK9duPmrdt3VtfuHug4VYzvs1jG6sinmksR8X0UKPlRojgNfckP/eOXefzwhCst4mgPBwnvhLQXiUAwitblrVXe1EnWICHFvh9k2nj4CCYzOpypWZcFZLjhmiYQU6uTREC9QVg3ziHY50ib1nuBhbpHkJ9ihlyjmUZyYtiGYJSXUZm9MhOkoiIyhYT4YhG3rWPz+DmaQrljX62+O6Xeswz5yiGFh4WAaNZKSFtWAQBnF/ynBXEGOS2QRMUJxpB42ShBid40ZvFnC6gmibeLC+GSumxHcKgEIo8gThESJWJlfUTyANtAdBpaNR5uu+a9FcU1JT4bMuWMG0C6XJZbgNG+gNAQ8Q+mSahM+hSAKNHrY8emVDyMT7j9ddOoSyM2mJP/H4o6I2nzPzQ/PlXDO1GqMm968ZmDTSC+DVy2x5fJFFFf0olGc1mMlSPuTUVcACgezkKyJpyfkyBWVMpiUdM7OF+Tt7re2moNB8wb7thYd8Zj11v9QroxS0MeIZNU67bbSrCTUYWCSW5qJNU8oeyY9njbmhENue5kw1fPPi7W0wVbo/0ihKF3dkVGQ60HoW+ReY26HMudi2LtFINnnUxESWpvFxslClIJ9kjmTyh0heIM5cAalNk7KBiwPlWUoX1oa1YEt9zyvHHweMu19tsn6zvvPo7kWHHuOw+chuM6T50d57Wz6+w7rPKp8rXyo/Kz+rn6vfqr+nsEXV4aS3jPKYzqn7/h7iY/</latexit><latexit sha1_base64="o+w8nuWLDJ0VD8mItgNEHIiSq5Q=">AAAGNXicxVRLbxMxEN6WBEp4tXDkMqKKSFRaZRESCFSpEhyQuBTRlxSHldfxJla9D9mzpdHWFb+JC/+DExw4gBBX/gLePJrsJlJPCEurHc98/mbm88NPpNDYan1bWr5SqV69tnK9duPmrdt3VtfuHug4VYzvs1jG6sinmksR8X0UKPlRojgNfckP/eOXefzwhCst4mgPBwnvhLQXiUAwitblrVXe1EnWICHFvh9k2nj4CCYzOpypWZcFZLjhmiYQU6uTREC9QVg3ziHY50ib1nuBhbpHkJ9ihlyjmUZyYtiGYJSXUZm9MhOkoiIyhYT4YhG3rWPz+DmaQrljX62+O6Xeswz5yiGFh4WAaNZKSFtWAQBnF/ynBXEGOS2QRMUJxpB42ShBid40ZvFnC6gmibeLC+GSumxHcKgEIo8gThESJWJlfUTyANtAdBpaNR5uu+a9FcU1JT4bMuWMG0C6XJZbgNG+gNAQ8Q+mSahM+hSAKNHrY8emVDyMT7j9ddOoSyM2mJP/H4o6I2nzPzQ/PlXDO1GqMm968ZmDTSC+DVy2x5fJFFFf0olGc1mMlSPuTUVcACgezkKyJpyfkyBWVMpiUdM7OF+Tt7re2moNB8wb7thYd8Zj11v9QroxS0MeIZNU67bbSrCTUYWCSW5qJNU8oeyY9njbmhENue5kw1fPPi7W0wVbo/0ihKF3dkVGQ60HoW+ReY26HMudi2LtFINnnUxESWpvFxslClIJ9kjmTyh0heIM5cAalNk7KBiwPlWUoX1oa1YEt9zyvHHweMu19tsn6zvvPo7kWHHuOw+chuM6T50d57Wz6+w7rPKp8rXyo/Kz+rn6vfqr+nsEXV4aS3jPKYzqn7/h7iY/</latexit><latexit sha1_base64="o+w8nuWLDJ0VD8mItgNEHIiSq5Q=">AAAGNXicxVRLbxMxEN6WBEp4tXDkMqKKSFRaZRESCFSpEhyQuBTRlxSHldfxJla9D9mzpdHWFb+JC/+DExw4gBBX/gLePJrsJlJPCEurHc98/mbm88NPpNDYan1bWr5SqV69tnK9duPmrdt3VtfuHug4VYzvs1jG6sinmksR8X0UKPlRojgNfckP/eOXefzwhCst4mgPBwnvhLQXiUAwitblrVXe1EnWICHFvh9k2nj4CCYzOpypWZcFZLjhmiYQU6uTREC9QVg3ziHY50ib1nuBhbpHkJ9ihlyjmUZyYtiGYJSXUZm9MhOkoiIyhYT4YhG3rWPz+DmaQrljX62+O6Xeswz5yiGFh4WAaNZKSFtWAQBnF/ynBXEGOS2QRMUJxpB42ShBid40ZvFnC6gmibeLC+GSumxHcKgEIo8gThESJWJlfUTyANtAdBpaNR5uu+a9FcU1JT4bMuWMG0C6XJZbgNG+gNAQ8Q+mSahM+hSAKNHrY8emVDyMT7j9ddOoSyM2mJP/H4o6I2nzPzQ/PlXDO1GqMm968ZmDTSC+DVy2x5fJFFFf0olGc1mMlSPuTUVcACgezkKyJpyfkyBWVMpiUdM7OF+Tt7re2moNB8wb7thYd8Zj11v9QroxS0MeIZNU67bbSrCTUYWCSW5qJNU8oeyY9njbmhENue5kw1fPPi7W0wVbo/0ihKF3dkVGQ60HoW+ReY26HMudi2LtFINnnUxESWpvFxslClIJ9kjmTyh0heIM5cAalNk7KBiwPlWUoX1oa1YEt9zyvHHweMu19tsn6zvvPo7kWHHuOw+chuM6T50d57Wz6+w7rPKp8rXyo/Kz+rn6vfqr+nsEXV4aS3jPKYzqn7/h7iY/</latexit>
Nagabandi, Finn, Levine. Deep Online Learning via Meta-Learning. ICLR ‘1918Nagabandi, Finn, Levine. Deep Online Learning via Meta-Learning. ICLR ‘19
Crawler with crippled legs
Does it work? online learning w. MAML initialization SGD w. MAML initialization MAML (always reset to prior + 1 grad step) model-based, no adaptation model-based, grad steps
Nagabandi, Finn, Levine. Deep Online Learning via Meta-Learning. ICLR ‘19
no meta-learning
19Nagabandi, Finn, Levine. Deep Online Learning via Meta-Learning. ICLR ‘19
Nagabandi, Finn, Levine. Deep Online Learning via Meta-Learning. ICLR ‘19
Latent task distribution during online learningDoes it work?
Crawler with crippled legs
20Nagabandi, Finn, Levine. Deep Online Learning via Meta-Learning. ICLR ‘19
Case Study: Can we modify vanilla SGD to avoid nega0ve backward transfer?
21
(from scratch)
Idea:
22Lopez-Paz & Ranzato. Gradient Episodic Memory for Continual Learning. NeurIPS ‘17
(1) store small amount of data per task in memory
(2) when making updates for new tasks, ensure that they don’t unlearn previous tasks
How do we accomplish (2)?
memory: for task ℳk zk
For t = 0,...,T
minimize ℒ( fθ( ⋅ , zt) , (xt, yt) )
subject to for all ℒ( fθ , ℳk ) ≤ ℒ( f t−1θ , ℳk ) zk < zt
learning predictor yt = fθ(xt, zt)
(i.e. s.t. loss on previous tasks doesn’t get worse)
Can formulate & solve as a QP.
Assume local linearity: ⟨gt, gk⟩ := ⟨ ∂ℒ( fθ , (xt, yt) )
∂θ,
ℒ( fθ , ℳk )∂θ ⟩ ≥ 0 for all zk < zt
23Lopez-Paz & Ranzato. Gradient Episodic Memory for Continual Learning. NeurIPS ‘17
Experiments
If we take a step back… do these experimental domains make sense?
BWT: backward transfer, FWT: forward transfer
- MNIST permuta0ons - MNIST rota0ons - CIFAR-100 (5 new classes/task)
Problems:
Total memory size: 5012 examples
Can we meta-learn how to avoid nega0ve backward transfer?
24
Javed & White. Meta-Learning Representa3ons for Con3nual Learning. NeurIPS ‘19
Plan for Today
25
The lifelong learning problem statement
Basic approaches to lifelong learning
Can we do better than the basics?
Revisiting the problem statement from the meta-learning perspective
More realis8cally:
learn learn learn learn learnlearn
slow learning rapid learning
learn
0me
What might be wrong with the online learning formula0on?
Online Learning(Hannan ’57, Zinkevich ’03)
Perform sequence of tasks while minimizing sta0c regret. 0me
perform perform perform perform perform performperform
zero-shot performance
26
Online Learning(Hannan ’57, Zinkevich ’03)
Perform sequence of tasks while minimizing sta0c regret.
(Finn*, Rajeswaran*, Kakade, Levine ICML ’18)
Online Meta-LearningEfficiently learn a sequence of tasks from a non-sta0onary distribu0on.
0me
learn learn learn learn learn learnlearn
0me
perform perform perform perform perform performperform
zero-shot performance
evaluate performance aQer seeing a small amount of data
What might be wrong with the online learning formula0on?
27
Primarily a difference in evalua&on, rather than the data stream.
The Online Meta-Learning Se=ng
RegretT :=TX
t=1
`t(�t(✓t))�min✓2⇥
TX
t=1
`t(�t(✓))<latexit sha1_base64="2+KP9DCIWvggRsB3gBx4m2dSNKQ=">AAACu3ichVFNbxMxEPUuX6UFmvJx4mKRRWoPRLvhQIUUqRIcOHAIKGkrZcPK650kprZ3ZY+LotWK38kP4H/gTfZAUyRGsvz0Zt6M/SavpLAYx7+C8M7de/cf7D3cP3j0+Mlh7+jpuS2d4TDlpSzNZc4sSKFhigIlXFYGmMolXORXH9r8xTUYK0o9wXUFc8WWWiwEZ+iprPczimiqGK6Mqr/C0gA22YS+H9HUOpXVOEqabxOagpQZHqfjlWgvXAGyDE9O6BsvFjqrtxRNhabppIXN/xt4eRRlvX48iDdBb4OkA33SxTg7CqK0KLlToJFLZu0siSuc18yg4BKa/dRZqBi/YkuYeaiZAjuvN0419LVnCroojT8a6Yb9W1EzZe1a5b6yNcXu5lryX7mZw8XpvBa6cgiabwctnKRY0tZ2WggDHOXaA8aN8G+lfMUM4+iXc2MKaKcEgvI/0fCDl0oxXdQp/9zUrYs7tCsq30X5HApZQLeIpvG+Jrsu3gbnw0HydjD8Muyffewc3iMvyStyTBLyjpyRT2RMpoST38FB8Dx4EY5CHn4P5bY0DDrNM3IjQvcHPI7YVw==</latexit>
Goal: Learning algorithm with sub-linear
Loss of algorithmLoss of best algorithm
in hindsight
28
for task t = 1, …, n
observe 𝒟trt
use update procedure to produce parameters Φ(θt, 𝒟trt ) ϕt
observe label yt
observe xt
predict ̂yt = fϕt(xt) Standard online learning se@ng
(Finn*, Rajeswaran*, Kakade, Levine ICML ’18)
29
Store all the data you’ve seen so far, and train on it.
Recall the follow the leader (FTL) algorithm:
Follow the meta-leader (FTML) algorithm:
Can we apply meta-learning in lifelong learning seTngs?
Store all the data you’ve seen so far, and meta-train on it.
Run update procedure on the current task.
Deploy model on current task.
What meta-learning algorithms are well-suited for FTML?
What if is non-sta0onary?pt(𝒯)
Experiment with sequences of tasks: - Colored, rotated, scaled MNIST - 3D object pose predic1on - CIFAR-100 classifica0on
Example pose predic0on tasks
plane
car
chair
Experiments
30
ExperimentsLe
arni
ng e
ffici
ency
(# d
atap
oint
s)
Task index
Rainbow MNIST Pose Predic8on
Task index
Rainbow MNIST Pose Predic8on
- TOE (train on everything): train on all data so far- FTL (follow the leader): train on all data so far, fine-tune on current task- From Scratch: train from scratch on each task
Lear
ning
pro
ficie
ncy
(err
or)
Comparisons:
31
Follow The Meta-Leader learns each new task faster & with greater proficiency,
approaches few-shot learning regime
32
TakeawaysMany flavors of lifelong learning, all under the same name.
Defining the problem statement is often the hardest part
Meta-learning can be viewed as a slice of the lifelong learning problem.
A very open area of research.
33
RemindersProject milestone due Wednesday.
Two guest lectures next week!
Jeff Clune Sergey Levine