16

lld-workshop.github.io · 2017-12-14lld-workshop.github.io

Embed Size (px)

Citation preview

Page 1: lld-workshop.github.io · 2017-12-14lld-workshop.github.io
Page 2: lld-workshop.github.io · 2017-12-14lld-workshop.github.io

Today, training data is the biggest bottleneck in ML

Page 3: lld-workshop.github.io · 2017-12-14lld-workshop.github.io

The Rise of Representation Learning

Figure  by  Aphex34  (Own  work)  [CC  BY-­‐SA  4.0  (h?ps://creaEvecommons.org/licenses/by-­‐sa/4.0)],  via  Wikimedia  Commons  

Page 4: lld-workshop.github.io · 2017-12-14lld-workshop.github.io

Representation Learning is Data Hungry•  Those feature maps and

transformation functions have lots of data-dependent parameters!

•  Good generalization requires at least tens of thousands of labeled training examples!

Page 5: lld-workshop.github.io · 2017-12-14lld-workshop.github.io

Now We Need a Lot of Training Data

Hard to move beyond a few benchmarks!

Page 6: lld-workshop.github.io · 2017-12-14lld-workshop.github.io

Now We Need a Lot of Training Data

“Google  trains  these  neural  networks  using  data  handcrafted  by  a  massive  team  of  PhD  linguists…”  

Proprietary data, labeled at immense cost!

Page 7: lld-workshop.github.io · 2017-12-14lld-workshop.github.io

Other Barriers to Curating Data

Expensive!collection!procedures!

Page 8: lld-workshop.github.io · 2017-12-14lld-workshop.github.io

Other Barriers to Curating Data

Need for!domain expertise !

Page 9: lld-workshop.github.io · 2017-12-14lld-workshop.github.io

Other Barriers to Curating Data

Data!regulation!

Page 10: lld-workshop.github.io · 2017-12-14lld-workshop.github.io

How Will We Feedthe Next Generation of

Data-Hungry ML?

Page 11: lld-workshop.github.io · 2017-12-14lld-workshop.github.io

Program Stats•  65 submissions!

•  56 reviewers!

•  44 accepted papers!

Page 12: lld-workshop.github.io · 2017-12-14lld-workshop.github.io

   

Semi-Supervised Learning!Weak Supervision!Transfer Learning!Representation Learning!Applications!Multi-Task Learning!Data Augmentation!Active Learning !Self-Training!Knowledge Distillation!

Topics of Accepted Papers

Page 13: lld-workshop.github.io · 2017-12-14lld-workshop.github.io

Invited SpeakersGaël Varoquaux8:40 AM!

Tom Mitchell9:10 AM!

Andrew McCallum11:00 AM!

Sebastian Riedel11:30 AM!

Nina Balcan3:30 PM!

Sameer Singh4:15 PM!

Ian Goodfellow4:45 PM!

Alan Ritter5:45 PM!

Page 14: lld-workshop.github.io · 2017-12-14lld-workshop.github.io

Other Program Highlights•  Panel on limited labeled

data in medical imaging!2:00 PM!

•  Award ceremony!6:15 PM!

Page 15: lld-workshop.github.io · 2017-12-14lld-workshop.github.io

Information for Poster Presenters•  Two sets of one-minute spotlights: 9:55 AM, 2:30 PM!

•  Followed immediately by poster sessions!

•  Can still send a spotlight slide to [email protected]!

Page 16: lld-workshop.github.io · 2017-12-14lld-workshop.github.io

More Information:

http://lld-workshop.github.io