Metis Presentation May 2016

Preview:

Citation preview

ML Little Data

Vincent TangLead ML Engineer

SAMSUNG ACCELERATOR

EMBEDDED ML

BIG DATA, BIG COMPUTE

STANDARD DATA PIPELINE + LEARNING

X

DX

D

D

D

X

D

D

D

D D D

D D D

XD

D

D

X

DEVICES IN THE WILD

Move Compute ML to the Data Edge

MOVE ML TO THE EDGE

D

D

D

D

D

D

D

DD

D D

D

D

D

D

D

D

D

D

D

D

D D

D DD

D D

D D

DD

Traditional Embedded

Resources MOAR GPUs Each thread counts; small buffers

Power 60-130 watts / server 0.18 mW for 32 bytes/second

Updates Commit + Push OTA (sometimes)

Languages Python & R FTW! C, C++, Java

Parameters Stationarity Non-stationarity

Cycle Batch Online, up to 1600hz

Type Supervised Unsupervised

Variance “Napolean Dynamite” problem Unreliable sensors

Metric arg max (accuracy) arg max (accuracy / big-O)

COMPARISON

PIPELINE

Acquisition (20%) Feature Engineering (60%) Learning (10%) Deploy (10%)

PIPELINE

Acquisition (20%) Feature Engineering (60%) Learning (10%) Deploy (10%)

PIPELINE

Acquisition (20%) Feature Engineering (60%) Learning (10%) Deploy (10%)

PIPELINE

Acquisition (20%) Feature Engineering (60%) Learning (10%) Deploy (10%)

DEEP NETS

Acquisition (20%) Feature Engineering (60%) Learning (10%) Deploy (10%)

Feature Engineering & Learning for the price of one!

PIPELINE

Acquisition (20%) Feature Engineering (60%) Learning (10%) Deploy (10%)

PIPELINE

Acquisition (20%) Feature Engineering (60%) Learning (10%) Deploy (10%)

Tighter Feedback & Cleaner Code!

● More data > smarter algorithm● Start with simple learners, then increase complexity as needed● Cast a wide net, then prune● Reject hypotheses early and often

ADVICE FOR PRACTITIONERS

SAMSUNG

CASE STUDY: UNCLIP

Thank you!