Distributed Machine Learning with Zero ETL · © 2018 GridGain Systems, Inc. Distributed Machine...

Preview:

Citation preview

© 2018 GridGain Systems, Inc.

Distributed Machine Learning with Zero ETL

Yury Babak

Head of development, GridGain

© 2018 GridGain Systems, Inc.

Long ETL

© 2018 GridGain Systems, Inc.

Long ETL

- Х%

- Х%

© 2018 GridGain Systems, Inc.

Distributed Training

© 2018 GridGain Systems, Inc.

Node Crash

© 2018 GridGain Systems, Inc.

Apache Ignite

© 2018 GridGain Systems, Inc.

Apache Ignite: Replicated Caches

Server Node 1 Server Node 2

Server Node 3 Server Node 4

Client

© 2018 GridGain Systems, Inc.

Map Reduce

© 2018 GridGain Systems, Inc.

Iterative Optimization Algorithm

© 2018 GridGain Systems, Inc.

Partition Based Data Set

© 2018 GridGain Systems, Inc.

Restoration of partitions after a failure

© 2018 GridGain Systems, Inc.

Recovering calculations after failure

© 2018 GridGain Systems, Inc.

OLS sample

Loss function

Gradient of loss function

Node 2Node 1Node M

© 2018 GridGain Systems, Inc.

Sample 2 LSQR

© 2018 GridGain Systems, Inc.

Limitations of Applicability

Iteration time

Number of Iterations

SGDBS 1 000

BS 10

Time to training

© 2018 GridGain Systems, Inc.

https://ignite.apache.org

https://apacheignite.readme.io/docs

https://github.com/apache/ignite

ybabak@gridgain.com

Want to learn more?

Recommended