Kamal Hakimzadeh – Reproducible Distributed Experiments

Preview:

Citation preview

www.karamel.io 1

Reproducible Distributed

Experiments

Kamal HakimzadehPhD Student mahh@kth.se

Jim DowlingAssociate Professor

jdowling@kth.se

www.karamel.io 2

Agenda

• Motivation

• Reproducibility

• Demo: Simple experiment 30-40 min

• Karamel Rep.

• Karamel Engine

• Orchestration

• Challenges

www.karamel.io 3

Motivation

Analytical vs Empirical proof

DS supports many scientific advancements

Scheduling, fault tolerant, scalability … Extremely complex

www.karamel.io 4

Reproducible vs. Replicable

1. Laboratory2. Experimenter3. Apparatus

Different and same conclusion

Same and same results

Reproducible

Replicable

Computational Reproducibility: Infrastructure, software, experiment and data

www.karamel.io 5

Demo : Word Count

Hadoop NN

Flink JM

Hadoop DNFlink TM

Hadoop DNFlink TM

Hadoop DNFlink TM

Text Generator Text Generator Text Generator

Word Count

www.karamel.io 6

Karamel: Rep. in different layers

Bare MetalGoogle Compute Engine

Virtual Machine is and abstract entity

Software is defined in Chef It is publicly available in Github

www.karamel.io 7

Karamel Engine

DSL Service

Cloud Clients

Karamel Engine

Physical Mapping

Orchestrator

www.karamel.io 8

Orchestration – queuing model

www.karamel.io 9

Result

www.karamel.io 10

Challenges and future work

Scalability Fault Recovery Model

Elasticity – Handle ChurnInstrumentation

Recommendation System

Language Support

Load generators

Scheduling

Container base machines Result Management

Debugging

www.karamel.io 11

Team members

Kamal HakimzadehPhD Student at KTH

mahh@kth.se

Alberto Lorente LealSoftware Developer at Comeon

a.lorenteleal@gmail.com

Jim DowlingAssociate Professor at KTH

jdowling@kth.se

Hooman Peiro SajjadPhD Student at KTH

shps@kth.se

Abhimanyu BabbarBackend Developer at Wrap

abhimanyu.babbar88@gmail.com

www.karamel.io 12