145
Hybrid Machine Learning/Analytical Models for Performance Prediction Diego Didona and Paolo Romano INESC3ID / Instituto Superior Técnico 6 th ACM/SPEC International Conference on Performance Engineering(ICPE) Feb 1 st , 2015

Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

Hybrid'Machine'Learning/Analytical'Models'for'

Performance'Prediction'Diego&Didona and&Paolo&Romano

INESC3ID&/&Instituto Superior&Técnico

6th ACM/SPEC&International&Conference&on&Performance&Engineering&(ICPE)

Feb&1st,&2015

Page 2: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

Outline

• Base&techniques&for&performance&modeling–White&box&modeling– Black&box&modeling–Modeling&and&optimization&on&two&case&studies

• Hybrid&modeling&techniques– Divide&et&impera– Bootstrapping– Ensemble

• Closing&remarks

Page 3: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

Modeling&a&system

SYSTEMINPUT&FEATURES KEY&PERFORMANCE& INDICATORS

Page 4: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

Modeling&a&system

SYSTEMINPUT&FEATURES KEY&PERFORMANCE& INDICATORS

Workload:• Intensity,&small&vs large&jobs

Infrastructure• #&servers,&type&of&servers

Application3specific• Replication

Throughput• Max&jobs&/&sec

Response&time• Exec.&time&of&a&job

Consumed&energy• Joules&/&job

Page 5: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

What&is&a&performance&model?

• Approximator of&a&KPI&function

• Relates&input&to&target&output

• Can&be&implemented&in&different&ways–White&box&– Black&box

Page 6: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

Applications&of&Performance&Modeling

• Capacity&planning– Avoid&overload&in&datacenters

• Anomaly&detection–Model&“normalcy”&to&detect&anomalies

• Self3tuning–Maximize&performance

• Resource&provisioning– Elastic&scaling&in&the&Cloud

Page 7: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

Accuracy&of&a&performance&model

• Approximation&accuracy&metrics–MAPE&(Mean&Absolute&Percentage&Error)

– RMSE&(Root&Mean&Square&Error)

Page 8: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system
Page 9: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

White&box&performance&modeling&• Leverage&on&knowledge&about&target&app’s&internals

• Formalize&a&mapping&between– Application,&hosting&platform&and– Performance&

• Formalization&can&be– Analytical&(e.g.,&Queueing Theory)&[45]– Simulation,&e.g.,&[36]

Page 10: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

Queueing Theory• A&resource&is&modeled&as&a&server&+&a&queue• Possible&target&KPIs– Resource&utilization– Throughput– Response&time

• Key&factors&impacting&queue’s&performance– Arrival&of&jobs– Service&demands– Service&policy&(e.g.,&FCFS)– Load&generation&model&(e.g.,&open&vs closed)

Page 11: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

From&single&queues&to&networks

Page 12: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

Queueing Theory&pros&and&cons

• Accurate&for&wide&spectrum&of&input&parameters• Specifically&crafted&for&target&app• Analytical&tractability&often&requires– Assumptions&(e.g.,&independent&job&flows)– Approximations– Simplifications&&(e.g.,&Poisson&arrival)&

Page 13: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

Simulation

• Encode&system&dynamics&via&a&computer&program• Alternative&w.r.t.&analytical&modeling– Simpler&(code&vs equations)–May&rely&on&less&assumptions– Slower&to&produce&output

• Similar&trade3offs&w.r.t analytical&modeling– Still&uses&simplifications&to&avoid&overly&complex&code

Page 14: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

Black&box&performance&modeling

• Definition• Taxonomy&(Offline&vs Online,&supervised&vsunsupervised,&regression&vs classification)

• Examples&(DT,&SVM,&ANN,&KNN,&UCB,&Gradient),&• Ensemble• Optimization

Page 15: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

Building&black&box&models

• Infer performance&model&from&behavior&

• Machine&Learning&[8]– Observe&Y&corresponding&to&different&X– Obtain&a&statistical&performance&model

Page 16: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

Machine&Learning&pros&and&cons

• No&need&for&domain&knowledge• High&accuracy& in&interpolation– i.e.,&for&input&values&close&to&the&observed&ones

• Curse&of&dimensionality– #&required&samples&grows&exp.&with&input&size&– Long&training&phase&to&build&model

• Poor&accuracy&in&extrapolation– i.e.,&for&input&values&far&away&from&the&observed&ones

Page 17: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

Black&box&modeling&taxonomy

• Target&output&feature&y– Classification&(discrete&y)&vs Regression&(y&in&R)

• Training&phase&timing– Online&vs Offline

• Predict&or&find&hidden&structures– Supervised&vs unsupervised&learning

Page 18: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

OFF3LINE&SUPERVISED&LEARNING

• Supervised– Known&inputs&x&have&a&corresponding&known&y&=&f(x)

• Offline&–Model&built&on&a&training&dataset– Dataset&{<x,y>&:&y&=&f(x)}– Learn&f’&:&f’(x)&~&f(x)• While&being&able&to&generalize&outside&the&known&dataset

Page 19: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

Decision&Trees&[55]

• Predictive&model&is&a&tree3like&graph• Intermediate&nodes&are&predicate• Classifications:& leaves&are&classes• Regression:&leaves&are&functions– Piecewise&approximation&of&nonlinear&functions

Page 20: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

DT:&an&exampleInput&features• Income&range• Criminal&records• #&years&in&present&job• use&credit&card

Page 21: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

Support&Vector&Machines&[16]&

• A&tuple&is&a&point&in&a&multidimensional&space• Find&hyperplane s.t. different&classes&are&as&much&distant&as&possible

Credits&to&Erik&Kim&for&SVM3related&images,&http://www.eric3kim.net/eric3kim3net/posts/1/kernel_trick.html

Page 22: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

Support&Vector&Machines&

What&if&points&are&not&linearly&separable?

Page 23: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

SVM:&the&kernel&trick,&I

• Map&points&to&a&higher&dimensional&space&• In&that&space,&points&are&linearly&separable

• Here,&kernel&is&f(x,&y)&=&(x,&y,&x2 +&y2)

Page 24: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

SVM:&the&kernel&trick,&II

• Nonlinear&separation&in&original&domain

• Here,&kernel&is&f(x,&y)&=&(x,&y,&x2 +&y2)

Page 25: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

Artificial&Neural&Network&[79]

• Inner&model&is&a&graph• Resembles&neurons&connections&in&brain

Credits&to&KonéMamadou Tadiou for&imagehttp://futurehumanevolution.com/artificial3intelligence3future3human3evolution/artificial3neural3networks

Page 26: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

ANN&internals

• Neuron&structure

• Weighted&sum&of&inputs• Activation&0/1&function&as&output

Credits&to&http://www.theprojectspot.com/tutorial3post/introduction3to3artificial3neural3networks3part31/7

Page 27: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

Building&an&ANN

• Determining&its&structure– #&layers– #&neurons&per&layer

• Activation&function&per&neuron• Iteratively&learn&weights&depending&on&error

Page 28: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

K&Nearest&Neighbors&[2]

• Predict&based&on&closest&known&values&to&target• Proximity&given&by&a&function

Pic&from&http://www.cs.bham.ac.uk/internal/courses/robotics/halloffame/2010/team12/knn.htm

Page 29: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

K&Nearest&Neighbors

• Classification:– Class&of&X&is&the&most&common&in&neighborhood

• Regression– Value&for&X&is&a&function&of&the&values&in&the&neighb.

Pic&from&http://www.cs.bham.ac.uk/internal/courses/robotics/halloffame/2010/team12/knn.htm

Page 30: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

ONLINE&LEARNING

• We&consider&Reinforcement&Learning– Training&set&not&available&(nor&stored)– Given&a&set&of&<State,&Action>&pairs– Find&sequence&of&actions&that&maximizes&payoff&(reward)

– Collect&feedback&from&system• Tradeoff&between&– Exploration&(try&new&actions)– Exploitation&(use&good&known&actions)

[70]

Page 31: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

Multi3armed&bandit&(MAB)

• Inspired&by&gambling&at&slot&machines.&Find–Which&arm&to&play– How&many&times– In&which&order

Page 32: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

Upper&Confidence&Bound&[3]

• Popular&set&of&algorithms&for&MAB

• At&any&time&choose&the&arm&that1. maximizes&reward,&while…2. minimizing&regret:&• utility&loss&due&to&sub3optimal&choices

• Efficiency:&regret&is&logarithmic&in&the&#&of&trials

Page 33: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

Hill&Climbing

• Not&really&“learning”,&but&online&optimization• Explore&function&in&the&direction&that&increases/decreases&its&value

• Possibly&coupled&with&randomization&to&avoid&local&max/min

Credit&to&Ben&Etzkorn for&pic.&http://www.benetzkorn.com/2011/09/non3linear3model3optimization3using3palisades3evolver/

Page 34: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

NO&FREE&LUNCH&THEOREM&FOR&ML

• There&is&no&“absolute&best&learner”

• Best&learner&and&parameters&depend&on&data

• When&working&in&extrapolation,& there&are&no&a&priori&distinctions&between&learning&algorithms&[80]

Page 35: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

ML&optimization

• A&ML&algorithm&has&meta3parameters– #&features&of&the&input&data– #&min&of&cases&per&leaf&in&DT– Kernel&and&its&parameters&in&SVM– Neurons,&layers,&activation&functions&in&ANN

• How&to&choose&them&to&maximize&accuracy?– It&depends&on&the&problem&at&hand!

Page 36: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

Features&selection&[40]

• Identify&features&of&inputs&that&are&correlated&the&most&with&target&output– Speedup&in&building&the&modelIncrease&accuracy&by&reducing&noise

Page 37: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

Features&selection

• Wrapper:&use&target&ML&with&different&combinations&of&features– Forward&selection,&Backward&elimination,&…

• Filter:&independent&of&the&target&ML– E.g.,&discard&1&between&2&highly&correlated&variables

• Dimensionality&reduction&(PCA,&SVD)– Find&features&that&account&for&most&of&the&variance

Page 38: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

Hyperparameters optimization

• Find&hyper3parameters& that&maximize&accuracy• Based&on&cross3validation– Use&part&of&the&set&as&training&and&part&as&test

• Different&approaches– Grid&search– Random&search&[6]– Bayesian&optimization&[74]

Page 39: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

Grid&Search

1. Uniformly&discretize&features’&domain

2. Take&the&Cartesian&product&of&features

Page 40: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

Random&search

• Include&randomness– Increase&sampling&granularity&of&important&param.&

Page 41: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

ENSEMBLING

• Solution&to&counter&NFL&theorem• Employ&multiple&learners&together• Bagging&[9]– Train&learners&on&different&training&sets

• Boosting&[66]– Generate&1&strong&learner&from&N&weak&ones

• Stacking&[79]– Combine&output&of&learners&depending&on&input

Page 42: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

Bagging

• Average&output&of&sub3models• Generate&N&sets&of&size&D’– Draw&uniformly&at&random&with&repetition&from&D

• Generate&N&black&box&models– Voting&for&classification– Averaging&for&regression

• Cannot&improve&predictive&power&(in&extrap.)&…• Can&reduce&variance&(i.e.,&better&interp.&accuracy)

Page 43: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

Bagging&example

• 100&bootstrapped&learners• Reduce&variance&and&overfit w.r.t.&single&models

Page 44: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

Boosting

• Build&a&strong&learner&from&many&weak&ones• Stage3wise&training&phase– Training&at&stage&i depends&on&output&of&i31

• 0/1&Adaboost– Base&learners&Bi:&can&classify&correctly&with&p&>&½– Iteratively&try&to&classify&better&mis3classified&samples– At&stage&i,&drawn&training&set&according&to&dist.&Di

– Di+1 s.t. mis3classified&samples&have&higher&relevance– Output&weighted&average&of&weak&learners

Page 45: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

Adaboost,&training

Credits&to&Kihwan Kim.&http://www.cc.gatech.edu/~kihwan23/imageCV/Final2005/FinalProject_KH.htm

Page 46: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

Adaboost,&result

Pictures&from&http://www.ieev.org/2010/03/adaboost3haar3features3face3detection.html

Page 47: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

• A&meta3learner&combine&output&of&ML• Partition&D&in&D’,&D’’• Train&1…N&learners&on&D’• MLN+1 trained&on&ML1…MLN predictions&on&D’’

Stacking

.'.'.'ML2 MLN

MLN+1

ML1

Page 48: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system
Page 49: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

Background&Case&studies

• Total&Order&Broadcast&primitive– Analytical&model– Black&box&online&optimization

• Distributed&NoSQL transactional&data&grid– Simulation&model– Black&box&offline&supervised&learning

Page 50: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

Total&Order&Broadcast&case&study• TOB&allows&a&set&of&nodes&to&deliver&broadcast&messages&in&the&same&order

• Incarnates&the&popular&consensus&problem– Fundamental&abstraction&for&dependable&computing

• We&consider&Sequencer3based&TOB– Messages&are&broadcast&normally– A&Sequencer&node&decides&the&delivery&order

Page 51: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

Sequencer3Based&TOB

52

M2M1

M2 M1

M1 M2

Broadcast&messages Broadcast&Seq.&No.& Ordered&deliver

N1

N2

N0(Sequencer)

M1,&M2

M1,&M2

Page 52: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

Performance&of&STOB

• STOB&minimizes&messages&exchange,&but…• The&sequencer&may&become&the&bottleneck• Possible&solution:&batching• The&sequencer–Waits&to&receive&N&>&1&msgs– Send&a&single,&bigger&seq.&msg for&the&N&msgsinstead&of&N&smaller

Page 53: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

Batching&in&STOB

• At&high&load&batching– Allows&for&amortizing&msgs sequencing&cost– Increases&sequencer&capacity&and&throughput

• At&load&load&batching– Introduces&useless&delays– The&sequencer&waits&too&much&and&wastes&time

Page 54: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

The&need&for&self3tuning&STOB&Batching

• Optimal&batching&depending&on&msgs rate

0

5000

10000

15000

20000

10 20 30 40 50 60

Self

Deliv

ery

Late

ncy

sec)

Batch Level

1K 5K 10K 15K 20K

Page 55: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

Tuning&the&batching&level

• White&box&approaches– Forecast&the&impact&of&batching&given&workload

• Black&box&approaches– On3line&optimization

Page 56: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

STOB&white&box&modeling

• Focus&on&performance&on&sequencer• It&is&representative&of&the&whole&system

0.1

1

10

100

1000

10000

1 10 100 1000 10000 100000

Avg.

Self

-Deli

very

Late

ncy a

t the

Seq

uenc

er n

ode\

n (m

s

Avg. Self-Delivery Latency over Non-Sequencer Nodes\n (msec)

Figure 1. Self-delivery latency at the sequencer and at the non-sequencernodes.

the sequencer process: the message arrival rate, and the self-delivery latency, i.e., the latency experienced by the sequencerto TO-deliver messages whose TO-broadcast was activatedlocally.

The choice of measuring exclusively the self-delivery la-tencies allows to circumvent the issue of ensuring accurateclock synchronization among the communicating nodes, whichwould have clearly been a crucial requirement in case we hadopted for monitoring the delivery latencies of messages gener-ated by different nodes. Preliminary experiments conducted inour cluster have indeed highlighted that the accuracy achiev-able using conventional clock synchronization schemes, suchas NTP, is inadequate for collecting measurements of the TObroadcast inter-nodes delivery latency in LAN environments,being the latter frequently around or less than a millisecond.

Further, we experimentally verified that, at least in typicalLAN settings, there is a very high correlation between theself-delivery latency at the sequencer and the self-deliverylatency experienced by the other nodes in the group. Thisclaim is supported by the data in Figure 1 and Figure 2.The data in the plots was obtained using a plain (not self-tuning) STOB protocol in a cluster of 10 machines equippedwith two Intel Quad-Core XEON at 2.0 GHz, 8 GB ofRAM, running Linux 2.6.32-26-server and interconnected viaa private Gigabit Ethernet. All the experimental data reportedin the remainder of this paper have been collected using thiscluster. In this experiment, we let the batching level b vary inthe set {1,2,3,4,6,8,16,32,64,128} and, for each batching level,we injected 512 bytes messages at an arrival rate ranging from1 msg/sec up to saturating the GCS. In all of the experiments,independently of the batching level used, it was verified thatthe bottleneck has always resulted to be the sequencer CPU,and the available network bandwidth was always far frombeing saturated. In Figure 1 we report on the x axis theself-delivery (in msec) experienced on average by the non-sequencer nodes, and on the y axis the self-delivery latencyexperienced by the sequencer node.

To generate the plot in Figure 2 we manually computed the

1

10

100

1 10 100

Optim

al b

base

d on

Self

-Deli

very

Late

ncy

at t

he S

eque

ncer

Nod

e

Optimal b based on Self-Delivery Latency Across Non-Sequencer Nodes

Figure 2. Optimal batching value computed on the basis of the sequencer’svs non-sequencer nodes’ self-delivery latency.

optimal batching value that, given the current load, minimizedi) the self-delivery latency of the sequencer node, and ii) theaverage self-delivery latency over all the non-sequencer nodes.The correlations of the datasets in Figure 1 and Figure 2are, respectively, 0.85 and 0.99. This experimental evidenceconfirms the fact that, in a LAN, by relying exclusively onlocal measures of self-delivery latency taken at the sequencernode, it is possible to obtain an accurate picture of theperformance perceived, on average, by any node of the system.

V. THE ANALYTICAL PERFORMANCE MODEL

As discussed in the Introduction section, our self-tuningrelies on an analytical model which is used to initialize areinforcement learning system. In this section we first presentthe analytical model. Next we discuss how it is possibleto determine its parameters’ values. Finally we validate itsaccuracy and effectiveness in driving the self-tuning processwithout additional feedback from the reinforcement learningsubsystem.

A. The Mathematical Model

The design of the analytical performance model used inour system has been driven by two key requirements: i) highcomputational efficiency, thus allowing to compute its solutionalso in real-time without significantly overloading the CPU ofthe sequencer node; ii) possibility to identify its parameters viaan automatized and extremely fast procedure, thus making iteasily employable in practical settings also by non-specialists.

These two requirements led us to opt for a relatively simplemodel that captures the impact of batching on self-deliverylatency focusing on two main aspects: the additional delayintroduced by forcing the sequencer to wait for the completionof a batch of messages, and the alleviation of the load pressureon the CPU of the sequencer due to the generation of a reducednumber of sequencing messages. We choose to explicitlymodel the possible effects of contention on the network, astypical applications that use TOB services in a LAN [5], [26]generate messages large at most a few KBs. In these settings,

788

Page 57: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

STOB&model&input

• m&=&messages&generation&rate• b&=&batching&level• T_1&=&time&to&process&1st message&in&batch• T_Add =&time&to&process&additional&msgs– Batching&makes&sense&when&T_1>T_Add

Page 58: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

STOB&analytical&model&[59]

• Sequencer&=&M/M/1&queue

• Batch&generation&rate

• Batch&service&rate&

• Taking&derivatives,&optimal&b&is&computed

Page 59: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

STOB&model’s&accuracy

• Assumptions&and&simplifications– Exponential&arrival&rate&and&service&rate&(M/M/1)– In&computing&arrivals&and&computation&overlapping

Page 60: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

STOB&black&box&optimization&[24]

• Learn&optimal&waiting&time&for&a&batch&of&size&b– Computed&at&the&sequencer

• Hill&climbing&for&each&value&of&b– In/decrease&wait&time&@b&depending&on&feedback

• When&delivering&a&batch&of&size&b– Confirm&previous&decision&if&delivery&time&is&lower– Revert&previous&decision&if&delivery&time&is&higher

Page 61: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

Hill&Climbing&in&STOB

• But&limited&expressiveness:– Self3tuning&at&the&cost&of&no&predictability

0

4000

8000

12000

16000

20000

20 40 60 80 100 120 140

Time (sec)

Arr

iva

l Ra

te (

Msg

s/se

c)

0

2

4

6

8

10

12

14

16

18

20

Ba

tch

Le

vel

Page 62: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

Transactional&NoSQL store&case&study

• Distributed&transactional&data&store– Nodes&maintain&elements&of&a&dataset• Full&vs partial&replication&(#&copies&per&item)

– Transactional&33ACI(D)– manipulation&of&data• Concurrency&control&scheme&(enforce&isolation)• Replication&protocol&(disseminate&modifications)

Page 63: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

Replication&protocols:&which&one?transactional&data

consistency&protocols

Single&master&(primary3backup)

Multi&master&

2PC3basedTotal&order3based

State&machine& replicationCertification

Non3votingBFC

Voting

Page 64: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

DSTM&Performance

0

1000

2000

3000

4000

5000

6000

2 3 4 5 6 7 8 9 10

Com

mitt

ed T

ransa

ctio

ns/

sec

Number of nodes

RG - Small RG - Large TPC-C

• Heterogeneous,&nonlinear&scalability&trends!

Page 65: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

Factors&limiting&scalability

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

2 3 4 5 6 7 8 9 10

Co

mm

it P

rob

ab

ility

Number of nodes

RG - Small RG - Large TPC-C

500

1000

1500

2000

2500

3000

3500

4000

4500

5000

5500

2 3 4 5 6 7 8 9 10

Ne

two

rk R

TT

La

ten

cy (

mic

rose

c)

Number of nodes

RG - Small RG - Large TPC-C

Aborted&transactions&because&of&conflicts

Network&latency&in&commit&phase

Page 66: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

White&box&modeling

Page 67: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

Simulator&[21]

• Assumptions&and&approximations– CPU&=&G/M/K– Fixed&point&to&point&network&latency

• Accuracy&/&resolution&time&trade3off

Page 68: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

Black&box&modeling&

• MorphR [20]– Automatic&switching&among&replication&protocols

• Decision&tree&classifier&(C5.0)• Workload&characterization– Xact mix,&#ops,&throughput,&abort&rate

• Physical&resource&usage– CPU,&memory,&commit&latency

• Output:&optimal&replication&protocol

Page 69: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

MorphR in&action

0 100 200 300 400 500 600 700 800 900

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

Thro

ughp

ut (c

omm

itted

tx/s

ec)

Time (minutes)

TW1 TW2 TW3

PB 2PC TOB MorphR

Page 70: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system
Page 71: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

Gray&box&modeling

• Combine&WB&and&BB&modeling– Lower&&training&time&thx&to&WBM– Incremental&learning&thx&to&BBM

• Techniques&in&this&tutorial– Divide&et&impera– Bootstrapping– Hybrid&ensembling

Page 72: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

Gray&box&modeling

• Techniques&in&this&tutorial– Divide&et&impera– Bootstrapping– Hybrid&ensembling

Page 73: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

Divide&et&impera

• Modular&approach–WBM&of&what&is&observable/easy&to&model– BBM&of&what&is&un3observable&or&too&complex

• Reconcile&their&output&in&a&single&function

• Higher&accuracy& in&extrapolation&thx&to&WBM• Apply&BBM&only&to&sub3problem– Less&features,&lower&training&time

Page 74: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

NoSQL optimization&in&the&Cloud

• Important&to&model&network3bound&ops&but…• Cloud&hides&detail&about&network&!– No&topology&info– No&service&demand&info– Additional&overhead&of&virtualization&layer

• BBM&of&network3bound&ops&performance– Train&ML&on&the&target&platform

Page 75: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

TAS/PROMPT&[28,30]• Analytical&modeling– Concurrency&control&scheme&

• E.g.,&encounter& time&vs commit&time&locking– Replication&protocol&

• E.g.,&PB&vs 2PC– Replication&scheme&

• Partial&vs full– CPU

• Machine&Learning– Network&bound&op&(prepare,&remote&gets)– Decision&tree&regressor

Page 76: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

Analytical&model&in&TAS/PROMPT

• Concurrency&control&scheme&(lock3based)– A&lock&is&a&M/G/1&server– Conflict&prob =&utilization&of&the&server

• Replication&protocol– 2PC:&all&nodes&are&similar&" one&model– PR:&primary&vs backups&" two&models

• Replication&scheme– Probability&of&accessing&remote&data– #&nodes&involved&in&commit

Page 77: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

Machine&Learning&in&TAS/PROMPT

• Decision&tree&regressor• Operation3specific&models– Latency&during&prepare– Latency&to&retrieve&remote&data

• Input– Operations&rate&(prepare,&commit,&remote&get…)– Size&of&messages– #&nodes&involved&in&commit

Page 78: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

ML&accuracy&for&network&bound&ops

0

5000

10000

15000

20000

25000

30000

35000

0 5000 10000 15000 20000 25000 30000 35000

Pre

dic

ted T

pre

p(µ

sec)

Real Tprep(µsec)

EC2

500

1000

1500

2000

2500

3000

3500

4000

500 1000 1500 2000 2500 3000 3500 4000

Pre

dic

ted T

pre

p(µ

sec)

Real Tprep(µsec)

Private Cluster

• Seamlessly&portable&across&infrastructures&– Here,&private&cloud&and&Amazon&EC2

Page 79: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

AM&and&ML&coupling

• At&training&time,&all&features&are&monitorable• At&query&time&they&are&NOT!

EXAMPLE• Current&config:&5&nodes,&full&replication– Contact&all&5&nodes&at&commit

• Query&config:&10&nodes,&partial&replication– How&many&contacted&nodes&at&commit??

Page 80: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

• AM&can&provide&(estimates&of)&missing&input• Iterative&coupling&scheme

ML&takes some&input&parameters from AM

AM&takes&latencies&forecast&by&ML&as&input&parameter

Model&resolution

Page 81: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

Model’s&accuracy

0.5

0.6

0.7

0.8

0.9

1

2 4 6 8 10 12 14 16 18 20

Co

mm

it P

rob

ab

ility

Number of nodes

EC2

TPCC-W-RealTPCC-W-Pred

TPCC-R-RealTPCC-R-Pred

0

500

1000

1500

2000

2500

3000

3500

4000

2 4 6 8 10 12 14 16 18 20

Thro

ughput (t

x/se

c)

Number of nodes

EC2

TPCC-W-RealTPCC-W-Pred

TPCC-R-RealTPCC-R-Pred

0.5

0.6

0.7

0.8

0.9

1

2 3 4 5 6 7 8 9 10

Co

mm

it P

rob

ab

ility

Number of nodes

PB, write transactions only

TPCC-W-RealTPCC-W-Pred

TPCC-R-RealTPCC-R-Pred

0

500

1000

1500

2000

2500

3000

3500

4000

4500

2 3 4 5 6 7 8 9 10

Thro

ughput (t

x/se

c)

Number of nodes

PB, write transactions only

TPCC-W-RealTPCC-W-Pred

TPCC-R-RealTPCC-R-Pred

0

500

1000

1500

2000

2500

3000

3500

4000

4500

2 3 4 5 6 7 8 9 10

Thro

ughput (t

x/se

c)

Number of nodes

PB, write transactions only

TPCC-W-RealTPCC-W-Pred

TPCC-R-RealTPCC-R-Pred

TOP:&PB,&only&master&node.&&BOTTOM:&2PC.&&&&FULL&REPL.

Page 82: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

COMPARISON&WITH&PURE&BLACK,&I

• YCSB&(transactified)&workloads&while&varying&– #&operations/tx– Transactional&mix– Scale– Replication&degree

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0 20 40 60 80

Me

an

re

lativ

e e

rro

r

Percentage of additional training set

CubistM5R

SMORegMLP

PROMPTDivide et Impera

Page 83: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

COMPARISON&WITH&PURE&BLACK,&II

• ML&trained&with&TPCC3R&and&queried&for&TPCC3W• Pure&ML&blunders&when&faced&with&new&workloads

0

500

1000

1500

2000

2500

3000

3500

4000

2 4 6 8 10 12 14 16 18 20

Th

rou

gh

pu

t (t

x/se

c)

Number of nodes

real TAS pure ML

Page 84: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

TAS/PROMPT&integration

• TAS/PROMPT&are&baseline&AM&for&case&studies

• We&will&use&TAS/PROMPT&as&pure'white'AM– Trained&with&fixed&network&model– i.e.,&we&do&not&retrain&it&as&new&data&are&collected&(But&it&is&possible)

– Representative&of&pure&white&box&models

Page 85: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

Gray&box&modeling

• Techniques&in&this&tutorial– Divide&et&impera– Bootstrapping– Hybrid&ensembling

Page 86: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

BOOTSTRAPPING&[27]

• Obtain&zero3training3time&ML&via&initial&AM1. Initial&(synthetic)& training&set&of&ML&from&AM2. Retrain&periodically&with&“real”&samples

Analytical !model!

Boostrapping"training set!

Machine learning!!

Gray box "model!

Sampling of "the Parameter Space!

Model construction!

Current training set!

Machine learning!

Gray box "model!

New data"come in!

(1)

(2)

Page 87: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

How&many&synthetic&samples?

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

2000 4000 6000 8000 10000 12000 14000 0.01

0.1

1

10

100

1000

10000

MA

PE

Tra

inin

g t

ime

(se

c, lo

gsc

ale

)

Training set size

KVS-MAPEKVS-time

TOB-MAPETOB-time

• Important&tradeoff– Higher&#&" lower&fitting&error&over&the&AM&output– Lower&#&" higher&density&of&real&samples&in&dataset

Page 88: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

How&to&update

• Merge:&simply&add&real&samples&to&synthetic&set

• Replace&only&the&nearest&neighbor&(RNN)

• Replace&neighbors&in&a&given&region&(RNR)– Two&variants

Page 89: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

Real&vs AM&function

Real&function

AM&function

Page 90: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

• Assuming&enough&point&to&perfectly&learn&AM

Real&vs learnt

Synthetic&sample

ML&function

Page 91: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

• Add&real&samples&to&synthetic

Merge

Real&sample

Page 92: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

• Problem:&same/near&samples&have&diff.&output

Merge

Page 93: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

• Remove&nearest&neighbor

Replace&Nearest&Neighbor&(RNN)

Page 94: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

• Preserve&distribution…

Replace&Nearest&Neighbor&(RNN)

Page 95: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

• …&but&may&induce&alternating&outputs

Replace&Nearest&Neighbor&(RNN)

Page 96: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

• Add&real&and&remove synth.&samples&in&a&radius&

Replace&Nearest&Region&(RNR)

Page 97: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

• R&=&radius&defining&neighborhood

Replace&Nearest&Region&(RNR)

R

Page 98: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

• R&=&radius&defining&neighborhood

Replace&Nearest&Region&(RNR)

R

Page 99: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

• Skew&samples’&distribution

Replace&Nearest&Region&(RNR)

Page 100: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

• Replace all&synthetic&samples&in&a&radius&R

Replace&Nearest&Region&2&(RNR2)

R

Page 101: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

Replace&Nearest&Region2&(RNN2)

• Maintain&distribution,&piecewise&approximation

Page 102: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

Weighting

• Give&more&relevance&to&some&samples

• Fit&better&the&model&around&real samples– “Trust”&real samples&more&than&synthetic&ones– Useful&especially&in&Merge

• Too&high&can&cause&over3fitting!– Learner&too&specialized&only&in&some&regions

Page 103: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

1&K&synthetic&samples&&&&&&&&&&&&&&&&10&K&synthetic&samples

• Weighting&more&real&samples&reduces&error

Merge&,&TOB

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1.1

1.2

20 40 60 80 100 120 140 160 180 200

MA

PE

Weight

Merge 20%Merge 70%

CUB 20%CUB 70%

AM

0

0.2

0.4

0.6

0.8

1

1.2

20 40 60 80 100 120 140 160 180 200M

AP

EWeight

Merge 20%Merge 70%

CUB 20%CUB 70%

AM

Page 104: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

Replace,&KVS

0.2

0.22

0.24

0.26

0.28

0.3

0.32

0.34

0.36

0.38

0.4

20 40 60 80 100 120 140 160 180 200

MA

PE

Weight

RNR2 0.01RNR2 0.3RNR 0.1

RNR 0.3RNN

AM

CUB

0.1

0.15

0.2

0.25

0.3

0.35

0.4

20 40 60 80 100 120 140 160 180 200

MA

PE

Weight

RNR2 0.01RNR2 0.3RNR 0.01

RNR 0.3RNN

AM

CUB

• Examples&of&over3fitting

70%20%

Page 105: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

• Assuming&optimal&parameterization• Merge&and&Replace&seem&*very*&similar…

MERGE&VS&REPLACE&(TOB)

0

0.2

0.4

0.6

0.8

1

1.2

1.4

10 20 30 40 50 60 70 80 90

MA

PE

Additional training %

Merge RNR2 CUB AM

Page 106: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

• …&BUT&replace&is&better&if&base&model&is&poor– It&evicts&synthetic&samples&more&aggressively

Impact&of&base&model&(TOB)

0

0.2

0.4

0.6

0.8

1

1.2

1.4

10 20 30 40 50 60 70 80 90

MA

PE

Additional training %

Merge RNR2 CUB AM

Page 107: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

Visualizing&the&correction&(STOB)BASE&MODEL PURE&ML&(70%&TS)

BOOTSTRAPPED& ML&(70%&TS)

Page 108: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

Visualizing&the&correction&(KVS)BASE&MODEL PURE&ML&(70%&TS)

BOOTSTRAPPED& ML&(70%&TS)

Page 109: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

BOOTSTRAPPING&in&RL&[59]

• Optimize&batching&level&in&STOB• Base&AM&already&presented

Page 110: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

Hybrid&RL&in&STOB

• UCB:&find&optimal&batch&size&(b*)&for&a&given&msg.&arrival&rate&(m)– Discretize&m&domain&into&M={m_min…m_max}– A&UCB&instance&for&each&m_i– For&each&instance,&a&lever&for&each&b

• Initial&rewards&are&determined&via&AM– Convergence&speed&of&UCB&insufficient&at&high&arr.&:• Enhance&convergence&speed&using&initial&knowledge&of&AM

Page 111: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

Bootstrapped&model

• Enhance&response&time&by&better&batching• Faster&convergence&than&UCB&(&&no&thrashing)

Page 112: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

Gray&box&modeling

• Techniques&in&this&tutorial– Divide&et&impera– Bootstrapping– Hybrid&ensembling

Page 113: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

Hybrid&Ensemble&[26]

• Combine&output&of&AM&and&ML

– Hybrid&boosting:&correct&errors&of&single&models

– KNN:&select&best&model&depending&on&query

– Probing:&train&ML&only&where&AM&is&not&accurate

Page 114: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

Hybrid&Boosting

• Implements&Logistic&Additive&Regression

• Chain&composed&by&AM&+&cascade&of&ML

• ML1 trained&over&residual&error&of&AM

• MLi,&i>1&&trained&over&residual&error&of&MLi31

Page 115: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

• Chain&of&3&BBMs&(&>&3&were&useless&here)– DT,&ANN,&SVM

BOOSTING:&sensitivity

0

0.5

1

1.5

2

2.5

0.2 0.3 0.4 0.5 0.6 0.7 0.8

RM

SE

norm

. w

.r.t.

ΓA

M

Percentage of additional training set

Cubist HyBoost

0.7

0.8

0.9

1

1.1

1.2

1.3

1.4

1.5

1.6

1.7

20 30 40 50 60 70 80

RM

SE

norm

. w

.r.t.

ΓA

M

Percentage of additional training set

Cubist HyBoost

Page 116: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

Online&variant&of&HyBoost

• SelfAcorrecting'Transactional'Auto'Scaler (SCATAS)'[28]

• identifying&optimal&level&of&parallelism&in&a&distributed&NoSQL transactional&store– #&nodes&in&the&platforms– #&threads&active&on&each&node

Page 117: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

Parallelism&tuning&in&DTM

• Why&not&using&a&simpler&exploration&based&approach,&e.g.&hill3climbing?– Adapting&number&of&threads&per&node&is&simple&and&effective– Changing&#&nodes&is&costly:&state&transfer!

• Model3based&solution– Input:&workload,&#&nodes,&#&threads/node– Output:&throughput– Obtain:&highest3throughput&configuration

Final&Workshop&of&the&Euro3TM&COST&Action#,&Amsterdam,&2015&Jan.&19th

Page 118: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

Implemented&solution:&SC3TAS

• Exploration +&&modeling&+&Machine&Learning1. Explore&to&gather&feedback&on&model’s&accuracy2. LEARN corrective&functions&to&“patch”&model&

• Try&to&avoid&global&reconfiguration&(#&nodes)– Rely&on&local&#&threads&exploration&(cheap)

• Increase&accuracy

Final&Workshop&of&the&Euro3TM&COST&Action#,&Amsterdam,&2015&Jan.&19th

Page 119: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

SC3TAS&control&loop

Collect&&Data

End&Exploration?Explore&locally Invoke&SC3TAS

Update&SC3TASLocal scaling Global scaling

NO YES

• Workload,&#thread,&#nodes,&TAS’&error

Page 120: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

SC3TAS&control&loop

Collect&&Data

End&Exploration?Explore&locally Invoke&SC3TAS

Update&SC3TASLocal scaling Global scaling

NO YES

• Re3train&hyboost “patching”&ML

Page 121: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

SC3TAS&control&loop

Collect&&Data

End&Exploration?Explore&locally Invoke&SC3TAS

Update&SC3TASLocal scaling Global scaling

NO YES

• Yes&if&min<#thread&exploration<max&&&&&• Accuracy&of&the&patched&model&considered&“OK”

Page 122: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

SC3TAS&control&loop

Collect&&Data

End&Exploration?Explore&locally Invoke&SC3TAS

Update&SC3TASLocal scaling Global scaling

NO YES

• Patch&is&not&enough• Change&#&of&threads&and&repeat

Page 123: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

SC3TAS&control&loop

Collect&&Data

End&Exploration?Explore&locally Invoke&SC3TAS

Update&SC3TASLocal scaling Global scaling

NO YES

• Model&is&supposedly patched• Invoke&the&patched&model&

Page 124: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

Dynamics&of&SC3TAS

0

0.2

0.4

0.6

0.8

1

0 5 10 15 20 25 30

Ab

so

lute

Re

lative

Err

or

Number of iterations

µ=3 µ=5 µ=7

Fig. 6: Average of the relative absoluteerror across all scenarios.

0

500

1000

1500

2000

2500

3000

3500

2 3 4 5 6 7 8 9 10

Th

rou

gh

pu

t (t

x/se

c)

Number of nodes

2 th real2 th TAS

2 th SCTAS

4 th real4 th TAS

4 th SCTAS

8 th real8 th TAS

8 th SCTAS

Fig. 7: Forecasts of SCTAS after a fulloptimization cycle (µ = 7).

concerning the prediction’s errors of TAS, can significantly improve its accuracyboth in explored and unexplored scenarios, lowering the average relative abso-lute error across all scenarios from 103% to 5%. These results highlight that theself-correcting capabilities of SCTAS can be beneficial not only to optimize themultiprogramming level of DTM applications, but also in applications (such asQoS/cost driven elastic scaling policies and what-if analysis of the scalability ofDTM applications) requiring to speculate on the performance of the platform indi�erent scale settings.

5 Conclusion

In this paper, we proposed and evaluated algorithms aimed to self-tune themulti-programming level in two radically di�erent types of TM systems: a sharedmemory STM and a distributed STM. We showed that for shared memory a sim-ple exploration-based hill-climbing algorithm can be extremely e�ective, evenwhen faced with challenging workloads. However, in the distributed case, pureexploration-based approaches are no longer a viable option, as testing configu-rations with a di�erent number of nodes requires triggering costly state transferphases. We tackled this problem by introducing a novel, hybrid approach thatcombines performance models and local exploration, in order to achieve the bestof the two methodologies: quick convergence towards the global optimum androbustness to possible inaccuracies of the performance models.

References

1. Herlihy, M., Moss, J.E.B.: Transactional memory: Architectural support for lock-free data structures. In: Proc. of ISCA. (1993)

2. Harris, T., Larus, J.R., Rajwar, R.: Transactional Memory. 2nd edition edn. Syn-thesis Lectures on Computer Architecture. Morgan & Claypool Publisher (2010)

3. Reimer, N., Haenssgen, S., Tichy, W.F.: Dynamically adapting the degree of par-allelism with reflexive programs. In: Proc. of IRREGULAR. (1996)

• μ&=&min&#&of&thread&explorations&per&node

Page 125: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

SC3TAS:&before&and&after

Final&Workshop&of&the&Euro3TM&COST&Action#,&Amsterdam,&2015&Jan.&19th

0

500

1000

1500

2000

2500

3 4 5 6 7 8 9 10

Th

rou

gh

pu

t (t

x/se

c)

Number of nodes

2 th real2 th pred

4 th real4 th pred

8 th real8 th pred

0

500

1000

1500

2000

2500

3000

3500

2 3 4 5 6 7 8 9 10

Th

rou

gh

pu

t (t

x/se

c)

Number of nodes

2 th real2 th TAS

4 th real4 th TAS

8 th real8 th TAS

Page 126: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

Hybrid&Ensemble&[26]

• Combine&output&of&AM&and&ML

– Hybrid&boosting:&correct&errors&of&single&models

– KNN:&select&best&model&depending&on&query

– Probing:&train&ML&only&where&AM&is&not&accurate

Page 127: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

Hybrid&KNN

• Split&D&into&D’,&D’’

• Train&ML1…MLN on&D’–ML&can&differ&in&nature,&parameters,&training&set…

• For&a&query&sample&x– Pick&the&K&training&samples&in&D’’&closest&to&x– Find&the&model&with&lowest&error&on&the&K&samples– Use&such&model&to&predict&x

Page 128: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

• Low&cut3off&&&&low&%&training&" collapse&to&AM• High&cut3off& &&&high&%&training&" collapse&to&ML

KNN,&sensitivity&(TOB)

0.9

1

1.1

1.2

1.3

1.4

1.5

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

RM

SE

norm

. w

.r.t.

ΓA

M

c

Cubist-20

KNN-20

Cubist-30

KNN-30

0.65

0.7

0.75

0.8

0.85

0.9

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

RM

SE

norm

. w

.r.t.

ΓA

M

c

Cubist-50

KNN-50

Cubist-80

KNN-80

Page 129: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

Hybrid&Ensemble&[26]

• Combine&output&of&AM&and&ML

– Hybrid&boosting:&correct&errors&of&single&models

– KNN:&select&best&model&depending&on&query

– Probing:&train&ML&only&where&AM&is&not&accurate

Page 130: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

PROBING• Build&a&ML&model&as&specialized&as&possible&– Use&AM&where&it&is&accurate– Train&ML&only&where&AM&fails

• Differences&with&KNN– In&KNN,&ML&is&trained&on&all&samples:• Here,&only&when&AM&is&found&to&be&inaccurate

– In&KNN,&voting&decides&on&ML&vs AM:• Here,&binary&classifier&determines&in&which&regions&the&AM&is&inaccurate

Page 131: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

Probing&at&work

1. DML =&empty&set2. Train&a&classifier:&for&each&x&in&D– If&error&of&AM&on&x&>&cut3off,&map&x&to&ML&and&add&

x&to&DML

– Else&map&x&to&AM3. Train&ML&on&DML

• QUERY&for&input&z– If&classify(z)&=&AM,&return&AM(z);&else&return&ML(z)

Page 132: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

• High&cut3off&" collapses&to&AM• Low&cut3off&" collapses&to&ML

Probing&Sensitivity&(KVS)

0.4

0.5

0.6

0.7

0.8

0.9

1

1.1

1.2

1.3

1.4

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8

RM

SE

norm

w.r

.t.

ΓA

M

c

Cubist-50

Prob-50

Cubist-80

Prob-80

Page 133: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

NFL&strikes&again

• No&one3size3fits3all&hybrid&models&exist!• Choosing&best&hybrid&model&(with&right&parameters)&can&be&cast&to&a&parameter&optimization&problem

0.6

0.8

1

1.2

1.4

1.6

1.8

20 30 50 80

RM

SE

norm

. w

.r.t.

ΓA

M

Percentage of additional training setCubist KNN HyBoost Prob

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

2.2

2.4

20 30 50 80

RM

SE

no

rm w

.r.t

. Γ

AM

Percentage of additional training setCubist KNN HyBoost Prob

Page 134: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

Concluding&remarks

• WBM&and&BBM&often&conceived&as&antithetic• They&can&be&leveraged&on&synergistically&– Increased&predictive&power&thx&to&WBM– Incremental&learning&capabilities&thx&to&BBM

• Gray&box&approaches– Divide&et&impera,&Bootstrapping,&Hybrid&ensembling– Design,&implementation&and&use&cases– Can&deliver&better&accuracy&than&pure&B/W&

Page 135: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

THANK&YOU

[email protected]

www.gsd.inesc3id.pt/~didona

Page 136: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

Hybrid Machine Learning/Analytical Models forPerformance Prediction: Bibliography

Diego Didona and Paolo RomanoINESC-ID / Instituto Superior Tecnico, Universidade de Lisboa

White box performance modeling: principles, applications and fundamentalresults.[45] [49] [50] [71] [46] [4] [39] [44] [58] [51] [36]Principles of Machine Learning.[8][5][80] [65][66][48][34][9][79][10][70][41][3][77][62] [16][33][54] [55] [47][56][53] [42] [76] [2]ML ensembling, features selection and hyper-parameter optimizations.[32] [12] [74] [6] [69] [40]Application of ML to performance modeling.[13] [60][18][20][15] [59][31][75][38][37][68] [1][82][81] [57]Divide et impera.[30] [43] [28] [22]Bootstrapping.[73] [59] [67] [72] [61] [27]Hybrid ensembling.[26] [25] [14]Case studies: introduction and performance modeling/optimization.[11] [35] [63] [24] [18] [21] [52] [7] [29] [17] [19] [23] [64] [20] [78][83] [84] [85] [86] [87] [73]

1

Page 137: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

References

[1] Mert Akdere, Ugur Cetintemel, Matteo Riondato, Eli Upfal, and Stanley B.Zdonik. Learning-based query performance modeling and prediction. InProceedings of the 2012 IEEE 28th International Conference on Data En-gineering, ICDE ’12, pages 390–401, Washington, DC, USA, 2012. IEEEComputer Society.

[2] Naomi S Altman. An introduction to kernel and nearest-neighbor nonpara-metric regression. The American Statistician, 46(3):175–185, 1992.

[3] Peter Auer, Nicolo Cesa-Bianchi, and Paul Fischer. Finite-time analysis ofthe multiarmed bandit problem. Machine Learning, 2002.

[4] Forest Baskett, K. Mani Chandy, Richard R. Muntz, and Fernando G. Pala-cios. Open, closed, and mixed networks of queues with different classes ofcustomers. J. ACM, 22(2):248–260, April 1975.

[5] Richard Bellman. Dynamic Programming. Princeton University Press,Princeton, NJ, 1957.

[6] James Bergstra and Yoshua Bengio. Random search for hyper-parameter op-timization. J. Mach. Learn. Res., 13(1):281–305, February 2012.

[7] Philip A Bernstein and Nathan Goodman. Concurrency control in distributeddatabase systems. ACM Computing Surveys (CSUR), 13(2):185–221, 1981.

[8] Christopher M. Bishop. Pattern Recognition and Machine Learning (Infor-mation Science and Statistics). Springer-Verlag New York, Inc., 2006.

[9] Leo Breiman. Bagging predictors. Mach. Learn., 24(2):123–140, August1996.

[10] Leo Breiman. Stacked regressions. Machine learning, 24(1):49–64, 1996.

[11] Christian Cachin, Rachid Guerraoui, and Luıs Rodrigues. Introduction toReliable and Secure Distributed Programming (2. ed.). Springer, 2011.

[12] Rich Caruana , Alexandru Niculescu-Mizil , Geoff Crew , Alex Ksikes En-semble selection from libraries of models. In Proc. of ICML, 2004.

[13] Jin Chen, G. Soundararajan, and C. Amza. Autonomic provisioning of back-end databases in dynamic content web servers. In Proceedings of the 2006IEEE International Conference on Autonomic Computing, ICAC ’06, pages231–242, Washington, DC, USA, 2006. IEEE Computer Society.

2

Page 138: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

[14] Jin Chen, G. Soundararajan, S. Ghanbari, and C. Amza. Model ensembletools for self-management in data centers. In Data Engineering Workshops(ICDEW), 2013 IEEE 29th International Conference on, pages 36–43, April2013.

[15] Tianshi Chen, Qi Guo, Ke Tang, Olivier Temam, Zhiwei Xu, Zhi-Hua Zhou,and Yunji Chen. Archranker: A ranking approach to design space exploration.SIGARCH Comput. Archit. News, 42(3):85–96, June 2014.

[16] Corinna Cortes and Vladimir Vapnik. Support-vector networks. MachineLearning, 1995.

[17] Maria Couceiro, Diego Didona, Lus Rodrigues, and Paolo Romano. Self-tuning in distributed transactional memory. In Rachid Guerraoui and PaoloRomano, editors, Transactional Memory. Foundations, Algorithms, Tools,and Applications, volume 8913 of Lecture Notes in Computer Science, pages418–448. Springer International Publishing, 2015.

[18] Maria Couceiro, Paolo Romano, and Luis Rodrigues. A machine learningapproach to performance prediction of total order broadcast protocols. In Self-Adaptive and Self-Organizing Systems (SASO), 2010 4th IEEE InternationalConference on, pages 184–193. IEEE, 2010.

[19] Maria Couceiro, Paolo Romano, and Luis Rodrigues. Polycert: Polymorphicself-optimizing replication for in-memory transactional grids. In Proceed-ings of the 12th ACM/IFIP/USENIX International Conference on Middle-ware, Middleware’11, pages 309–328, Berlin, Heidelberg, 2011. Springer-Verlag.

[20] Maria Couceiro, Pedro Ruivo, Paolo Romano, and Luis Rodrigues. Chas-ing the optimum in replicated in-memory transactional platforms via protocoladaptation. In Proceedings of the 2013 43rd Annual IEEE/IFIP InternationalConference on Dependable Systems and Networks (DSN), DSN ’13, pages1–12, Washington, DC, USA, 2013. IEEE Computer Society.

[21] Pierangelo Di Sanzo, Francesco Antonacci, Bruno Ciciani, RobertoPalmieri, Alessandro Pellegrini, Sebastiano Peluso, Francesco Quaglia,Diego Rughetti, and Roberto Vitali. A framework for high performance simu-lation of transactional data grid platforms. In Proceedings of the 6th Interna-tional ICST Conference on Simulation Tools and Techniques, SimuTools ’13,pages 63–72, ICST, Brussels, Belgium, Belgium, 2013. ICST (Institute forComputer Sciences, Social-Informatics and Telecommunications Engineer-ing).

3

Page 139: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

[22] Pierangelo Di Sanzo, Francesco Quaglia, Bruno Ciciani, Alessandro Pel-legrini, Diego Didona, Paolo Romano, Roberto Palmieri, and SebastianoPeluso. A flexible framework for accurate simulation of cloud in-memorydata stores. ArXiv e-prints, December 2014.

[23] Pierangelo Di Sanzo, Diego Rughetti, Bruno Ciciani, and Francesco Quaglia.Auto-tuning of cloud-based in-memory transactional data grids via machinelearning. In Proceedings of the 2012 Second Symposium on Network CloudComputing and Applications, NCCA ’12, pages 9–16, Washington, DC,USA, 2012. IEEE Computer Society.

[24] Diego Didona, Daniele Carnevale, Sergio Galeani, and Paolo Romano. Anextremum seeking algorithm for message batching in total order protocols. InSASO, pages 89–98. IEEE Computer Society, 2012.

[25] Diego Didona, Pascal Felber, Derin Harmanci, Paolo Romano, and JoergSchenker. Identifying the optimal level of parallelism in transactional mem-ory applications. Computing Journal, pages 1–21, December 2013.

[26] Diego Didona, Francesco Quaglia, Paolo Romano, and Ennio Torre. Enhanc-ing performance prediction robustness by combining analytical modeling andmachine learning. In Proceedings of the 2015 ACM/SPEC 6th InternationalConference on Performance Engineering (ICPE 2015), ICPE ’15, 2015.

[27] Diego Didona and Paolo Romano. On Bootstrapping Machine Learning Per-formance Predictors via Analytical Models. ArXiv e-prints, October 2014.

[28] Diego Didona and Paolo Romano. Performance modelling of partially repli-cated in-memory transactional stores. In Proceedings of the 22nd IEEE Inter-national Symposium on Modeling, Analysis and Simulation of Computer andTelecommunication Systems (MASCOTS 2014), MASCOTS ’14, 2014.

[29] Diego Didona and Paolo Romano. Self-tuning transactional data grids: Thecloud-tm approach. In Proceedings of the Symposium on Network CloudComputing and Applications, (NCCA), pages 113–120. IEEE, 2014.

[30] Diego Didona, Paolo Romano, Sebastiano Peluso, and Francesco Quaglia.Transactional auto scaler: Elastic scaling of replicated in-memory transac-tional data grids. ACM Trans. Auton. Adapt. Syst., 9(2):11:1–11:32, July2014.

[31] Nuno Diegues and Paolo Romano. Self-tuning intel transactional synchro-nization extensions. In 11th International Conference on Autonomic Com-

4

Page 140: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

puting (ICAC 14), pages 209–219, Philadelphia, PA, June 2014. USENIXAssociation.

[32] Thomas G. Dietterich. Ensemble methods in machine learning. In Proc. ofMCS Workshop, 2000.

[33] Harris Drucker, Chris, Burges* L. Kaufman, Alex Smola, and Vladimir Vap-nik. Support vector regression machines. In Advances in Neural InformationProcessing Systems 9, volume 9, pages 155–161, 1997.

[34] Jerome H. Friedman. Stochastic gradient boosting. Comput. Stat. Data Anal.,38(4):367–378, February 2002.

[35] Toy Friedman and Robbert Van Renesse. Packing messages as a tool forboosting the performance of total ordering protocls. In Proceedings of the6th IEEE International Symposium on High Performance Distributed Com-puting, HPDC ’97, pages 233–, Washington, DC, USA, 1997. IEEE Com-puter Society.

[36] Richard M. Fujimoto. Parallel discrete event simulation. Commun. ACM,33(10):30–53, October 1990.

[37] Archana Ganapathi, Harumi Kuno, Umeshwar Dayal, Janet L. Wiener, Ar-mando Fox, Michael Jordan, and David Patterson. Predicting multiple metricsfor queries: Better decisions enabled by machine learning. In Proceedings ofthe 2009 IEEE International Conference on Data Engineering, ICDE ’09,pages 592–603, Washington, DC, USA, 2009. IEEE Computer Society.

[38] Saeed Ghanbari, Gokul Soundararajan, Jin Chen, and Cristiana Amza. Adap-tive learning of metric correlations for temperature-aware database provision-ing. In Proceedings of the Fourth International Conference on AutonomicComputing, ICAC ’07, pages 26–, Washington, DC, USA, 2007. IEEE Com-puter Society.

[39] Donald Gross, John F Shortle, James M Thompson, and Carl M Harris. Fun-damentals of queueing theory. John Wiley & Sons, 2013.

[40] Isabelle Guyon and Andre Elisseeff. An introduction to variable and featureselection. The Journal of Machine Learning Research, 3:1157–1182, 2003.

[41] Martin T. Hagan, Howard B. Demuth, and Mark Beale. Neural NetworkDesign. PWS Publishing Co., Boston, MA, USA, 1996.

5

Page 141: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

[42] Mark Hall et al. The weka data mining software: An update. SIGKDD Explor.Newsl., 11(1):10–18, November 2009.

[43] Herodotos Herodotou, Fei Dong, and Shivnath Babu. No one (cluster) sizefits all: automatic cluster sizing for data-intensive analytics. In Proc. of theACM Symposium on Cloud Computing (SOCC), 2011.

[44] James R Jackson. Networks of waiting lines. Operations research, 5(4):518–521, 1957.

[45] Leonard Kleinrock. Queueing Systems, volume I: Theory. Wiley Interscience,1975.

[46] John DC Little. A proof for the queuing formula: L= l w. Operationsresearch, 9(3):383–387, 1961.

[47] Wei-Yin Loh. Classification and regression trees. Wiley InterdisciplinaryReviews: Data Mining and Knowledge Discovery, 1(1):14–23, 2011.

[48] Philip M. Long and Rocco A. Servedio. Random classification noise defeatsall convex potential boosters. Mach. Learn., 78(3):287–304, March 2010.

[49] Daniel A. Menasce and Virgilio Almeida. Capacity Planning for Web Ser-vices: Metrics, Models, and Methods. Prentice Hall PTR, Upper SaddleRiver, NJ, USA, 1st edition, 2001.

[50] Daniel A. Menasce, Lawrence W. Dowdy, and Virgilio A. F. Almeida. Perfor-mance by Design: Computer Capacity Planning By Example. Prentice HallPTR, Upper Saddle River, NJ, USA, 2004.

[51] T. Murata. Petri nets: Properties, analysis and applications. Proceedings ofthe IEEE, 77(4):541–580, Apr 1989.

[52] Matthias Nicola and Matthias Jarke. Performance modeling of distributedand replicated databases. IEEE Trans. on Knowl. and Data Eng., 2000.

[53] J. R. Quinlan. Learning with continuous classes. In Proceedings of the 5thAustralian Joint Conference on Artificial Intelligence (AI), pages 343–348.World Scientific, 1992.

[54] J. R. Quinlan. Improved use of continuous attributes in c4.5. J. Artif. Int.Res., 4(1):77–90, March 1996.

[55] J. R. Quinlan. Learning decision tree classifiers. ACM Comput. Surv.,28(1):71–72, March 1996.

6

Page 142: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

[56] J. Ross Quinlan. C4.5: Programs for Machine Learning. Morgan KaufmannPublishers Inc., 1993.

[57] Jia Rao, Xiangping Bu, Cheng-Zhong Xu, Leyi Wang, and George Yin.Vconf: A reinforcement learning approach to virtual machines auto-configuration. In Proceedings of the 6th International Conference on Au-tonomic Computing, ICAC ’09, pages 137–146, New York, NY, USA, 2009.ACM.

[58] M. Reiser and S. S. Lavenberg. Mean-value analysis of closed multichainqueuing networks. J. ACM, 27(2):313–322, April 1980.

[59] Paolo Romano and Matteo Leonetti. Self-tuning batching in total order broad-cast protocols via analytical modelling and reinforcement learning. In In-ternational Conference on Computing, Networking and Communications.,ICNC, 2011.

[60] Diego Rughetti, Pierangelo Di Sanzo, Bruno Ciciani, and Francesco Quaglia.Machine learning-based self-adjusting concurrency in software transactionalmemory systems. In Proc. of the International Symposium on Model-ing, Analysis and Simulation of Computer and Telecommunication Systems,MASCOTS, 2012.

[61] Diego Rughetti, Pierangelo Di Sanzo, Bruno Ciciani, and Francesco Quaglia.Analytical/ml mixed approach for concurrency regulation in software trans-actional memory. In IEEE/ACM International Symposium on Cluster, Cloudand Grid Computing, CCGRID, 2014.

[62] G. A. Rummery and M. Niranjan. On-line Q-learning using connectionistsystems. Technical Report TR 166, Cambridge University Engineering De-partment, Cambridge, England, 1994.

[63] Nuno Santos and Andre Schiper. Optimizing paxos with batching and pipelin-ing. Theor. Comput. Sci., 496:170–183, July 2013.

[64] Pierangelo Di Sanzo, Francesco Molfese, Diego Rughetti, and Bruno Ciciani.Providing transaction class-based qos in in-memory data grids via machinelearning. In Proceedings of the 2014 IEEE 3rd Symposium on Network CloudComputing and Applications (Ncca 2014), NCCA ’14, pages 46–53, Wash-ington, DC, USA, 2014. IEEE Computer Society.

[65] Robert E. Schapire. The strength of weak learnability. Mach. Learn.,5(2):197–227, July 1990.

7

Page 143: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

[66] Robert E. Schapire. A brief introduction to boosting. In Proceedings ofthe 16th International Joint Conference on Artificial Intelligence - Volume 2,IJCAI’99, pages 1401–1406, San Francisco, CA, USA, 1999. Morgan Kauf-mann Publishers Inc.

[67] Bianca Schroeder, Mor Harchol-Balter, Arun Iyengar, Erich Nahum, andAdam Wierman. How to determine a good multi-programming level forexternal scheduling. In Proc. of the International Conference on Data En-gineering, ICDE, 2006.

[68] Karan Singh, Engin Ipek, Sally A. McKee, Bronis R. de Supinski, MartinSchulz, and Rich Caruana. Predicting parallel application performance viamachine learning approaches: Research articles. Concurr. Comput. : Pract.Exper., 19(17):2219–2235, December 2007.

[69] Jasper Snoek, Hugo Larochelle, and Ryan P Adams. Practical bayesian opti-mization of machine learning algorithms. In Advances in Neural InformationProcessing Systems, pages 2951–2959, 2012.

[70] Richard S. Sutton and Andrew G. Barto. Introduction to ReinforcementLearning. MIT Press, Cambridge, MA, USA, 1st edition, 1998.

[71] Y. C. Tay. Analytical Performance Modeling for Computer Systems, SecondEdition. Morgan & Claypool Publishers, 2013.

[72] Gerald Tesauro, Nicholas K. Jong, Rajarshi Das, and Mohamed N. Bennani.On the use of hybrid reinforcement learning for autonomic resource alloca-tion. Cluster Computing, 2007.

[73] Eno Thereska and Gregory R. Ganger. Ironmodel: Robust performance mod-els in the wild. SIGMETRICS Perform. Eval. Rev., 36, June 2008.

[74] Chris Thornton, Frank Hutter, Holger H. Hoos, and Kevin Leyton-Brown.Auto-weka: Combined selection and hyperparameter optimization of classi-fication algorithms. In Proceedings of the 19th ACM SIGKDD InternationalConference on Knowledge Discovery and Data Mining, KDD ’13, pages 847–855, New York, NY, USA, 2013. ACM.

[75] D. Tsoumakos, I. Konstantinou, C. Boumpouka, S. Sioutas, and N. Koziris.Automated, elastic resource provisioning for nosql clusters using tiramola. InCluster, Cloud and Grid Computing (CCGrid), 2013 13th IEEE/ACM Inter-national Symposium on, pages 34–41, May 2013.

8

Page 144: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

[76] Laurens JP van der Maaten, Eric O Postma, and H Jaap van den Herik. Di-mensionality reduction: A comparative review. Technical Report TiCC-TR2009-005, Tilburg University, 2009.

[77] Christopher John Cornish Hellaby Watkins. Learning from delayed rewards.PhD thesis, University of Cambridge, 1989.

[78] Pawel T. Wojciechowski, Tadeusz Kobus, and Maciej Kokocinski. Model-driven comparison of state-machine-based and deferred-update replicationschemes. In Proceedings of the 2012 IEEE 31st Symposium on Reliable Dis-tributed Systems, SRDS ’12, pages 101–110, Washington, DC, USA, 2012.IEEE Computer Society.

[79] David H. Wolpert. Original contribution: Stacked generalization. NeuralNetw., 5(2):241–259, February 1992.

[80] David H. Wolpert. The lack of a priori distinctions between learning algo-rithms. Neural Comput., 8(7):1341–1390, October 1996.

[81] Pengcheng Xiong, Yun Chi, Shenghuo Zhu, Hyun Jin Moon, Calton Pu,and Hakan Hacigumus. Intelligent management of virtualized resources fordatabase systems in cloud environment. In Proceedings of the 2011 IEEE27th International Conference on Data Engineering, ICDE ’11, pages 87–98, Washington, DC, USA, 2011. IEEE Computer Society.

[82] Pengcheng Xiong, Yun Chi, Shenghuo Zhu, Junichi Tatemura, Calton Pu, andHakan HacigumuS. Activesla: A profit-oriented admission control frame-work for database-as-a-service providers. In Proceedings of the 2Nd ACMSymposium on Cloud Computing, SOCC ’11, pages 15:1–15:14, New York,NY, USA, 2011. ACM.

[83] Steve Zhang, Ira Cohen, Julie Symons, and Armando Fox. Ensembles ofmodels for automated diagnosis of system performance problems. In Pro-ceedings of the 2005 International Conference on Dependable Systems andNetworks, DSN ’05, pages 644–653, Washington, DC, USA, 2005. IEEEComputer Society.

[84] Ludmila Cherkasova, Kivanc Ozonat, Ningfang Mi, Julie Symons, and Ev-genia Smirni. Automated anomaly detection and performance modeling ofenterprise applications. ACM Trans. Comput. Syst., 27(3):6:1–6:32, Novem-ber 2009.

9

Page 145: Hybrid'Machine' Learning/Analytical'Models'for ...romanop/files/papers/tutorial-icpe15.pdf · STOB&white&box&modeling • Focus&on&performance&on&sequencer • It&is&representative&of&the&whole&system

[85] Yongmin Tan, Hiep Nguyen, Zhiming Shen, Xiaohui Gu, Chitra Venkatra-mani, and Deepak Rajan. Prepare: Predictive performance anomaly preven-tion for virtualized cloud systems. In Proceedings of the 2012 IEEE 32NdInternational Conference on Distributed Computing Systems, ICDCS ’12,pages 285–294, Washington, DC, USA, 2012. IEEE Computer Society.

[86] Varun Chandola, Arindam Banerjee, and Vipin Kumar. Anomaly detection:A survey. ACM Computing Surveys (CSUR), 41(3):15, 2009.

[87] Ira Cohen, Moises Goldszmidt, Terence Kelly, Julie Symons, and Jeffrey S.Chase. Correlating instrumentation data to system states: A building blockfor automated diagnosis and control. In Proceedings of the 6th Conferenceon Symposium on Opearting Systems Design & Implementation - Volume 6,OSDI’04, pages 16–16, Berkeley, CA, USA, 2004. USENIX Association.

10