35
Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions Our target: CFS What can we do ? Results and analysis Conclusion Machine Learning applied to Process Scheduling Benoit Zanotti [email protected] http://www.lse.epita.fr July 17, 2013

Machine Learning applied to Process Scheduling · Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions Machine Learning Process Scheduling Our

  • Upload
    others

  • View
    14

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Machine Learning applied to Process Scheduling · Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions Machine Learning Process Scheduling Our

Machine Learningapplied to Process

Scheduling

Benoit Zanotti

Introduction anddefinitions

Our target: CFS

What can we do ?

Results andanalysis

Conclusion

Machine Learning applied to ProcessScheduling

Benoit Zanotti

[email protected]://www.lse.epita.fr

July 17, 2013

Page 2: Machine Learning applied to Process Scheduling · Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions Machine Learning Process Scheduling Our

Machine Learningapplied to Process

Scheduling

Benoit Zanotti

Introduction anddefinitionsMachine Learning

Process Scheduling

Our target: CFS

What can we do ?

Results andanalysis

Conclusion

Plan

1 Introduction and definitionsMachine LearningProcess Scheduling

Page 3: Machine Learning applied to Process Scheduling · Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions Machine Learning Process Scheduling Our

Machine Learningapplied to Process

Scheduling

Benoit Zanotti

Introduction anddefinitionsMachine Learning

Process Scheduling

Our target: CFS

What can we do ?

Results andanalysis

Conclusion

Plan

1 Introduction and definitionsMachine LearningProcess Scheduling

Page 4: Machine Learning applied to Process Scheduling · Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions Machine Learning Process Scheduling Our

Machine Learningapplied to Process

Scheduling

Benoit Zanotti

Introduction anddefinitionsMachine Learning

Process Scheduling

Our target: CFS

What can we do ?

Results andanalysis

Conclusion

Definition of Machine Learning

DefinitionMachine Learning is a field of Computer Science aboutthe construction and study of systems that can learn fromdata.

Usual organizations of ML algorithms :Supervised learning (classification, ...)Unsupervised learning (clustering, ...)Semi-supervised learning...

Page 5: Machine Learning applied to Process Scheduling · Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions Machine Learning Process Scheduling Our

Machine Learningapplied to Process

Scheduling

Benoit Zanotti

Introduction anddefinitionsMachine Learning

Process Scheduling

Our target: CFS

What can we do ?

Results andanalysis

Conclusion

Notes about Machine Learning

We won’t talk really about the theory. But:Pretreatment is very important.Usually, big tradeoff between speed and efficiency

In Process Scheduling, those factors will be limiting.

Page 6: Machine Learning applied to Process Scheduling · Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions Machine Learning Process Scheduling Our

Machine Learningapplied to Process

Scheduling

Benoit Zanotti

Introduction anddefinitionsMachine Learning

Process Scheduling

Our target: CFS

What can we do ?

Results andanalysis

Conclusion

Plan

1 Introduction and definitionsMachine LearningProcess Scheduling

Page 7: Machine Learning applied to Process Scheduling · Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions Machine Learning Process Scheduling Our

Machine Learningapplied to Process

Scheduling

Benoit Zanotti

Introduction anddefinitionsMachine Learning

Process Scheduling

Our target: CFS

What can we do ?

Results andanalysis

Conclusion

What is Process Scheduling ?

DefinitionProcess Scheduling is the method by which processes aregiven access to processor time. It is used to achieved multi-tasking.

There is many well-known scheduling algorithms. Forexample:

First In, First OutRound-Robin (fixed time unit, processes in a circle)

Page 8: Machine Learning applied to Process Scheduling · Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions Machine Learning Process Scheduling Our

Machine Learningapplied to Process

Scheduling

Benoit Zanotti

Introduction anddefinitionsMachine Learning

Process Scheduling

Our target: CFS

What can we do ?

Results andanalysis

Conclusion

Main concerns

A scheduler has mainly 3 metrics: throughput, latencyand fairness. We can simplify them (in practice) by:

Speed (how much time the scheduler itself uses,number of context-switching, ...)Fairness (giving equal CPU time to each process)Reactivity (are interactive processes given anyadvantages ?)

A scheduler is complicated. Let’s optimize one using ML !

Page 9: Machine Learning applied to Process Scheduling · Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions Machine Learning Process Scheduling Our

Machine Learningapplied to Process

Scheduling

Benoit Zanotti

Introduction anddefinitions

Our target: CFSInner workings

Advantages/Inconvenients

What can we do ?

Results andanalysis

Conclusion

Plan

2 Our target: CFSInner workingsAdvantages/Inconvenients

Page 10: Machine Learning applied to Process Scheduling · Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions Machine Learning Process Scheduling Our

Machine Learningapplied to Process

Scheduling

Benoit Zanotti

Introduction anddefinitions

Our target: CFSInner workings

Advantages/Inconvenients

What can we do ?

Results andanalysis

Conclusion

Plan

2 Our target: CFSInner workingsAdvantages/Inconvenients

Page 11: Machine Learning applied to Process Scheduling · Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions Machine Learning Process Scheduling Our

Machine Learningapplied to Process

Scheduling

Benoit Zanotti

Introduction anddefinitions

Our target: CFSInner workings

Advantages/Inconvenients

What can we do ?

Results andanalysis

Conclusion

Inner workings of CFS

Stands for Completely Fair SchedulerScheduler of Linux since 2.6.23Just an RB-tree with elements indexed by theruntime of the process.Straightforward algorithm: just take the minimum ofthe tree.

CFS in Linux kernel is actually more complicated(handling Real-Time tasks, nice values, ...)

Page 12: Machine Learning applied to Process Scheduling · Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions Machine Learning Process Scheduling Our

Machine Learningapplied to Process

Scheduling

Benoit Zanotti

Introduction anddefinitions

Our target: CFSInner workings

Advantages/Inconvenients

What can we do ?

Results andanalysis

Conclusion

Why CFS ?

Quite simple and works really wellMost familiar (I implemented one in mikro)Already efficient. I wanted to see what ML could do.

Page 13: Machine Learning applied to Process Scheduling · Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions Machine Learning Process Scheduling Our

Machine Learningapplied to Process

Scheduling

Benoit Zanotti

Introduction anddefinitions

Our target: CFSInner workings

Advantages/Inconvenients

What can we do ?

Results andanalysis

Conclusion

Plan

2 Our target: CFSInner workingsAdvantages/Inconvenients

Page 14: Machine Learning applied to Process Scheduling · Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions Machine Learning Process Scheduling Our

Machine Learningapplied to Process

Scheduling

Benoit Zanotti

Introduction anddefinitions

Our target: CFSInner workings

Advantages/Inconvenients

What can we do ?

Results andanalysis

Conclusion

Advantages/Inconvenients

3 Very simple to understand3 Works really well in general cases3 No real corner cases7 A little light on the handling of interactive processes.

Page 15: Machine Learning applied to Process Scheduling · Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions Machine Learning Process Scheduling Our

Machine Learningapplied to Process

Scheduling

Benoit Zanotti

Introduction anddefinitions

Our target: CFS

What can we do ?ML considerations

Applying ML to the CFS

Results andanalysis

Conclusion

Plan

3 What can we do ?ML considerationsApplying ML to the CFS

Page 16: Machine Learning applied to Process Scheduling · Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions Machine Learning Process Scheduling Our

Machine Learningapplied to Process

Scheduling

Benoit Zanotti

Introduction anddefinitions

Our target: CFS

What can we do ?ML considerations

Applying ML to the CFS

Results andanalysis

Conclusion

Plan

3 What can we do ?ML considerationsApplying ML to the CFS

Page 17: Machine Learning applied to Process Scheduling · Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions Machine Learning Process Scheduling Our

Machine Learningapplied to Process

Scheduling

Benoit Zanotti

Introduction anddefinitions

Our target: CFS

What can we do ?ML considerations

Applying ML to the CFS

Results andanalysis

Conclusion

ML considerations

Restricted to supervised learning (classification andregression mainly)Scheduler must be as fast as possible. Its MLcomponents too.Avoiding complex code in the kernel is often a goodidea.

→ precomputed model/profile for each processes→ no complex methods, results will be mitigated

Page 18: Machine Learning applied to Process Scheduling · Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions Machine Learning Process Scheduling Our

Machine Learningapplied to Process

Scheduling

Benoit Zanotti

Introduction anddefinitions

Our target: CFS

What can we do ?ML considerations

Applying ML to the CFS

Results andanalysis

Conclusion

Plan

3 What can we do ?ML considerationsApplying ML to the CFS

Page 19: Machine Learning applied to Process Scheduling · Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions Machine Learning Process Scheduling Our

Machine Learningapplied to Process

Scheduling

Benoit Zanotti

Introduction anddefinitions

Our target: CFS

What can we do ?ML considerations

Applying ML to the CFS

Results andanalysis

Conclusion

Applying ML to the CFS

Ojective: reducing the number of context switchs:A process time quantum should ideally not finish(process going to sleep)An estimation of the next quantum would helpBased on the N lasts quantumsBe careful not to be too unfair

Note: Many other objectives were possible...

Page 20: Machine Learning applied to Process Scheduling · Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions Machine Learning Process Scheduling Our

Machine Learningapplied to Process

Scheduling

Benoit Zanotti

Introduction anddefinitions

Our target: CFS

What can we do ?ML considerations

Applying ML to the CFS

Results andanalysis

Conclusion

Actual implementation

Proof of ConceptOne using Taylor’s Theorem and one using aclassifierNeed to extract real runtime quantums and to createprofiles

Page 21: Machine Learning applied to Process Scheduling · Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions Machine Learning Process Scheduling Our

Machine Learningapplied to Process

Scheduling

Benoit Zanotti

Introduction anddefinitions

Our target: CFS

What can we do ?ML considerations

Applying ML to the CFS

Results andanalysis

Conclusion

Taylor’s theorem

The sequence of quantums can be seen as a functionof the time.Taylor’s theorem gives an approximation of afunction on a point given its derivativesDiscrete derivation is only substraction

→ an approximation of the next quantum is:

f (x + 1) = f (x) + f ′(x − 1) +f ′′(x − 1)

2

This method is simple and fast, but not very precise.

Page 22: Machine Learning applied to Process Scheduling · Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions Machine Learning Process Scheduling Our

Machine Learningapplied to Process

Scheduling

Benoit Zanotti

Introduction anddefinitions

Our target: CFS

What can we do ?ML considerations

Applying ML to the CFS

Results andanalysis

Conclusion

Classifier

Naive Bayes Classifier using the last 4 quantums:It is the best (found) compromise between speed andresultsParameters and output are range of time, not theactual valuesBased on Bayes’ theorem. Outputs the labels withmost probabilityOnly 4 multiplications are needed for each label(there is 10 of them).Using bit manipulation, we can avoid anyconditionals

→ it is fast, but clearly not the most accurate

Page 23: Machine Learning applied to Process Scheduling · Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions Machine Learning Process Scheduling Our

Machine Learningapplied to Process

Scheduling

Benoit Zanotti

Introduction anddefinitions

Our target: CFS

What can we do ?

Results andanalysisperf and Linsched

Methodology and results

Analysis

Conclusion

Plan

4 Results and analysisperf and LinschedMethodology and resultsAnalysis

Page 24: Machine Learning applied to Process Scheduling · Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions Machine Learning Process Scheduling Our

Machine Learningapplied to Process

Scheduling

Benoit Zanotti

Introduction anddefinitions

Our target: CFS

What can we do ?

Results andanalysisperf and Linsched

Methodology and results

Analysis

Conclusion

Plan

4 Results and analysisperf and LinschedMethodology and resultsAnalysis

Page 25: Machine Learning applied to Process Scheduling · Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions Machine Learning Process Scheduling Our

Machine Learningapplied to Process

Scheduling

Benoit Zanotti

Introduction anddefinitions

Our target: CFS

What can we do ?

Results andanalysisperf and Linsched

Methodology and results

Analysis

Conclusion

perf

perfPerformance analysis tools for LinuxBased on kernel-based performance countersCan be used to extract many scheduling stats

Page 26: Machine Learning applied to Process Scheduling · Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions Machine Learning Process Scheduling Our

Machine Learningapplied to Process

Scheduling

Benoit Zanotti

Introduction anddefinitions

Our target: CFS

What can we do ?

Results andanalysisperf and Linsched

Methodology and results

Analysis

Conclusion

Linsched

LinschedLinux Scheduler Simulator (in userland...)

3 Easy to use (cycle of development, debugging, ...)and fast

3 Can replay records from perf7 Hard to quantify how much time is used by the

scheduler

Page 27: Machine Learning applied to Process Scheduling · Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions Machine Learning Process Scheduling Our

Machine Learningapplied to Process

Scheduling

Benoit Zanotti

Introduction anddefinitions

Our target: CFS

What can we do ?

Results andanalysisperf and Linsched

Methodology and results

Analysis

Conclusion

Plan

4 Results and analysisperf and LinschedMethodology and resultsAnalysis

Page 28: Machine Learning applied to Process Scheduling · Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions Machine Learning Process Scheduling Our

Machine Learningapplied to Process

Scheduling

Benoit Zanotti

Introduction anddefinitions

Our target: CFS

What can we do ?

Results andanalysisperf and Linsched

Methodology and results

Analysis

Conclusion

Methodology of the tests

Use perf to extract records and datasetsUse WEKA to compute profiles for each processTest using vanilla/modified linsched to see the gainTime the tests of vanilla/modified linsched toestimate how costly each method is

Page 29: Machine Learning applied to Process Scheduling · Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions Machine Learning Process Scheduling Our

Machine Learningapplied to Process

Scheduling

Benoit Zanotti

Introduction anddefinitions

Our target: CFS

What can we do ?

Results andanalysisperf and Linsched

Methodology and results

Analysis

Conclusion

ResultsTi

me

used

(bas

e=10

0)

Results of the simulation (without scheduler time)

100 98 95

Vanilla Extrapolation Classifier0

20

40

60

80

100

120

Page 30: Machine Learning applied to Process Scheduling · Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions Machine Learning Process Scheduling Our

Machine Learningapplied to Process

Scheduling

Benoit Zanotti

Introduction anddefinitions

Our target: CFS

What can we do ?

Results andanalysisperf and Linsched

Methodology and results

Analysis

Conclusion

ResultsTi

me

used

(bas

e=10

0)

Results of the simulation (with scheduler time)

100 102 98

Vanilla Extrapolation Classifier0

20

40

60

80

100

120

Page 31: Machine Learning applied to Process Scheduling · Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions Machine Learning Process Scheduling Our

Machine Learningapplied to Process

Scheduling

Benoit Zanotti

Introduction anddefinitions

Our target: CFS

What can we do ?

Results andanalysisperf and Linsched

Methodology and results

Analysis

Conclusion

Plan

4 Results and analysisperf and LinschedMethodology and resultsAnalysis

Page 32: Machine Learning applied to Process Scheduling · Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions Machine Learning Process Scheduling Our

Machine Learningapplied to Process

Scheduling

Benoit Zanotti

Introduction anddefinitions

Our target: CFS

What can we do ?

Results andanalysisperf and Linsched

Methodology and results

Analysis

Conclusion

Analysis

CFS is already quite goodML results are positive but very limitedMore complex pretreatment/ML techniques wouldyield better results... at which cost ?

Page 33: Machine Learning applied to Process Scheduling · Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions Machine Learning Process Scheduling Our

Machine Learningapplied to Process

Scheduling

Benoit Zanotti

Introduction anddefinitions

Our target: CFS

What can we do ?

Results andanalysis

Conclusion

Plan

5 Conclusion

Page 34: Machine Learning applied to Process Scheduling · Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions Machine Learning Process Scheduling Our

Machine Learningapplied to Process

Scheduling

Benoit Zanotti

Introduction anddefinitions

Our target: CFS

What can we do ?

Results andanalysis

Conclusion

Conclusion

It was only one idea on one objective.Using ML in scheduling is hard, because of thespeed/results tradeoff

Difficulties for a real kernel integration (passing themodels, limiting abuses, ...)Basic rule in scheduling: "Simpler is Better"Another idea: run a (kernel ?) process every X hoursto compute new profiles...K. Kumar Pusukuri, A. Negi, Applying machinelearning techniques to improve Linux process scheduling,Dec. 2005.

Page 35: Machine Learning applied to Process Scheduling · Machine Learning applied to Process Scheduling Benoit Zanotti Introduction and definitions Machine Learning Process Scheduling Our

Machine Learningapplied to Process

Scheduling

Benoit Zanotti

Introduction anddefinitions

Our target: CFS

What can we do ?

Results andanalysis

Conclusion

Questions ?

Questions ?