Chapter 5 Distributed Process Scheduling. 5.1 A System Performance Model

Chapter 5 Distributed Process Scheduling.

5.1 A System Performance Model

--Niharika Muriki

Outline• Need for Scheduling

• Process Interaction Models

• System Performance Model

• Efficiency Loss

• Distribution of Workload

• Comparison of Performance for Workload Sharing

• Latest Relevant Applications

• Future Work

• References

Scheduling• As we have numerous number of processes running in parallel,

scheduling these process plays a major role.

• Before execution, processes need to scheduled and allocated with required resources.

• Results of Scheduling:• Enhance overall system performance• Process completion time is minimized• Processor utilization is enhanced• Helps in achieving location and performance transparency in

distributed systems.

Issues of Process Scheduling

Process scheduling in distributed systems touches upon severalpractical considerations that are often omitted in the traditional multiprocessor scheduling.

In distributed systems, • Communication overhead is non-negligible.• Effect of the underlying architecture cannot be ignored.• Dynamic behavior of the system must be addressed.

Process Interaction Models

Based on the differences in interactions between processes, we have 3 types of process interaction models namely,

• Precedence process model • Communication process model• Disjoint process model

Process Interaction Models[1](Contd.)

We have depicted the differences in interactions between processes using a simple example of a program computation consisting of four processes mapped to a two-processor multiple computer system.

Precedence Process Model

• Processes are represented by a Directed Acyclic Graph (DAG).• May incur communication

overhead.• This model is best applied to the

concurrent processes.• Use: Minimize the total completion

time of the task.

Total Completion Time= Computation Time + Communication time

Communication Process Model

• Processes communicate asynchronously. • Optimize the total cost of communication and computation.• The task is partitioned in such a way that minimizes the inter

processor communication and computation costs of processes on processors.

Disjoint Process Model• Process interaction is implicit.• Processors utilization is maximized and turnaround time of the

processes is minimized.• Partitioning a task into multiple processes for execution can

result in a speedup of the total task completion time.

System Performance Model

Speedup is a function of• Algorithm design • Underlying system architecture. • Efficiency of the scheduling algorithm .

System Performance Model[1]

• S can also be written as :

Where,• OSPT (optimal sequential processing time): the best time that can be achieved on a single

processor using the best sequential algorithm

• CPT (concurrent processing time): the actual time achieved on a n-processor system with the concurrent algorithm and a specific scheduling method being considered

• OCPTideal (optimal concurrent processing time on an ideal system): the best time that can

achieved with the concurrent algorithm being considered on an ideal n-processor system(no inter-communication overhead) and scheduled by an optimal scheduling policy

• Si: the ideal speedup by using a multiple processor system over the best sequential time

• Sd: the degradation of the system due to actual implementation compared to an ideal

system

n=number of processors.m=number of tasks in the algorithm.

RP=Relative Processing requirement. (RP 1)

RC=Relative Concurrency. RC=1 best use of the processors


Si can be further derived as,

---the efficiency lessthe ratio of the real system overhead due to all causes to the ideal optimal processing time.

Two parts: sched + syst


Sd can be rewritten as

Finally we can get

Efficiency lossReal system

Ideal system

Multiple computer system

X’ X

Scheduling policy Y’ Y

'

)()()',(

)',(

schedsyst

ideal

idealideal

ideal

ideal

ideal

ideal

OCPTOCPTYCPT

OCPTYCPTYXCPT

OCPTOCPTYXCPT

'

)()(),(

),(

systsched

ideal

ideal

ideal

ideal

ideal

OCPTOCPTXOCPT

OCPTXOCPTZXCPT

OCPTOCPTZXCPT

Ideal system

Non-Ideal system

• Efficiency loss can be expressed as:

Efficiency lossFollowing figure demonstrates the decomposition of efficiency loss due to scheduling and system communication.

The significance of the impact of communication on system performance must be carefully addressed in the design of distributed scheduling algorithm.

Workload Distribution• Load sharing: static workload distribution• Dispatch processes to the idle processors statically upon

arrival• Corresponding to processor pool model

• Load balancing: dynamic workload distribution• Migrate processes dynamically from heavily loaded

processors to lightly loaded processors• Corresponding to migration workstation model

• Model by queuing theory: X/Y/c• An arrival process X, a service time distribution of

Y, and c servers.• : arrival rate;

: service rate; : migration rate• : depends on channel bandwidth, migration

protocol, context and state information of the process being transferred.

Workload Distribution

Processor-Pool and Workstation Queuing Models

Static Load Sharing Dynamic Load Balancing

*M for Markovian distribution

=0 M/M/1=M/M/2

COMPARISON OF PERFORMANCE FOR WORKLOAD SHARING

Latest Relevant Application[2]

• In the situation where there are multiple users or a networked computer system, you probably share a printer with other users. When you request to print a file, your request is added to the print queue. When your request reaches the front of the print queue, your file is printed. This ensures that only one person at a time has access to the printer and that this access is given on a first-come, first-served basis.

Latest Relevant Examples[2]

• When you phone the toll-free number for your bank or any other customer service you may get a recording that says, "Thank you for calling XYZ Bank. Your call will be answered by the next available operator. Please wait." This is a queuing system.

• Vehicles on toll-tax bridge: The vehicle that comes first to the toll tax booth leaves the booth first. The vehicle that comes last leaves last. Therefore, it follows first-in-first-out (FIFO) strategy of queue.

Future Work• Distributed flow scheduling in an unknown environment[3]

Flow scheduling is crucial in the next-generation network but hard to address due to fast changing link states and tremendous cost to explore the global structure.

• Pareto-Optimal Cloud Bursting[4]

Large-scale Bag-of-Tasks (BoT) applications are characterized by their massively parallel, yet independent operations. The use of resources in public clouds to dynamically expand the capacity of a private computer system might be an appealing alternative to cope with such massive parallelism. To fully realize the benefit of this 'cloud bursting', the performance to cost ratio (or cost efficiency) must be thoroughly studied and incorporated into scheduling and resource allocation strategies.

References[1] Randy Chow, Theodore Johnson, Distributed Operating Systems & Algorithms, 1997

[2] 5 real life instances where queue operations are being used http://wiki.answers.com/Q/List_out_atleast_5_real_life_instances_where_queue_operations_are_being_used.

[3] Yaoqing Yang ., Kegin Liu, & Pingyi Fan, Distributed flow scheduling in an unknown environment. http://ieeexplore.ieee.org/xpl/articleDetails.jsp?tp=&arnumber=6576397&sortType%3Ddesc_p_Publication_Year%26queryText%3DDistributed+scheduling

[4] M. Reza HoseinyFarahabady, Young Choon Lee, Albert Y. Zomaya, "Pareto-Optimal Cloud Bursting," IEEE Transactions on Parallel and Distributed Systems, 27 Aug. 2013. IEEE computer Society Digital Library. IEEE Computer Society, http://doi.ieeecomputersociety.org/10.1109/TPDS.2013.218

http://wiki.answers.com/Q/List_out_atleast_5_real_life_instances_where_queue_operations_are_being_used.




http://ieeexplore.ieee.org/xpl/articleDetails.jsp?tp=&arnumber=6576397&sortType=desc_p_Publication_Year&queryText=Distributed+scheduling




http://doi.ieeecomputersociety.org/10.1109/TPDS.2013.218



Thank You

Documents

Chapter 5 Distributed Process Scheduling. 5.1 A System Performance Model