Reducing Queue Lock Pessimism in Multiprocessor Schedulability Analysis

1

Reducing Queue Lock Pessimism in Multiprocessor Schedulability

Analysis

Yang Chang, Robert Davis and Andy WellingsReal-time Systems Research Group

University of York

2

Overview Background

Multiprocessor scheduling Multiprocessor resource sharing

Pessimism in existing work A new model of spinning Schedulability analysis Empirical evaluations Conclusion and future work

Pentium 4

Background: processors

3

Core i7

4

Background: multiprocessor scheduling

Partitioned scheduling A task can only run on a specific processor Existing exact analysis for single processors can be used NP-hard to find an optimal task allocation

Global scheduling Allows tasks to migrate among processors More appropriate for open systems Overheads related to task migration Tractable exact analysis is so far unknown for sporadic tasks

Fixed task-priority / fixed job-priority / dynamic-priority

Focus on fixed task-priority global scheduling (global FP)

5

Background: multiprocessor resource sharing protocols

Fully Partitioned multiprocessor system MPCP of Rajkumar et al. 1988 MSRP of Gai et al. 2001

Global multiprocessor system FMLP of Block et al. 2007 Devi et al. 2006 P-PCP of Easwaran and Anderson, 2009

Red ones use queue locks (FIFO-queue-based non-preemptive spin locks) as their building blocks

6

Background: Queue Locks

Upon resource request Task becomes non-pre-emptible Checks if the requested resource has been locked If not, it enters the critical section and accesses the

resource non-preemptively Otherwise, it is inserted into a FIFO queue for the resource

and busy waits until the resource becomes available Upon resource release

Releases the associated lock Becomes pre-emptible again The task at the head of the FIFO queue will get the

resource next

7

Background: current analysis of queue-lock based systems

Existing analyses e.g. [GAI2001], [BLOCK2007], [DEVI2006] inflate the worst-case execution time to include the time wasted spinning

For each resource request, it is assumed that requests to the same resource always interfere with it.

m is the total number of processors is the total number of tasks that access resource Hence, each resource request can spin for at most total of the

longest critical sections of all tasks regarding resource

Then, use some existing global FP analysis (that does not consider resource sharing)

(min(m,n j ) 1)

rj

n j

(min(m,n j ) 1)

rj

If there are 4 tasks on a 4 processor system and 3 tasks access the same resource only once while the other task accesses the same resource 100 times, existing analysis estimates that 309 time units will be wasted

…99 times…

This is overly pessimistic. The real worst-case scenario for this configuration is:

So at most 9 time units will be wasted on spinning

Pessimism in existing analysis(1)

8

Pessimism in existing analysis(2)

Under global FP scheduling, when analyzing a task at priority k, all tasks with lower priorities can be ignored.

When k is a low priority, many higher priority tasks may have inflated execution times.

Pessimism can accumulate at low priorities.This greatly reduces the number of tasksets that can be recognised as schedulable.

Aim to provide a less pessimistic model of spinning for lower priority tasks

9

A less pessimistic modeling of spinning (1)

For each task k, our new model calculates the maximum amount of spinning caused by any task’s accesses to each resource during task k’s deadline Dk.

Our model safely ignores those resource accesses that can never run in parallel with others

Suppose we know the maximum number of requests to a resource every task can have during Dk

For each 2<=x<=m, we calculate the maximum number of resource request groups in which x requests for the same resource could happen in parallel 10

A less pessimistic modeling of spinning (2)

11

Calculate the maximum computation time that could be wasted on busy waiting (spinning) by x parallel accesses to the same resource

The worst case can be further improved by considering the duration of

resource accesses

Schedulability analysis (1) Built on Bertogna and Lipari’s sufficient analysis for global FP

systems

Total interference to the problem job within its problem window Dk is the total of any execution time that occurs when the problem job is not executing within the problem window

Global FP is work conserving, so if the total interference is less than m*(Dk - Ck) , task k will be guaranteed to be

schedulable 12

ta td

Dk

CkCk

Schedulability analysis (2) With resource sharing, there are 3 differences:

1. Tasks with priorities lower than the problem job can also interfere2. Spinning wastes processor time3. A task may be non-preemptively blocked (Block et al. 2007) - Not

work conserving

We re-define total interference to the problem job within its problem window as the total of any idle time, task execution time or spinning time that occurs when the problem job is not executing within the problem window

If the total interference is less than m*(Dk - Ck), task k will be guaranteed to be schedulable. Ck does not include any spinning time.

13

Schedulability analysis (3) Upper bound on the total interference made up of:

Upper bound on the interference from higher priority tasks to the problem job derived by Bertogna and Lipari.

Total spinning time within the problem window Upper bound on the interference to the problem job when this job is

non-preemptively blocked : m * Bk where Bk is the maximum length of the non-preemptive blocking due to lower priority tasks

When the problem job is not one of the m highest priority ready tasks, spinning, high priority task execution, non-preemptible execution of the low priority tasks (excluding any spinning) may occur

When the problem job is spinning, all processor time is part of the total interference. However, some of this processor time has to be spent on spinning. Since all spinning time has been accounted for previously, we can ignore a minimal amount of inevitable spinning and so avoid double counting

Schedulability analysis (4) New spinning time modeling approach has two interesting

characteristics:1. Irrespective of the priority of the problem job under analysis, all tasks

that access a resource always contribute to the total spinning time2. It ignores all the resource accesses that can never run in parallel with

any other resource accesses when estimating total spinning time

This makes schedulability analysis work better than existing approaches when analyzing low priority tasks

… but worse when analyzing high priority tasks

Solution is to combine existing analysis and new analysis

(for each task check schedulability using existing WCET Inflation Analysis (WIA) and only if unschedulable check with new technique (lp-CDW), combined approach referred to as m-CDW)

15

Evaluation Experiments were conducted on randomly generated task sets:

Priority assignment policy (DkC monotonic) Processors (4 or 8) Number of tasks in each set (25) Total utilization (between 0 and m*100%) Periods (log uniform in range [2000, 25000]) Constrained deadlines (uniform in range [C,T]) 20,000 tasksets for each configuration

Only one resource in the whole system Max resource accesses in any job is limited to 5 Length of each resource access is randomly generated.

(More graphs in the technical report)

16

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 3

Utilisation

Suc

cess

Rat

e

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 3

Utilisation

Suc

cess

Rat

e

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 3

Utilisation

Suc

cess

Rat

e

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 3

Utilisation

Suc

cess

Rat

e

BLm-CDWWIAlp-CDW

Evaluation (4 processors)

17

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

2 2.5 3 3.5 4 4.5 5 5.5 6

Utilisation

Suc

cess

Rat

eEvaluation (8 processors)

18

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

2 2.5 3 3.5 4 4.5 5 5.5 6

Utilisation

Suc

cess

Rat

e

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

2 2.5 3 3.5 4 4.5 5 5.5 6

Utilisation

Suc

cess

Rat

e

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

2 2.5 3 3.5 4 4.5 5 5.5 6

Utilisation

Suc

cess

Rat

e

BLm-CDWWIAlp-CDW

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

10 15 20 25

Task number

Suc

cess

Rat


19

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

10 15 20 25

Task number

Suc

cess

Rat

e

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

10 15 20 25

Task number

Suc

cess

Rat

e

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

10 15 20 25

Task number

Suc

cess

Rat

e

BLm-CDWWIAlp-CDW

Varying taskset cardinality from 10-25 with other parameters held constant

U = 2.0 (50%)

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

10 15 20 25

Task number

Suc

cess

Rat


20

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

10 15 20 25

Task number

Suc

cess

Rat

e

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

10 15 20 25

Task number

Suc

cess

Rat

e

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

10 15 20 25

Task number

Suc

cess

Rat

e

BLm-CDWWIAlp-CDW

Varying taskset cardinality from 10-25 with other parameters held constant

U = 4.0 (50%)

Conclusion and Future Work

Contribution: A more effective way of modeling the impact of queue

locks on task schedulability – works well when there are a large number of tasks – unlike previous techniques

A new schedulability analysis to work with the new spinning time modeling approach

Future work includes: Taking nested resource accesses into consideration Applying this approach to other resource sharing

protocols that use queue locks and suspension-based locks (e.g. FMLP)

21

Questions

22

Documents

Reducing Queue Lock Pessimism in Multiprocessor Schedulability Analysis