Chapter 6, Process Synchronization, Overheads, Part 1

1

Chapter 6, Process Synchronization,Overheads, Part 1

2

• Fully covering Chapter 6 takes a lot of overheads.• Not all of the sections in the book are even

covered.• Only the first sections are covered in these

overheads, Part 1• These sections are listed on the next overhead• The rest of the sections are covered in the

second overheads file, Part 2

3

• 6.1 Background• 6.2 The Critical Section Problem• 6.3 Peterson’s Solution• 6.4 Synchronization Hardware• 6.5 Semaphores

4

6.1 Background

• Cooperating processes can affect each other• This may result from message passing• It may result from shared memory space• The general case involves concurrent access to

shared resources

5

• This section illustrates how uncontrolled access to a shared resource can result in inconsistent state

• In other words, it shows what the concurrency control or synchronization problem is

6

• The following overheads show:– How producer and consumer threads may both

change the value of a variable, count– How the single increment or decrement of count

is not atomic in machine code– How the interleaving of machine instructions can

give an incorrect result

7

High level Producer Code

• while(count == BUFFER_SIZE)• ; // no-op• ++count;• buffer[in] = item;• in = (in + 1) % BUFFER_SIZE

8

High Level Consumer Code

• while(count == 0)• ; // no-op• --count;• Item = buffer[out];• out = (out + 1) % BUFFER_SIZE

9

Machine Code for Incrementing count

• register1 = count;• register1 = register1 + 1;• count = register1;

10

Machine Code for Decrementing count

• register2 = count;• register2 = register2 - 1;• count = register2;

11

• The following overhead shows an interleaving of machine instructions which leads to a lost increment

12

• Let the initial value of count be 5• S0: Producer executes• register1 = count ( register1 = 5)• S1: Producer executes • register1 = register1 + 1 ( register1 = 6)• Context switch• S2: Consumer executes • register2 = count ( register2 = 5)• S3: Consumer executes • register2 = register2 – 1 ( register2 = 4)• Context switch• S4: producer executes

• count = register1 ( count = 6)• Context switch• S5: consumer executes

• count = register2 ( final value of count = 4)

13

• The point is that you started with a value of 5.• Then two processes ran concurrently.• One attempted increment the count.• The other attempted to decrement the count.• 5 + 1 – 1 should = 5• However, due to synchronization problems the

final value of count was 4, not 5.

14

• Term: • Race Condition. • Definition: • This is the general O/S term for any situation

where the order of execution of various actions affects the outcome

• For our purposes it refers specifically to the case where the outcome is incorrect, i.e., where inconsistent state results

15

• The derivation of the term “race” condition: • Execution is a “race”. • In interleaved actions, whichever sequence

finishes first determines the final outcome. • Note in the concrete example that the one

that finished first “lost”.

16

• Process synchronization refers to the tools that can be used with cooperating processes to make sure that that during concurrent execution they access shared resources in such a way that a consistent state results

• In other words, it’s a way of enforcing a desired interleaving of actions, or preventing an undesired interleaving of actions.

17

• Yet another way to think about this is that process synchronization reduces concurrency somewhat, because certain sequences that might otherwise happen are not allowed, and at least a partial sequential ordering of actions may be required

18

6.2 The Critical Section Problem

• Term: • Critical section. • Definition: • A segment of code where resources common to

a set of threads are being manipulated• Note that the definition is given in terms of

threads because it will be possible to concretely illustrate it using threaded Java code

19

• Alternative definition of a critical section: • A segment of code where access is regulated. • Only one thread at a time is allowed to

execute in the critical section • This makes it possible to avoid conflicting

actions which result in inconsistent state

20

• Critical section definition using processes: • Let there be n processes, P0, …, Pn-1, that share

access to common variables, data structures, or resources

• Any segments of code where they access shared resources are critical sections

• No two processes can be executing in their critical section at the same time

21

• The critical section problem is to design a protocol that allows processes to cooperate

• In other words, it allows them to run concurrently, but it prevents breaking their critical sections into parts and interleaving the execution of those parts

22

• Once again recall that this will ultimately be illustrated using threads

• In that situation, no two threads may be executing in the same critical section of code at the same time

• For the purposes of thinking about the problem, the general structure and terminology of a thread with a critical section can be diagrammed in this way:

23

• while(true)• {• entry section • // The synchronization entrance• protocol is implemented here.• critical section • // This section is protected.• exit section • // The synchronization exit protocol• is implemented here.• remainder section • // This section is not protected.• }

24

• Note the terminology:– Entry section– Critical section– Exit section– Remainder section

• These terms for referring to the parts of a concurrent process will be used in the discussions which follow

25

• A correct solution to the critical section problem has to meet these three conditions:– Mutual exclusion– Progress– Bounded waiting

• In other words, an implementation of a synchronization protocol has to have these three characteristics in order to be correct.

26

Mutual exclusion

• Definition of mutual exclusion:• If process Pi is executing in its critical section, no

other process can be executing in its critical section• Mutual exclusion is the heart of concurrency control.• However, concurrency control is not correct if the

protocol “locks up” and the program can’t produce results.

• That’s what the additional requirements, progress and bounded waiting, are about.

27

Progress

• Definition of progress:• If no process is in its critical section and some

processes wish to enter, only those not executing in their remainder sections can participate in the decision

• It should not be surprising if you find this statement of progress somewhat mystifying.

• The idea requires a little explanation and may not really be clear until an example is shown.

28

Progress explained

• For the sake of discussion, let all processes be structured as infinite loops

• If mutual exclusion has been implemented, a process may be at the top of the loop, waiting for the entry section to allow it into the critical section

• The process may be involved in the entry or exit protocols

• Barring these possibilities, the process can either be in its critical section or in its remainder section

29

• The first three possibilities, waiting to enter, entering, or exiting, are borderline cases.

• Progress is most easily understood by focusing on the question of processes either in the critical section or in the remainder section

• The premise of the progress condition is that no process is in its critical section

30

• Some processes may be in their remainder sections

• Others may be waiting to enter the critical section

• But the important point is that the critical section is available

• The question is how to decide which process to allow in, assuming some processes do want to enter the critical section

31

• Progress states that a process that is happily running in its remainder section has no part in the decision of which process to allow into the critical section.

• A process in the remainder section can’t stop another process from entering the critical section.

• A process in the remainder section also cannot delay the decision.

32

• The decision on which enters can only take into account those processes that are currently waiting to get in.

• This sounds simple enough, but it’s not really clear what practical effect it has on the entry protocol.

• An example will be given later that violates the progress condition.

• Hopefully this will make it clearer what the progress condition really means.

33

Bounded waiting

• Definition of bounded waiting:• There exists a bound, or limit, on the number

of times that other processes are allowed to enter their critical sections after a given process has made a request to enter its critical section and when that request is granted

34

Bounded waiting explained

• Whatever algorithm or protocol is implemented for allowing processes into the critical section, it cannot allow starvation

• Granting access to a critical section is reminiscent of scheduling.

• Eventually, everybody has to get a chance

35

• For the purposes of this discussion, it is assumed that each process is executing at non-zero speed, although they may differ in their speeds

• Bounded waiting is expressed in terms of “a number of times”.

• No concrete time limit can be given, but the result is that allowing a thread into its critical section can’t be postponed indefinitely

36

More about the critical section problem

• The critical section problem is unavoidable in operating systems

• The underlying idea already came up in chapter 5 in the discussion of preemption and interrupt handling

• There are pieces of operating system code that manipulate shared structures like waiting and ready queues.

• No more than one process at a time can be executing such code because inconsistent O/S state could result

37

• You might try to avoid the critical section problem by disallowing cooperation among user processes (although this diminishes the usefulness of multi-programming)

• Such a solution would be very limiting for application code.

• It doesn’t work for system code.• System processes have to be able to cooperate in

their access to shared structures.

38

• A more complete list of shared resources that multiple system processes contend for would include scheduling queues, I/O queues, lists of memory allocation, lists of processes, etc.

• The bottom line is that code that manipulates these resources has to be in a critical section

• Stated briefly: There has to be mutual exclusion between different processes that access the resources (with the additional assurances of progress and bounded waiting)

39

The critical section problem in the O/S, elaborated

• You might try to get rid of the critical section problem by making the kernel monolithic (although this goes against the grain of layering/modular design)

• The motivation behind this would be the idea that there would only be one O/S process or thread, not many processes or threads.

40

• Even so, if the architecture is based on interrupts, whether the O/S is modular or not, one activation of the O/S can be interrupted and set aside, while another activation is started as a result of the interrupt.

• The idea behind “activations” can be illustrated with interrupt handling code specifically.

41

• One activation of the O/S may be doing one thing.

• When the interrupt arrives a second activation occurs, which will run a different part of the O/S—namely an interrupt handler

42

• The point is that any activation of the O/S, including an interrupt handler, has the potential to access and modify a shared resource like the ready queue

• It doesn’t matter whether the O/S is modular or monolithic

• Each activation would have access to shared resources

43

Do you own your possessions, or do your possessions own you?

• The situation can be framed in this way: • You think of the O/S as “owning” and

“managing” the processes, whether system or user processes

• It turns out that in a sense, the processes own the O/S.

44

• Even if the processes in question are user processes and don’t access O/S resources directly, user requests for service and the granting of those requests by the O/S causes changes to the O/S’s data structures

• The O/S may create the processes, but the processes can then be viewed as causing shared access to common O/S resources

45

• This is the micro-level view of the idea expressed at the beginning of the course, that the O/S is the quintessential service program

• Everything that it does can ultimately be traced to some application request

• Applications don’t own the system resources, but they are responsible for O/S behavior which requires critical section protection

46

A multiple-process O/S

• The previous discussion was meant to emphasize that even a monolithic O/S has concurrency issues

• As soon as multiple processes are allowed, whether an O/S is a microkernel or not, it is reasonable to implement some of the system functionality in different processes

47

• At that point, whether the O/S supports multi-programming or not, the O/S itself has concurrency issues with multiple processes of its own

• Once you take the step of allowing >1 concurrent process, whether user or system processes, concurrency, or the critical section problem arises

48

What all this means to O/S code

• O/S code can’t allow race conditions to arise• The O/S can’t be allowed to enter an

inconsistent state• O/S code has to be written so that access to

shared resources is done in critical sections• No two O/S processes can be in a critical

section at the same time

49

• The critical section problem has to be solved in order to implement a correct O/S.

• If the critical section problem can be solved in O/S code, then the solution tools can also be used when considering user processes or application code which exhibit the characteristics of concurrency and access to shared resources

50

Dealing with critical sections in the O/S

• Keep in mind:– A critical section is a sequence of instructions that

has to be run atomically without interleaving executions of >1 process

– Pre-emptive scheduling a new process can be scheduled before the currently running one finishes

51

• The question of critical sections in the O/S intersects with the topic of scheduling

• The question of scheduling leads to two possible approaches to dealing with concurrency in the O/S:– A non-preemptive kernel– A preemptive kernel

52

• In a non-preemptive kernel, any kernel mode process will run until:– It exits– It blocks due to making an I/O request– It voluntarily yields the CPU

• Under this scenario, the process will run through any of its critical sections without interruption by another process

53

• A preemptive kernel is likely to be more desirable than a non-preemptive kernel.

• A preemptive kernel– Will be more responsive for interactive time-

sharing– Will support (soft) real-time processing

54

• Writing a pre-emptive kernel means dealing with concurrency in the kernel code– In general, this is not easy– In some cases (like SMP) it is virtually impossible– How do you control two threads running

concurrently on two different CPU’s?

55

• Windows XP and before were non-preemptive• More recent versions of Windows are

preemptive• Traditional Unix was non-preemptive. • Newer versions of Unix and Linux are

preemptive

56

6.3 Peterson’s Solution

• This section will be divided into these 4 categories:

• Background• Scenario 1• Scenario 2• Peterson’s Solution

57

Background

• Peterson’s solution is an illustration of how to manage critical sections using high-level language pseudo-code

• My coverage of Peterson’s algorithm starts with two scenarios that are not given in the current edition of the textbook

58

• The two scenarios were given in previous editions of the book.

• I have kept these scenarios because I think they’re helpful.

• They give a concrete idea of what it means to violate progress and bounded waiting

• And Peterson’s solution can be viewed as a successful combination of the two incomplete techniques given in the two scenarios

59

• Before going into the scenarios, it will be helpful to recall what the critical section problem is, and how it can be solved, in general

• A successful solution to the critical section problem includes these three requirements:– Mutual exclusion– Progress– Bounded waiting

60

• At this point, let it be assumed that the mutual exclusion requirement is clear

• Progress and bounded waiting can be made clearer by considering possible solutions to the critical section problem that may not satisfy these requirements

61

Scenario 1

• Suppose your synchronization protocol is based on the idea that processes “take turns”

• Taking turns is the basis for concurrency• The question is, how do you decide whose

turn it is?• Consider two processes, P0 and P1

• Let “whose turn” be represented by an integer variable turn which takes on the values 0 or 1

62

• Note that regardless of whose turn it currently is, you can always change turns by setting turn = turn - 1

• If turn == 0, then 1 – turn = 1 gives the opposite• If turn == 1, then 1 – turn = 0 gives the opposite• On the other hand, in code protected by an if

statement, if(turn == x) you can simply hard code the change to turn = y, the opposite of x.

63

• Let the sections of the problem be structured in this way:

• while(true)• {• entry: If it’s my turn• critical section• exit: Change from my turn to the other• remainder• }

64

• For the sake of discussion assume:– P0 and P1 are user processes– The system correctly (atomically) grants access to

the shared variable turn• The code blocks for the two processes could

be spelled out in detail as given below

65

P0

• while(true)• {• entry: if(turn == 0)• {• critical section• exit: turn = 1;• }• remainder• }

66

P1

• while(true)• {• entry: if(turn == 1)• {• critical section• exit: turn = 0;• }• remainder• }

67

• Suppose that turn is initialized arbitrarily to either 1 or 0

• Assume that execution of the two processes goes from there

• This would-be solution violates progress

68

• Consider this scenario:• turn == 0• P0 runs its critical section• It then goes to its exit section and sets turn to

1• P1 is in its remainder section and doesn’t want

to enter its critical section (yet)

69

• P0 completes its remainder section and wants to enter its critical section again

• It can’t, because it locked itself out when it changed turn to 1

• Due to the structure of the code, only P1 can enter the critical section now

• P0 can only get in after P1 has entered and exited the critical section

70

• This scenario meets the requirements for a violation of the progress condition for a correct implementation of synchronization

• The scheduling of P0, a process that is ready to enter its critical section, is dependent on a process, P1, which is running in its remainder section—It is not running in its critical section

71

• Here is a physical analogy• There is a single train track between two

endpoints• Access to the track is managed with a token• If the token is at point A, a train can enter the

track from that direction, taking the token with it• If the token is at point B, a train can enter the track

from that direction, taking the token with it

72

73

• This protocol works to prevent train wrecks• The need to hold the token forces strict

alternation between trains on the track• However, no two trains in a row can originate

from the same point, even if no train wants to go in the other direction

• This decreases the potential use of the tracks

74

Scenario 2

• Suppose that the same basic assumptions apply as for scenario 1

• There are two user processes with entry, critical, exit, and remainder sections

• You want to correctly synchronize them• You want to avoid the progress violation that

occurred under scenario 1

75

• The scenario 1 solution involved a turn variable that was rigidly alternated between P0 and process P1.

• Under scenario 2, there will be two variables, one for each process

• The variables are used for a process to assert its desire to enter its critical section

• Then each process can defer to the other if the other wants to go, implementing a “polite” rather than “rigid” determination of whose turn it is

76

• Specifically, for the sake of discussion, assume:– P0 and P1 are user processes– There are two, shared flag variables, flag0 and

flag1, which are boolean.– The system correctly (atomically) grants access to

the two shared variables

77

• Let the variables represent the fact that a process wants to take a turn in this way:– flag0 == true P0 wants to enter its critical section

– flag1 == true P1 wants to enter its critical section

78

• Let the sections of the problem be structured in this way:

• while(true)• {• entry: Set my flag to true• while the other flag is true, wait• critical section• exit: Change my flag to false• remainder• }

79

• The given solution overcomes the problem of the previous scenario

• The processes don’t have to rigidly take turns to enter their critical sections

• If one process is in its remainder section, its flag will be set to false

• That means the other process will be free to enter its critical section

• A process not in its critical section can’t prevent another process from entering its critical section

80

• However, this new solution is still not fully correct.

• Recall that outside of the critical section, scheduling is determined independent of the processes

• That is to say that under concurrent execution, outside of a protected critical section, any execution order may occur

81

• So, consider the possible sequence of execution shown in the diagram on the following overhead

• The horizontal arrow represents the fact that P0 is context switched out and P1 is context switched in due to some random scheduling decision

• Note that this is possible because at the time of the arrow, neither process is in its critical section

82

83

• Under this scheduling order, both processes have set their flags to true

• The result will be deadlock• Both processes will wait eternally for the other to

change its flag• This protocol is coded in such a way that the

decision to schedule is postponed indefinitely• This seems to violate the 2nd part of the definition

of progress

84

• The definition of the bounded waiting requirement in a correct implementation of synchronization states that there exists a bound, or limit, on the number of times that other processes are allowed to enter their critical sections after a given process has made a request to enter its critical section and when that request is granted

85

• Strictly speaking, scenario 2 doesn’t violate bounded waiting, because neither process can enter its critical section under deadlock.

• In effect, scenario 2 is another example of lack of progress.

• However, it is still interesting because it hints at the problems of unbounded waiting

86

• The picture on the following overhead illustrates what is wrong with scenario 2

• Each of the processes is too polite

87

88

Peterson’s Solution

• Keep in mind that in the previous two scenarios the assumption was made that the system was providing correct, atomic access to the turn and flag variables

• In essence, this assumption presumes a prior solution to the concurrency control problem

• In order to synchronize access to critical sections in code, we’re assuming that there is synchronized access to variables in the code

89

• For the sake of explaining the ideas involved, continue to assume that a system supports correct, atomic access to the needed variables

• Peterson’s solution is an illustration of a “correct” solution to synchronization in high level language software that would be dependent on atomicity or synchronization in the machine language implementation to work

90

• Peterson’s solution is a combination of the previous two scenarios

• It uses both a turn variable and flag variables to arrive at a solution to the critical section problem that makes progress

• Let the solution have these variables– int turn– boolean flag[2]

91

• The meaning of the variables is as before• turn indicates which of the two processes is

allowed to enter the critical section• Instead of having two separate flag variables,

flag[] is given as a boolean array, dimensioned to size 2

• flag[] records whether either of the two processes want to enter its critical section

92

• Let the sections of the problem be structured in the way shown on the following overhead

• The code is given from the point of view of Pi

• Pj would contain analogous but complementary code

93

• // Pi

• while(true)• {• flag[i] = true; // Assert that this process, i, wants to enter• • turn = j; // Politely defer the turn to the other process, j• • /* Note that the loop below does not enclose the critical section.• It is a busy waiting loop for process i. As long as it’s j’s turn• and j wants in the critical section, then i has to wait. The critical• point is that if it’s j’s turn, but j doesn’t want in, i doesn’t have• to wait. */• • while(flag[j] && turn == j);• • critical section;• • flag[i] = false;• • remainder section;• }

94

• Analyzing from the point of view of process Pi alone

• Let Pi reach the point where it wants to enter its critical section

• The following command signals its desire:• flag[i] = true;• It then politely does this in case Pj also wants to get

in:• turn = j;

95

• Pi only waits if turn == j and flag[j] is true

• If Pj doesn’t want in, then Pi is free to enter even though it set turn to j

• The argument is the converse if Pj wants in and Pi doesn’t want in

• In either of these two cases, there’s no conflict and the process gets to enter

96

• Consider the conflict case• Both Pi and Pj want in, so they both set their flags• They both try to defer to each other by setting

turn to the other• The system enforces atomic, synchronized access

to the turn variable• Whichever one sets turn last defers to the other

and wins the politeness contest

97

• That means the other one goes first• As soon as it leaves the critical section, it

resets its flag• That releases the other process from the busy

waiting loop

98

• Recall that outside of the critical section, scheduling is determined independent of the processes

• That is to say that under concurrent execution, outside of a protected critical section, any execution order may occur

• A possible sequence of execution is shown in the diagram on the overhead following the next one

99

• In the diagram the horizontal arrow represents the fact that P0 is context switched out and P1 is context switched in

• In this diagram, P1 will win the politeness contest, meaning P0 will be allowed into the critical section when it is scheduled next

• It should be possible to figure out the order of execution of the blocks of code of the two processes following this context switch

100

101

• The book gives the explanation of Peterson’s solution in the form of a proof

• The demonstration given here is not a proof• A non-proof understanding is sufficient• You should be able to answer questions about

and sketch out Peterson’s solution

102

• You should also be able to answer questions about and sketch out the partial solutions of scenarios 1 and 2, which Peterson’s solution is based on

• Recall that the most significant points of those examples, about which questions might be asked, is how they fail to meet the progress requirement

103

6.4 Synchronization Hardware

• The fundamental idea underlying the critical solution problem is locking

• The entry section can be described as the location where a lock on the critical section is obtained.

• The exit section can be described as the location where the lock on the critical section is released.

• The pseudo-code on the following overhead shows a critical section with this terminology.

104

• while(true)• {• acquire lock• critical section• release lock• remainder section• }

105

• Locking means exclusive access—in other words, the ability to lock supports mutual exclusion

• In the previous explanations of synchronization, it was assumed that at the very least, variables could in essence be locked, and that made it possible to lock critical sections

• The question is, where does the ability to lock variables come from?

106

• Ultimately, locking, whether locking of a variable, a block of code, or a physical resource, can only be accomplished through hardware support

• There is no such thing as a pure software solution to locking

107

• At the bottom-most hardware level, one solution to locking/synchronization/critical section protection would be the following:

• Disallow interrupts and disable preemption for the duration of a critical section

• This was mentioned earlier and will be covered again briefly

• Keep in mind, though, that eventually a more flexible construct, accessible to system code writers, needs to be made available

108

• Disallowing interrupts is a very deep topic• In reality, this means queuing interrupts• Interrupts may not be handled immediately,

but they can’t be discarded• Interrupt queuing may be supported in

hardware• Interrupt queuing may impinge on the very

“lowest” level of O/S software

109

• Disabling preemption• Disallowing interrupts alone wouldn’t be

enough in order to protect critical sections• Preemption would also have to be disabled• The real question is how broadly it should be

disabled• A gross solution is to implement a non-

preemptive scheduling algorithm

110

• Whether they supported synchronization in user processes, some older, simpler operating systems solved the synchronization problem in system code by disallowing preemption of system processes.

• The next step up is identifying just the critical sections of user and system processes and only disabling preemption during the critical sections themselves.

• This is the real goal of a general, flexible implementation of synchronization.

111

• The overall problem with drastic solutions like uninterruptibility and non-preemption is loss of concurrency.

• Loss of concurrency reduces the effectiveness of time-sharing systems

• Loss of concurrency would have an especially large impact on a real time system

112

• This is just one of the book’s side notes on multi-processing systems.

• Trying to make a given process uninterruptible in a multi-processor system is a mess.

• Each processor has to be sent a message saying that no code can be run which would conflict with a given kernel process

113

• Another potential architectural problem with the gross solutions is that uninterruptibility may affect the system clock

• System clock may be driven by interrupts. • If interrupts can be

disallowed/delayed/queued in order to support critical sections, the system clock may be slowed.

114

A flexible, accessible locking construct

• Modern computer architectures have machine instructions which support locking

• These instructions have traditionally been known as test and set instructions

• The book refers to them as get and set instructions

• I suppose this terminology was developed to sound more modern and object-oriented

115

• The book discusses this as a hardware concept and also illustrates it with a Java-like class

• The class has a private instance variable and public methods

• I find it somewhat misleading since what we’re dealing with are not simple accessor and mutator methods

• I prefer the traditional, straightforward, hardware/machine language explanation with the test and set terminology, and that’s all that I’ll cover

116

Locking with test and set

• The fundamental problem of synchronization is mutual exclusion

• You need an instruction at the lowest level that either executes completely or doesn’t execute at all

• The execution of that one instruction can’t be interleaved with other actions

• Other threads/processes/code are excluded from running at the same time

• This kind of “uninterruptible” instruction is known as an atomic instruction

117

• In order to support locking, you need such an instruction which will both check the current value of a variable (for example), and depending on the value, set it to another value

• This has to be accomplished without the possibility that another process will have its execution interleaved with this one, affecting the outcome on that variable

118

• The name “test and set” indicates that this is an instruction with a composite action

• Even though the instruction is logically composite, it is implemented as an atomic instruction in the machine language

• Execution of a block of code containing it may be interrupted before the instruction is executed or after it is executed

• However, the instruction itself cannot be interrupted between the test and the set

119

• The test and set instruction can be diagrammed this way in high level pseudo-code

• if(current value of variable x is …)• set the value of x to …

• The variable x itself now serves as the lock

120

• In other words, whether or not a thread can enter a section depends on the value of x

• The entry into a critical section, the acquisition of the lock, takes the form of testing and setting x:

• if(x is not set) // These two lines together• set x; // are the atomic test and set• critical section;• unset x;• remainder section;

121

• This may be beating a dead horse, but consider what could happen if test and set were not atomic.– P0 could test x and find it available

– P1 could test x and find it available

– P0 could set x

– P1 could redundantly set x

• Due to this interleaving of sub-instructions, both P0 and P1 would then proceed to enter the critical section, violating mutual exclusion

122

• The availability of this actual mutual exclusion at the machine language/hardware level is the basis for all correct/successful software solutions made available in API’s

• The book also mentions that an atomic swap instruction can be used to support mutual exclusion/locking.

• It is sufficient to understand the test and set concept and not worry about a swap instruction.

123

6.5 Semaphores

• An API may provide a semaphore construct as a synchronization tool for programmers

• A semaphore contains an integer variable and has two operations

• The first operation is acquire()• Historically this has been known as P()• P is short for “proberen”, which means “test”

in Dutch

124

• The second operation is known as release()• Historically this has been known as V()• V is short for “verhogen”, which means

“increment” in Dutch• P() and V() had the advantage of brevity, but

they were cryptic

125

• In this section I will adopt the modern terminology of acquire() and release() and that’s what you should be familiar with

• It’s kind of sad, but the acquire() and release() methods, in the book’s presentation, have a variable named value.

• It’s clumsy to talk about the value of value, but I will stick with the book’s terminology.

126

• On the following overheads high level language pseudo-code definitions are given for acquire() and release()

• The definitions are shown as multiple lines of code

• This is a key point:• In order to be functional, they would have to

be implemented atomically in the API

127

• In other words, what you’re seeing here, again, is a high-level software explanation of locking concepts, but these concepts would ultimately have to be supported by locking at the machine level

128

• In the definitions of acquire() and release() there is a variable named value.

• The acquire() and release() methods share access to value

• Value is essentially the lock variable• In other words, at the system level, mutual

exclusion and synchronization have to be provided for on the variable value

129

• The pseudo-code for the acquire() method is given on the next overhead.

• In considering the code initially, assume that value has somehow been initialized to the value 1

130

The semaphore acquire() method

• acquire()• {• while(value <= 0); // no-op• value--;• }

131

• Note that this implementation is based on the idea of a waiting loop.

• Acquisition can’t occur if value is less than or equal to zero.

• Successful acquisition involves decrementing the variable value, driving its value towards zero.

• The <= condition in the loop is how the method got its name, “test”, in Dutch

• The idea of waiting in a loop due to a variable value may be reminiscent of elements of Peterson’s solution

132

• The pseudo-code for the release() method is given on the next overhead.

• In considering the code, it is not necessary to assume that value has taken on any particular value

133

The semaphore release() method

• release()• {• value++;• }

134

• Note that releasing involves incrementing the variable value.

• This is how it got its name, “increment” in Dutch.

• Releasing, or incrementing value, is very straightforward.

• It is not conditional and it doesn’t involve any kind of waiting.

135

What semaphores are for

• Keep in mind what semaphores are for.• They are a construct that might be made

available in an API to support synchronization in user-level code.

• Internally, they themselves would have to be implemented in such a way that they relied on test and set instructions, for example, so that they really did synchronization

136

• In a way, you can understand semaphores as wrappers for test and set instructions

• In code that used them, entry to a critical section would be preceded by a call to acquire()

• The end of the critical section would followed by a call to release()

• In a sense, the variable value in the semaphore is analogous to the variable x that was used in discussing test and set

137

• Locking of the critical section is accomplished by the acquisition of the semaphore/the value contained in it

• The semaphore is similar in concept to the car title analogy or the token in the train example of scenario 1

• Locking the variable becomes a surrogate for locking some other thing, like a critical section of code

• If you own the variable, you own the thing it is a surrogate for

138

• We will not cover how you would implement semaphores internally

• Eventually, the book will present its own (simplified) Semaphore class

• It turns out that there is also a Semaphore class in the Java API

139

• It turns out that in the long run we will not want to make great use of semaphores

• In many cases the inline Java synchronization syntax can be used without needing semaphores

• When more complex synchronization becomes necessary, a Java construct called a monitor will be used instead of a semaphore

• In the meantime, simply to explain the idea of synchronization further, semaphores will be used in the examples

140

Using semaphores to protect critical sections

• Keep in mind that in multi-threaded code, the critical section would be shared because all of the code is shared

• A single semaphore would be created to protect that single critical section

• Each thread would make calls on that common semaphore in order to enter and exit the critical section

141

• The book has some relatively complete examples.• They will be abstracted in these overheads.• You might have a driver program (containing

main()) where a semaphore is constructed and various threads are created.

• The code on the following overhead shows how a reference to the shared semaphore might be passed to each thread when the thread is constructed

142

• Semaphore S = new Semaphore();• Thread thread1 = new Thread(S);• Thread thread2 = new Thread(S);

143

• As long as the code that each thread runs is identical, I can see no reason why you might not declare a static semaphore in the class that implements Runnable (the thread class) and construct the common semaphore inline

• In any case, no matter how many instances of the thread class are created, with a shared reference to a semaphore, only one at a time will be able to get into the critical section

144

• Once there’s a common semaphore, then the run() method for the threads would take the form shown on the following overhead.

• The shared critical section would be protected by sandwiching it between calls to acquire and release the semaphore

145

• run()• {• …• while(true)• {• semaphore.acquire();• // critical section• semaphore.release();• // remainder section• }• }

146

Using semaphores to enforce execution order

• Along with the general problem of protecting a critical section, semaphores and locking can be used to enforce a particular order of execution of code when >1 process is running

• Suppose that the code for processes P1 and P2 are not exactly the same.

• P1 contains statement S1 and P2 contains statement S2, and it is necessary for S1 to be executed before S2

147

• The difference between P1 and P2 is critical to this example.

• The point is that we’ll now be talking about two processes running through two blocks of code that differ from each other

• The two processes do not share exactly the same code the way two simple threads would

148

• Recall that in the introduction to semaphores, the semaphore was shown as initialized to the value 1.

• This is the “unlocked” value• You could acquire the lock if value was >= 0, and

when you acquired it, you decremented value• If the semaphore is initialized to 0, then it is

initialized as locked

149

• Remember that even if the semaphore is locked, this doesn’t prevent all code from running

• Code that doesn’t try to acquire the lock can run freely regardless of the semaphore value

• If code for P1 and P2 differ, then there could be parts of P2 that are protected, while the corresponding parts of P1—which correspond, but aren’t the same as P2—are not protected

150

• These considerations are important in the following example

• S1 and S2 are the lines of code that differ in P1 and P2 and that have to be executed in 1-2 order

• By inserting the semaphore acquire() and release() calls as shown in the code on the following overhead, this execution order is enforced.

151

• /* At the point where the shared semaphore is constructed, initialize it to the locked value. S1 is not protected by an acquire() call, so it can run freely. After it has run and released the lock, then S2 will be able to acquire the lock and run. */

• Process P1 code• {• S1• semaphore.release();• }• Process P2 code• {• semaphore.acquire();• S2• }

152

• Two differing processes or threads are synchronized by means of a shared semaphore

• In this case, since they differ, they should probably be passed a reference to a shared semaphore, rather than expecting to be able to construct a static one inline

• The foregoing example is not especially cosmic, but it does introduce an idea that will lead to more complicated examples later:

• The semaphore can be asymmetrically acquired and released in the different processes

153

Binary vs. Counting Semaphores

• A binary semaphore is a simple lock which can take on two values:– The value 1 = available (not locked)– The value 0 = not available (locked)

• The semaphore concept can be extended to values greater than 1

• A counting semaphore is initialized to an integer value n, greater than 1

154

• The initial value of a counting semaphore tells how many different instances of a given, interchangeable kind of resource there are

• The semaphore is still decremented by 1 for every acquisition and incremented by 1 for every release

• If you think about it, the plan of action is pretty clear

155

• Every time a thread acquires one of the copies of the resource, the value variable is decremented, reducing the count of the number of copies available for acquisition

• Every time a thread releases one of the copies, the count goes up by one.

• In theory, such a semaphore could also be used to allow n threads in a critical section at a time, although that’s not a typical use in practice

156

Implementing Non-Busy Waiting in a Semaphore

• This was the implementation/definition of a semaphore given above

• acquire()• {• while(value <= 0); // no-op• value--;• }

157

• This was referred to earlier as a busy waiting loop and can also be called a spin lock

• The process is still alive and it should eventually acquire the shared resource or enter the critical section

• This is true because a correct implementation of synchronization requires progress and bounded waiting

158

• As a live process, it will still be scheduled, but during those time slices when it’s scheduled and the resource isn’t available it will simply burn up its share of CPU time/cycles doing nothing but spinning in the loop

• From the standpoint of the CPU as a resource, this is a waste

159

• The solution to this problem is to have a process voluntarily block itself or allow itself to be blocked when it can’t get a needed resource

• This is reminiscent of I/O blocking• The process should be put in a waiting list for

the resource to become available• When it’s in a waiting list, it won’t be scheduled

160

• A given system may support a block() and a wakeup() call in the API

• The pseudo-code on the following overhead shows how a semaphore definition might be structured using these calls

• This is hypothetical for the time being, but the idea will recur in a different form later

161

• acquire()• {• value--;• if(value < 0)• {• block();• // If the semaphore shows that the resource• // is not available, block the process that• // called acquire() and put it on a waiting• // list. No implementation details below• // the level of the call to block() will be• // considered.• }• }

162

• In the original semaphore definition with busy waiting, decrementing value occurred after the waiting loop, when acquisition occurred

• In this new definition the decrementing occurs right away, before the “if” that tests value

• First of all, this is OK since value is allowed to go negative in a counting semaphore

163

• When value is negative, it tells how many processes are waiting in the list for the resource

• Not only is it OK to decrement right away, it’s necessary.

• If the result after decrementing is positive, the process successfully acquires

• If not, the decrementation is immediately included as part of the count of the number of waiting processes

164

• More importantly, it’s necessary to do the decrementing right away.

• There is nothing else in the semaphore definition• Releasing a blocked process from the waiting list

happens elsewhere in the code, so decrementing has to be done when and where it can

• It can’t be saved until later

165

• The corresponding, new definition of the release() method is given on the following overhead

• It is in release() that waiting processes are taken off the waiting list

• As with the acquire() method, the details of how that is done are not shown

• All that’s shown is the call to wakeup(), which is the converse of the call to block() in the acquire() method

166

• release()• {• value++;• if(value <= 0)• {• wakeup(P);• // wakeup() means that a process• // has to be removed from the• // waiting list. Which process, P,• // it is depends on the internal• // implementation of the waiting• // and the wakeup() method.• }• }

167

Why waiting lists?

• In this latest iteration, locking by means of a semaphore no longer involves wasting CPU time on busy waiting.

• If such a semaphore can be implemented, and locking can be done in this way, then processes simply spend their idle time waiting in queues or waiting lists.

168

• How can a waiting list be implemented?• Essentially, just like an I/O waiting list• If this is a discussion of processes, then PCB’s are

entered into the list• There is no specific requirement on the order in

which processes are given the resource when it becomes available

• A FIFO queuing discipline ensures fairness and bounded waiting

169

• Note once again that semaphores are an explanatory tool—but they once again beg the question of mutual exclusion

• In other words, it’s obvious that the multi-value semaphore definitions of acquire() and release() consist of multiple lines of high level language code

• They will only work correctly if the entire methods are atomic

170

• This can be accomplished if the underlying system supports a synchronization mechanism that would allow mutual exclusion to be enforced from the beginning to the end of the semaphore methods

• In other words, the semaphore implementations themselves have to be critical sections

171

• In a real implementation this might be enforced with the use of test and set type instructions

• At the system level, it again raises the question of inhibiting interrupts

• However, we are still not interested in the particular details of that might be done

172

• What we are interested in is the concept that enclosing blocks of code in correctly implemented semaphore calls to acquire() and release() make those blocks critical sections

• We are also interested in the implementation of semaphores to the extent that we understand that internally, a semaphore is conceptually based on incrementing and decrementing an integer variable

173

Deadlocks and Starvation

• Although how it’s all implemented in practice might not be clear, the previous discussion of locking and semaphores illustrated the idea that you could enforce:– Mutual exclusion– Order of execution

• This should provide a basis for writing user level code where the end result is in a consistent state

174

• This still leaves two possible problems: Starvation and deadlock

• Starvation can occur for the same reasons it can occur in a scheduling algorithm

• If the waiting list for a given resource was not FIFO and used some priority for waking up blocked processes, some processes may never acquire the resource

• The simple solution to this problem is to implement a queuing discipline that can’t lead to starvation

175

• Deadlock is a more difficult problem• An initial example of this came up with

scenario 2, presented previously.• Essentially, Alphonse and Gaston deadlocked.• This was a simple case of a poor

implementation where you had two processes and one resource

176

• Deadlock typically arises when there are more than one process and more than one resource they are contending for

• If the acquire() and release() calls are interleaved in a certain way, the result can be that neither process can proceed

• An example of this follows

177

Deadlock Example• Suppose that Q and S are resources and that P0 and P1 are structured in

this way:• P0 P1

• acquire(S) acquire(Q)• acquire(Q) acquire(S)• … …• release(S) release(Q)• release(Q) release(S)• The arbitrary, concurrent scheduling of the processes may proceed as

follows• P0 acquires S

• Then P1 acquires Q

178

• At this point, neither process can go any further

• Each is waiting for a resource that the other one holds

• This is a classic case of deadlock• This is a sufficiently broad topic that it will not

be pursued in depth here• A whole chapter is devoted to it later on

179

The End

Documents

Chapter 6, Process Synchronization, Overheads, Part 1