Distributed (Operating) Systems -Processes and Threads- Computer Engineering Department Distributed Systems Course Assoc. Prof. Dr. Ahmet Sayar Kocaeli

1

Distributed (Operating) Systems -Processes and Threads-

Computer Engineering DepartmentDistributed Systems Course

Assoc. Prof. Dr. Ahmet SayarKocaeli University - Fall 2015

2

Processes and Threads• Processes management and their scheduling• Process: Executing context of a programme• Distributed Scheduling/migration

– Can help in achieving scalability, but can also help to dynamically configure clients and servers

• Multiprogramming versus multiprocessing• Multiprogramming: Ability to run more than one process on any

machine– Multiple applications are running but only one may be executing in the

core– Degree of multiprogramming determined by long term scheduling

• Multiprocessing: You have more than one CPU core or processor in your machine, it enables multiple processors to run in parallel at the same time

3

Processes: Review• Process is a program in execution• Kernel data structure: Process control block (PCB)

– Keeps track of what the process keeps doing, when it started, where it resides, what files it has open, what sockets it has open

• Each process has an address space– Contains code, global and local variables.– Processes have code and data segments carried out and in through pages

• Process state transitions– When processes is created its state is new– When the process is load into memory its state is called ready– When processes is selected by CPU scheduler its state turns into run– When process makes a blocking IO its state turns into wait state– When process finish it turns into terminate state

4

Process States

5

Processes: Review

• Uniprocessor scheduling algorithms– Round-robin, shortest job first, FIFO, lottery scheduling*

• CPU bound processes• IO bound processes

• Performance metrics: Throughput, CPU utilization, turnaround time, response time, fairness

• What is the difference between turnaround time and response time?

6

Context Switching

• Switching the CPU between 2 processes – computationally expensive

• CPU context– Register values, program counter, stack pointer, etc.

• CPU and other hardware devices are shared by the processes.• OS keeps process table

– Entries to store CPU register values, memory maps, open files, privileges, etc.

• If OS supports more processes than it can simultaneously hold in main memory, it may have to swap processes between main memory and disk before the actual switch can take place.

7

Context Switching

8

Process Scheduling

• Priority queues: Multiples queues, each with a different priority– Use strict priority scheduling– Example: Page swapper, kernel tasks, real-time tasks, user tasks

• Multi-level feedback queue– Multiple queues with priority– Processes dynamically move from one queue to another

• Depending on priority/CPU characteristics

– Gives higher priority to I/O bound or interactive tasks– Lower priority to CPU bound tasks– Round robin at each level

• Pre-emptive – Non-pre-emptive scheduling– FCFS is non-preemptive scheduling– Round-robin is preemptive scheduling

9

Example of processes and threads

• Processes– Excel worksheet, email client tool, browser all

running together• Threads– In an excel worksheet; when a user changes the

value in a single cell such a modification can trigger large series of computation. While excel is making those computations users can still enter other values continuously.

10

Processes and Threads

• Traditional process– One thread of control through a large, potentially sparse address

space– Address space may be shared with other processes (shared

memory)– Collection of systems resources (files, semaphores)

• Thread (light weight process)– A flow of control through an address space– Each address space can have multiple concurrent control flows– Each thread has access to entire address space– Potentially parallel execution, minimal state (low overheads)– May need synchronization to control access to shared variables

11

Threads• A process has many threads sharing its execution environment• Each thread has its own stack, PC, registers

– Share address space, files,…

12

Why use Threads?

• Large multiprocessors/multi-core systems need many computing entities (one per CPU or core )

• Switching between processes incurs high overhead• With threads, an application can avoid per-process

overheads– Thread creation, deletion, switching cheaper than processes

• Threads have full access to address space (easy sharing)• Threads can execute in parallel on multiprocessors

13

Why Threads?• Single threaded process: Blocking system calls, no parallelism• 1.The single thread processes (having one address space and one flow of

control)– If there is blocking IO system whole process blocks (CPU waits idle -and put into IO queue )

until the IO finishes – Does not allow parallelism

• 2.One address space multiple processes – If one thread calls blocking IO, only that thread goes to IO queue, the other threads are still

ready to run, so the CPU scheduler can actually chose process can continue to make progress. - CPU keeps running

- 3. Event-based (finite-state machine) model:- Also called asynchronous model– Allows you to get the parallelism with the single threaded processes– Instead of making blocking IO processes make non-blocking IO– You can send out IO instructions but instead of blocking keep doing what you are doing, when

IO finishes you will basically get an exception on an event that you can go back and service the results of that IO

– This is actually parallelizing IO and CPU job for a single process

14

Thread Management

• Creation and deletion of threads– Static versus dynamic

• Critical sections– Synchronization primitives: Blocking, spin-lock

(busy-wait)– Condition variables

• Global thread variables• Kernel versus user-level threads

15

User-level vs kernel threads

• Key issues:• Cost of thread management– More efficient in user space

• Ease of scheduling• Flexibility: Many parallel programming models

and schedulers• Process blocking – a potential problem

16

User-level Threads

• Threads managed by a threads library– Kernel is unaware of presence of threads

• Advantages:– No kernel modifications needed to support threads– Efficient: creation/deletion/switches don’t need system calls– Flexibility in scheduling: library can use different scheduling

algorithms, can be application dependent• Disadvantages

– Need to avoid blocking system calls [all threads block]– Threads compete for one another– Does not take advantage of multiprocessors [no real parallelism]

• Blocking system calls block the whole process

17

Kernel-level threads

• Kernel aware of the presence of threads– Better scheduling decisions, more expensive– Better for multiprocessors, more overheads for

uniprocessors• Blocking system calls done by a process is no problem• Loss of efficiency (all operations are performed as

system calls)

• Conclusion: Try to mix user-level and kernel-level threads into a single concept

18

Light-weight Processes (Hybrid Approach)

• Combining kernel-level lightweight processes and user-level threads

19

Light-weight Processes (LWP)• Several LWPs per heavy-weight process

• User-level threads package– Create/destroy threads and synchronization primitives

• Multithreaded applications – create multiple threads, assign threads to LWPs (one-one, many-one, many-many)

• When a thread calls a blocking user-level operation (e.g. block on a mutex or a condition variable), and another runnable thread is found, a context switch is made to that thread which is then bound to the same LWP

• When a thread does a blocking system-level call, its LWP blocks. The thread remains bound to the LWP. The kernel schedules another LWP having a runnable thread bound to it. This also implies that a context switch is made back to user mode. The selected LWP will simply continue where it had previously had left off.

20

Thread Packages

• Posix Threads (pthreads)– Widely used threads package– Conforms to the Posix standard– Sample calls: pthread_create,…– Typical used in C/C++ applications– Can be implemented as user-level or kernel-level or via

LWPs• Java Threads– Native thread support built into the language– Threads are scheduled by the JVM

21

Quiz1. At any specific time, how many threads can be running in the system?

– Is the question correct?– Correct the question and answer it.

2. What does a thread need when it is created?

3. Lets assume there is a sequential process that cannot be divided into parallel task. Do you think it can utilize threads?

4. Process is doing IO instruction and context switch occurs?

5. Process is doing CPU instruction and context switch occurs?

22

Threads in Distributed Systems

• Threads allow clients and servers to be constructed such that communication and local processing can overlap, resulting in a high-level performance.

• They can provide a convenient means of allowing blocking system calls without blocking the entire process in which the thread is running

• The main idea is to exploit parallelism to attain high performance (especially when executing a program on a multiprocessor system).

• Multithreaded Clients• Multithreaded Servers

23

Multi-threaded Clients Example

• Main issue is hiding network latency• Browsers such as IE are multi-threaded• Such browsers can display data before entire document

is downloaded: performs multiple simultaneous tasks– Fetch main HTML page, activate separate threads for other

parts– Each thread sets up a separate connection with the server

• Uses blocking calls

– Each part (gif image) fetched separately and in parallel– Advantage: connections can be setup to different sources

• Ad server, image server, web server…

24

Multi-threaded Server Example• Main issue is improved performance and better structure• Apache web server: Pool of pre-spawned worker threads

– Dispatcher thread waits for requests– For each request, choose an idle worker thread– Worker thread uses blocking system calls to service web request

25

Three ways to construct a server

• Threads– Parallelism, blocking system call

• Single-threaded process– No parallelism, blocking system call.– How it runs? Lets say file server…

• Finite-state machine– Single thread but imitating parallelism, non-

blocking system calls

Documents

Distributed (Operating) Systems -Processes and Threads- Computer Engineering Department Distributed Systems Course Assoc. Prof. Dr. Ahmet Sayar Kocaeli