View
221
Download
1
Tags:
Embed Size (px)
Citation preview
Introduction & Architecture Refresher
CS 519: Operating System Theory
Computer Science, Rutgers University
Instructor: Thu D. Nguyen
TA: Xiaoyan Li
Spring 2002
Computer Science, Rutgers CS 519: Operating System Theory2
Logistics
Instructor: Thu D. NguyenOffice: CoRE 326
Office Hours: TDB
TA: Xiaoyan LiOffice: Hill 411
Office Hours: TDB
Resourceshttp://paul.rutgers.edu/cs519/S02
Computer Science, Rutgers CS 519: Operating System Theory3
Course Overview
GoalsDeeper understanding of OS and the OS/architecture interface/interaction
A little on distributed/parallel OS
PrerequisitesUndergraduate OS and architecture courses
Good programming skills in C and UNIX
Computer Science, Rutgers CS 519: Operating System Theory4
Course Overview
Structure:Each major area
Review basic material
Read and discuss papers to understand advanced issues
Several programming assignments
Significant project
Midterm and final exams
Computer Science, Rutgers CS 519: Operating System Theory5
Textbooks - Required
William Stallings. Operating Systems: Internals and Design Principles, Prentice-Hall, 1998. (Third Edition)
Papers
Most available from the web
Others accessible from the library IRIS Reserve Desk
Computer Science, Rutgers CS 519: Operating System Theory6
Topics
Processes, threads, and synchronization
Virtual memory
I/O and file system
Communication
Processor scheduling
Transactions
Distributed file systems
Parallel and real-time scheduling
Cluster memory management
OS structure and organization
Current research topics
Computer Science, Rutgers CS 519: Operating System Theory7
Project
GoalsLearn to design, implement, and evaluate a significant OS-level software system
Improve systems programming skills: virtual memory, threads, synchronization, sockets
StructureSmall teams (3 students per team)
Written project report
Possibly 15-minute oral presentation of final report
Computer Science, Rutgers CS 519: Operating System Theory8
Grading
Midterm 20%
Final 30%
Programming assignments 20%
Project 30%
Computer Science, Rutgers CS 519: Operating System Theory9
Today
What is an Operating System?Stallings 2.1-2.4
Architecture refresher …
Computer Science, Rutgers CS 519: Operating System Theory10
What is an operating system?
A software layer between the hardware and the application programs/users which provides a virtual machine interface: easy to use (hides complexity) and safe (prevents and handles errors)
Acts as resource manager that allows programs/users to share the hardware resources in a protected way: fair and efficient
hardware
operating system
application (user)
Computer Science, Rutgers CS 519: Operating System Theory11
How does an OS work?
Receives requests from the application: system calls
Satisfies the requests: may issue commands to hardware
Handles hardware interrupts: may upcall the application
OS complexity: synchronous calls + asynchronous events
hardware
OS
application (user) system calls upcalls
commands interrupts
hardware independent
hardware dependent
Computer Science, Rutgers CS 519: Operating System Theory12
Mechanism and Policy
Mechanisms: data structures and operations that implement an abstraction (e.g. the buffer cache)
Policies: the procedures that guide the selection of a certain course of action from among alternatives (e.g. the replacement policy for the buffer cache)
Traditional OS is rigid: mechanism together with policy
hardware
operating system: mechanism+policy
application (user)
Computer Science, Rutgers CS 519: Operating System Theory13
Mechanism-Policy Split
Single policy often not the best for all cases
Separate mechanisms from policies:OS provides the mechanism + some policy
applications contribute to the policy
Flexibility + efficiency require new OS structures and/or new OS interfaces
Computer Science, Rutgers CS 519: Operating System Theory15
Basic computer structure
CPU Memory
memory bus
I/O bus
disk Net interface
Computer Science, Rutgers CS 519: Operating System Theory16
von Neumann Machine
The first computers (late 40’s) were calculators
The advance was the idea of storing the instructions (coded as numbers) along with the data in the same memory
Crux of the split between:Central Processing Unit (CPU) and
Memory
Computer Science, Rutgers CS 519: Operating System Theory17
Conceptual Model
Addresses ofmemory cells
+-*/
+-*/
CPU 012
Memory contents
34
5
76
89
"big byte array"
Computer Science, Rutgers CS 519: Operating System Theory18
Operating System Perspective
A computer is a piece of hardware which runs the fetch-decode-execute loop
Next slides: walk through a very simple computer to illustrate
Machine organizationWhat are the pieces and how they fit together
The basic fetch-decode-execute loop
How higher-level constructs are translated into machine instructions
At its core, the OS builds what looks like a more complex machine on top of this basic hardware
Computer Science, Rutgers CS 519: Operating System Theory19
Fetch-Decode-Execute
Computer as a large, general purpose calculatorwant to program it for multiple functions
All von Neumann computers follow the same loop:Fetch the next instruction from memory
Decode the instruction to figure out what to do
Execute the instruction and store the result
Instructions are simple. Examples:Increment the value of a memory cell by 1
Add the contents of memory cells X and Y and store in Z
Multiply contents of memory cells A and B and store in B
Computer Science, Rutgers CS 519: Operating System Theory20
Instruction Encoding
How to represent instructions as numbers?
operators+: 1-: 2*: 3/: 4
operands destination8 bits 8 bits 8 bits 8 bits
Computer Science, Rutgers CS 519: Operating System Theory21
Example Encoding
Add cell 28 to cell 63 and place result in cell 100:
operator
+: 1-: 2*: 3/: 4
source operands destinationCell 100Cell 63 Cell 28
8 bits 8 bits 8 bits 8 bits
Instruction as a number in: Decimal: 1:28:63:100Binary: 00000001:00011100:00111111:01100100 Hexadecimal: 01:1C:3F:64
Computer Science, Rutgers CS 519: Operating System Theory22
Example Encoding (cont)
How many instructions can this encoding have?
8 bits, 2^8 combinations = 256 instructions
How much memory can this example instruction set support?
Assume each memory cell is a byte (8 bits) wide
Assume operands and destination come from the same memory
8 bits per source/dest = 2^8 combinations = 256 bytes
How many bytes did we use per instruction?
4 bytes per instruction
How could we get more memory without changing the encoding?
Why is this simple encoding not realistic?
Computer Science, Rutgers CS 519: Operating System Theory23
The Program Counter
Where is the “next instruction” held in the machine?
In a special memory cell in the CPU called the “program counter" (the PC)
Special purpose memory in the CPU and devices are called registers
Naïve fetch cycle: Increment the PC by the instruction length (4) after each execute
Assumes all instructions are the same length
Computer Science, Rutgers CS 519: Operating System Theory24
Conceptual Model
+-*/
+-*/
CPU
44Program Counter
Arithmetic Units
01
2
Memory
operator
operand 1
operand 2
destination34
5
76
89
Instruction 0@ memory address 0
Instruction 1@ memory address 4
Computer Science, Rutgers CS 519: Operating System Theory25
Memory Indirection
How do we access array elements efficiently if all we can do is name a cell?
Modify the operand to allow for fetching an operand "through" a memory location
E.g.: LOAD [5], 2 means fetch the contents of the cell whose address is in cell 5 and put it into cell 2
So if cell 5 had the number 100, we would place the contents of cell 100 into cell 2
This is called indirection
Fetch the contents of the cell “pointed to” by the cell in the opcode
Steal an operand bit to signify if an indirection is desired
Computer Science, Rutgers CS 519: Operating System Theory26
Conditionals and Looping
Primitive “computers” only followed linear instructions
Breakthrough in early computing was addition of conditionals and branching
Instructions that modify the Program Counter
Conditional instructionsIf the content of this cell is [positive, not zero, etc.] execute the instruction or not
Branch InstructionsIf the content of this cell is [zero, non zero, etc.], set the PC to this location
jump is an unconditional branch
Computer Science, Rutgers CS 519: Operating System Theory27
Example: While Loop
while (counter > 0) {
sum = sum + Y[counter];
counter–-;
};
Variables to memory cells: counter is cell 1sum is cell 2index is cell 3Y[0]= cell 4, Y[1]=cell 5…
100 LOOP: BNZ 1,END // branch to address of END
// if cell 1 is not 0.104 ADD 2,[3],2 // Add cell 2 and the
value // of the cell pointed
to by// cell 3 then place the // result in cell 2
108 DEC 3 // decrement cell 3 by 1112 DEC 1 // decrement cell 1 by 1116 JUMP LOOP // start executing from
the // address of LOOP120 END: <next code block>
Memory cell
address
Assembler label
Assembler "mnemonic" English
Computer Science, Rutgers CS 519: Operating System Theory28
Registers
Architecture rule: large memories are slow, small ones are fast
But everyone wants more memory!
Solution: Put small amount of memory in the CPU for faster operation
Most programs work on only small chunks of memory in a given time period. This is called locality.
So, if we cache the contents of a small number of memory cells in the CPU memory, we might be able to execute a number of instructions before having to access memory
Small memory in CPU named separately in the instructions from the “main memory”
Small memory in CPU = registers
Large memory = main memory
Computer Science, Rutgers CS 519: Operating System Theory29
Register Machine Model
+,-,*,/+,-,*,/
CPU 01
2
Memory
34
5
76
89
88Program Counter
Arithmetic Units
Logic Units <,>,!=<,>,!=
2424
100100
1818
register 0
register 1
register 2
Computer Science, Rutgers CS 519: Operating System Theory30
Registers (cont)
Most CPUs have 16-32 “general purpose” registers All look the “same”: combination of operators, operands and destinations possible
Operands and destination can be in:Registers only (Sparc, PowerPC, Mips, Alpha)
Registers & 1 memory operand (Intel x86 and clones)
Any combination of registers and memory (Vax)
Only memory operations possible in "register-only" machines are load from and store to memory
Operations 100-1000 times faster when operands are in registers compared to when they are in memory
Save instruction space too Only address 16-32 registers, not GB of memory
Computer Science, Rutgers CS 519: Operating System Theory31
Typical Instructions
Add the contents of register 2 and register 3 and place result in register 5
ADD r2,r3,r5
Add 100 to the PC if register 2 is not zeroRelative branch
BNZ r2,100
Load the contents of memory location whose address is in register 5 into register 6
LDI r5,r6
Computer Science, Rutgers CS 519: Operating System Theory32
Memory Hierarchy
cpu
cache
main memory
word transfer
block transfer
disks
page transfer
decrease cost per bit
decrease frequency of access
increase capacity
increase access time
increase size of transfer unit
Computer Science, Rutgers CS 519: Operating System Theory34
Memory Caches
Motivated by the mismatch between processor and memory speed
Closer to the processor than the main memory
Smaller and faster than the main memory
Act as “attraction memory”: contains the value of main memory locations which were recently accessed (temporal locality)
Transfer between caches and main memory is performed in units called cache blocks/lines
Caches contain also the value of memory locations which are close to locations which were recently accessed (spatial locality)
Computer Science, Rutgers CS 519: Operating System Theory35
Cache Architecture
CPU
L1
L2
Memory
cache line
associativity
Capacity miss
Conflict miss
Cold miss
Cache line ~32-128Associativity ~2-4
Computer Science, Rutgers CS 519: Operating System Theory36
Cache design issues
Cache size and cache block size
Mapping: physical/virtual caches, associativity
Replacement algorithm: direct or LRU
Write policy: write through/write back
cpu
cache
mainmemory
word transfer
block transfer
Computer Science, Rutgers CS 519: Operating System Theory37
Abstracting the Machine
Bare hardware provides a computation device
How to share this expensive piece of equipment between multiple users?
Sign up during certain hours?
Give program to an operator?
they run it and give you the results
Software to give the illusion of having it all to yourself while actually sharing it with others (time-sharing)!
This software is the Operating System
Need hardware support to “virtualize” machine
Computer Science, Rutgers CS 519: Operating System Theory38
Architecture Features for the OS
Next we'll look at the mechanisms the hardware designers add to allow OS designers to abstract the basic machine in software
Processor modes
Exceptions
Traps
Interrupts
These require modifications to the basic fetch-decode-execute cycle in hardware
Computer Science, Rutgers CS 519: Operating System Theory39
Processor Modes
OS code is stored in memory … von Neumann model, remember?
What if a user program modifies OS code or data?
Introduce modes of operation
Instructions can be executed in user mode or system mode
A special register holds which mode the CPU is in
Certain instructions can only be executed when in system mode
Likewise, certain memory location can only be written when in system mode
Only OS code is executed in system mode
Only OS can modify its memory
The mode register can only be modified in system mode
Computer Science, Rutgers CS 519: Operating System Theory40
Simple Protection Scheme
All addresses < 100 are reserved for operating system use
Mode register providedzero = CPU is executing the OS (in system mode)
one = CPU is executing in user mode
Hardware does this check: On every fetch, if the mode bit is 1 and the address is less than 100, then do not execute the instruction
When accessing operands, if the mode bit is 1 and the operand address is less than 100, do not execute the instruction
Mode register can only be set if mode is 0
Computer Science, Rutgers CS 519: Operating System Theory41
Simple Protection Model
Memory
0
99100101
102
104103
105106
+,-,*,/+,-,*,/
CPU
88Program Counter
Arithmetic Units
Logic Units <,>,!=<,>,!=
Registers 0-31
Mode register 00
OS
User
Computer Science, Rutgers CS 519: Operating System Theory42
Fetch-decode-execute Revised
Fetch:
if (( the PC < 100) && ( the mode register == 1)) then
Error! User tried to access the OS
else
fetch the instruction at the PC
Decode:
if (( destination register == mode) && ( the mode register == 1)) then
Error! User tried to set the mode register
< more decoding >
Execute:
if (( an operand < 100) && ( the mode register == 1) then
error! User tried to access the OS
else
execute the instruction
Computer Science, Rutgers CS 519: Operating System Theory43
Exceptions
What happens when a user program tries to access memory holding the operating system code or data?
Answer: exceptions
An exception occurs when the CPU encounters an instruction which cannot be executed
Modify fetch-decode-execute loop to jump to a known location in the OS when an exception happens
Different errors jump to different places in the OS (are "vectored" in OS speak)
Computer Science, Rutgers CS 519: Operating System Theory44
Fetch-decode-execute with Exceptions
Fetch:
if (( the PC < 100) && ( the mode bit == 1)) then
set the PC = 60
set the mode = 0
fetch the instruction at the PC
Decode:
if (( destination register == mode) && ( the mode register == 1)) then
set the PC = 64
set the mode = 0
goto fetch
< more decoding >
Execute:
< check the operands for a violation>
60 is the well known entry pointfor a memory violation
64 is the well known entry pointfor a mode register violation
Computer Science, Rutgers CS 519: Operating System Theory45
Access Violations
Notice both instruction fetch from memory and data access must be checked
Execute phase must check both operands
Execute phase must check again when performing an indirect load
This is a very primitive memory protection scheme. We'll cover more complex virtual memory mechanisms and policies later in the course
Computer Science, Rutgers CS 519: Operating System Theory46
Recovering from Exceptions
The OS can figure out what caused the exception from the entry point
But how can it figure out where in the user program the problem was?
Solution: add another register, the PC’
When an exception occurs, save the current PC to PC’ before loading the PC with a new value
OS can examine the PC' and perform some recovery action
Stop user program and print an error message: error at address PC'
Run a debugger
Computer Science, Rutgers CS 519: Operating System Theory47
Fetch-decode-execute with Exceptions & Recovery
Fetch:
if (( the PC < 100) && ( the mode bit == 1)) then
set the PC' = PC
set the PC = 60
set the mode = 0
Decode:
if (( destination register == mode) && ( the mode register == 1)) then
set the PC' = PC
set the PC = 64
set the mode = 0
goto fetch
< more decoding >
Execute:
…
Computer Science, Rutgers CS 519: Operating System Theory48
Traps
Now we know what happens when a user program illegally tries to access OS code or data
How does a user program legitimately access OS services?
Solution: Trap instruction
A trap is a special instruction that forces the PC to a known address and sets the mode into system mode
Unlike exceptions, traps carry some arguments to the OS
Foundation of the system call
Computer Science, Rutgers CS 519: Operating System Theory49
Fetch-decode-execute with traps
Fetch:
if (( the PC < 100) && ( the mode bit == 1)) then
< memory exception>
Decode:
if (the instruction is a trap) then
set the PC' = PC
set the PC = 68
set the mode = 0
goto fetch
if (( destination register == mode) && ( the mode bit == 1)) then
< mode exeception >
Execute:
…
Computer Science, Rutgers CS 519: Operating System Theory50
How does the OS know which service the user program wants to invoke on a trap?
User program passes the OS a number that encodes which OS service is desired
This example machine could include the trap ID in the instruction itself:
Most real CPUs have a convention for passing the trap code in a set of registers
E.g. the user program sets register 0 with the trap code, then executes the trap instruction
Traps
Trap opcode Trap service ID
Computer Science, Rutgers CS 519: Operating System Theory51
Returning from a Trap
How to "get back" to user mode and the user's code after a trap?
Set the mode register = 0 then set the PC?
But after the mode bit is set to user, exception!
Set the PC, then set the mode bit?
Jump to "user-land", then in kernel mode
Most machines have a "return from exception" instruction
A single hardware instruction:
Swaps the PC and the PC'
Sets the mode bit to user mode
Traps and exceptions use the same mechanism (RTE)
Computer Science, Rutgers CS 519: Operating System Theory52
Interrupts
How can we force a the CPU back into system mode if the user program is off computing something?
Solution: Interrupts
An interrupt is an external event that causes the CPU to jump to a known address
Link an interrupt to a periodic clock
Modify fetch-decode-execute loop to check an external line set periodically by the clock
Computer Science, Rutgers CS 519: Operating System Theory53
Simple Interrupt Model
Clock
+,-,*,/+,-,*,/
CPU
88
PC'
Arithmetic Units
Logic Units <,>,!=<,>,!=
Registers 0-31
Mode register 00
Program Counter
Memory
Interrupt line
Reset line
OSUser
Computer Science, Rutgers CS 519: Operating System Theory54
The Clock
The clock starts counting to 10 milliseconds
The clock sets the interrupt line "high" (e.g. sets it logic 1, maybe +5 volts)
When the CPU toggles the reset line, the clock sets the interrupt line low and starts count to 10 milliseconds again
Computer Science, Rutgers CS 519: Operating System Theory55
Fetch-decode-execute with Interrupts
Fetch:
if (the clock interrupt line == 1) then
set the PC' = PC
set the PC = 72
set the mode = 0
goto fetch
if (( the PC < 100) && ( the mode bit == 1)) then
< memory exception >
fetch next instruction
Decode:
if (the instruction is a trap) then
< trap exception >
if (( destination register == mode) && ( the mode bit == 1)) then
< mode exeception >
<more decoding>
Execute: …
Computer Science, Rutgers CS 519: Operating System Theory56
Entry Points
What are the "entry points" for our little example machine?
60: memory access violation
64: mode register violation
68: User-initiated trap
72: Clock interrupt
Each entry point is typically a jump to some code block in the OS
All real OS’es have a set of entry points for exceptions, traps and interrupts
Sometimes they are combined and software has to figure out what happened.
Computer Science, Rutgers CS 519: Operating System Theory57
Saving and Restoring Context
Recall the processor state:PC, PC', R0-R31, mode register
When an entry to the OS happens, we want to start executing the correct routine then return to the user program such that it can continue executing normally
Can't just start using the registers in the OS!
Solution: save/restore the user context Use the OS memory to save all the CPU state
Before returning to user, reload all the registers and then execute a return from exception instruction
Computer Science, Rutgers CS 519: Operating System Theory58
Input and Output
How can humans get at the data?
How to load programs?
What happens if I turn the machine off?
Can I send the data to another machine?
Solution: add devices to perform these tasks:Keyboards, mice, graphics
Disk drives
Network cards
Computer Science, Rutgers CS 519: Operating System Theory59
A Simple I/O device
Network card has 2 registers: a store into the “transmit” register sends the byte over the wire.
Transmit often is written as TX (E.g. TX register)
a load from the “receive” register reads the last byte which was read from the wire
Receive is often written as RX
How does the CPU access these registers?
Solution: map them into the memory space An instruction that access memory cell 98 really accesses the transmit register instead of memory
An instruction that accesses memory cell 99 really accesses the receive register
These registers are said to be memory-mapped
Computer Science, Rutgers CS 519: Operating System Theory60
Basic Network I/O
Clock
+,-,*,/+,-,*,/
CPU
88
PC'
Arithmetic Units
Logic Units <,>,!=<,>,!=
Registers 0-31
Mode register 00
Program Counter
Memory
Interrupt line
Reset line
0
Network card
9899
Transmit Reg.Transmit Reg.
Receive Reg.Receive Reg.
Computer Science, Rutgers CS 519: Operating System Theory61
Why Memory-Mapped Registers
"Stealing" memory space for device registers has 2 functions:
Allows protected access --- only the OS can access the device.
User programs must trap into the OS to access I/O devices because of the normal protection mechanisms in the processor
Why do we want to prevent direct access to devices by user programs?
OS can control devices and move data to/from devices using regular load and store instructions
No changes to the instruction set are required
This is called programmed I/O
Computer Science, Rutgers CS 519: Operating System Theory62
Status Registers
How does the OS know if a new byte has arrived?
How does the OS know when the last byte has been transmitted? (so it can send another one)
Solution: status registers
A status register holds the state of the last I/O operation
Our network card has 1 status register
To transmit, the OS writes a byte into the TX register and sets bit 0 of the status register to 1. When the card has successfully transmitted the byte, it sets bit 0 of the status register back to 0.
When the card receives a byte, it puts the byte in the RX register and sets bit 1 of the status register to 1. After the OS reads this data, it sets bit 1 of the status register back to 0.
Computer Science, Rutgers CS 519: Operating System Theory63
Polled I/O
To Transmit:While (status register bit 0 == 1); // wait for card to be ready
TX register = data;
Status reg = status reg | 0x1; // tell card to TX (set bit 0 to 1)
Naïve Receive:While (status register bit 1 != 1); // wait for data to arrive
Data = RX register;
Status reg = status reg & 0x01; // tell card got data (clear bit 1)
Cant' stall OS waiting to receive!
Solution: poll after the clock ticks If (status register bit 1 == 1)
Data = RX register
Status reg = status reg & 0x01;
Computer Science, Rutgers CS 519: Operating System Theory64
Interrupt driven I/O
Polling can waste many CPU cyclesOn transmit, CPU slows to the speed of the device
Can't block on receive, so tie polling to clock, but wasted work if no RX data
Solution: use interruptsWhen network has data to receive, signal an interrupt
When data is done transmitting, signal an interrupt.
Computer Science, Rutgers CS 519: Operating System Theory65
Polling vs. Interrupts
Why poll at all?
Interrupts have high overhead:Stop processor
Figure out what caused interrupt
Save user state
Process request
Key factor is frequency of I/O vs. interrupt overhead
Computer Science, Rutgers CS 519: Operating System Theory66
Direct Memory Access (DMA)
Problem with programmed I/O: CPU must load/store all the data into device registers.
The data is probably in memory anyway!
Solution: more hardware to allow the device to read and write memory just like the CPU
Base + bound or base + count registers in the device
Set base + count register
Set the start transmit register
I/O device reads memory from base
Interrupts when done
Computer Science, Rutgers CS 519: Operating System Theory67
PIO vs. DMA
Overhead less for PIO than DMAPIO is a check against the status register, then send or receive
DMA must set up the base, count, check status, take an interrupt
DMA is more efficient at moving dataPIO ties up the CPU for the entire length of the transfer
Size of the transfer becomes the key factor in when to use PIO vs. DMA
Computer Science, Rutgers CS 519: Operating System Theory68
Example of PIO vs. DMA
Given:A load costs 100 CPU "cycles“ (time units)
A store costs 50 cycles
To process an interrupt costs 2000 instructions, each an average each of 2 cycles
To send a packet via PIO costs 1 load + 1 store per byte
To send via DMA costs 4 loads + interrupt
Find the packet size where transmitting via DMA costs less CPU cycles than PIO
Computer Science, Rutgers CS 519: Operating System Theory69
Example PIO vs. DMA
Find the number of bytes were PIO==DMA (cutoff point) cycles per load: L
cycles per store: S
byte in the packet: C
Express simple equation for CPU cycles in terms of cost per byte:
# of cycles for PIO = L + S*B
# of cycles for DMA = setup + interrupt
# of cycles for DMA = 4L + 4000
Set PIO cycles equal to DMA cycles and solve for bytes:L+S*B = 4L+4000
100+50B = 4(100)+4000
B = 86 bytes (cutoff point)
When the packet size is >86 bytes, DMA costs less cycles than PIO.
Computer Science, Rutgers CS 519: Operating System Theory70
Typical I/O devices
Disk drives:Present the CPU a linear array of fixed-sized blocks that are persistent across power cycles
Network cards:Allow the CPU to send and receive discrete units of data (packets) across a wire, fiber or radio
Packet sizes 64-8000 bytes are typical
Graphics adapters:Present the CPU with a memory that is turned into pixels on a screen
Computer Science, Rutgers CS 519: Operating System Theory71
Recap: the I/O design space
Polling vs. interruptsHow does the device notify the processor an event happened?
Polling: Device is passive, CPU must read/write a register
Interrupt: device signals CPU via an interrupt
Programmed I/O vs. DMA How the device sends and receives data
Programmed I/O: CPU must use load/store into the device
DMA: Device reads and writes memory
Computer Science, Rutgers CS 519: Operating System Theory72
Practical: How to boot?
How does a machine start running the operating system in the first place?
The process of starting the OS is called booting
Sequence of hardware + software event form the boot protocol
Boot protocol in modern machines is a 3-stage processCPU starts executing from a fixed address
Firmware loads the boot loader
Boot loader loads the OS
Computer Science, Rutgers CS 519: Operating System Theory73
Boot Protocol
(1) CPU is hard-wired to start executing from a known address in memory
E.g., on x86 this address is 0xFFFF0 (hexadecimal)
This memory address is typically mapped to read-only memory (ROM)
(2) ROM contains the “boot” codeThis kind of read-only software is called firmware
On x86, the starting address corresponds to the BIOS (basic input-output system) boot entry point
This "firmware" code contains only enough enough code to read 1 block from the disk drive. This block is loaded and then executed. This program is the boot loader.
Computer Science, Rutgers CS 519: Operating System Theory74
Boot Protocol (cont)
(3) The boot loader can then load the rest of the operating system from disk. Note that this point the OS still is not running
The boot loader can know about multiple operating systems
The boot loader can know about multiple versions of the OS
Computer Science, Rutgers CS 519: Operating System Theory75
Why Have A Boot Protocol?
Why not just store the OS into ROM?
Separate the OS from the hardwareMultiple OSes or different versions of the OS
Want to boot from different devices E.g. security via a network boot
OS is pretty big (4-8MB). Rather not have it as firmware
Computer Science, Rutgers CS 519: Operating System Theory76
Multiprocessors
CPUMemory
memory bus
I/O bus
disk Net interface
cache
Simple scheme (SMP): more than one processor on the same bus
Memory is shared among processors -- cache coherence
Bus contention increases -- does not scale
Alternative (non-bus) system interconnect – complex and expensive
SMPs naturally support single-image operating systems
CPU
cache
Computer Science, Rutgers CS 519: Operating System Theory77
Cache-Coherent Shared-Memory: UMA
CPU
Memory
CPU CPU CPU
SnoopingCaches
Computer Science, Rutgers CS 519: Operating System Theory78
CC-NUMA Multiprocessors
CPUMemory
memory bus
I/O bus
disk Net interface
cache
CPUMemory
memory bus
I/O bus
diskNet interface
cache
network
• Non-uniform access to different memories• Hardware allows remote memory accesses and maintains cache coherence• Scalable interconnect more scalable than bus-based UMA systems• Also naturally supports single-image operating systems• Complex hardware coherence protocols
Computer Science, Rutgers CS 519: Operating System Theory79
Multicomputers
Network of computers: “share-nothing” -- cheap
Distributed resources: difficult to program
Message passing
Distributed file system
Challenge: build efficient global abstraction in software
CPUMemory
memory bus
I/O bus
disk Net interface
cache
CPUMemory
memory bus
I/O bus
diskNet interface
cache
network
Computer Science, Rutgers CS 519: Operating System Theory81
Basic computer structure
CPU Memory
memory bus
I/O bus
disk Net interface
Computer Science, Rutgers CS 519: Operating System Theory82
System Abstraction: Processes
A process is a system abstraction:illusion of being the only job in the system
hardware: computer
operating system: process
user: application create, kill processes,inter-process comm.
multiplex resources
Computer Science, Rutgers CS 519: Operating System Theory83
Processes: Mechanism and Policy
Mechanism:
Creation, destruction, suspension, context switch, signaling, IPC, etc.
Policy:
Minor policy questions:
Who can create/destroy/suspend processes?
How many active processes can each user have?
Major policy question that we will concentrate on:
How to share system resources between multiple processes?
Typically broken into a number of orthogonal policies for individual resources such as CPU, memory, and disk.
Computer Science, Rutgers CS 519: Operating System Theory84
Processor Abstraction: Threads
A thread is a processor abstraction: illusion of having 1 processor per execution context
hardware: processor
operating system: thread
application: execution contextcreate, kill, synch.
context switch
Process vs. Thread: Process is the unit of resource ownership,while Thread is the unit of instruction execution.
Computer Science, Rutgers CS 519: Operating System Theory85
Threads: Mechanism and Policy
Mechanism:Creation, destruction, suspension, context switch, signaling, synchronization, etc.
Policy:How to share the CPU between threads from different processes?
How to share the CPU between threads from the same process?
Computer Science, Rutgers CS 519: Operating System Theory86
Threads
Traditional approach: OS uses a single policy (or at most a fixed set of policies) to schedule all threads in the system. Assume two classes of jobs: interactive and batch.
New approaches: application-controlled scheduling, reservation-based scheduling
Computer Science, Rutgers CS 519: Operating System Theory87
Memory Abstraction: Virtual memory
hardware: physical memory
operating system: virtual memory
Virtual memory is a memory abstraction: illusion of large contiguous memory, often more memory than physically available
application: address spacevirtual addresses
physical addresses
Computer Science, Rutgers CS 519: Operating System Theory88
Virtual Memory: Mechanism
Mechanism:
Virtual-to-physical memory mapping, page-fault, etc.
physical memory:
v-to-p memory mappings
processes:
virtual address spacesp1 p2
Computer Science, Rutgers CS 519: Operating System Theory89
Virtual Memory: Policy
Policy:How to multiplex a virtual memory that is larger than the physical memory onto what is available?
How to share physical memory between multiple processes?
Computer Science, Rutgers CS 519: Operating System Theory90
Virtual Memory
Traditional approach: OS provides a sufficiently large virtual address space for each running application, does memory allocation and replacement, and may ensure protection
New approaches: external memory management, huge (64-bit) address space, global virtual address space
Computer Science, Rutgers CS 519: Operating System Theory91
Storage Abstraction: File System
hardware: disk
operating system: files, directories
A file system is a storage abstraction: illusion of structured storage space
application/user: copy file1 file2 naming, protection,operations on files
operations on disk blocks
Computer Science, Rutgers CS 519: Operating System Theory92
File System
Mechanism:File creation, deletion, read, write, file-block-to-disk-block mapping, file buffer cache, etc.
Policy:Sharing vs. protection?
Which block to allocate for new data?
File buffer cache management?
Computer Science, Rutgers CS 519: Operating System Theory93
File System
Traditional approach: OS does disk block allocation and caching (buffer cache) , disk operation scheduling, and management of the buffer cache
New approaches: application-controlled cache replacement, log-based allocation (makes writes fast)
Computer Science, Rutgers CS 519: Operating System Theory94
Traditional file system
application: read/write files
OS: translate file to disk blocks
...buffer cache ...maintains
controls disk accesses: read/write blocks
hardware:
Computer Science, Rutgers CS 519: Operating System Theory95
Application-controlled caching
application: read/write files replacement policy
OS: translate file to disk blocks
...buffer cache ...maintains
controls disk accesses: read/write blocks
hardware:
Computer Science, Rutgers CS 519: Operating System Theory96
Communication Abstraction: Messaging
hardware: network interface
operating system: TCP/IP protocols
Message passing is a communication abstraction: illusion of reliable (sometime ordered) transport
application: socketsnaming, messages
network packets
Computer Science, Rutgers CS 519: Operating System Theory97
Message Passing
Mechanism:Send, receive, buffering, retransmission, etc.
Policy:Congestion control and routing
Multiplexing multiple connections onto a single NIC
Computer Science, Rutgers CS 519: Operating System Theory98
Message Passing
Traditional approach: OS provides naming schemes, reliable transport of messages, packet routing to destination
New approaches: user-level protocols, zero-copy protocols, active messages, memory-mapped communication
Computer Science, Rutgers CS 519: Operating System Theory99
UMA Multiprocessors
CPU
Memory
memory bus
I/O bus
disk Net interface
cache
CPU
cache
Computer Science, Rutgers CS 519: Operating System Theory100
UMA Multiprocessors: OS Issues
ProcessesHow to divide processors among multiple processes? Time sharing vs. space sharing
ThreadsSynchronization mechanisms based on shared memory
How to schedule threads of a single process on its allocated processors?
Affinity scheduling?
Computer Science, Rutgers CS 519: Operating System Theory101
CC-NUMA Multiprocessors
CPUMemory
memory bus
I/O bus
disk Net interface
cache
CPUMemory
memory bus
I/O bus
diskNet interface
cache
network
Hardware allows remote memory accesses and maintains cache coherence through protocol
Computer Science, Rutgers CS 519: Operating System Theory102
CC-NUMA Multiprocessors: OS Issues
Memory locality!!Remote memory access up to an order of magnitude more expensive than local access
Thread migration vs. page migration
Page replication
Affinity scheduling
Computer Science, Rutgers CS 519: Operating System Theory103
Multicomputers
CPUMemory
memory bus
I/O bus
disk Net interface
cache
CPUMemory
memory bus
I/O bus
diskNet interface
cache
network
Share-nothing
Computer Science, Rutgers CS 519: Operating System Theory104
Multicomputers: OS Issues
SchedulingNode allocation? (CPU and memory allocated together)
Process migration?
Software distributed shared-memory (Soft DSM)
Distributed file systems
Low-latency reliable communication
Computer Science, Rutgers CS 519: Operating System Theory106
Traditional OS structure
Monolithic/layered systems
one/N layers all executed in “kernel-mode”
good performance but rigid
OS kernel
hardware
userprocess
filesystem
memorysystem
usersystem calls
Computer Science, Rutgers CS 519: Operating System Theory107
Micro-kernel OS
client-server model, IPC between clients and servers
the micro-kernel provides protected communication
OS functions implemented as user-level servers
flexible but efficiency is the problem
easy to extend for distributed systems
micro-kernel
hardware
clientprocess
fileserver
memoryserver
IPC
user mode
Computer Science, Rutgers CS 519: Operating System Theory108
Extensible OS kernel
User processes can load customized OS services into the kernel
Good performance and flexibility but protection and scalability become problems
extensible kernel
hardware
process
defaultmemoryservice
user modeprocess
mymemoryservice
Computer Science, Rutgers CS 519: Operating System Theory109
Virtual Machines
Old concept which is heavily revived today
the real hardware is “cloned” in several identical virtual machines
OS functionality built on top of the virtual machine
hardware
user
exokernel
allocate resourceOS on virtual machine
Computer Science, Rutgers CS 519: Operating System Theory111
The UNIX Time-sharing System
FeaturesTime-sharing system
Hierarchical file system
System command language (shell)
File-based device-independent I/O
Versions 1 & 2No multi-programming
Ran on PDP-7,9,11
Computer Science, Rutgers CS 519: Operating System Theory112
More History
Version 4Ran on PDP-11 (hardware costing < $40k)
Took less than 2 man-years to code
~50KB code size (kernel)
Written in C
Computer Science, Rutgers CS 519: Operating System Theory113
File System
Ordinary files (uninterpreted)
DirectoriesFile of files
Organized as a rooted tree
Pathnames (relative and absolute)
Contains links to parent, itself
Multiple links to files can exist
Link - hard OR symbolic
Computer Science, Rutgers CS 519: Operating System Theory114
File System (contd)
Special filesEach I/O device associated with a special file
To provide uniform naming and protection model
Uniform I/O
Computer Science, Rutgers CS 519: Operating System Theory115
Removable File Systems
Tree-structured file hierarchies
Mounted on existing space by using mount
No links between different file systems
Computer Science, Rutgers CS 519: Operating System Theory116
Protection
User iduid marked on files
Ten protection bitsnine - rwx permissions for user, group & other
setuid bit is used to change user id
Super-user has special uidexempt from constraints on access
Computer Science, Rutgers CS 519: Operating System Theory117
Uniform I/O Model
Basic system callsopen, close, creat, read, write, seek
Streams of bytes, no records
No locks visible to the user
Computer Science, Rutgers CS 519: Operating System Theory118
File System Implementation
I-node contains a short description of file
direct, single-indirect and double-indirect pointers to disk blocks
I-listtable of i-nodes, indexed by i-number
pathname scanning to determine i-number
Allows simple and efficient fsck
Buffered data
Write-behind
Computer Science, Rutgers CS 519: Operating System Theory119
Processes
ProcessMemory image, register values, status of open files etc.
Memory image consists of text, data, and stack segments
To create new processes
pid = fork()
process splits into two independently executing processes (parent and child)
Pipes used for communication between related processes
exec(file, arg1, ..., argn) used to start another application
Computer Science, Rutgers CS 519: Operating System Theory120
The Shell
Command-line interpreter
cmd arg1 arg2 ... argn
i/o redirection
<, >
filters & pipes
ls | more
job control
cmd &
simplified shell =>
Computer Science, Rutgers CS 519: Operating System Theory121
Plan 9 From Bell Labs
Move towards distributed computing with PCs brought two problems:
Focus on private machines difficult for networks of machines to serve as seamlessly as old monolithic timesharing systems.
Many machines in physically different locations administrative nightmare
Plan 9build Unix out of lot of little systems, not a system out of lots of little Unixes.
Computer Science, Rutgers CS 519: Operating System Theory122
Plan 9 Core
Client/server modelShared-memory multiprocessors provide computing cycles
File servers provide storage
PCs/workstation provides terminal
Three basic principles:File-like API to name and access all resources
Standard protocol for accessing these resources
Personal hierarchical namespace
Computer Science, Rutgers CS 519: Operating System Theory124
Everything Is A File
All systems export a file system APIOpen, close, read, write, etc.
API defined as set of messages on a communication channelextends easily to a remote environment
Kernel-based service - mount deviceAttach communication channels to remote servers, into the name space
Bind to give a new name for an existing object
Converts system calls on remote resource into 9P messages on the communication channel
Real file systemSingle view of system spread across many disks
Storage hierarchy: memory buffer, disks, WORM jukebox
Computer Science, Rutgers CS 519: Operating System Theory125
Resource Access Protocol
A single resource access protocol - 9PControls file systems
Resolve file names, traverse name hierarchy of filesystems
Computer Science, Rutgers CS 519: Operating System Theory126
Personal Name Space
Personal name space a powerful abstractionAlso used in other systems (e.g. Spring)
Plan 9 example: "my house" vs. the exact address/dev/cons always refers to the user's terminal
Local name space obeys globally understood conventions
Computer Science, Rutgers CS 519: Operating System Theory127
Parallel Programming
Processes (no threads)Can control what resources are shared when forking
New processes created using an rfork system call
argument to rfork is a bit vector which specifies which of the parent's resources should be copied/shared/created anew
resources - name space, the environment, file descriptor table, memory etc.
Basic IPC mechanism: rendezvous
Computer Science, Rutgers CS 519: Operating System Theory128
Protection & Access Control
Authentication mechanism based on DESbuilt into 9P
No super-useruser with special privileges for maintenance
Use of file system interface for most of the services takes care of ownership and permissions problems