View
217
Download
0
Category
Preview:
Citation preview
CS194-3/CS16xIntroduction to Systems
Lecture 13
Demand Paging
October 10, 2007
Prof. Anthony D. Joseph
http://www.cs.berkeley.edu/~adj/cs16x
Lec 13.310/10/07 Joseph CS194-3/16x ©UCB Fall 2007
Review: What Factors Lead to Misses?• Compulsory Misses:
– Pages that have never been paged into memory before– How might we remove these misses?
» Prefetching: loading them into memory before needed» Need to predict future somehow! More later.
• Capacity Misses:– Not enough memory. Must somehow increase size.– Can we do this?
» One option: Increase amount of DRAM (not quick fix!)» Another option: If multiple processes in memory: adjust percentage of memory allocated to each one!
• Conflict Misses:– Technically, conflict misses don’t exist in virtual memory, since it is a “fully-associative” cache
• Policy Misses:– Caused when pages were in memory, but kicked out prematurely because of the replacement policy
– How to fix? Better replacement policy
Lec 13.410/10/07 Joseph CS194-3/16x ©UCB Fall 2007
Review: Page Replacement Policies• Why do we care about Replacement Policy?
– Replacement is an issue with any cache– Particularly important with pages
» The cost of being wrong is high: must go to disk» Must keep important pages in memory, not toss them out
• FIFO (First In, First Out)– Throw out oldest page. Be fair – let every page live in memory for same amount of time.
– Bad, because throws out heavily used pages instead of infrequently used pages
• MIN (Minimum): – Replace page that won’t be used for the longest time
– Great, but can’t really know future…– Makes good comparison case, however
• RANDOM:– Pick random page for every replacement– Typical solution for TLB’s. Simple hardware– Pretty unpredictable – makes it hard to make real-time guarantees
Lec 13.510/10/07 Joseph CS194-3/16x ©UCB Fall 2007
Review: Replacement Policies (Con’t)• LRU (Least Recently Used):
– Replace page that hasn’t been used for the longest time
– Programs have locality, so if something not used for a while, unlikely to be used in the near future.
– Seems like LRU should be a good approximation to MIN.
• In practice LRU is expensive to implement, so people approximate LRU
Lec 13.610/10/07 Joseph CS194-3/16x ©UCB Fall 2007
Goals for Today
• Page Replacement Policies– Nth chance algorithm– Second-Chance-List Algorithm
• Page Allocation Policies• Working Set/Thrashing
Note: Some slides and/or pictures in the following areadapted from slides ©2005 Silberschatz, Galvin, and Gagne. Slides courtesy of Kubiatowicz, AJ Shankar, George Necula, Alex Aiken, Eric Brewer, Ras Bodik, Ion Stoica, Doug Tygar, and David Wagner.
Lec 13.710/10/07 Joseph CS194-3/16x ©UCB Fall 2007
Demand Paging Example• Since Demand Paging like caching, can compute average access time! (“Effective Access Time”)– EAT = Hit Rate x Hit Time + Miss Rate x Miss Time
• Example:– Memory access time = 200 nanoseconds– Average page-fault service time = 8 milliseconds– Suppose p = Probability of miss, 1-p = Probably of hit
– Then, we can compute EAT as follows:EAT = (1 – p) x 200ns + p x 8 ms
= (1 – p) x 200ns + p x 8,000,000ns = 200ns + p x 7,999,800ns• If one access out of 1,000 causes a page fault, then EAT = 8.2 μs:– This is a slowdown by a factor of 40!
• What if want slowdown by less than 10%?– 200ns x 1.1 < EAT p < 2.5 x 10-6
– This is about 1 page fault in 400000!
Lec 13.810/10/07 Joseph CS194-3/16x ©UCB Fall 2007
• Suppose we have 3 page frames, 4 virtual pages, and following reference stream: – A B C A B D A D B C B
• Consider FIFO Page replacement:
– FIFO: 7 faults. – When referencing D, replacing A is bad choice, since need A again right away
Example: FIFO
C
B
A
D
C
B
A
BCBDADBACBA
3
2
1
Ref:
Page:
Lec 13.910/10/07 Joseph CS194-3/16x ©UCB Fall 2007
• Suppose we have the same reference stream: – A B C A B D A D B C B
• Consider MIN Page replacement:
– MIN: 5 faults – Where will D be brought in? Look for page not referenced farthest in future.
• What will LRU do?– Same decisions as MIN here, but won’t always be true!
Example: MIN
C
DC
B
A
BCBDADBACBA
3
2
1
Ref:
Page:
Lec 13.1010/10/07 Joseph CS194-3/16x ©UCB Fall 2007
Administrivia
• Project 1 code due tomorrow (Thu) by 11:59pm
• Journal due Fri by 11:59pm
Lec 13.1110/10/07 Joseph CS194-3/16x ©UCB Fall 2007
• Consider the following: A B C D A B C D A B C D
• LRU Performs as follows (same as FIFO here):
– Every reference is a page fault!• MIN Does much better:
D
When will LRU perform badly?
C
B
A
D
C
B
A
D
C
B
A
CBADCBADCBA D
3
2
1
Ref:
Page:
B
C
DC
B
A
CBADCBADCBA D
3
2
1
Ref:
Page:
Lec 13.1210/10/07 Joseph CS194-3/16x ©UCB Fall 2007
Graph of Page Faults Versus The Number of Frames
• One desirable property: When you add memory the miss rate goes down– Does this always happen?– Seems like it should, right?
• No: BeLady’s anomaly – Certain replacement algorithms (FIFO) don’t have this obvious property!
Lec 13.1310/10/07 Joseph CS194-3/16x ©UCB Fall 2007
Adding Memory Doesn’t Always Help Fault Rate
• Does adding memory reduce number of page faults?– Yes for LRU and MIN– Not necessarily for FIFO! (Called Belady’s anomaly)
• After adding memory:– With FIFO, contents can be completely different– In contrast, with LRU or MIN, contents of memory with X pages are a subset of contents with X+1 Page
D
C
E
B
A
D
C
B
A
DCBAEBADCBA E
3
2
1
Ref:Page:
CD4
E
D
B
A
E
C
B
A
DCBAEBADCBA E
3
2
1
Ref:Page:
Lec 13.1410/10/07 Joseph CS194-3/16x ©UCB Fall 2007
Nth Chance version of Clock Algorithm• Nth chance algorithm: Give page N chances
– OS keeps counter per page: # sweeps– On page fault, OS checks use bit:
» 1clear use and also clear counter (used in last sweep)
» 0increment counter; if count=N, replace page– Means that clock hand has to sweep by N times without page being used before page is replaced
• How do we pick N?– Why pick large N? Better approx to LRU
» If N ~ 1K, really good approximation– Why pick small N? More efficient
» Otherwise might have to look a long way to find free page
• What about dirty pages?– Takes extra overhead to replace a dirty page, so give dirty pages an extra chance before replacing?
– Common approach:» Clean pages, use N=1» Dirty pages, use N=2 (and write back to disk when N=1)
Lec 13.1510/10/07 Joseph CS194-3/16x ©UCB Fall 2007
DBMS “Clock” Replacement Policy
• An approximation of LRU• Arrange frames into a cycle, store one reference bit per frame– Can think of this as the 2nd chance bit
• When pin count reduces to 0, turn on ref. bit
• When replacement necessarydo for each page in cycle {
if (pincount == 0 && ref bit is on)turn off ref bit;
else if (pincount == 0 && ref bit is off)
choose this page for replacement;
} until a page is chosen;
A(1)
B(p)
C(1)
D(1)
Lec 13.1610/10/07 Joseph CS194-3/16x ©UCB Fall 2007
Clock Algorithms: Details
• Which bits of a PTE entry are useful to us?– Use: Set when page is referenced; cleared by clock algorithm
– Modified: set when page is modified, cleared when page written to disk
– Valid: ok for program to reference this page– Read-only: ok for program to read page, but not modify» For example for catching modifications to code pages!
• Remember, however, that clock is just an approximation of LRU– Can we do a better approximation, given that we have to take page faults on some reads and writes to collect use information?
– Need to identify an old page, not oldest page!– Answer: second chance list
Lec 13.1710/10/07 Joseph CS194-3/16x ©UCB Fall 2007
Second-Chance List Algorithm (VAX/VMS)
• Split memory in two: Active list (RW), SC list (Invalid)
• Access pages in Active list at full speed• Otherwise, Page Fault
– Always move overflow page from end of Active list to front of Second-chance list (SC) and mark invalid
– Desired Page On SC List: move to front of Active list, mark RW
– Not on SC list: page in to front of Active list, mark RW; page out LRU victim at end of SC list
DirectlyMapped Pages
Marked: RWList: FIFO
Second Chance List
Marked: InvalidList: LRU
LRU victim
Page-inFrom disk
NewActivePages
Access
NewSCVictims
Overflow
Lec 13.1810/10/07 Joseph CS194-3/16x ©UCB Fall 2007
Second-Chance List Algorithm (con’t)
• How many pages for second chance list?– If 0 FIFO– If all LRU, but page fault on every page reference
• Pick intermediate value. Result is:– Pro: Few disk accesses (page only goes to disk if unused for a long time)
– Con: Increased overhead trapping to OS (software / hardware tradeoff)
• With page translation, we can adapt to any kind of access the program makes– We can even use page translation / protection to share memory between threads on widely separated machines
Lec 13.1910/10/07 Joseph CS194-3/16x ©UCB Fall 2007
Free List
• Keep set of free pages ready for use in demand paging– Freelist filled in background by Clock algorithm or other technique (“Pageout demon”)
– Dirty pages start copying back to disk when enter list• Like VAX second-chance list
– If page needed before reused, just return to active set
• Advantage: Faster for page fault– Can always use page (or pages) immediately on fault
Set of all pagesin Memory
Single Clock Hand:Advances as needed to keep freelist full (“background”)D
D
Free PagesFor Processes
Lec 13.2010/10/07 Joseph CS194-3/16x ©UCB Fall 2007
Demand Paging (more details)
• Does software-loaded TLB need use bit? Two Options:– Hardware sets use bit in TLB; when TLB entry is replaced, software copies use bit back to page table
– Software manages TLB entries as FIFO list; everything not in TLB is Second-Chance list, managed as strict LRU
• Core Map– Page tables map virtual page physical page – Do we need a reverse mapping (i.e. physical page virtual page)?» Yes. Clock algorithm runs through page frames. If sharing, then multiple virtual-pages per physical page
» Can’t push page out to disk without invalidating all PTEs
Lec 13.2210/10/07 Joseph CS194-3/16x ©UCB Fall 2007
Allocation of Page Frames (Memory Pages)
• How do we allocate memory among different processes?– Does every process get the same fraction of memory? Different fractions?
– Should we completely swap some processes out of memory?
• Each process needs minimum number of pages– Want to make sure that all processes that are loaded into memory can make forward progress
– Example: IBM 370 – 6 pages to handle SS MOVE instruction:» instruction is 6 bytes, might span 2 pages» 2 pages to handle from» 2 pages to handle to
• Possible Replacement Scopes:– Global replacement – process selects replacement frame from set of all frames; one process can take a frame from another
– Local replacement – each process selects from only its own set of allocated frames
Lec 13.2310/10/07 Joseph CS194-3/16x ©UCB Fall 2007
Fixed/Priority Allocation• Equal allocation (Fixed Scheme):
– Every process gets same amount of memory– Example: 100 frames, 5 processesprocess gets 20 frames
• Proportional allocation (Fixed Scheme)– Allocate according to the size of process– Computation proceeds as follows:
si = size of process pi and S = si m = total number of frames
ai = allocation for pi =
• Priority Allocation:– Proportional scheme using priorities rather than size» Same type of computation as previous scheme
– Possible behavior: If process pi generates a page fault, select for replacement a frame from a process with lower priority number
• Perhaps we should use an adaptive scheme instead???– What if some application just needs more memory?
mS
si
Lec 13.2410/10/07 Joseph CS194-3/16x ©UCB Fall 2007
Page-Fault Frequency Allocation
• Can we reduce Capacity misses by dynamically changing the number of pages/application?
• Establish “acceptable” page-fault rate– If actual rate too low, process loses frame– If actual rate too high, process gains frame
• Question: What if we just don’t have enough memory?
Lec 13.2510/10/07 Joseph CS194-3/16x ©UCB Fall 2007
Thrashing
• If a process does not have “enough” pages, the page-fault rate is very high. This leads to:– low CPU utilization– operating system spends most of its time swapping to disk
• Thrashing a process is busy swapping pages in and out
• Questions:– How do we detect Thrashing?– What is best response to Thrashing?
Lec 13.2610/10/07 Joseph CS194-3/16x ©UCB Fall 2007
• Program Memory Access Patterns have temporal and spatial locality– Group of Pages accessed along a given time slice called the “Working Set”
– Working Set defines minimum number of pages needed for process to behave well
• Not enough memory for Working SetThrashing– Better to swap out process?
Locality In A Memory-Reference Pattern
Lec 13.2710/10/07 Joseph CS194-3/16x ©UCB Fall 2007
Working-Set Model
working-set window fixed number of page references – Example: 10,000 instructions
• WSi (working set of Process Pi) = total set of pages referenced in the most recent (varies in time)– if too small will not encompass entire locality– if too large will encompass several localities– if = will encompass entire program
• D = |WSi| total demand frames • if D > m Thrashing
– Policy: if D > m, then suspend one of the processes– This can improve overall system behavior by a lot!
Lec 13.2810/10/07 Joseph CS194-3/16x ©UCB Fall 2007
What about Compulsory Misses?
• Recall that compulsory misses are misses that occur the first time that a page is seen– Pages that are touched for the first time– Pages that are touched after process is swapped out/swapped back in
• Clustering:– On a page-fault, bring in multiple pages “around” the faulting page
– Since efficiency of disk reads increases with sequential reads, makes sense to read several sequential pages
• Working Set Tracking:– Use algorithm to try to track working set of application
– When swapping process back in, swap in working set
Lec 13.2910/10/07 Joseph CS194-3/16x ©UCB Fall 2007
Summary• Replacement policies
– FIFO: Place pages on queue, replace page at end– MIN: Replace page that will be used farthest in future– LRU: Replace page used farthest in past
• Nth-chance clock algorithm: Another approx LRU– Give pages multiple passes of clock hand before replacing
• Second-Chance List algorithm: Yet another approx LRU– Divide pages into two groups, one of which is truly LRU and managed on page faults.
• Working Set:– Set of pages touched by a process recently
• Thrashing: a process is busy swapping pages in and out– Process will thrash if working set doesn’t fit in memory– Need to swap out a process
Recommended