Upload
laszlo
View
19
Download
1
Tags:
Embed Size (px)
DESCRIPTION
제 52 강 : Swapping & Flushing. When pages are updated. text. data. Main memory. Disk. block device. a.out. 3. text. data. data. text. Main memory. Disk. block device. a.out. Loaded (scattered). X. RW. 3. text. data. data. stack. text. Main memory. Disk. block device. - PowerPoint PPT Presentation
Citation preview
1
When pages are updated
제 52 강 : Swapping & Flushing
2
Main memory Disk
blo
ck d
evic
e
textdata
a.out
3
Main memory
Loaded
(scattered)
Disk
blo
ck d
evic
e
textdata
a.outtext
data
X
RW
4
Main memory
Loaded
(scattered)
Disk
blo
ck d
evic
e
textdata
a.outtext
data
stack
X
RW
RW
5
Main memory Disk
blo
ck d
evic
e
textdata
a.outtext
data
stack
X always clean
written
writtenRW
RW
6
Main memory Disk
blo
ck d
evic
e
textdata
a.outtext
data
stack
X always clean
written
writtenwrite-back
RW
RW
7
Main memory Disk
blo
ck d
evic
e
textdata
a.outtext
data
stack
X always clean
written
write-backImmediately
orflush
periodically
RW
RW
8
Main memory Disk
blo
ck d
evic
e
textdata
a.outtext
data
stack
X always clean
written
write-backImmediately
orflush
periodically
RW
RW
Page Cacheclean
dirty
locked
address_space struct
9
Main memory Disk
blo
ck d
evic
e
textdata
a.outtext
data
stack
X
RW
always clean
written (no disk counterpart)
written
RW
10
Main memory Disk
blo
ck d
evic
e
textdata
a.outtext
data
stack
X always clean
written
writtenwrite-back
or flush
swap
are
a
stack
RW
RW swap(save) this page
for later use
11
Main memory Disk
blo
ck d
evic
e
textdata
a.outtext
data
stack
X always clean
written
written
swap
are
a
stack
RW
RW
“Anonymous page”such as stack, heap
does not map to a file on disk
swap(save) this page
for later use
write-backor flush
12
13
Page Frame Reclamation
• kernel refills free-block-list • before all free memory are gone• PFRA(Page Frame Reclaiming
Algorithm)– select page frame, make this page free
• Target pages can belong to – user-mode-processes or– kernel caches (slab layer)
• If dirty – write or swap
14
Writing out Dirty Pages
Chapter 15, Love’s book
15
Writing out Dirty Pages
Chapter 15, Love’s book
Main memory Disk
blo
ck d
evic
e
textdata
a.outtext
data
stack
X
written
write-backImmediately
orflush
periodically
RW
RW
16
pdflush Daemon
• Dirty page writeback occurs when– No free memory (below a specified
threshold)– Dirty data became too old (older than a specific
threshold)
Page Cacheclean
dirty
locked
– setting pdflush daemon• dirty_background_ratio (free mem < d_b_r)• dirty_expire_centisecs (mod_time > d_e_c)• dirty_ratio (no. of dirty page > d_r)• dirty_writeback_centisecs (cycle time of pdflush)
17
Swapping out Anonymous Pages
18
Swapping out Anonymous Pages
Main memory Disk
blo
ck d
evic
e
textdata
a.outtext
data
stack
X
written
writtenwrite-back
or flush
swap
are
a
stack
RW
RW swap(save) this page
for later use
19
Linux swapping• Pages like stack/heap cannot be discarded (used later) • They have to be copied to backing store, called swap area • Strictly speaking, Linux does not swap, because – 'swapping‘ means copying entire process address space to disk – 'paging' means copying out individual pages
• Linux actually implements paging (traditionally called it swapping)
• Linux swapping page frame reclaiming
Meomory
pagepageswap area
20
Swap Area in Disk (p 179 Gorman)
• Multiple swap areas system administrator spreads load among
several disks Faster swap areas (faster disk) may have higher
priority swapping may start from faster swap area multiple swap area may read/write concurrently
• each active swap area is a file or partition (max 32 swap areas)
• Each swap area is divided up into page-sized slots on disk.
swap area (file or partition)
slot ( page)
21
Swap Area in Disk (p 179 Gorman)
• Multiple swap areas system administrator spreads load among
several disks Faster swap areas (faster disk) may have higher
priority swapping may start from faster swap area multiple swap area may read/write concurrently
• each active swap area is a file or partition (max 32 swap areas)
• Each swap area is divided up into page-sized slots on disk.
struct swap_info_struct { unsigned int flags; spinlock_t sdev_lock; struct file *swap_file; struct block_device
*bdev; };
swap_info[]01
31
swap area (file or partition)
slot ( page)
struct swap_info_struct {};
22
Swapping Subsystem
• PTE keeps track of the positions of data in swap area
swap_info_struct { flags; sdev_lock; *swap_file; *bdev; };
swap_info[]01
31
swap area (file or partition)
PTE01
31
Pk was swapped out
k indexindex
swap_info_struct
23
Swap in – Race Problem• swap in can cause race condition • Example [2 process case]• 1st process accesses page X and page
faults– kernel tries to swap in– allocate a new page frame– start I/O operation
• 2nd process accesses page X and page faults– kernel tries to swap in– allocate a new page frame– start I/O operation
24
Swap out – Race Problem• Swap out can cause race condition• Example [N process share a page]
PTE
PTE
PTE
PTE
PA
PB
PC
PD
pageX
CPU1
CPU2
CPU3
CPU4
25
Swap out – Race Problem• Swap out can cause race condition• Example [N process share a page]
– 1st process swap out page X – Other processes access page X
PTE
PTE
PTE
PTE
PA
PB
PC
PD
pageX
CPU1
CPU2
CPU3
CPU4
PTE
PTE
PTE
PTE
PA
PB
PC
PD
pageX
CPU1
CPU2
CPU3
CPU4
pageX
26
If a page is shared, a special entry (swap entry) is allocated “swap out” just decrement reference count in swap entry. Only when the count reaches zero will the page be freedPages like this are considered to be in the swap cacheswap cache is implemented by page cache data structure swap cache is purely conceptual because it’s simply
specialization of page cache
Linux Solution – Swap Cache
Swapping out pages
PTE
PTE
PA
PB
pageX
Count 2
Swap Cache
27
Linux Solution – Swap Cache
Swapping out pages
PTE
PTE
PA
PB
pageX
Count 1
Swap Cache
XPTE
PTE
PA
PB
pageX
Count 2
Swap Cache
• If a page is shared, a special entry (swap entry) is allocated • “swap out” just decrement reference count in swap entry. • Only when the count reaches zero will the page be freed• Pages like this are considered to be in the swap cache• swap cache is implemented by page cache data structure • swap cache is purely conceptual because it’s simply specialization
of page cache
28
Linux Solution – Swap Cache
Swapping out pages
PTE
PTE
PA
PB
pageX page
X
Count 2
PTE
PTE
PA
PB
pageX
Count 1
PTE
PTE
PA
PB
pageX
Count 0
Swap Cache Swap Cache Swap Cache
Swap Area
X XX
• If a page is shared, a special entry (swap entry) is allocated • “swap out” just decrement reference count in swap entry. • Only when the count reaches zero will the page be freed• Pages like this are considered to be in the swap cache• swap cache is implemented by page cache data structure • swap cache is purely conceptual because it’s simply specialization
of page cache
29
Page Fault (Love’s book – Chapter 14)
30
/* This routine handles page faults. */asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long error_code){ struct task_struct *tsk; struct mm_struct *mm; struct vm_area_struct * vma; unsigned long address; unsigned long page;
/* get the address */ __asm__("movl %%cr2,%0":"=r" (address)); tsk = current; mm = tsk->mm; down_read(&mm->mmap_sem); vma = find_vma(mm, address); if (vma->vm_start <= address) goto good_area; …..good_area:switch (error_code & 3) { default: /* 3: write, present */
/* fall through */ case 2: /* write, not present */ if (!(vma->vm_flags & VM_WRITE)) goto bad_area; write++; break; case 1: /* read, present */ goto bad_area; case 0: /* read, not present */ if (!(vma->vm_flags & (VM_READ | VM_EXEC))) goto bad_area; }
mm field
task_struct mm_struct
mm
tty
files
fs
VMA - text
VMA - data
VMA – stack
mmap pgd
vm_area_struct
vm_area_struct
vm_area_structPTE
PTE
Directory
start_addressend_addresspermissionfileoperations page fault() add_vma remove_vma
31
thread_info
kernel stack
task_struct
mm_structmm
tty
fs
files
pgd
PTE
PTE
DirectoryCPU SP
filp cach
e
Inode
cache
dentrycache
space Manager slab
unitslab
page
method ()
inode
page
page
address_space
cleandirty pageslocked
addressspace
T
L
S
D
mmap
VMA
VMA
VMA
LRU list
page
page
startendfilenopage()
32
Thank You