Upload
linuxfb
View
91
Download
1
Tags:
Embed Size (px)
Citation preview
. . . . . .
..What is fcache ..Principle and Flow ..What we have done ..Problems during porting ..Benchmark
An Introduction to fcache
Huang Pu Hu Ziming
Beijing University of Posts and Telecommunications
December 15, 2007
.. . .1 / 34
.An Introduction to fcache
. . . . . .
..What is fcache ..Principle and Flow ..What we have done ..Problems during porting ..Benchmark
Who Are We
H PSchool of Telecommunication Engineer, BUPTGraduate at Apr, 2008 (expected)[email protected]
H ZSchool of Information Engineer, BUPTGraduate at Apr, 2008 (expected)[email protected]
.. . .2 / 34
.An Introduction to fcache
.
. . . . . .
..What is fcache ..Principle and Flow ..What we have done ..Problems during porting ..Benchmark
What is fcache
A block remapping cacheBetween file system and block deviceA separated cache parition is needed
.. . .3 / 34
.An Introduction to fcache
.
. . . . . .
..What is fcache ..Principle and Flow ..What we have done ..Problems during porting ..Benchmark
Principle of fcache
Data accessed during boot is not layout in linear orderHard disk has to perform more seeking during boot upSlow mechanical operations contribute to low boot speed
But if data was laid out with accessing order on the diskLess hard disk seeking will make system boot up fasterfcache can make this
.. . .4 / 34
.An Introduction to fcache
.
. . . . . .
..What is fcache ..Principle and Flow ..What we have done ..Problems during porting ..Benchmark
Principle of fcache
Data accessed during boot is not layout in linear orderHard disk has to perform more seeking during boot upSlow mechanical operations contribute to low boot speedBut if data was laid out with accessing order on the diskLess hard disk seeking will make system boot up faster
fcache can make this
.. . .4 / 34
.An Introduction to fcache
.
. . . . . .
..What is fcache ..Principle and Flow ..What we have done ..Problems during porting ..Benchmark
Principle of fcache
Data accessed during boot is not layout in linear orderHard disk has to perform more seeking during boot upSlow mechanical operations contribute to low boot speedBut if data was laid out with accessing order on the diskLess hard disk seeking will make system boot up fasterfcache can make this
.. . .4 / 34
.An Introduction to fcache
.
. . . . . .
..What is fcache ..Principle and Flow ..What we have done ..Problems during porting ..Benchmark
Flow of fcache
There are two working modes in fcacheOne is priming modeThe other is cache mode
.. . .5 / 34
.An Introduction to fcache
.
. . . . . .
..What is fcache ..Principle and Flow ..What we have done ..Problems during porting ..Benchmark
Flow of fcache -- priming
In priming mode, fcache captures all read operationsduring boot upThen mirrors the data to cache partition while responsingread operations
Data will be mirrored into cache partitionWith the same order accessed during boot upThe detail will show on the next slide
.. . .6 / 34
.An Introduction to fcache
.
. . . . . .
..What is fcache ..Principle and Flow ..What we have done ..Problems during porting ..Benchmark
Flow of fcache -- priming
In priming mode, fcache captures all read operationsduring boot upThen mirrors the data to cache partition while responsingread operationsData will be mirrored into cache partitionWith the same order accessed during boot upThe detail will show on the next slide
.. . .6 / 34
.An Introduction to fcache
.
. . . . . .
..What is fcache ..Principle and Flow ..What we have done ..Problems during porting ..Benchmark
Priming -- at the beginning
xchg(&q->make_request_fn, fcache_make_request)
Now fcache captures all read requests in request queue
.. . .7 / 34
.An Introduction to fcache
.
. . . . . .
..What is fcache ..Principle and Flow ..What we have done ..Problems during porting ..Benchmark
Priming -- at the beginning
xchg(&q->make_request_fn, fcache_make_request)Now fcache captures all read requests in request queue
.. . .7 / 34
.An Introduction to fcache
.
. . . . . .
..What is fcache ..Principle and Flow ..What we have done ..Problems during porting ..Benchmark
Priming -- read request
...1 Clones the original bio and adds cloned data tocache_device
...2 Submit that cloned bio to real device
...3 When this bio has been completely cloned
...4 Generates extent, and adds it to prio-tree
.. . .8 / 34
.An Introduction to fcache.
. . . . . .
..What is fcache ..Principle and Flow ..What we have done ..Problems during porting ..Benchmark
Priming -- read request
...1 Clones the original bio and adds cloned data tocache_device
...2 Submit that cloned bio to real device
...3 When this bio has been completely cloned
...4 Generates extent, and adds it to prio-tree
.. . .8 / 34
.An Introduction to fcache.
. . . . . .
..What is fcache ..Principle and Flow ..What we have done ..Problems during porting ..Benchmark
Priming -- read request
...1 Clones the original bio and adds cloned data tocache_device
...2 Submit that cloned bio to real device
...3 When this bio has been completely cloned
...4 Generates extent, and adds it to prio-tree
.. . .8 / 34
.An Introduction to fcache.
. . . . . .
..What is fcache ..Principle and Flow ..What we have done ..Problems during porting ..Benchmark
Priming -- read request
...1 Clones the original bio and adds cloned data tocache_device
...2 Submit that cloned bio to real device
...3 When this bio has been completely cloned
...4 Generates extent, and adds it to prio-tree
.. . .8 / 34
.An Introduction to fcache.
. . . . . .
..What is fcache ..Principle and Flow ..What we have done ..Problems during porting ..Benchmark
Priming -- read finished
When all read requests are completedThe script will remount target partition
Then fcache will re-generate prio-tree from extentsThen write extents to cache deviceThen re-write the header of cache device
.. . .9 / 34
.An Introduction to fcache.
. . . . . .
..What is fcache ..Principle and Flow ..What we have done ..Problems during porting ..Benchmark
Priming -- read finished
When all read requests are completedThe script will remount target partitionThen fcache will re-generate prio-tree from extentsThen write extents to cache deviceThen re-write the header of cache device
.. . .9 / 34
.An Introduction to fcache.
. . . . . .
..What is fcache ..Principle and Flow ..What we have done ..Problems during porting ..Benchmark
Flow of fcache -- cache
In cache mode, fcache tries to fetch blocks from cachepartitionAnd serves the read request from the cache partition if itcan
.. . .10 / 34
.An Introduction to fcache.
. . . . . .
..What is fcache ..Principle and Flow ..What we have done ..Problems during porting ..Benchmark
Cache -- at the beginning
Reads header from cache deviceGets basic info such as version and serial
Reads all extents and builds prio-treeThe value of tree is offset of cacheFunction xchg is used againTo replace every normal read request
.. . .11 / 34
.An Introduction to fcache.
. . . . . .
..What is fcache ..Principle and Flow ..What we have done ..Problems during porting ..Benchmark
Cache -- at the beginning
Reads header from cache deviceGets basic info such as version and serialReads all extents and builds prio-treeThe value of tree is offset of cache
Function xchg is used againTo replace every normal read request
.. . .11 / 34
.An Introduction to fcache.
. . . . . .
..What is fcache ..Principle and Flow ..What we have done ..Problems during porting ..Benchmark
Cache -- at the beginning
Reads header from cache deviceGets basic info such as version and serialReads all extents and builds prio-treeThe value of tree is offset of cacheFunction xchg is used againTo replace every normal read request
.. . .11 / 34
.An Introduction to fcache.
. . . . . .
..What is fcache ..Principle and Flow ..What we have done ..Problems during porting ..Benchmark
Cache -- read request
Looks up extents in prio-tree
Sends the request to real device if not matchedIf matched, it means requested data is extents from cacheparitionThen submits the read request to cache device
.. . .12 / 34
.An Introduction to fcache.
. . . . . .
..What is fcache ..Principle and Flow ..What we have done ..Problems during porting ..Benchmark
Cache -- read request
Looks up extents in prio-treeSends the request to real device if not matched
If matched, it means requested data is extents from cacheparitionThen submits the read request to cache device
.. . .12 / 34
.An Introduction to fcache.
. . . . . .
..What is fcache ..Principle and Flow ..What we have done ..Problems during porting ..Benchmark
Cache -- read request
Looks up extents in prio-treeSends the request to real device if not matchedIf matched, it means requested data is extents from cacheparitionThen submits the read request to cache device
.. . .12 / 34
.An Introduction to fcache.
. . . . . .
..What is fcache ..Principle and Flow ..What we have done ..Problems during porting ..Benchmark
Data structure of header I
There is only one header in a cache partitionWhich will cost one block of the partitionAnd is stored at the beginning of partition
.. . .13 / 34
.An Introduction to fcache.
. . . . . .
..What is fcache ..Principle and Flow ..What we have done ..Problems during porting ..Benchmark
Data structure of header II
header
magic
version
nr extent
max extent
serial
...
.. . .14 / 34
.An Introduction to fcache.
. . . . . .
..What is fcache ..Principle and Flow ..What we have done ..Problems during porting ..Benchmark
Data structure of extent I
A er header block is extent blocksIt will be stored in several blocks
extent will show the offset of cache dataAnd the mapping of cache data and the orig data
.. . .15 / 34
.An Introduction to fcache.
. . . . . .
..What is fcache ..Principle and Flow ..What we have done ..Problems during porting ..Benchmark
Data structure of extent I
A er header block is extent blocksIt will be stored in several blocksextent will show the offset of cache dataAnd the mapping of cache data and the orig data
.. . .15 / 34
.An Introduction to fcache.
. . . . . .
..What is fcache ..Principle and Flow ..What we have done ..Problems during porting ..Benchmark
Data structure of extent II
extent
fs sector
fs size
cache sector
pio node
real device offset
extent length
cache device offset
.. . .16 / 34
.An Introduction to fcache.
. . . . . .
..What is fcache ..Principle and Flow ..What we have done ..Problems during porting ..Benchmark
Data structure of combo I
A er extent blocks is data blocksWhich will fill other blocks
.. . .17 / 34
.An Introduction to fcache.
. . . . . .
..What is fcache ..Principle and Flow ..What we have done ..Problems during porting ..Benchmark
Data structure of combo II
combo
header
extent
...
...
data
...
...
...
1 block
some blocks
some extents
1 2 3 4 5 . . .
.. . .18 / 34
.An Introduction to fcache.
. . . . . .
..What is fcache ..Principle and Flow ..What we have done ..Problems during porting ..Benchmark
Normal Without fcache
OS
disk
readdata
...
block
block
...
block
...
.. . .19 / 34
.An Introduction to fcache.
. . . . . .
..What is fcache ..Principle and Flow ..What we have done ..Problems during porting ..Benchmark
fcache in Priming Mode
OS
fcache
cache disk
cache
cache
cache
...
...
...
block
...
block
block
copy dataread
readdata
.. . .20 / 34
.An Introduction to fcache.
. . . . . .
..What is fcache ..Principle and Flow ..What we have done ..Problems during porting ..Benchmark
fcache in Cache Mode
OS
fcache
cache disk
cache
cache
cache
...
...
...
block
...
block
block
re-fetchdataread
readdata
.. . .21 / 34
.An Introduction to fcache.
. . . . . .
..What is fcache ..Principle and Flow ..What we have done ..Problems during porting ..Benchmark
What we have done
Make this patch can be able on ext4 filesystem
Port it to the newest kernel 2.6.24-rc5
.. . .22 / 34
.An Introduction to fcache.
. . . . . .
..What is fcache ..Principle and Flow ..What we have done ..Problems during porting ..Benchmark
What we have done
Make this patch can be able on ext4 filesystemPort it to the newest kernel 2.6.24-rc5
.. . .22 / 34
.An Introduction to fcache.
. . . . . .
..What is fcache ..Principle and Flow ..What we have done ..Problems during porting ..Benchmark
Modified interface
INIT_WORKHas changed a er 2.6.20Takes 2 arguments instead of 3Macro container_of is used
.. . .23 / 34
.An Introduction to fcache.
. . . . . .
..What is fcache ..Principle and Flow ..What we have done ..Problems during porting ..Benchmark
Modified Data Type
request_queue_tUndefined variablechange it to struct request_queueTrivial, but should be handled carefully
.. . .24 / 34
.An Introduction to fcache.
. . . . . .
..What is fcache ..Principle and Flow ..What we have done ..Problems during porting ..Benchmark
Boot with ext4 filesystem
Download kernel 2.6.24-rc5, make sure options aboutext4dev are chosen
We add -t ext4dev option for mount command toindicate the type of partitionBut it cannot change the mount type by using remountoptionWe need mount root file system at the very beginningWhile the scripts in init directory is executingActually the root file system has been mountedWhich means -t ext4dev -o remount will take no effectSo we need to use kernel option rootfstype=ext4dev
.. . .25 / 34
.An Introduction to fcache.
. . . . . .
..What is fcache ..Principle and Flow ..What we have done ..Problems during porting ..Benchmark
Boot with ext4 filesystem
Download kernel 2.6.24-rc5, make sure options aboutext4dev are chosenWe add -t ext4dev option for mount command toindicate the type of partitionBut it cannot change the mount type by using remountoption
We need mount root file system at the very beginningWhile the scripts in init directory is executingActually the root file system has been mountedWhich means -t ext4dev -o remount will take no effectSo we need to use kernel option rootfstype=ext4dev
.. . .25 / 34
.An Introduction to fcache.
. . . . . .
..What is fcache ..Principle and Flow ..What we have done ..Problems during porting ..Benchmark
Boot with ext4 filesystem
Download kernel 2.6.24-rc5, make sure options aboutext4dev are chosenWe add -t ext4dev option for mount command toindicate the type of partitionBut it cannot change the mount type by using remountoptionWe need mount root file system at the very beginningWhile the scripts in init directory is executingActually the root file system has been mountedWhich means -t ext4dev -o remount will take no effect
So we need to use kernel option rootfstype=ext4dev
.. . .25 / 34
.An Introduction to fcache.
. . . . . .
..What is fcache ..Principle and Flow ..What we have done ..Problems during porting ..Benchmark
Boot with ext4 filesystem
Download kernel 2.6.24-rc5, make sure options aboutext4dev are chosenWe add -t ext4dev option for mount command toindicate the type of partitionBut it cannot change the mount type by using remountoptionWe need mount root file system at the very beginningWhile the scripts in init directory is executingActually the root file system has been mountedWhich means -t ext4dev -o remount will take no effectSo we need to use kernel option rootfstype=ext4dev
.. . .25 / 34
.An Introduction to fcache.
. . . . . .
..What is fcache ..Principle and Flow ..What we have done ..Problems during porting ..Benchmark
Fix a Bug I
max_extent means how many blocks can be cachedThe method to calc the max extents is
.Calc number of max extent..
.max_extent = (cache_blocks− 1) ∗PAGE_SIZEPAGE_SIZE - sizeof(fcache_extent)
But we think the denominator should be + instead of -Or the division by zero error will occurIf the size of fcache_extent is one page
.. . .26 / 34
.An Introduction to fcache.
. . . . . .
..What is fcache ..Principle and Flow ..What we have done ..Problems during porting ..Benchmark
Fix a Bug I
max_extent means how many blocks can be cachedThe method to calc the max extents is
.Calc number of max extent..
.max_extent = (cache_blocks− 1) ∗PAGE_SIZEPAGE_SIZE - sizeof(fcache_extent)
But we think the denominator should be + instead of -Or the division by zero error will occurIf the size of fcache_extent is one page
.. . .26 / 34
.An Introduction to fcache.
. . . . . .
..What is fcache ..Principle and Flow ..What we have done ..Problems during porting ..Benchmark
Fix a Bug I
max_extent means how many blocks can be cachedThe method to calc the max extents is
.Calc number of max extent..
.max_extent = (cache_blocks− 1) ∗PAGE_SIZEPAGE_SIZE - sizeof(fcache_extent)
But we think the denominator should be + instead of -Or the division by zero error will occurIf the size of fcache_extent is one page
.. . .26 / 34
.An Introduction to fcache.
. . . . . .
..What is fcache ..Principle and Flow ..What we have done ..Problems during porting ..Benchmark
Fix a Bug II
So, we think the method should
.Calc number of max extent(fixed)..
.max_extent = (cache_blocks− 1) ∗PAGE_SIZEPAGE_SIZE + sizeof(fcache_extent)
.. . .27 / 34
.An Introduction to fcache.
. . . . . .
..What is fcache ..Principle and Flow ..What we have done ..Problems during porting ..Benchmark
Fix a Bug II
So, we think the method should.Calc number of max extent(fixed)..
.max_extent = (cache_blocks− 1) ∗PAGE_SIZEPAGE_SIZE + sizeof(fcache_extent)
.. . .27 / 34
.An Introduction to fcache.
. . . . . .
..What is fcache ..Principle and Flow ..What we have done ..Problems during porting ..Benchmark
Method for Benchmark
...1 Set the fcache to priming mode
...2 Wait the start-up procedure finished, this will take moretime than before
...3 Set the fcache to cache mode, then reboot
...4 Then, you can feel the speed
.. . .28 / 34
.An Introduction to fcache.
. . . . . .
..What is fcache ..Principle and Flow ..What we have done ..Problems during porting ..Benchmark
Method for Benchmark
...1 Set the fcache to priming mode
...2 Wait the start-up procedure finished, this will take moretime than before
...3 Set the fcache to cache mode, then reboot
...4 Then, you can feel the speed
.. . .28 / 34
.An Introduction to fcache.
. . . . . .
..What is fcache ..Principle and Flow ..What we have done ..Problems during porting ..Benchmark
Method for Benchmark
...1 Set the fcache to priming mode
...2 Wait the start-up procedure finished, this will take moretime than before
...3 Set the fcache to cache mode, then reboot
...4 Then, you can feel the speed
.. . .28 / 34
.An Introduction to fcache.
. . . . . .
..What is fcache ..Principle and Flow ..What we have done ..Problems during porting ..Benchmark
Method for Benchmark
...1 Set the fcache to priming mode
...2 Wait the start-up procedure finished, this will take moretime than before
...3 Set the fcache to cache mode, then reboot
...4 Then, you can feel the speed
.. . .28 / 34
.An Introduction to fcache.
. . . . . .
..What is fcache ..Principle and Flow ..What we have done ..Problems during porting ..Benchmark
Information of Benchmark Data
The time was measured during bootAnd there are three parts for each time
Time from grub to login screenTime from login to finish loginThe total time
.. . .29 / 34
.An Introduction to fcache.
. . . . . .
..What is fcache ..Principle and Flow ..What we have done ..Problems during porting ..Benchmark
Information of Benchmark Data
The time was measured during bootAnd there are three parts for each timeTime from grub to login screenTime from login to finish loginThe total time
.. . .29 / 34
.An Introduction to fcache.
. . . . . .
..What is fcache ..Principle and Flow ..What we have done ..Problems during porting ..Benchmark
Benchmark Data for ext3
Mode grub to login login finish totalNormal 48s 38s 86sPriming 50s 52s 102sCache 46s 29s 75s
.. . .30 / 34
.An Introduction to fcache.
. . . . . .
..What is fcache ..Principle and Flow ..What we have done ..Problems during porting ..Benchmark
Benchmark Data for ext4 without extent
Mode grub to login login finish totalNormal 39.4s 26.4s 65.8sPriming 44.3s 31.0s 75.3sCache 33.0s 19.7s 52.7s
.. . .31 / 34
.An Introduction to fcache.
. . . . . .
..What is fcache ..Principle and Flow ..What we have done ..Problems during porting ..Benchmark
Benchmark Data for ext4 with extent(2.6.24-rc5)
Mode grub to login login finish totalNormal 37s 34s 71sPriming 43s 52s 97sCache 32s 27s 59s
.. . .32 / 34
.An Introduction to fcache.
. . . . . .
..What is fcache ..Principle and Flow ..What we have done ..Problems during porting ..Benchmark
Q and A
Q and A
.. . .33 / 34
.An Introduction to fcache.
. . . . . .
..What is fcache ..Principle and Flow ..What we have done ..Problems during porting ..Benchmark
Thank you all
.. . .34 / 34
.An Introduction to fcache.