View
215
Download
1
Category
Preview:
Citation preview
Mendon: Filesystem in Hardware for High-Speed Secondary Storage Slide: 1 / 32
Filesystem in Hardware for High-SpeedSecondary Storage
An Implementation and Evaluation
Ashwin Mendon and Ron SassThe University of North Carolina at Charlotte
February 7, 2009
Mendon: Filesystem in Hardware for High-Speed Secondary Storage Slide: 2 / 32
Motivation
I Out-of-core applicationsI Disk-to-core bandwidth for data intensive compute cores.I I/O bandwidth bottleneck
Mendon: Filesystem in Hardware for High-Speed Secondary Storage Slide: 3 / 32
Data-Intensive Computing
uP 59% CAGR
I/O Perf. 10% CAGR
(10 year) CAGR 77% db size
1e+08
1e+09
1e+10
1e+11
1e+12
1994 1996 1998 2000 2002 2004
num
ber o
f nuc
leot
ides
year
#bases(genbank)mp(x)bio(x)
io(x)
Mendon: Filesystem in Hardware for High-Speed Secondary Storage Slide: 4 / 32
Outline
I BackgroundI Design and ImplementationI Evaluation
I functionalityI cost (slice utilization)I performance (efficiency)
I Conclusions & Future Work
Mendon: Filesystem in Hardware for High-Speed Secondary Storage Slide: 5 / 32
Filesystems
Filesystem — organizes a fixed-sized, block addressablenon-volatile storage into a variable number of variable-sized,byte-addressable files
Traditional Organization Filesystem-in-Hardware
File System
Device Driver
Application
Operating System
Hard Disk
Sector
Track
Disk Drive Controller
SOFTWARE
HARDWARE
File System
Device Driver
Application
Operating System
Hard Disk
Sector
Track
Disk Drive Controller
SOFTWARE
HARDWARE
Mendon: Filesystem in Hardware for High-Speed Secondary Storage Slide: 6 / 32
Filesystems: Traditional Requirements
I Efficiently use disk spaceI Perform basic file operations: open, read, write and deleteI Efficient run-time performanceI Classify files for fast and easy retrievalI Provide advanced features like renaming, access
permission, encryption
Mendon: Filesystem in Hardware for High-Speed Secondary Storage Slide: 7 / 32
UNIX File System
I File System LayoutData BlocksInode ListSuper BlockBoot Block
I UNIX I-node Structure
direct 2
direct 4
direct 1
direct 0
direct 3
Inode Data BlocksBlock
Root
direct 9
singleindirect
doubleindirect
Inode
Blocks
Inode
Blocks
Direct and Indirect Inode Blocks
Mendon: Filesystem in Hardware for High-Speed Secondary Storage Slide: 8 / 32
Design and Implementation
I Goals:I optimized for a few large files ...I as lean as possible ...I efficiency
I ApproachI C-based software reference designI VHDL-based design that can be simulated (determine
efficiency)I synthesizable design to determine core sizeI test on an FPGA (we use a RAMDISK, no SATA core)
Mendon: Filesystem in Hardware for High-Speed Secondary Storage Slide: 8 / 32
Design and Implementation
I Goals:I optimized for a few large files ... (I/O-bound problems first)I as lean as possible ...I efficiency
I ApproachI C-based software reference designI VHDL-based design that can be simulated (determine
efficiency)I synthesizable design to determine core sizeI test on an FPGA (we use a RAMDISK, no SATA core)
Mendon: Filesystem in Hardware for High-Speed Secondary Storage Slide: 8 / 32
Design and Implementation
I Goals:I optimized for a few large files ... (I/O-bound problems first)I as lean as possible ... (expected large, hairy design)I efficiency
I ApproachI C-based software reference designI VHDL-based design that can be simulated (determine
efficiency)I synthesizable design to determine core sizeI test on an FPGA (we use a RAMDISK, no SATA core)
Mendon: Filesystem in Hardware for High-Speed Secondary Storage Slide: 9 / 32
High-Level Block Diagram
Compute
( BLAST )
Hardware
File System
Write
FIFO
FIFO
Read
command
and
status
Block
Register
Shadow
Slave
Wishbone
FIFOs
DCR
SATA CorePPC405
Core
Link Layer
32
DISK
data
data
command
MGT
PHY
IF
Mendon: Filesystem in Hardware for High-Speed Secondary Storage Slide: 10 / 32
Software Reference Design: I-node Structure
Data Blocks
direct 0
direct 1
direct 2
direct 3
direct 4
direct 5
indirectsingle
direct 14
Inode Block
Root
singleindirect
direct 0
direct 1
direct 2
indirectsingle
direct 0
direct 1
Inode
Block
Inode
Block
Mendon: Filesystem in Hardware for High-Speed Secondary Storage Slide: 11 / 32
Software Reference Design: SuperBlock Layout
FN1 FN3FN1BLOCK 0
BLOCK1
BLOCK3
BLOCK2
INH1
FN4 FN4 INH4
FN8 FN8 INH8
FN12 FN12 INH12
5 15141312
FS3INH3 FS1
7 8
FS4
FLH= FreeList Head
FLI= FreeList Index
FNL= FileName Location
FSS= FileSystem Size
FS= File Size
INH= Inode Head (Root Inode Location)
FN= File Name
0 1 2 3
FN3FLH FLI FNL FSS
4
Mendon: Filesystem in Hardware for High-Speed Secondary Storage Slide: 12 / 32
SATA Behavioral Components
mkafs.vhd satastub.vhd
5
4 100 00 4 00 0
5
0 0
0 0 0 0 0
0
0
0
0
23 24
0 0 0
0 0
21 22
37 38
100
6 7 8
85 86
64 B
0
0
0
0
19
6
4
20
36
84
100
0
1
2
3
20
36
52
19
DISKSIZE=100
BLOCKSIZE=64
FREELIST
HEAD
to FIFO/
SB_buffer
from FIFO/
SB_buffer
FL_buffer
32 bits
32 bits
100
DISK
cmd
50
32 bits
blknum
2
1
0
64B
4 0 0
0 1 15
read_buf
write_buf
Storage Array
Mendon: Filesystem in Hardware for High-Speed Secondary Storage Slide: 13 / 32
Simulation Environment
PUSH
EMPTY
POP
PUSH
FULL
POP
EMPTY
DISK
SATA
STUB
FULL
BLKNUM
32
DATA_IN
32
PORT_OUT
PORT_IN
32
MUX
SATA
SATA_MUX_SEL
DATA_IN
DATA_OUT
SB_OUT
FL_OUT
32
32
WRITE_DISK
COUNT_RESET
STATUS
CMD
32
HWFS
WRITE
FIFODATA_IN
32
READ
FIFO
DATA_OUT
32
CLOCK
RESET
OPERATION
FILENAME
WRFIN
TESTBENCH
32
64
2
Mendon: Filesystem in Hardware for High-Speed Secondary Storage Slide: 14 / 32
Synthesizable HWFS Core
READ/WRITE/REMOVE
FSM
32
CMP
COMPARATOR
FL_OUT
DATA_IN
SB_OUT
FILE
NAME
32
32
FN
MUX
DP_ADDR WE1
SB
MUX
SEL
FN
MUX
SEL ADDR WE2
CMD
BLKNUM
SP DP
INODEMUX
32
FL
MUX
SEL
SATA_MUX_SEL
COUNT_RESET
STATUS
WRITE_DISK
FREELIST
BUFFER
ADDR
SB_BUF FL_BUF
32
64
DATAIN
FL
DATAIN SEL
INODE
MUXSB
32
WE
DATAIN
DATAOUT ADDR WE
DATAIN
DP_ADDRADDR
DATA1 DATA2
4
CLOCKCLK
FOUND
32
BUFFER
INODE
32
0 1 2 30 1 2
SB MUXFL MUX
3
SUPERBLOCK /
6 6
32
PUSH
RESET
OPERATION
RST
FULL
PUSH
FULL
POPPOP
EMPTYEMPTY
OFFSET
OPRN
WRFIN
WRFIN
OFFSET
I Control FSM
I Datapath
I two BRAMsI ComparatorI three 32-bit
4:1 MUXesI two 32-bit
2:1 MUXes
Mendon: Filesystem in Hardware for High-Speed Secondary Storage Slide: 15 / 32
Example Operations
I OPENI READI WRITEI LSEEKI DELETE
Mendon: Filesystem in Hardware for High-Speed Secondary Storage Slide: 16 / 32
OPEN File
status=0
reset=1
operation=00
operation=10
operation=11
READ
SB
FIND
INODE
READ
FILEWRITE
FILE
MATCHFILE
NAME
FILE
found=0
found=1
status=1 and operation=00
operation=00
operation=01
DELETE
Mendon: Filesystem in Hardware for High-Speed Secondary Storage Slide: 17 / 32
READ File
READFSM
status=0
always
op=01and
data_read=1
or
op=01
and
(status=1
eof=1)
status=0db_count<15 and
sb_datain != 0
and
status=0 and
db_count<15 and
sb_datain=0
status=1
db_count=15
op=01
data_read=0and
op=01
always
READ
INODE
SET
ADDRESS
IDLE
SET
BLOCKID
READ
DATA
Mendon: Filesystem in Hardware for High-Speed Secondary Storage Slide: 18 / 32
LSEEK(implemented as random read)
READFSM
IDLE
status=0
always
op=01and
data_read=1
status=0db_count<15 and
sb_datain != 0
and
always
status=0 and
db_count<15 and
sb_datain=0
status=1
db_count=15
op=01
INODE
READ
BLOCKID
SET
SETop=01and
data_read=0
GET
READ
DATA
(status=1 oreof=1) andop=01 and
inode_skip=0
op=01(status=1 or
eof=1) and
op=01 and
inode_skip>0 ADDRESS
INODE
Mendon: Filesystem in Hardware for High-Speed Secondary Storage Slide: 19 / 32
WRITE File
WRITEFSM
op=10
alwaysop=10 and
and (fl_buf_empty=1 and
file_not_found=1)
found=0
filecount= NFILESand
and fl_buf_empty=0
fl_buf_full=’0’
op=10 andfl_buf_full=’1’
always
and wrfin=0 and
fl_buf_addr !=1110
rfdin=0 andsb_buf_addr=001111 or
eof_written=’1’
write_md_fin=1
status=1
write_md_fin=0
and (fl_buf_empty=0
file_not_found=1)and
eof=1
status=0
status=0
wrfin=1and
and
op=10
sb_buf_addr=001111
eofwritten=1 or
sb_buf_addr !=001111and eofwritten=0
op=10and
eof=0
status=1
status=0
status=1
status=0
status=1
status=0
status=1
status=1
status=1
status=0
fl_buf_addr !=1110
status=1 and wrfin=0 and
status=1
and
fl_buf_addr
=1110
status=1
status=0
READ
INODE
READ
SET
ADDRESS
SET
BLOCKID
GET
INODE
LINK
NODE
WRITE
DB
WRITE
IB
WRITE
FLREAD
SBWRITE
MD
WRITE
SB
IDLE
FL
Mendon: Filesystem in Hardware for High-Speed Secondary Storage Slide: 20 / 32
DELETE File
op=11
always
status=0
status=0
status=1or
and
op=11
op=11
(status=1eof=1)
status=0
fl_index=0 or
sb_buf_empty=1
rtrn_fl_len=0
status=1and
eof=0 status=0
status=1 and
eof=0
write_md_fin=0
write_md_fin=1
status=0
status=1
and op=11status=1
op=11
sb_buf_empty=0
rtrn_fl_len!=0 and
fl_index!=0 and
SET
BLOCKID
SET
ADDRESS
INODE
READ
WRITE
FL SB
INODE
RETURN
GET
INODE
READ
WRITE
SB
WRITE
MD
READ
IDLE
FL
Mendon: Filesystem in Hardware for High-Speed Secondary Storage Slide: 21 / 32
Evaluation: Results and Analysis
I FunctionalityI AreaI Performance: filesystem efficiency
Mendon: Filesystem in Hardware for High-Speed Secondary Storage Slide: 22 / 32
Functionality
I Yes!I dedicated an external 256 MB SDRAM to HWFSI replaced the satastub.vhd behavioral model with a
synthesizable RAMDISK interface
I (measured timing validates behavioral model)
Mendon: Filesystem in Hardware for High-Speed Secondary Storage Slide: 23 / 32
AreaResource Utilization for HWFS core
I SuperBlock and Freelistbuffers are mapped ontoBRAMs
I Number of slices consumed isnot affected by increasingblock sizes
I Uses 3% of slices for a XilinxVirtex-4 FX60 device
Block Size Slices LUTs F/Fs BRAMs64 B 759 1471 343 2128 B 724 1369 345 2256 B 749 1446 349 2512 B 783 1502 353 21024 B 762 1463 356 34096 B 779 1476 364 10
Mendon: Filesystem in Hardware for High-Speed Secondary Storage Slide: 24 / 32
Subcomponent Resource Utilization
Component Slices LUTs F/Fs BRAMsFSM 617 1169 358 0
sb buffer 0 0 0 2fl buffer 0 0 0 1
comparator 17 33 0 0sb mux 32 64 0 0fl mux 32 64 0 0
sata mux 32 64 0 0inode mux 18 32 0 0
fn mux 18 32 0 0
Mendon: Filesystem in Hardware for High-Speed Secondary Storage Slide: 25 / 32
Performance
I design runs at 100 MHz ClockI used Modelsim to measure sequential read/write execution
timesI measured efficiency: how much latency is introduced by
the filesystem compared to an ideal disk?
eff =data latency
(data + overhead) latency
Mendon: Filesystem in Hardware for High-Speed Secondary Storage Slide: 26 / 32
EfficiencyFile Read/Write Efficiency plotted against different file sizes(10 KB-5 GB)
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
10KB 100KB
IdealRead Latency (BlockSize 64B)
Read Latency (BlockSize 256B)Read Latency (BlockSize 512B)Write Latency (BlockSize 64B)
Write Latency (BlockSize 256B)Write Latency (BlockSize 512B)
1KB
Mendon: Filesystem in Hardware for High-Speed Secondary Storage Slide: 27 / 32
Conclusions and Future Work
I definitely feasibleI minimum set of operations has been implementedI just 3% slice utilization for a Virtex-4 platform FPGAI efficient run-time performance for large files
I changing block size only effects the size of buffers (burnsBRAMs but not logic)
I FSM dominates the coreI adding functionality will increase the size of coreI at 3% utilization, plenty of room to add functionality
Mendon: Filesystem in Hardware for High-Speed Secondary Storage Slide: 28 / 32
Future Work
I interface with SATA core to measure file I/O performanceI evaluate with High-Performance Computing I/O
benchmarksI create Linux block device driverI compare HWFS with other software-based filesystemsI replicated on each node of the cluster, integrate with
networkI add Parallel Hardware Filesystem layer
Mendon: Filesystem in Hardware for High-Speed Secondary Storage Slide: 29 / 32
Thank-You!(extra slides follow, if needed)
Mendon: Filesystem in Hardware for High-Speed Secondary Storage Slide: 30 / 32
Architecture Comparison
FPGA FPGA
PPC Bridge
SATA IP SATA IP
MGT MGT
HWFS
DDR SDRAM
DISK DISK
MEM Controller
COMPUTE
CORE
Mendon: Filesystem in Hardware for High-Speed Secondary Storage Slide: 31 / 32
Performance
FileSize(Bytes) 64 B 256 B 512 B1 KB 4.24 us 5.32 us 7.81 us10 KB 32.56 us 32.45 us 32.39 us100 KB 299.2 us 268.6 us 263 us1 MB 3.03 ms 2.71 ms 2.67 ms10 MB 30.12 ms 26.8 ms 26.6 ms100 MB 300.4 ms 266.4 ms 264.3 ms
1 GB 2.98 s 2.63 s 2.6 s5 GB 14.6 s 13 s 12.9 s
ReadFile execution time (incycles) plotted againstdifferent file sizes(1 KB-5GB)
0
2e+08
4e+08
6e+08
8e+08
1e+09
1.2e+09
1.4e+09
1.6e+09
0 1e+09 2e+09 3e+09 4e+09 5e+09 6e+09
Cycl
es
File Size(Bytes)
BlockSize(64 B)BlockSize(256 B)BlockSize(512 B)
Mendon: Filesystem in Hardware for High-Speed Secondary Storage Slide: 32 / 32
Performance
FileSize(Bytes) 64 B 256 B 512 B1 KB 5.47 us 9.24 us 15.52 us
10 KB 33.02 us 34.26 us 36.6 us100 KB 293.4 us 270.3 us 269.3 us1 MB 2.96 ms 2.7 ms 2.65 ms10 MB 29.6 ms 26.5 ms 26.5 ms100 MB 295 ms 263 ms 262.5 ms
1 GB 2.96 s 2.65 s 2.63 s5 GB 14.6 s 13.2 s 13 s
WriteFile execution time (incycles) plotted againstdifferent file sizes(1 KB-5GB)
0
2e+08
4e+08
6e+08
8e+08
1e+09
1.2e+09
1.4e+09
1.6e+09
0 1e+09 2e+09 3e+09 4e+09 5e+09 6e+09
Cy
cles
FileSize(Bytes)
BlockSize(64 B)BlockSize(256 B)BlockSize(512 B)
Recommended