1
The Conquest File System: An-I A. Wang • Geoffrey H. Kuenning • Peter Reiher • Gerald J. Popek Life after Disks Abstract The rapidly declining cost of persistent RAM technologies prompts the question of when, not whether, such memory will become the preferred storage medium for many computers. Conquest is a file system that provides a transition from disk to persistent RAM as the primary storage medium. Conquest provides two specialized and simplified data paths to in-core and on-disk storage, and Conquest realizes most of the benefits of persistent RAM at a fractional cost of a RAM-only solution. As of October 2001, Conquest can be used effectively for a hardware cost of under $200. We compare Conquest’s performance to ext2, reiserfs, SGI XFS, and ramfs, using popular benchmarks. Our measurements show that Conquest incurs little overhead compared to ramfs. Compared to the disk-based file systems, Conquest achieves 24% to 1900% faster memory performance, and 43% to 96% faster performance when exercising both memory and disk. Motivation Conquest Architecture Conquest Benefits Problems: Modern file systems are designed for disks Disk becoming increasingly worse system bottleneck Growing complexity to mask the disk performance Observation: Cost of persistent RAM (e.g., battery-backed DRAM) is rapidly declining Question: How do we design a file system that exploits the abundance of RAM? Conquest uses memory to store all metadata (file attributes), small files, executables, and shared libraries, leaving only the content of large files on disk. All accesses to in-core data and metadata incur no data duplication or disk-related overhead, and executions are in-place. Because most accesses to large files are sequential, we can relax many historical disk design constraints. Persistence: Conquest memory storage survives reboots Capacity: Conquest is not limited by the physical size of the memory Simplicity: Conquest consists of two simplified data paths, with at least 20% fewer semicolons compared to ext2, reiserfs, and SGI XFS) Performance: Conquest is at least 24% faster than ext2, reiserfs, and SGI XFS, operating under the LRU disk cache 1995 2005 10 0 Year $/MB (log) 2000 10 -2 10 -1 10 1 10 2 paper/film 3.5” HDD 2.5” HDD 1” HDD persistent RAM Booming of digital photography 4 to 10 GB of persistent RAM on high-end machines Conquest File System Simplified IO buffer management IO buffer Storage requests Simplified disk management Disk Simplified persistence support Battery-backed RAM 10% 90% File system boundary Small file and metadata storage Large-file-only file system ATA/SCSI/IDE Conventional file systems IO buffer Disk management Storage requests IO buffer management Disk File system boundary ATA/SCSI/IDE Persistence support Buffer allocation management Buffer garbage collection Data caching Metadata caching Predictive readahead Write behind Cache replacement Metadata allocation Metadata placement Metadata translation Disk layout Fragmentation management Conventional Data Path IO buffer Disk management Storage requests IO buffer management Disk Persistence support Simplified metadata allocation Memory manager encapsulation Simplified persistence support Battery-backed RAM Small file and metadata storage Conquest Memory Data Path Storage requests Buffer allocation management Buffer garbage collection Data caching Simplified predictive readahead Simplified write behind Simplified cache replacement Simplified disk layout Simplified disk management Conquest Disk Data Path Simplified IO buffer management IO buffer Storage requests Disk Small file and metadata storage Large-file-only file system Battery-backed RAM P ostM ark B enchm ark (10,000 files,350 M B to 3.5 G B w orking setsize) 0 1000 2000 3000 4000 5000 0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 percentage oflarge files trans /sec SGIXFS reiserfs ext2 Conquest 2 GB physical RAM > RAM <= RAM 0 20 40 60 80 100 120 6.0 7.0 8.0 9.0 10.0 percentage oflarge files trans / sec P ostM ark B enchm ark (250 M B ofsm allfiles) 0 2000 4000 6000 8000 10000 5000 10000 15000 20000 25000 30000 files trans /sec SGIXFS reiserfs ext2 ram fs Conquest

The Conquest File System: An-I A. Wang Geoffrey H. Kuenning Peter Reiher Gerald J. Popek Life after Disks Abstract The rapidly declining cost of persistent

Embed Size (px)

Citation preview

Page 1: The Conquest File System: An-I A. Wang Geoffrey H. Kuenning Peter Reiher Gerald J. Popek Life after Disks Abstract The rapidly declining cost of persistent

The Conquest File System:

An-I A. Wang • Geoffrey H. Kuenning • Peter Reiher • Gerald J. Popek

Life after Disks

Abstract

The rapidly declining cost of persistent RAM

technologies prompts the question of when,

not whether, such memory will become the

preferred storage medium for many

computers. Conquest is a file system that

provides a transition from disk to persistent

RAM as the primary storage medium.

Conquest provides two specialized and

simplified data paths to in-core and on-

disk storage, and Conquest realizes most

of the benefits of persistent RAM at a

fractional cost of a RAM-only solution. As

of October 2001, Conquest can be used

effectively for a hardware cost of under

$200.

We compare Conquest’s performance to

ext2, reiserfs, SGI XFS, and ramfs, using

popular benchmarks. Our measurements

show that Conquest incurs little overhead

compared to ramfs. Compared to the

disk-based file systems, Conquest

achieves 24% to 1900% faster memory

performance, and 43% to 96% faster

performance when exercising both

memory and disk.

Motivation

Conquest Architecture

Conquest Benefits

Problems: Modern file systems are designed for disks Disk becoming increasingly worse system bottleneck Growing complexity to mask the disk performance

Observation: Cost of persistent RAM (e.g., battery-backed DRAM) is rapidly declining

Question: How do we design a file system that exploits the abundance of RAM?

Conquest uses memory to store all metadata (file attributes), small files, executables, and shared libraries, leaving only the content of large files on disk.

All accesses to in-core data and metadata incur no data duplication or disk-related overhead, and executions are in-place. Because most accesses to large files are sequential, we can relax many historical disk design constraints.

Persistence: Conquest memory storage survives rebootsCapacity: Conquest is not limited by the physical size of the memorySimplicity: Conquest consists of two simplified data paths, with at least 20% fewer semicolons compared to ext2, reiserfs, and SGI XFS)

Performance: Conquest is at least 24% faster than ext2, reiserfs, and SGI XFS, operating under the LRU disk cache

1995 2005

100

Year

$/MB (log)

2000

10-2

10-1

101

102

paper/film

3.5” HDD2.5” HDD

1” HDD

persistent RAM

Booming of digitalphotography

4 to 10 GB of persistent RAM on high-end machines

Conquest File System

SimplifiedIO buffermanagement

IO buffer

Storage requests

Simplified disk management

Disk

Simplifiedpersistencesupport

Battery-backedRAM

10%90%

File system boundary

Small file and metadata storage

Large-file-only file system

ATA/SCSI/IDE

Conventional file systems

IO buffer

Disk management

Storage requests

IO buffermanagement

Disk

File system boundary

ATA/SCSI/IDE

Persistencesupport

Buffer allocation management

Buffer garbage collection

Data caching

Metadata caching

Predictive readahead

Write behind

Cache replacement

Metadata allocation

Metadata placement

Metadata translation

Disk layout

Fragmentation management

Conventional Data Path

IO buffer

Disk management

Storage requests

IO buffermanagement

Disk

Persistencesupport

Simplified metadata allocation

Memory manager encapsulation

Simplifiedpersistencesupport

Battery-backedRAM

Small file and metadata storage

Conquest Memory Data PathStorage requests

Buffer allocation management

Buffer garbage collection

Data caching

Simplified predictive readahead

Simplified write behind

Simplified cache replacement

Simplified disk layout

Simplified disk management

Conquest Disk Data Path

SimplifiedIO buffermanagement

IO buffer

Storage requests

Disk

Small file and metadata storage

Large-file-only file system

Battery-backedRAM

PostMark Benchmark (10,000 files, 350 MB to 3.5 GB working set size)

0

1000

2000

3000

4000

5000

0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0

percentage of large files

trans / sec

SGI XFS reiserfs ext2 Conquest

2 GB physical RAM

> RAM<= RAM0

20406080

100120

6.0 7.0 8.0 9.0 10.0

percentage of large files

trans / sec

PostMark Benchmark (250 MB of small files)

0

2000

4000

6000

8000

10000

5000 10000 15000 20000 25000 30000

files

trans / sec

SGI XFS reiserfs ext2 ramfs Conquest