8
ScoutFS: POSIX Archiving at Extreme Scale Zach Brown, Versity MSST 2019

ScoutFS: POSIX Archiving at Extreme Scale · - Item writes grouped into atomic log fragment writes - Fundamental unit of metadata IO is large log fragments • Build a robust POSIX

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: ScoutFS: POSIX Archiving at Extreme Scale · - Item writes grouped into atomic log fragment writes - Fundamental unit of metadata IO is large log fragments • Build a robust POSIX

ScoutFS: POSIX Archiving at Extreme ScaleZach Brown, Versity

MSST 2019

Page 2: ScoutFS: POSIX Archiving at Extreme Scale · - Item writes grouped into atomic log fragment writes - Fundamental unit of metadata IO is large log fragments • Build a robust POSIX

POSIX Archiving with ScoutFS

POSIX: NFS / SAMBA / RSYNC / CUSTOM

SAN Fabric

Archival Storage

ScoutFS

ScoutAM Archive Manager

Page 3: ScoutFS: POSIX Archiving at Extreme Scale · - Item writes grouped into atomic log fragment writes - Fundamental unit of metadata IO is large log fragments • Build a robust POSIX

Archival Filesystem Differences

• Built for archive transfer rate, not for total data capacity • Almost all metadata at rest, file data stored on archive tiers • File count no longer constrained by data capacity • Support both user-facing file transfer and internal tier management load • Files have metadata which describes locations in archives • Archive tier management software needs to search through files • Strong desire for open source implementation

Page 4: ScoutFS: POSIX Archiving at Extreme Scale · - Item writes grouped into atomic log fragment writes - Fundamental unit of metadata IO is large log fragments • Build a robust POSIX

Archival Filesystem Challenges

• Exhaustive file searches are in the critical path - Are there new or modified files that need to be archived? - Which files were on that archive media that just caught fire? - Which large archived files were least recently used and can be released? - (.. and users would love efficient searching of their files!)

• Must saturate streaming archive tier throughput - Efficient large file IO - Small files need high metadata rates to produce sufficient archive data

Page 5: ScoutFS: POSIX Archiving at Extreme Scale · - Item writes grouped into atomic log fragment writes - Fundamental unit of metadata IO is large log fragments • Build a robust POSIX

ScoutFS Design Highlights

• Start with an in-kernel coherent key/value item store: - “Logical” locking protects item consistency and governs caching - Log-based “physical” storage allows concurrent item reads and writes - Item writes grouped into atomic log fragment writes - Fundamental unit of metadata IO is large log fragments

• Build a robust POSIX filesystem out of these items: - Full POSIX semantics, data extents, atomic metadata transactions

• Maintain persistent file index items along with FS metadata items: - Sort files by metadata: size, mtime, xattrs, etc - Index items modified in the same transaction as primary metadata items - Concurrent write lock mode avoids global serialization

Page 6: ScoutFS: POSIX Archiving at Extreme Scale · - Item writes grouped into atomic log fragment writes - Fundamental unit of metadata IO is large log fragments • Build a robust POSIX

Concurrent Metadata Reads and Writes

A B C D

1 2 3 4 5 6 7 8

Page 7: ScoutFS: POSIX Archiving at Extreme Scale · - Item writes grouped into atomic log fragment writes - Fundamental unit of metadata IO is large log fragments • Build a robust POSIX

4 Nodes All Search While Creating

Page 8: ScoutFS: POSIX Archiving at Extreme Scale · - Item writes grouped into atomic log fragment writes - Fundamental unit of metadata IO is large log fragments • Build a robust POSIX

Thank [email protected]

@versitysoftware