Upload
meryl-sullivan
View
220
Download
0
Tags:
Embed Size (px)
Citation preview
NeST: Network Storage
Flexible Commodity Storage AppliancesJohn Bent, Miron Livny, Andrea Arpaci-Dusseauand Remzi Arpaci-Dusseau
Terms
Appliance (Merriam-Webster) b : an instrument or device
designed for a particular use; specifically a household or office device
Storage appliance Storage plus access methods
What storage users want
Reliability and availabilityManageability
cost of management > cost of storage itself
“no futz” computingScalabilityPerformance
What storage vendors have
NetApp, EMC, others make storage appliances (network-attached storage)
Manageable Just plug it in and it works Administrative web interface
Reliable and available Standard RAID techniques
High performance Specialized, thin OS focused on serving files
What storage vendors get,annual revenues
NetApp
$800 million in 2000
EMC
$9 billion in 2000
What’s the problem?
False coupling between HW and SW“Playground syndrome”Myth of specialization
H/W and S/W are bundled
Hardware decisions are imposed Hard to ride commodity curve
Example:Netapp F720
• $35,000.00, 252 GB• $138 / GB
Maxtor DiamondMax• $279.00, 80 GB• $3.50 / GB
“Playground syndrome”
“We have storage appliances . . . if you use these protocols, if you use these security mechanisms, if you are comfortable with our data
semantics”Non-flexible software entity
Myth of specialization
Specialize for one protocol on one machineSpecialization decreases over time as
Protocols are added Product line expands
Example: Netapp software Generation 1 fit on a single floppy Generation 2 took six Generation 3?
Alternatives?
Appliance (Merriam-Webster) a : a piece of equipment for
adapting a tool or machine to a special purpose
Our game?
Flexible, commodity based, software-only storage appliances
Goal
Find a networked machine
“Drop” some software on it
Have a ready to use storage appliance with flexible mechanisms
New worlds, new problems
Diverse hardware, software platforms Netapp, EMC advantage
fewer platforms, control over OS
Our approachAutomate configuration to each host system
• Hardware example - use file system or self-manage• Software example - use either read/write or mmap
Cost of flexibilityKey is design of the software
Outline
IntroductionBuilding flexible storage modules
Big picture Protocol layer Concurrency architecture Storage layer
Motivations for flexible storage appliances
Conclusion and current status
NeST structure
Cleanly separated modules for communication, transfer and storage Protocol layer
Maps diverse protocols into common control flows
Concurrency architecturesDifferent models to maximize system throughput
Storage layerProvides abstract interface to disks
NeST structure
Protocol Layer
GFTP NeST WiND HTTP NFS
CentralControl
Concurrency ArchitectureEvent driven
Multi-process
Multi-threaded
Transfer request
Storage Layer
Raw disk Local FS RAID
Protocol layer
A collection of servers is less than the sum of their parts.
HTTPdNFSd
Operating system Operating system
NeST
HTTPNFS
Consolidate protocols
Single point of control
Storage quotas and guarantees can be supported across multiple protocols.
Bandwidth can be controlled and quality of service can be guaranteed.
Single administrative interface
Set policies
Manage user accounts
Protocol layer implementation
Each protocol listens on well-defined port
Central control accepts connectionsProtocol layer reads from connection
and returns generic request objectLike Linux V-nodes
Add new protocol by writing a couple of methods
Protocol layer example,directory list request
FTP
NeSTspeak
Protocol layer
“31: LIST”
“5”
Central controlDirectory list
Storage layer
Directory list
Linked list
Linked list
“ftp, ftp, ftp”
“nest, nest”
Concurrency architecture
Three difficult goals Low latency High bandwidth Multiple simultaneous clients
No single portable solution Provide multiple models to provide solutions on
a range of different platformsMulti-threadedMulti-processEvent driven
Concurrency architecture
Concurrency architectureEvent driven Multi-process Multi-threaded
Central control creates transfer object Socket descriptor from the protocol layer File descriptor from the storage layer
Transfer object passed to concurrency architecture
Concurrency on Linux
Storage layer
Three needed areas of flexiblity File systems interfaces
Example: read()/write() or mmap()
Abstract storage modelsRAID, JBOD, etc.
User account administrationCreation and removalQuotas and guarentees for users and
groups
File system interfaces on Linux
Outline
IntroductionBuilding flexible storage modulesMotivations for flexible storage
appliancesConclusion and current status
Clients have different needs
Communication protocols Replacement costsData semanticsSecurity and authentication
Communication protocols
The Esperanto problemToo many protocols to implement them
allToo many clients use proprietary
protocolsStorage must allow pluggable protocols.
Replacement costs
Infinite cost to replace first class data.Variable cost to replace cached data
depending on size and distance.Variable cost to replace job output files
depending on computation cost.
Cheap cached files First class data
Cost aware storage can effectively increase its own capacity.
Data semantics
Must stored objects be protected from read and write dependencies?
Is transaction support necessary?Acceptable replies to storage
requests.
Data semantics, example
Problem PFS on top of FTP fakes open read may then return file not found
Solution Mechanisms are needed to support flexible
semantics independent of the transfer protocol.
Divorce semantics from the protocol.
Security and authentication
OwnershipPrivacyEncryptionAuthenticationAccess rights
Who, when, how and how much?
Abstinent Promiscuous
Who is allowed to use the storage?Promiscuity and monogamy are easyPolygamy is also easy
Do I know you?
Problem Migrant grid users may need temporary,
preferential storage accessSolution
Provide mechanisms to advertise available storagecreate self-destructing user accounts
Matchmake applications with storage opportunities.
Outline
IntroductionBuilding flexible storage solutionsMotivations for flexible storage
appliancesConclusion
Current status Future work Concluding remarks
Current status
Concurrency architectures are done Gets, puts, reads and writes perform well
Virtual protocol class interface is built NeST speak is fully implemented Grid ftp coming soon!!
Simple first implementation of storage reservations and remote quota management is done Venkateshwaran Venkataramani
Future work
Discovery process of client storage requirements
Quality of service guarantees for bandwidth and storage
Support for transient and opportunistic users
Concluding remarks
Return storage to the commodity curve by creating software-only storage appliances
Allow greater storage flexibility for a wide range of application needs