View
218
Download
3
Embed Size (px)
Citation preview
Computer Science
Storage Systems andSensor Storage
Research Overview
Computer Science
Storage Research Overview
• Hyperion– High volume stream archival system
• Bandwidth efficient data migration in enterprise storage systems
• Use of flash-storage in data centers
Computer Science
Hyperion Stream Store
• Streaming data common in environments such as network monitoring, system monitoring, sensors, RFID– Archive data for retrospective querying, forensics
• Hyperion: high volume stream archival for distributed network monitoring– Gigabit link: 250K packets per second
– Archive and index in real-time, while supporting interactive querying
– Neither commodity rdbms nor general-purpose file systems suitable
[Usenix 2007]
Computer Science
Hyperion Design• Multiple monitor nodes, each monitoring multiple network links• StreamFS: high-performance stream file system• Local index: multi-level signature index based on bloom filters• Distributed index for querying multiple nodes• Can scale to million pkts/s with StreamFs and 200K pkts/s
indexing per core on a commodity multi-core PC
Monitor/capture
StreamFS
Signature index
Distributedindex
Hyperion node
Computer Science
Online Data Migration
• Enterprise storage systems: multiple volumes mapped onto each array– Load imbalances and hotspots can occur
• Goal: automatically resolve hotspots on volumes in large storage systems
• Focus: minimize migration cost (bytes migrated to resolve hotspot)
• Bandwidth-to-space ratio algorithm– Displace and swap of volumes
– Implemented in Linux lvm
[ICAC 06]
Computer Science
Semantic-aware Replication
• Replication for disaster recovery: synchronous replication for tight recovery point objectives– Latency increases with geographic separation
– Use of intermediary does not improve consistency
– Too stringent for certain applications
• Semantic-aware replication: hybrid approach– Use synchronous replication for “important” writes
– Use asynchronous replication for other writes
– Automatically infer which mode to use for each request
– Transparent to applications
Computer Science
Flash-storage in Data Centers
• Flash-based storage becoming popular– Higher performance but also higher cost than disk drives
• How can flash storage be exploited in data centers?
• Use flash drives as an accelerator between disk storage and servers– Focus on video storage where performance is key
• Exploit flash disk as non-volatile storage in servers– Fast hibernate / resume => efficient power management in data
centers
Computer Science
Sensor Storage Overview
• Flash memory becoming extremely energy-efficient
• Exploit flash memory trends to design more efficient in-network sensor storage and querying systems– Capsule: flash-based
object storage system
– STONES: storage-centric sensor networks
CC1000
CC2420
Telos STM NOR
Atmel NOR
Communication
Storage
Micron NAND 128MB
Energy Cost (uJ/byte)
Generation of Sensor Platform
Computer Science
Capsule Overview
•Object-based storage abstraction
•Energy and memory optimized library of objects
•Checkpointing and rollback for failure recovery
•Storage reclamation to deal with finite storage capacity
•Portable to NAND/NOR flash memories and different sensor platforms
[SenSys 06]
Computer Science
StonesDB Overview
Query Engine
Partitioned Access Methods
• StonesDB: flash memory-optimized archival data management architecture that supports sensor data storage, indexing, and aging of data.
[CIDR 07]
Computer Science
Extra Slides
Computer Science
Mapping App Data Needs to Storage
Debug logsData Archival &
IndexingSignal
Processing PacketQueue
Map application data structures to Capsule objects that offer efficient
flash implementation
CalibrationTables
??Pages on Flash
DataProcessing
QueueArrayStream
StackFile
Index
Computer Science
Local Data Management Stack
Computer Science
Distributed Data Management Stack
Computer Science
STONES
• Design an archival data management architecture that:
– Supports energy-efficient sensor data storage, indexing, and aging by optimizing for flash memories.
– Supports energy-efficient processing of SQL-type queries, as well as data mining and search queries.
– Is configurable to heterogeneous sensor platforms with different memory and processing constraints.
Computer Science
Technology Trends in Storage
Generation of Sensor Platform
CC1000
CC2420
Telos STM NOR
Atmel NOR
Communication
Storage
Micron NAND 128MB
Energy Cost
(uJ/byte)