39
HOT COLD Unified Virtual File System For Hot & Cold Data Storage Aditya Ambre Madhura S. Raghavan Rohit Arora ENTERPRISE STORAGE ARCHITECTURE GROUP 2

Hot and cold data storage

Embed Size (px)

Citation preview

Page 1: Hot and cold data storage

HOT COLD

Unified Virtual File SystemFor Hot & Cold Data Storage

Aditya Ambre Madhura S. Raghavan Rohit Arora

ENTERPRISE STORAGE ARCHITECTURE

GROUP 2

Page 2: Hot and cold data storage

HOT COLD

CSC 568 Enterprise Storage Architecture (NC State University)

AGENDA

➔ Problem Statement

➔ Project Goals and Features

➔ Architecture and Workflow

➔ Verification Cases

➔ Summary

Page 3: Hot and cold data storage

Least Frequently Accessed

Data

HOT COLD

CSC 568 Enterprise Storage Architecture (NC State University)

PROBLEM STATEMENT

➔ Lifecycle of Data.

◆ Access frequency.

◆ Storage capacity and hardware characteristics.

➔ User intervention - Running jobs/scripts.

➔ Acknowledging Data temperature

➔ Tight coupling needed between storage components

FrequentlyAccessed

Data

Page 4: Hot and cold data storage

HOT COLD

CSC 568 Enterprise Storage Architecture (NC State University)

WHAT IS A HOT FILE?

Data File that

➔ Very frequently accessed.

➔ Mostly contains business critical information.

➔ Needs to be accessed quickly.

Page 5: Hot and cold data storage

HOT COLD

CSC 568 Enterprise Storage Architecture (NC State University)

WHAT IS A COLD FILE?

Data File that

➔ Is infrequently accessed.

➔ Contains less important information.

➔ Need not be quickly accessed.

Page 6: Hot and cold data storage

HOT COLD

CSC 568 Enterprise Storage Architecture (NC State University)

GOAL: WHAT OUR PROJECT IS?

➔ From decoupled storage components - To - tightly coupled two-

tiered storage system

➔ Manage hot & cold data between primary and secondary storage.

➔ Manage primary storage space utilization.

➔ File transfer do not interrupt FS operations.

➔ User agnostic about file transfer and storage.

➔ Optimal storage of cold data.

Page 7: Hot and cold data storage

HOT COLD

CSC 568 Enterprise Storage Architecture (NC State University)

WHAT OUR PROJECT IS?

Page 8: Hot and cold data storage

HOT COLD

CSC 568 Enterprise Storage Architecture (NC State University)

FEATURES

➔ Infinite Storage illusion

➔ Automatic cold data identification and transfer

➔ Consistent CRUD operations for both hot and cold files

➔ Block level storage

➔ On the fly deduplication

➔ Uninterrupted file access

➔ File level Consistency

➔ Optimal storage space utilization

Page 9: Hot and cold data storage

HOT COLD

CSC 568 Enterprise Storage Architecture (NC State University)

OUR ARCHITECTURE

Cold File Tracking

Hot File Tracking

File TrackingLayer

Data Block Processing Layer

Write block to cold

Get block from cold

De-duplication

COLD STORAGE

APPLICATION

Write Read

FUSE OPERATIONSRead, Write, Delete, Rename, etc.

2f0f3ff2c7439635e7faa85…3f35ec5fe4ae0b963779c8…4a8f9ec938243beac4b2d…

Hot File

Cold File

Page 10: Hot and cold data storage

HOT COLD

CSC 568 Enterprise Storage Architecture (NC State University)

HOT-TO-COLD WORKFLOW

COLD STORAGE

APPLICATION

Write

FUSE {WRITE} OPERATIONS

File TrackingLayer

Data Block Processing Layer

Page 11: Hot and cold data storage

HOT COLD

CSC 568 Enterprise Storage Architecture (NC State University)

HOT-TO-COLD WORKFLOW

COLD STORAGE

APPLICATION

Write

FUSE {WRITE} OPERATIONSCheck: Storage > 70%

File TrackingLayer

Data Block Processing Layer

Page 12: Hot and cold data storage

HOT COLD

CSC 568 Enterprise Storage Architecture (NC State University)

HOT-TO-COLD WORKFLOW

COLD STORAGE

APPLICATION

Write

FUSE {WRITE} OPERATIONSCheck: Storage > 70%

Cold File Tracking

File TrackingLayer

Data Block Processing Layer

Page 13: Hot and cold data storage

HOT COLD

CSC 568 Enterprise Storage Architecture (NC State University)

HOT-TO-COLD WORKFLOW

File Tracking Layer

1. List all the files

2. Sort files by access time - oldest to newest

3. Select files to be transferred - (till <=50%)

4. Sort above files by size - large to small

5. Send the largest & least accessed files to

Data Processing layer

Cold File tracking

Page 14: Hot and cold data storage

HOT COLD

CSC 568 Enterprise Storage Architecture (NC State University)

HOT-TO-COLD WORKFLOW

File TrackingLayer

File 11:30 PM100 KB

File 24:30 PM500 KB

File 33:30 PM250 KB

File 42:30 PM350 KB

File 11:30 PM100 KB

File 24:30 PM500 KB

File 33:30 PM250 KB

File 42:30 PM350 KB

File 11:30 PM100 KB

File 42:30 PM350 KB

File 33:30 PM250 KB

Page 15: Hot and cold data storage

HOT COLD

CSC 568 Enterprise Storage Architecture (NC State University)

HOT-TO-COLD WORKFLOW

COLD STORAGE

APPLICATION

Write

FUSE {WRITE} OPERATIONSCheck: Storage > 70%

File TrackingLayer

Cold File Tracking

Data Block Processing Layer

Write block to cold

Cold File

Page 16: Hot and cold data storage

2f0f3ff2…7439635…e7faa85…3f35ec5f…e4ae0b9...

HOT COLD

CSC 568 Enterprise Storage Architecture (NC State University)

HOT-TO-COLD WORKFLOW

Data BlockProcessing Layer

1. Request Hashtable2. Get Hashtable

Write Blockto Cold

COLD STORAGE

1. Request Hashtable

2. Gets Hashtable

Page 17: Hot and cold data storage

HOT COLD

CSC 568 Enterprise Storage Architecture (NC State University)

HOT-TO-COLD WORKFLOW

Data BlockProcessing Layer

1. Request Hashtable2. Get Hashtable3. Calculate block level hash4. Check for de-duplication

Write Blockto Cold

COLD STORAGE

4. Duplicate? 2f0f3ff2…7439635…e7faa85…3f35ec5f…e4ae0b9...

Block 1

Block 2

Block 3

Page 18: Hot and cold data storage

HOT COLD

CSC 568 Enterprise Storage Architecture (NC State University)

HOT-TO-COLD WORKFLOW

Data BlockProcessing Layer

1. Request Hashtable2. Get Hashtable3. Calculate block level hash4. Check for de-duplication5. Transfer if not duplicate6. Free block’s memory

Write Blockto Cold

COLD STORAGE

5. Transfer Block

2f0f3ff2…7439635…e7faa85…3f35ec5f…e4ae0b9...

5. UpdateHashtable

Block 1 Block 2 Block 3

Page 19: Hot and cold data storage

HOT COLD

CSC 568 Enterprise Storage Architecture (NC State University)

HOT-TO-COLD WORKFLOW

Data BlockProcessing Layer

1. Request Hashtable2. Get Hashtable3. Calculate block level hash4. Check for de-duplication5. Transfer if not duplicate6. Free block’s memory7. Send updated hashtable to

cold storage

Write Blockto Cold

COLD STORAGE

7. Send Updated

Hashtable

2f0f3ff2…7439635…e7faa85…3f35ec5f…e4ae0b9...

6.

Page 20: Hot and cold data storage

HOT COLD

CSC 568 Enterprise Storage Architecture (NC State University)

HOT-TO-COLD WORKFLOW

COLD STORAGE

APPLICATION

Write

FUSE {WRITE} OPERATIONSCheck: Storage <= 50%

File TrackingLayer

Cold File Tracking

Data Block Processing Layer

Write block to cold

Cold File De-duplication

2f0f3ff2c7439635e7faa85…3f35ec5fe4ae0b963779c8…4a8f9ec938243beac4b2d…

Page 21: Hot and cold data storage

HOT COLD

CSC 568 Enterprise Storage Architecture (NC State University)

COLD-TO-HOT WORKFLOW

COLD STORAGE

APPLICATION

FUSE {READ} OPERATIONS

File TrackingLayer

Data Block Processing Layer

ReadRequest

2f0f3ff2c7439635e7faa85…3f35ec5fe4ae0b963779c8…4a8f9ec938243beac4b2d…

Page 22: Hot and cold data storage

HOT COLD

CSC 568 Enterprise Storage Architecture (NC State University)

COLD-TO-HOT WORKFLOW

COLD STORAGE

APPLICATION

FUSE {READ} OPERATIONS

File TrackingLayer

Data Block Processing Layer

ReadRequest

Check: Is File on Hot Storage?2f0f3ff2c7439635e7faa85…3f35ec5fe4ae0b963779c8…4a8f9ec938243beac4b2d…

Page 23: Hot and cold data storage

HOT COLD

CSC 568 Enterprise Storage Architecture (NC State University)

COLD-TO-HOT WORKFLOW

COLD STORAGE

APPLICATION

FUSE {READ} OPERATIONS

File TrackingLayer

Data Block Processing Layer

ReadRequest

Check: Is File on Hot Storage?

Get block from cold

No 2f0f3ff2c7439635e7faa85…3f35ec5fe4ae0b963779c8…4a8f9ec938243beac4b2d…

Page 24: Hot and cold data storage

HOT COLD

CSC 568 Enterprise Storage Architecture (NC State University)

COLD-TO-HOT WORKFLOW

Data BlockProcessing Layer

1. Request copy of Hashtable2. Get Hashtable

Get Blockfrom Cold

COLD STORAGE

1. Request Hashtable

2. Gets Hashtable

Page 25: Hot and cold data storage

2f0f3ff2…7439635…e7faa85…3f35ec5f…e4ae0b9...

HOT COLD

CSC 568 Enterprise Storage Architecture (NC State University)

COLD-TO-HOT WORKFLOW

Data BlockProcessing Layer

1. Request copy of Hashtable2. Get Hashtable3. Read block presence on cold

Get Blockfrom Cold

COLD STORAGE

3. Is block present?

Page 26: Hot and cold data storage

HOT COLD

CSC 568 Enterprise Storage Architecture (NC State University)

COLD-TO-HOT WORKFLOW

Data BlockProcessing Layer

1. Request copy of Hashtable2. Get Hashtable3. Read block presence on cold4. Request/Get block from cold

Get Blockfrom Cold

COLD STORAGE

4 Request Block

4. Gets Block

2f0f3ff2…7439635…e7faa85…3f35ec5f…e4ae0b9...

2f0f3ff2…7439635…e7faa85…3f35ec5f…e4ae0b9...

Block 1 Block 2 Block 3

Page 27: Hot and cold data storage

2f0f3ff2…7439635…e7faa85…3f35ec5f…e4ae0b9...

HOT COLD

CSC 568 Enterprise Storage Architecture (NC State University)

COLD-TO-HOT WORKFLOW

Data BlockProcessing Layer

1. Request copy of Hashtable2. Get Hashtable3. Read block presence on cold4. Request/Get block from cold5. Write transferred’ block

content to memory block6. Construct complete file

Get Blockfrom Cold

COLD STORAGE

Block 1

Block 2

Block 3

6.

Page 28: Hot and cold data storage

2f0f3ff2…7439635…e7faa85…3f35ec5f…e4ae0b9...

HOT COLD

CSC 568 Enterprise Storage Architecture (NC State University)

COLD-TO-HOT WORKFLOW

Data BlockProcessing Layer

1. Request copy of Hashtable2. Get Hashtable3. Read block presence on cold4. Request/Get block from cold5. Write transferred’ block

content to memory block6. Construct complete file7. Delete copy of Hashtable

Get Blockfrom Cold

COLD STORAGE

Block 1

Block 2

Block 3

7. DeleteHashtable

Page 29: Hot and cold data storage

HOT COLD

CSC 568 Enterprise Storage Architecture (NC State University)

COLD-TO-HOT WORKFLOW

COLD STORAGE

APPLICATION

FUSE {READ} OPERATIONS

File TrackingLayer

Data Block Processing Layer

ReadReadRequest

Get block from cold

Block Read

Request

No 2f0f3ff2c7439635e7faa85…3f35ec5fe4ae0b963779c8…4a8f9ec938243beac4b2d…

Page 30: Hot and cold data storage

HOT COLD

CSC 568 Enterprise Storage Architecture (NC State University)

MINIMAL THRESHOLD WORKFLOW

COLD STORAGE

APPLICATION

FUSE {READ} OPERATIONS

File TrackingLayer

Data Block Processing Layer

Some Operation

Get block from cold

Block Read

Request

Yes 2f0f3ff2c7439635e7faa85…3f35ec5fe4ae0b963779c8…4a8f9ec938243beac4b2d…

Check: Storage <= 30%

Get Cold FileHot File Tracking

Page 31: Hot and cold data storage

HOT COLD

CSC 568 Enterprise Storage Architecture (NC State University)

READ OPERATION WORKFLOW

COLD STORAGE

APPLICATION

FUSE {READ} OPERATIONS

File TrackingLayer

Data Block Processing Layer

Some Operation

Get block from cold

Block Read

Request

Yes 2f0f3ff2c7439635e7faa85…3f35ec5fe4ae0b963779c8…4a8f9ec938243beac4b2d…

Check: Storage >30% & < 70%

Get Cold FileHot File Tracking

Page 32: Hot and cold data storage

HOT COLD

CSC 568 Enterprise Storage Architecture (NC State University)

QUICK DEMO

Page 33: Hot and cold data storage

HOT COLD

CSC 568 Enterprise Storage Architecture (NC State University)

SCENARIOS / VERIFICATION CASES

I. GENERAL

➔ File System 70% full -> Transfer to cold storage.

➔ File System drops less than 30% -> Transfer from cold storage.

➔ File transfers -> Do not interrupt general FS operations.

➔ Redundant/Duplicate blocks ->Not transferred.

Page 34: Hot and cold data storage

HOT COLD

CSC 568 Enterprise Storage Architecture (NC State University)

SCENARIOS / VERIFICATION CASES

II. SPECIFIC

➔ Files transferred –> Based on access and size.

➔ File removed on hot storage –> After last block is transferred.

➔ File in transition accessed –> Abort transfer, access granted!

➔ File space reclamation and File access –> Synchronized.

➔ Only one background process running at specific time.

➔ Delayed delete (rm) -> Transparent to user.

Page 35: Hot and cold data storage

HOT COLD

CSC 568 Enterprise Storage Architecture (NC State University)

ASSUMPTIONS

➔ Network is always available.

➔ Hot-Cold classification at file level

➔ Cold Storage is infinite.

➔ Files are not very small or very large.

➔ Delay is accepted for rarely accessed files.

➔ File access granularity – in seconds.

Page 36: Hot and cold data storage

HOT COLD

CSC 568 Enterprise Storage Architecture (NC State University)

SUMMARY

➔ Acknowledged data temperatures - hot and cold

➔ Project Features

◆ Auto file identification.

◆ File transfer

◆ Deduplication

➔ Architecture and workflows in action.

➔ Design and implementation of file tracking layer

➔ Design and implementation of Block Data Process Layer

➔ Design decisions for specific verification scenarios.

Page 37: Hot and cold data storage

HOT COLD

CSC 568 Enterprise Storage Architecture (NC State University)

FUTURE SCOPE

➔ Variable block size and Block size specifications.

➔ Garbage collection on secondary/cold storage.

➔ Cold file identification parameters and profiles.

➔ Distributed cold storage.

Page 38: Hot and cold data storage

HOT COLD

CSC 568 Enterprise Storage Architecture (NC State University)

REFERENCES

1. S. Quinlan and S. Dorward, “Venti: A new approach to archival storage,” in Proceedings of the First USENIX Conference on File and Storage Technologies (FAST), 2002. http://plan9.bell-labs.com/sys/doc/venti/venti.pdf

2. Chuanyi Liu, Dapeng Ju, et al, “Semantic data de-duplication for archival storage systems,” in Proceedings of the 13th IEEE Asia-Pacific Computer Systems Architecture Conference (ACSAC 2008), Hsinchu, Taiwan, August, 2008.

3. Sean Quinlan, Jim McKie Russ Cox, “Fossil, an Archival File Server”, Lucent Technologies Bell Labs, Unpublished memorandum (September 2003).

4. http://www.storiant.com/resources/Cold-Storage-Is-Hot-Again.pdf

5. “What is Unified Storage system ” http://searchstorage.techtarget.com/definition/unified-storage

6. File System in User Space - http://fuse.sourceforge.net/

Page 39: Hot and cold data storage

HOT COLD

CSC 568 Enterprise Storage Architecture (NC State University)

QUESTIONS ?