23
He Xiao and Zhenhua Li, Tsinghua University; Ennan Zhai, Yale University; Tianyin Xu, UIUC; Yang Li and Yunhao Liu, Tsinghua University; Quanlu Zhang, Microsoft Research Asia; Yao Liu, SUNY Binghamton Towards Web-based Delta Synchronization for Cloud Storage Services [email protected] Fast’18 Feb 14, 2018

Towards Web-based Delta Synchronization for Cloud Storage ... · 2/18/2014  · • Native Extension: leverage the native client for web browsers. -> as quick as native rsync , supported

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Towards Web-based Delta Synchronization for Cloud Storage ... · 2/18/2014  · • Native Extension: leverage the native client for web browsers. -> as quick as native rsync , supported

He Xiao and Zhenhua Li, Tsinghua University; Ennan Zhai, Yale University; Tianyin Xu, UIUC; Yang Li and Yunhao Liu, Tsinghua University;

Quanlu Zhang, Microsoft Research Asia; Yao Liu, SUNY Binghamton

Towards Web-based Delta Synchronization for Cloud Storage Services

[email protected] Fast’18 Feb 14, 2018

Page 2: Towards Web-based Delta Synchronization for Cloud Storage ... · 2/18/2014  · • Native Extension: leverage the native client for web browsers. -> as quick as native rsync , supported

Network Traffic is Overwhelming in Cloud Storage

File Synchronization(Sync)

2

Cloud Traffic has 30% CAGR (Compound Average Growth Rate)

Cloud SeverClient

Network Traffic

Page 3: Towards Web-based Delta Synchronization for Cloud Storage ... · 2/18/2014  · • Native Extension: leverage the native client for web browsers. -> as quick as native rsync , supported

Delta Sync Improves Network Efficiency

Delta Sync is crucial for reducing cloud storage network traffic.

10 MB1 B

Delta Sync

Delta Data

3

New File Old File

Delta sync support in nine state-of-the-art cloud storage services 10 MB

Full SyncNew File Old File

Full File

Page 4: Towards Web-based Delta Synchronization for Cloud Storage ... · 2/18/2014  · • Native Extension: leverage the native client for web browsers. -> as quick as native rsync , supported

No Web-based Delta Sync

Why web-based delta sync is not supported by today’s commercial cloud storage services ?

4

Web is the most pervasive and OS-independent cloud storage access method

Web-based delta sync is essential for cloud storage web clients and web apps

Page 5: Towards Web-based Delta Synchronization for Cloud Storage ... · 2/18/2014  · • Native Extension: leverage the native client for web browsers. -> as quick as native rsync , supported

WebRsync: First Workable Web Delta Sync

• Implement rsync on web framework with pure web tech: JavaScript + HTML5 + WebSocket

• Points out the Challenges of supporting delta sync on web.

JavaScript Implementation of Rsync

Web Server

LocalFile System

JavaScriptHTML5 FileAPI

WebSocket

Storage BackendAliyun OSS / OpenStack Swift

High-Speed Internal Network

Web Browser

C Implementation of Rsync

5

Page 6: Towards Web-based Delta Synchronization for Cloud Storage ... · 2/18/2014  · • Native Extension: leverage the native client for web browsers. -> as quick as native rsync , supported

WebRsync benchmarking: poor client performance

6

Sync time of WebRsync vs Linux rsync

~40%

60-92%

Rsync

WebRsync14–25 times slower

Page 7: Towards Web-based Delta Synchronization for Cloud Storage ... · 2/18/2014  · • Native Extension: leverage the native client for web browsers. -> as quick as native rsync , supported

7

StagMeter Tool

Timing tasks: Printing timestamps every 100ms:

x x x x x x x x x x x x x x x x x x x x x

x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x

Stagnation Interval

Stagnation: single-thread is occupied by some backend tasks

User’s operation cannot get response timely.

Page 8: Towards Web-based Delta Synchronization for Cloud Storage ... · 2/18/2014  · • Native Extension: leverage the native client for web browsers. -> as quick as native rsync , supported

1. Send meta dataWait server

2. Checksum Searchand Comparison

3. Send tokens and literal bytes

High CPU Utilization when computing

Timestamp Printing is suspendedWeb is under stagnation state

Measuring Stagnation with StagMeter

8

Sync Process (Second)

Page 9: Towards Web-based Delta Synchronization for Cloud Storage ... · 2/18/2014  · • Native Extension: leverage the native client for web browsers. -> as quick as native rsync , supported

9

Client Cloud

Why poor client : slow searching and comparing

Bottleneck

Page 10: Towards Web-based Delta Synchronization for Cloud Storage ... · 2/18/2014  · • Native Extension: leverage the native client for web browsers. -> as quick as native rsync , supported

WebR2sync: Reverse Computation Process

10

Segmentation& fingerprinting

Searching& Comparing

Changed Bits

Construct New Files

Checksum list

Matched Tokens

Meta data

Client ServerNew file Old file

Page 11: Towards Web-based Delta Synchronization for Cloud Storage ... · 2/18/2014  · • Native Extension: leverage the native client for web browsers. -> as quick as native rsync , supported

WebR2sync: Reverse Computation Process

11

Segmentation& fingerprinting

Searching& Comparing

Changed Bits

Construct New Files

Matched Tokens

Meta data

Client ServerNew file Old file

Checksum list

Matched Tokens

Page 12: Towards Web-based Delta Synchronization for Cloud Storage ... · 2/18/2014  · • Native Extension: leverage the native client for web browsers. -> as quick as native rsync , supported

Performance of WebR2sync

Edit Size (Byte)

Sync

Tim

e (S

econ

d)

12Issue: Server takes severely heavy overhead.

Sever side is 2-3 time slower

Page 13: Towards Web-based Delta Synchronization for Cloud Storage ... · 2/18/2014  · • Native Extension: leverage the native client for web browsers. -> as quick as native rsync , supported

Server-side Overhead Profiling

Checksum searching and block comparison occupy80% of the computing time

MD5 Computing Checksum Search

13

Ø Use faster hash functions to replace MD5Ø Reduce checksum searching overhead

Page 14: Towards Web-based Delta Synchronization for Cloud Storage ... · 2/18/2014  · • Native Extension: leverage the native client for web browsers. -> as quick as native rsync , supported

Replacing MD5 with SipHash in Chunk Comparison

SipHash remain low Collision Probabilityat much faster speed

14

A comparison of pseudorandom hash functions

Page 15: Towards Web-based Delta Synchronization for Cloud Storage ... · 2/18/2014  · • Native Extension: leverage the native client for web browsers. -> as quick as native rsync , supported

Reduce Checksum Searching by Exploiting Locality of File Edits.

15

MD5-4

Hash Table

Adler32-1 Adler32-2 Adler32-3 Adler32-4

MD5-1 MD5-2 MD5-3

Block1 Block2 Block3 Block4

Checksum

search

Compare

SearchingOver 95% modified files have less than 10 edits.

Page 16: Towards Web-based Delta Synchronization for Cloud Storage ... · 2/18/2014  · • Native Extension: leverage the native client for web browsers. -> as quick as native rsync , supported

Reduce Checksum Searching by Exploiting Locality of File Edits.

16

MD5-4

Hash TableAdler32-1 Adler32-2 Adler32-3 Adler32-4

MD5-1 MD5-2 MD5-3

Block1 Block2 Block3 Block4

Checksumsearch

Compare

Searching Over 95% modified files have less than 10 edits.

Page 17: Towards Web-based Delta Synchronization for Cloud Storage ... · 2/18/2014  · • Native Extension: leverage the native client for web browsers. -> as quick as native rsync , supported

A Series of attempts of other techs:Native Extension, Parallelism

17

• Native Extension: leverage the native client for web browsers. -> as quick as native rsync , supported platforms limited (e.g. Mobile web)

• WebRsync-Parallel: using HTML5 web workers to avoid stagnations. -> avoid stagnation but not on sync time

• The drawback of WebRsync cannot be fundamentally addressed through above optimizations

Page 18: Towards Web-based Delta Synchronization for Cloud Storage ... · 2/18/2014  · • Native Extension: leverage the native client for web browsers. -> as quick as native rsync , supported

Evaluation Setup

18

Basic experiment setup visualized in a map of China

Page 19: Towards Web-based Delta Synchronization for Cloud Storage ... · 2/18/2014  · • Native Extension: leverage the native client for web browsers. -> as quick as native rsync , supported

Sync Time

19

WebR2sync+ is 2-3 times faster than WebR2sync and 15-20 times faster than WebRsync

Page 20: Towards Web-based Delta Synchronization for Cloud Storage ... · 2/18/2014  · • Native Extension: leverage the native client for web browsers. -> as quick as native rsync , supported

Throughput

20

This throughput is as 4 times as that of WebR2sync/rsyncand as 9 times as that of NoWebRsync.

Regular Workload Intensive Workload

Page 21: Towards Web-based Delta Synchronization for Cloud Storage ... · 2/18/2014  · • Native Extension: leverage the native client for web browsers. -> as quick as native rsync , supported

Conclusion

• Implement a workable web-based delta sync named WebRsyncusing JavaScript and Html5, then quantifying the stagnation on browser by StagMeter.

• WebR2sync: Reverse the rsync process by moving computation-intensive operations from client with JavaScript to server side with efficient native C code.

• WebR2sync+: By exploiting the edit locality and trading off hash algorithms, we make the computation overhead affordable at the server side.

21

Page 22: Towards Web-based Delta Synchronization for Cloud Storage ... · 2/18/2014  · • Native Extension: leverage the native client for web browsers. -> as quick as native rsync , supported

Future Work

• A seamless way to integrate the server-side design of WebR2sync+ with the back-end of commercial cloud storage vendors (like Dropbox and iCloud Drive).

• Explore the benefits of using more fine-grained and complex delta sync protocols, such as CDC and its variants.

• We envision to expand the usage of WebR2sync+ for a broader range of web service scenarios.

22

Page 23: Towards Web-based Delta Synchronization for Cloud Storage ... · 2/18/2014  · • Native Extension: leverage the native client for web browsers. -> as quick as native rsync , supported

Q&AThanks!

23