37
Architecture for Caching Responses with Multiple Dynamic Dependencies in Multi-Tier Data-Centers over InfiniBand S. Narravula, P. Balaji, K. Vaidyanathan, H.-W. Jin and D. K. Panda The Ohio State University

S. Narravula, P. Balaji, K. Vaidyanathan, H.-W. Jin and D. K. Panda The Ohio State University

  • Upload
    blade

  • View
    31

  • Download
    0

Embed Size (px)

DESCRIPTION

Architecture for Caching Responses with Multiple Dynamic Dependencies in Multi-Tier Data-Centers over InfiniBand. S. Narravula, P. Balaji, K. Vaidyanathan, H.-W. Jin and D. K. Panda The Ohio State University. Presentation Outline. Introduction/Motivation Design and Implementation - PowerPoint PPT Presentation

Citation preview

Page 1: S. Narravula, P. Balaji, K. Vaidyanathan,      H.-W. Jin and D. K. Panda The Ohio State University

Architecture for Caching Responses with Multiple Dynamic Dependencies in Multi-Tier Data-Centers over InfiniBandS. Narravula, P. Balaji, K. Vaidyanathan,

H.-W. Jin and D. K. Panda

The Ohio State University

Page 2: S. Narravula, P. Balaji, K. Vaidyanathan,      H.-W. Jin and D. K. Panda The Ohio State University

Presentation Outline

Introduction/Motivation Design and Implementation Experimental Results Conclusions

Page 3: S. Narravula, P. Balaji, K. Vaidyanathan,      H.-W. Jin and D. K. Panda The Ohio State University

Introduction Fast Internet Growth

Number of Users Amount of data Types of services

Several uses E-Commerce, Online Banking, Online Auctions, etc

Web Server Scalability Multi-Tier Data-Centers Caching – An Important Technique

Page 4: S. Narravula, P. Balaji, K. Vaidyanathan,      H.-W. Jin and D. K. Panda The Ohio State University

Presentation Outline Introduction/Motivation

Multi-Tier Data-Centers Active Caches

Design and Implementation Experimental Results Conclusions

Page 5: S. Narravula, P. Balaji, K. Vaidyanathan,      H.-W. Jin and D. K. Panda The Ohio State University

A Typical Multi-Tier Data-Center

Database Servers

Clients

Application Servers

Web Servers

Proxy Nodes

Tier 0

Tier 1

Tier 2

Apache

PHP

WAN

SAN

Page 6: S. Narravula, P. Balaji, K. Vaidyanathan,      H.-W. Jin and D. K. Panda The Ohio State University

InfiniBand High Performance

Low latency High Bandwidth

Open Industry Standard Provides rich features

RDMA, Remote Atomic operations, etc Targeted for Data-Centers Transport Layers

VAPI IPoIB

Page 7: S. Narravula, P. Balaji, K. Vaidyanathan,      H.-W. Jin and D. K. Panda The Ohio State University

Presentation Outline Introduction/Motivation

Multi-Tier Data-Centers Active Caches

Design and Implementation Experimental Results Conclusions

Page 8: S. Narravula, P. Balaji, K. Vaidyanathan,      H.-W. Jin and D. K. Panda The Ohio State University

Caching

Can avoid re-fetching of content

Beneficial if requests repeat

Important for scalability Static content caching

Well studied in the past Widely used

Front-End Tiers

Back-End Tiers

Number ofRequestsDecrease

Page 9: S. Narravula, P. Balaji, K. Vaidyanathan,      H.-W. Jin and D. K. Panda The Ohio State University

Active Caching Dynamic Data

Stock Quotes, Scores, Personalized Content, etc Complexity of content

Simple caching methods not suited Issues

Consistency Coherency

Proxy NodeCache

Back-EndData

User Request

Update

Page 10: S. Narravula, P. Balaji, K. Vaidyanathan,      H.-W. Jin and D. K. Panda The Ohio State University

Cache Coherency

Refers to the average staleness of the document served from cache

Strong or immediate (Strong Coherency) Required for certain kinds of data Cache Disabling Client Polling

Page 11: S. Narravula, P. Balaji, K. Vaidyanathan,      H.-W. Jin and D. K. Panda The Ohio State University

Basic Client Polling *Front-End Back-End

Request

Cache Hit

Cache Miss

Response

Version Read

* SAN04: Supporting Strong Cache Coherency for Active Caches in Multi-TierData-Centers over InfiniBand. Narravula, et. Al.

Page 12: S. Narravula, P. Balaji, K. Vaidyanathan,      H.-W. Jin and D. K. Panda The Ohio State University

Multiple Object Dependencies Cache documents contain multiple objects A Many-to-Many mapping

Single Cache document can contain Multiple Objects Single Object can be a part of multiple Documents

Complexity!!

Cache Documents Objects

Page 13: S. Narravula, P. Balaji, K. Vaidyanathan,      H.-W. Jin and D. K. Panda The Ohio State University

Client PollingFront-End Back-End

Request

Cache Hit

Cache Miss

Response

Version ReadSingle Check

Possible

Single Lookup counter essential for correct and efficient design

Page 14: S. Narravula, P. Balaji, K. Vaidyanathan,      H.-W. Jin and D. K. Panda The Ohio State University

Objective

To design an architecture that very efficiently supports strong cache coherency with multiple dynamic dependencies on InfiniBand

Page 15: S. Narravula, P. Balaji, K. Vaidyanathan,      H.-W. Jin and D. K. Panda The Ohio State University

Presentation Outline

Introduction/Motivation Design and Implementation Experimental Results Conclusions

Page 16: S. Narravula, P. Balaji, K. Vaidyanathan,      H.-W. Jin and D. K. Panda The Ohio State University

Basic System ArchitectureServer Node

Mod

Server Node

Mod

Server Node

Mod

Server Node

Mod

Cooperation

Cache Lookup Counter maintained on the Application Servers

ProxyServers

ApplicationServers

Page 17: S. Narravula, P. Balaji, K. Vaidyanathan,      H.-W. Jin and D. K. Panda The Ohio State University

Basic Design

Home Node based Client Polling Cache Documents assigned home nodes

Proxy Server Modules Client polling functionality

Application Server Modules Support “Version Reads” for client polling Handle updates

Page 18: S. Narravula, P. Balaji, K. Vaidyanathan,      H.-W. Jin and D. K. Panda The Ohio State University

Many-to-Many Mappings

Mapping of updates to dynamic objects Mapping of dynamic objects with Lookup

counters Efficiency

Factor of dependency

UpdatesObjectsLookupcounters

Page 19: S. Narravula, P. Balaji, K. Vaidyanathan,      H.-W. Jin and D. K. Panda The Ohio State University

Mapping of updates

Non-Trivial solution Three possibilities

Database schema, constraints and dependencies are known

Per query dependencies are known No dependency information known

Page 20: S. Narravula, P. Balaji, K. Vaidyanathan,      H.-W. Jin and D. K. Panda The Ohio State University

Mapping Schemes

Dependency Lists Home node based Complete dependency lists

Invalidate All Single Lookup Counter for a given class of

queries Low application server overheads

Page 21: S. Narravula, P. Balaji, K. Vaidyanathan,      H.-W. Jin and D. K. Panda The Ohio State University

Handling Updates

DatabaseServer

Ack (Atomic)

ApplicationServer

ApplicationServer

ApplicationServer

Update Notification

VAPI SendLocal

Search andCoherentInvalidate

HTTPRequest

HTTPResponse

DB Query (TCP)

DB Response

Page 22: S. Narravula, P. Balaji, K. Vaidyanathan,      H.-W. Jin and D. K. Panda The Ohio State University

Presentation Outline Introduction/Motivation Design and Implementation Experimental Results Conclusions

Page 23: S. Narravula, P. Balaji, K. Vaidyanathan,      H.-W. Jin and D. K. Panda The Ohio State University

Experimental Test-bed

Cluster 1: Eight Dual 3.0 GHz Xeon processor nodes with 64-bit 133MHz PCI-X interface, 512KB L2-Cache and 533 MHz Front Side Bus

Cluster 2: Eight Dual 2.4 GHz Xeon processor nodes with 64-bit 133MHz PCI-X interface, 512KB L2-Cache and 400 MHz Front Side Bus

Mellanox InfiniHost MT23108 Dual Port 4x HCAs MT43132 eight 4x port Switch Mellanox Golden CD 0.5.0

Page 24: S. Narravula, P. Balaji, K. Vaidyanathan,      H.-W. Jin and D. K. Panda The Ohio State University

Experimental Outline

Basic Data-Center Performance Cache Misses in Active Caching Impact of Cache Size Impact of Varying Dependencies Impact of Load in Backend Servers Traces Used

Traces 1-5 with increasing update rate Trace 6: Zipf like trace

Page 25: S. Narravula, P. Balaji, K. Vaidyanathan,      H.-W. Jin and D. K. Panda The Ohio State University

Basic Data-Center Performance

Data-Center Throughput

0

5000

10000

15000

20000

Trace 2 Trace 3 Trace 4 Trace 5

Traces with Increasing Update Rate

TPS

No Cache Invalidate All Dependency Lists

• Maintaining Dependency Lists perform significantly well for all traces

Data-Center Response Time

0

1

2

3

4

5

6

Trace 2 Trace 3 Trace 4 Trace 5Traces with Increasing Update Rate

Resp

onse

Tim

e (m

s)No Cache Invalidate All Dependency Lists

Page 26: S. Narravula, P. Balaji, K. Vaidyanathan,      H.-W. Jin and D. K. Panda The Ohio State University

Cache Misses in Active Caching

• Cache misses for Invalidate All increases drastically with increasingupdate rates

Cache Misses

0

20

40

60

80

100

120

Trace 2 Trace 3 Trace 4 Trace 5

Traces with Increasing Update Rate

Cach

e M

iss

%

No Cache Invalidate All Dependency Lists

Page 27: S. Narravula, P. Balaji, K. Vaidyanathan,      H.-W. Jin and D. K. Panda The Ohio State University

Impact of Cache Size

• Maintaining Dependency Lists perform significantly well for all traces• Possible to cache a select few and still extract performance

Throughput Vs Cache Size

0

5000

10000

15000

20000

25000

100% 50% 10% 5% 1% 0.10% 0%

Relative Cache Size

TPS

Page 28: S. Narravula, P. Balaji, K. Vaidyanathan,      H.-W. Jin and D. K. Panda The Ohio State University

Impact of Varying Dependencies

• Throughput drops significantly with increase in the average number ofdependencies per cache file

Effect of Varying Dependencies

02000400060008000

10000120001400016000

x 2x 4x 8x 16x 32x 64x

Factor of Dependencies

TPS

Page 29: S. Narravula, P. Balaji, K. Vaidyanathan,      H.-W. Jin and D. K. Panda The Ohio State University

Impact of Load in Backend Servers

• Our design can sustain high performance even under high loadedconditions with a factor of improvement close to 22

Effect of Load

02000400060008000

10000120001400016000

0 1 2 4 8 16 32 64

# Compute Threads

TPS

No Cache Dependency List

Page 30: S. Narravula, P. Balaji, K. Vaidyanathan,      H.-W. Jin and D. K. Panda The Ohio State University

Conclusions An architecture for supporting Strong Cache

Coherence with multiple dynamic dependencies

Efficiently handle multiple dynamic dependencies Supporting RDMA-based Client polling

Resilient to load on back-end servers

Page 31: S. Narravula, P. Balaji, K. Vaidyanathan,      H.-W. Jin and D. K. Panda The Ohio State University

Web Pointers

http://nowlab.cis.ohio-state.edu/

E-mail: {narravul, balaji, vaidyana, jinhy, panda}@cse.ohio-state.edu

NBC home page

Page 32: S. Narravula, P. Balaji, K. Vaidyanathan,      H.-W. Jin and D. K. Panda The Ohio State University

Back-up Slides

Page 33: S. Narravula, P. Balaji, K. Vaidyanathan,      H.-W. Jin and D. K. Panda The Ohio State University

Cache Consistency

Non-decreasing views of system state Updates seen by all or none

User Requests

Proxy Nodes

Back-End Nodes

Update

Page 34: S. Narravula, P. Balaji, K. Vaidyanathan,      H.-W. Jin and D. K. Panda The Ohio State University

Performance

• Receiver side CPU utilization is very low• Leveraging the benefits of One sided communication

Throughput (RDMA Read)

0

1000

2000

3000

4000

5000

6000

7000

1 4 16 64 256 1K 4K 16K 64K 256K

Message Size (bytes)

Thro

ughp

ut (M

bps)

0

5

10

15

20

25

Send CPU Recv CPU

Throughput (Poll) Throughput (Event)

Page 35: S. Narravula, P. Balaji, K. Vaidyanathan,      H.-W. Jin and D. K. Panda The Ohio State University

RDMA based Client Polling *

• The VAPI module can sustain performance even with heavy load on the back-end servers

DataCenter: Throughput

0

500

1000

1500

2000

2500

0 10 20 30 40 50 60 70 80 90 100 200

Number of Compute Threads

Tran

sact

ions

per

sec

ond

(TP

S)

No Cache IPoIB VAPI

* SAN04: Supporting Strong Cache Coherency for Active Caches in Multi-TierData-Centers over InfiniBand. Narravula, et. Al.

Page 36: S. Narravula, P. Balaji, K. Vaidyanathan,      H.-W. Jin and D. K. Panda The Ohio State University

Mechanism

Cache Hit: Back-end Version Check If version current, use cache Invalidate data for failed version check Use of RDMA-Read

Cache Miss Get data to cache Initialize local versions

Page 37: S. Narravula, P. Balaji, K. Vaidyanathan,      H.-W. Jin and D. K. Panda The Ohio State University

Other Implementation Details Requests to read and update are mutually

excluded at the back-end module to avoid simultaneous readers and writers accessing the same data.

Minimal changes to existing application software