Architecture for Caching Responses with Multiple Dynamic Dependencies in Multi-Tier Data- Centers...

Preview:

Citation preview

Architecture for Caching Responses with Multiple Dynamic Dependencies in Multi-Tier Data-Centers over InfiniBandS. Narravula, P. Balaji, K. Vaidyanathan,

H.-W. Jin and D. K. Panda

The Ohio State University

Presentation Outline

Introduction/Motivation Design and Implementation Experimental Results Conclusions

Introduction

Fast Internet Growth Number of Users Amount of data Types of services

Several uses E-Commerce, Online Banking, Online Auctions, etc

Web Server Scalability Multi-Tier Data-Centers Caching – An Important Technique

Presentation Outline

Introduction/Motivation Multi-Tier Data-Centers Active Caches

Design and Implementation Experimental Results Conclusions

A Typical Multi-Tier Data-Center

Database Servers

Clients

Application Servers

Web Servers

Proxy Nodes

Tier 0

Tier 1

Tier 2

Apache

PHP

WAN

SAN

InfiniBand High Performance

Low latency High Bandwidth

Open Industry Standard Provides rich features

RDMA, Remote Atomic operations, etc Targeted for Data-Centers Transport Layers

VAPI IPoIB

Presentation Outline

Introduction/Motivation Multi-Tier Data-Centers Active Caches

Design and Implementation Experimental Results Conclusions

Caching

Can avoid re-fetching of content

Beneficial if requests repeat

Important for scalability Static content caching

Well studied in the past Widely used

Front-End Tiers

Back-End Tiers

Number ofRequestsDecrease

Active Caching Dynamic Data

Stock Quotes, Scores, Personalized Content, etc Complexity of content

Simple caching methods not suited Issues

Consistency Coherency

Proxy NodeCache

Back-EndData

User Request

Update

Cache Coherency

Refers to the average staleness of the document served from cache

Strong or immediate (Strong Coherency) Required for certain kinds of data Cache Disabling Client Polling

Basic Client Polling *

Front-End Back-End

Request

Cache Hit

Cache Miss

Response

Version Read

* SAN04: Supporting Strong Cache Coherency for Active Caches in Multi-TierData-Centers over InfiniBand. Narravula, et. Al.

Multiple Object Dependencies Cache documents contain multiple objects A Many-to-Many mapping

Single Cache document can contain Multiple Objects Single Object can be a part of multiple Documents

Complexity!!

Cache Documents Objects

Client Polling

Front-End Back-End

Request

Cache Hit

Cache Miss

Response

Version ReadSingle Check

Possible

Single Lookup counter essential for correct and efficient design

Objective

To design an architecture that very

efficiently supports strong cache

coherency with multiple dynamic

dependencies on InfiniBand

Presentation Outline

Introduction/Motivation Design and Implementation Experimental Results Conclusions

Basic System Architecture

Server Node

Mod

Server Node

Mod

Server Node

Mod

Server Node

Mod

Cooperation

Cache Lookup Counter maintained on the Application Servers

ProxyServers

ApplicationServers

Basic Design

Home Node based Client Polling Cache Documents assigned home nodes

Proxy Server Modules Client polling functionality

Application Server Modules Support “Version Reads” for client polling Handle updates

Many-to-Many Mappings

Mapping of updates to dynamic objects Mapping of dynamic objects with Lookup

counters Efficiency

Factor of dependency

UpdatesObjectsLookupcounters

Mapping of updates

Non-Trivial solution Three possibilities

Database schema, constraints and dependencies are known

Per query dependencies are known No dependency information known

Mapping Schemes

Dependency Lists Home node based Complete dependency lists

Invalidate All Single Lookup Counter for a given class of

queries Low application server overheads

Handling Updates

DatabaseServer

Ack (Atomic)

ApplicationServer

ApplicationServer

ApplicationServer

Update Notification

VAPI SendLocal

Search andCoherentInvalidate

HTTPRequest

HTTPResponse

DB Query (TCP)

DB Response

Presentation Outline Introduction/Motivation Design and Implementation Experimental Results Conclusions

Experimental Test-bed

Cluster 1: Eight Dual 3.0 GHz Xeon processor nodes with 64-bit

133MHz PCI-X interface, 512KB L2-Cache and 533 MHz Front

Side Bus Cluster 2: Eight Dual 2.4 GHz Xeon processor nodes with 64-bit

133MHz PCI-X interface, 512KB L2-Cache and 400 MHz Front

Side Bus Mellanox InfiniHost MT23108 Dual Port 4x HCAs MT43132 eight 4x port Switch Mellanox Golden CD 0.5.0

Experimental Outline

Basic Data-Center Performance Cache Misses in Active Caching Impact of Cache Size Impact of Varying Dependencies Impact of Load in Backend Servers Traces Used

Traces 1-5 with increasing update rate Trace 6: Zipf like trace

Basic Data-Center Performance

Data-Center Throughput

0

5000

10000

15000

20000

Trace 2 Trace 3 Trace 4 Trace 5

Traces with Increasing Update Rate

TPS

No Cache Invalidate All Dependency Lists

• Maintaining Dependency Lists perform significantly well for all traces

Data-Center Response Time

0

1

2

3

4

5

6

Trace 2 Trace 3 Trace 4 Trace 5

Traces with Increasing Update Rate

Resp

onse

Tim

e (m

s)No Cache Invalidate All Dependency Lists

Cache Misses in Active Caching

• Cache misses for Invalidate All increases drastically with increasingupdate rates

Cache Misses

0

20

40

60

80

100

120

Trace 2 Trace 3 Trace 4 Trace 5

Traces with Increasing Update Rate

Cac

he

Mis

s %

No Cache Invalidate All Dependency Lists

Impact of Cache Size

• Maintaining Dependency Lists perform significantly well for all traces• Possible to cache a select few and still extract performance

Throughput Vs Cache Size

0

5000

10000

15000

20000

25000

100% 50% 10% 5% 1% 0.10% 0%

Relative Cache Size

TP

S

Impact of Varying Dependencies

• Throughput drops significantly with increase in the average number ofdependencies per cache file

Effect of Varying Dependencies

02000400060008000

10000120001400016000

x 2x 4x 8x 16x 32x 64x

Factor of Dependencies

TP

S

Impact of Load in Backend Servers

• Our design can sustain high performance even under high loadedconditions with a factor of improvement close to 22

Effect of Load

02000400060008000

10000120001400016000

0 1 2 4 8 16 32 64

# Compute Threads

TP

S

No Cache Dependency List

Conclusions

An architecture for supporting Strong Cache Coherence with multiple dynamic dependencies

Efficiently handle multiple dynamic dependencies Supporting RDMA-based Client polling

Resilient to load on back-end servers

Web Pointers

http://nowlab.cis.ohio-state.edu/

E-mail: {narravul, balaji, vaidyana, jinhy, panda}@cse.ohio-state.edu

NBC home page

Back-up Slides

Cache Consistency

Non-decreasing views of system state Updates seen by all or none

User Requests

Proxy Nodes

Back-End Nodes

Update

Performance

• Receiver side CPU utilization is very low• Leveraging the benefits of One sided communication

Throughput (RDMA Read)

0

1000

2000

3000

4000

5000

6000

7000

1 4 16 64 256 1K 4K 16K 64K 256K

Message Size (bytes)

Th

rou

gh

pu

t (M

bp

s)

0

5

10

15

20

25

Send CPU Recv CPU

Throughput (Poll) Throughput (Event)

RDMA based Client Polling *

• The VAPI module can sustain performance even with heavy load on the back-end servers

DataCenter: Throughput

0

500

1000

1500

2000

2500

0 10 20 30 40 50 60 70 80 90 100 200

Number of Compute Threads

Tra

nsac

tions

per

sec

ond

(TP

S)

No Cache IPoIB VAPI

* SAN04: Supporting Strong Cache Coherency for Active Caches in Multi-TierData-Centers over InfiniBand. Narravula, et. Al.

Mechanism

Cache Hit: Back-end Version Check If version current, use cache Invalidate data for failed version check Use of RDMA-Read

Cache Miss Get data to cache Initialize local versions

Other Implementation Details Requests to read and update are mutually

excluded at the back-end module to avoid simultaneous readers and writers accessing the same data.

Minimal changes to existing application software

Recommended