62
Web Caching 1 Web Caching By Amisha Thakkar Alpa Shah

Web Caching

  • Upload
    gloria

  • View
    44

  • Download
    0

Embed Size (px)

DESCRIPTION

Web Caching. By Amisha Thakkar Alpa Shah. Overview. What is a Web Cache ? Caching Terminology Why use a cache? Disadvantages of Web Cache Other Features Caching Rules. Overview. Caching Architectures Comparison of Architectures Cache Deployment Scheme Client Side Cache Cooperation - PowerPoint PPT Presentation

Citation preview

Page 1: Web Caching

Web Caching 1

Web Caching

By

Amisha Thakkar

Alpa Shah

Page 2: Web Caching

Web Caching 2

Overview

• What is a Web Cache ?

• Caching Terminology

• Why use a cache?

• Disadvantages of Web Cache

• Other Features

• Caching Rules

Page 3: Web Caching

Web Caching 3

Overview

• Caching Architectures

• Comparison of Architectures

• Cache Deployment Scheme

• Client Side Cache Cooperation

• Active Caching

Page 4: Web Caching

Web Caching 4

What is a Web Cache ?

• Cache is a place where temporary copies of objects are stored

• Cached information is generally closer to the requester than the permanent information is

• Objects -HTML pages, images, files

Page 5: Web Caching

Web Caching 5

What is a Web Cache?

Page 6: Web Caching

Web Caching 6

Caching Terminology

• Client - An application program that establishes connections for sending requests

• Server- An application program that accepts connection to service requests by sending back responses

• Origin Server-The server on which the given resource resides or is to be created

Page 7: Web Caching

Web Caching 7

Caching Terminology

• Proxy- An intermediary program which acts both as a server and a client which requests on behalf of the other clients

• Proxy is not necessarily a cache

* Proxy does not always cache the replies passing through it

* It may be used on a firewall to monitor accesses

Page 8: Web Caching

Web Caching 8

Why use a cache ?

• To reduce latency

• To reduce network traffic

• Load on origin servers will be reduced

• Can isolate end users from network failures

Page 9: Web Caching

Web Caching 9

Disadvantages of Web cache

• With cached data there is always a chance of receiving stale information

• Content providers lose access counts when cache hits are served

• Manual configuration is often required• Operation of cache requires additional resources• In some situations the cache can be a single

point of failure

Page 10: Web Caching

Web Caching 10

Other Features

• Depending on the perspective the following may be good or bad

* Cache requests on behalf of clients ; the servers never see the clients IP addresses

* Cache provides an easy opportunity to monitor and analyze browsing activities

* Cache can be used to block certain requests

Page 11: Web Caching

Web Caching 11

Types of Web Caches

• Proxy caches

* Serve a large number of users

* Large corporations and ISP’s often set

them up on the firewalls

* They are type of shared caches

• Browser caches

* Use a section of the computer’s hard disk

to store objects that you have seen

Page 12: Web Caching

Web Caching 12

Caching Rules

• Rules on which caches work -

* Some of them set in protocols

* Some are set by cache administrator• Most common rules :

* If the object is authenticated or secure it

won’t be cached

* Object’s headers indicate whether the

object is cacheable or not

Page 13: Web Caching

Web Caching 13

Caching Rules

* Object is considered fresh when -

It has an expiry time or other age

controlling directive set & is still

within the fresh period

If the browser cache has already seen

the object & has been set to check

once a session

Page 14: Web Caching

Web Caching 14

Caching Rules

If a proxy cache has seen the object

recently & it was modified relatively

long ago

Fresh documents are served directly from the

cache without checking with the origin server

Page 15: Web Caching

Web Caching 15

Caching Rules

* For a stale object , the origin server will

be asked to validate the object , or tell the

cache whether the copy is still good

* The most common validator is the time

that the object was last changed

Page 16: Web Caching

Web Caching 16

Caching Architectures Hierarchical /Simple Cache

• Browser-cache interaction is same as browser -host interaction, i.e. a TCP connection is made & item requested

• If not found send request to parent cache

• Hierarchy built up - each level serving indirectly a wider community of users

Page 17: Web Caching

Web Caching 17

Caching Architectures Hierarchical /Simple Cache

National Network National Network

Regional NetworkRegional Network

Institutional NetworkInstitutional Network Institutional Network Institutional Network

Page 18: Web Caching

Web Caching 18

Caching Architectures Distributed /Co-operating Cache

• Decentralized(Cache Mesh)

• Multiple servers cooperate in such a way that they share their individual caches to create a large distributed one

• Simply put caching proxies communicating with each other to serve different users

• On a cache miss, it checks with other proxy caches before contacting the origin server

Page 19: Web Caching

Web Caching 19

Caching Architectures Distributed /Co-operating Cache

• Caches communicate amongst themselves using a protocol like ICP (Internet Cache Protocol)

• Caches can be selected on the basis of

* Distances from the end user

* Specialize in particular URLs(location hint).

Page 20: Web Caching

Web Caching 20

Caching Architectures Distributed /Co-operating Cache

• Why Distributed - limitations of hierarchy

* Width of cache in hierarchy: caches at same level are inaccessible to each other

* LRU policy implies sufficient disk space

* Cost in replication of disk storage

* Amount of disk space reqd. depends on number of users served & breadth of reading

Page 21: Web Caching

Web Caching 21

Caching Architectures Distributed /Co-operating Cache

More the users more disk space higher in the hierarchy

* Exponential growth of number of documents on WWW

Page 22: Web Caching

Web Caching 22

Caching Architectures Distributed /Co-operating Cache

• Caching close to user - more effective, higher the level lower the efficiency

• Can be created for load balancing

• Most effective when serving a community of interests

Page 23: Web Caching

Web Caching 23

Caching Architectures Distributed /Co-operating Cache

• First an UDP packet sent for cache inquiry.• Cache selection decision is determined by

RTT• Potential problem -network congestion

because of UDP• In favor-

* UDP exchange :2 IP packets, TCP :at least 8 packets

Page 24: Web Caching

Web Caching 24

Caching Architectures Distributed /Co-operating Cache

* UDP reply from cache can indicate

a. Presence

b. Speed

c. Availability of requested documents

Page 25: Web Caching

Web Caching 25

Caching Architectures Hybrid Cache

Note: ICP

Page 26: Web Caching

Web Caching 26

Comparison of Architectures

• Hierarchical : caches placed at multiple levels

• Distributed :caches only at bottom level; no intermediate caches

Page 27: Web Caching

Web Caching 27

Comparison of Architectures

• Performance parameters.

Connection time (Tc)is defined as the time since the document is requested & first data byte is received

Transmission time (Tt)is defined as the time taken to transmit the document

Total latency = Tc +Tt .

Bandwidth usage

Page 28: Web Caching

Web Caching 28

Comparison of Architectures

• Fig 3 -Connection time for different document’s popularity

Page 29: Web Caching

Web Caching 29

Comparison of Architectures

• For unpopular documents high connection time

• No of requests increases avg.. connection time decreases

• For extremely popular documents distributed has smaller connection times

Page 30: Web Caching

Web Caching 30

Comparison of Architectures

• Fig 4 Network traffic generated

Page 31: Web Caching

Web Caching 31

Comparison of Architectures

• On lower levels, distributed caching practically double the network bandwidth usage

• Around the root node in national network, the network traffic is reduced to half

• Distributed caching uses all possible network shortcuts between institutional caches, generating more traffic in the less congested low network levels

Page 32: Web Caching

Web Caching 32

Comparison of Architectures

• Fig 5 a, Not congested national network

Page 33: Web Caching

Web Caching 33

Comparison of Architectures

• The only bottleneck on the path from the client to the origin server is the international path. Hence transmission times are similar for both

Page 34: Web Caching

Web Caching 34

Comparison of Architectures

• Fig 5 b Congested National Networks

Page 35: Web Caching

Web Caching 35

Comparison of Architectures

• Both have higher transmission times compared to the previous case

• Distributed caching gives shorter transmission times than hierarchical because many requests travel through lower network levels

Page 36: Web Caching

Web Caching 36

Comparison of Architectures

• Fig 6 Average total latency

Page 37: Web Caching

Web Caching 37

Comparison of Architectures

• For large documents transmission time is more relevant than connection times

• Hierarchical caching gives lower latencies for documents smaller 200 KB due to lower connection times

• Distributed caching gives lower latencies for larger documents due to lower transmission times

Page 38: Web Caching

Web Caching 38

Comparison of Architectures

• The size- threshold depends on the degree of congestion in national network

• Higher the congestion, lower is the size- threshold

• Distributed caching has lower latencies than hierarchical

Page 39: Web Caching

Web Caching 39

Comparison of ArchitecturesWith Hybrid Scheme

• Fig 7 connection time

Page 40: Web Caching

Web Caching 40

Comparison of ArchitecturesWith Hybrid Scheme

• Fig 8.

Page 41: Web Caching

Web Caching 41

Comparison of ArchitecturesWith Hybrid Scheme

• In the hybrid scheme if the number of cooperating caches (kc) is very small , the connection time is high

• When number of cooperating caches increases, the connection times decreases up to a minimum

• If the number increases over the threshold , the connection time increases very fast

Page 42: Web Caching

Web Caching 42

Comparison of ArchitecturesWith Hybrid Scheme

• Fig 9 Transmission time

Page 43: Web Caching

Web Caching 43

Comparison of ArchitecturesWith Hybrid Scheme

• For un-congested n/w the no.of coop caches (kt) at every level hardly influences Tt

• If no. of coop caches is very small , high Tt & vice -versa

• If the no increases above the threshold the Tt increases

• Optimum no. of caches depends on the no of caches reachable avoiding congested links

Page 44: Web Caching

Web Caching 44

Comparison of ArchitecturesWith Hybrid Scheme

• Fig 10

Page 45: Web Caching

Web Caching 45

Comparison of ArchitecturesWith Hybrid Scheme

• Fig 11 total latency

Page 46: Web Caching

Web Caching 46

Comparison of ArchitecturesWith Hybrid Scheme

• The no. of coop caches(kopt) at every level depend on the document size to minimize the total latency

• For small documents the optimum no. is closer to kc

• For large documents the the optimum no. is closer to kt

Page 47: Web Caching

Web Caching 47

Comparison of ArchitecturesWith Hybrid Scheme

• Fig 12

Page 48: Web Caching

Web Caching 48

Comparison of ArchitecturesWith Hybrid Scheme

• For any document the optimum kopt that minimizes the total latency is such that kc koptkt

Page 49: Web Caching

Web Caching 49

Cache Deployment Schemes

• Proxy caching

Page 50: Web Caching

Web Caching 50

Cache Deployment Schemes

• Advantages

Clients point all web requests directly to cache : no effect on non web traffic

Cost of upgrading h/w & s/w is limited

Administration on caches limited to basic configuration

Page 51: Web Caching

Web Caching 51

Cache Deployment Schemes

• Disadvantages

Every browser must be configured to point to the cache

Each client can hit only one cache

Single point of failure

Unnecessary duplication of data

Bottleneck in cases where content is otherwise available in LAN

Page 52: Web Caching

Web Caching 52

Cache Deployment Schemes

• Transparent Proxy caching

Page 53: Web Caching

Web Caching 53

Cache Deployment Schemes

• Advantages

No browser configuration

Cost of upgrading h/w & s/w is limited

No administration of intermediate systems required

Page 54: Web Caching

Web Caching 54

Cache Deployment Schemes

• Disadvantages

Each client can hit only one cache

If cache goes down internet as well as intranet access lost

Negative impact on non web traffic

Cache has to route non web traffic

Routing ,packet examination & n/w addr. translation steal CPU cycles from the main cache serving function

Page 55: Web Caching

Web Caching 55

Cache Deployment Schemes

• Transparent proxy caching with web cache redirection.

Page 56: Web Caching

Web Caching 56

Cache Deployment Schemes

• Advantages

Switch/ router examines the packets

Minimal impact on non-web traffic

Frees up CPU cycles for the web cache

Allows client load to be dynamically spread over multiple caches

Eliminates single point of failure especially if redundant redirectors are used

Page 57: Web Caching

Web Caching 57

Cache Deployment Schemes

• Disadvantages

Additional intermediate systems must be deployed

Increases expense

Page 58: Web Caching

Web Caching 58

Client Side Cache Cooperation.

Page 59: Web Caching

Web Caching 59

Active Caching

• Current problem unable to cache dynamic documents

• Caching Dynamic contents on the web using active web

• Cache applet is server supplied code that is attached with an URL , or collection of URLs

• Applet is written in platform independent language

Page 60: Web Caching

Web Caching 60

Active Caching

• On a user request the applet is invoked by the cache

• The applet decides what is to be sent to the user

• Other functions of the applet-

* Logging user accesses

* Checking access permissions

* Rotating advertising banners

Page 61: Web Caching

Web Caching 61

Active Caching

• The proxy has the freedom to not invoke the applet but send the request to the server

• Proxy promises to not send back a cached copy without invoking the applet

• If applet too huge ,send request to server

• Proxy not obligated to cache any applet , in that case agrees to not service the request for that document

Page 62: Web Caching

Web Caching 62

Active Caching

• Proxy can devote resources to the applets associated with the hottest URLs to its user

• Proxy that receives the request is typically the proxy closest to the user , the scheme automatically migrates the server processing to the nodes that are close to users

• Thus increasing the scalability of web based services