Content Caching and Scheduling

CHAPTER-1

INTRODUCTION

1

1. INTRODUCTION

1.1 OVERVIEW OF THE PROJECT

The past few years have seen the rise of smart handheld wireless

devices as a means of content consumption. Content might include

streaming applications in which chunks of the file must be received under

hard delay constraints, as well as file downloads such as software updates

that do not have such hard constraints. The core of the Internet is well

provisioned, and network capacity constraints for content delivery are at the

media vault (where content originates) and at the wireless access links at

end-users. Hence, a natural location to place caches for a content distribution

network (CDN) would be at the wireless gateway, which could be a cellular

base station through which users obtain network access. Furthermore, it is

natural to try to take advantage of the inherent broadcast nature of the

wireless medium to satisfy multiple users simultaneously.

There are multiple cellular base stations (BSs), each of which has a

cache in which to store content. The content of the caches can be

periodically refreshed through accessing a media vault. We divide users into

different clusters, with the idea that all users in each cluster are

geographically close such that they have statistically similar channel

conditions and are able to access the same base stations. Note that multiple

clusters could be present in the same cell based on the dissimilarity of their

channel conditions to different base stations. The requests made by each

cluster are aggregated at a logical entity that we call a front end (FE)

2

associated with that cluster. The front end could be running on any of the

devices in the cluster or at a base station, and its purpose is to keep track of

the requests associated with the users of that cluster. The following

constraints affect system operation: 1) the wireless network between the

caches to the users has finite capacity; 2) each cache can only host a finite

amount of content; and 3) refreshing content in the caches from the media

vault incurs a cost.

Users can make two kinds of requests, namely: 1) elastic requests that

have no delay constraints, and 2) inelastic requests that have a hard delay

constraint. Elastic requests are stored in a request queue at each front end,

with each type of request occupying a particular queue. Here, the objective

is to stabilize the queue, so as to have finite delays. For inelastic requests,

we adopt the model proposed in [2] wherein users request chunks of content

that have a strict deadline, and the request is dropped if the deadline cannot

be met. The idea here is to meet a certain target delivery ratio, which could

be something like “90% of all requests must be met to ensure smooth

playout.”

Each time an inelastic request is dropped, a deficit queue is updated

by an amount proportional to the delivery ratio. We would like the average

value of the deficit to be zero. In this paper, we are interested in solving the

joint content placement and scheduling problem for both elastic and inelastic

traffic in wireless networks. In doing so, we will also determine the value of

predicting the demand for different types of content and what impact it has

on the design of caching algorithms.

3

CHAPTER-2

LITERATURE SURVEY

4

2. LITERATURE SURVEY

2.2 PROPOSED SYSTEM

In this paper, we develop algorithms for content distribution with elastic and

inelastic requests. We use a request queue to implicitly determine the

popularity of elastic content. Similarly, the deficit queue determines the

necessary service for inelastic requests. Content may be refreshed

periodically at caches. We study two different kinds of cost models, each of

which is appropriate for a different content distribution scenario. The first is

the case of file distribution (elastic) along with streaming of stored content

(inelastic), where we model cost in terms of the frequency with which

caches are refreshed. The second is the case of streaming of content that is

generated in real-time, where content expires after a certain time, and the

cost of placement of each packet in the cache is considered.

We first characterize the capacity region of the system and develop

feasibility constraints that any stabilizing algorithm must satisfy. Here, by

stability we mean that elastic request queues have a finite mean, while

inelastic deficit values are zero on average.

We develop a version of the max-weight scheduling algorithm that we

propose to use for joint content placement and scheduling. We show that it

satisfies the feasibility constraints and, using a Lyapunov argument, also

show that it stabilizes the system of the load within the capacity region. As a

5

by-product, we show that the value of knowing the arrival rates is limited in

the case of elastic requests, while it is not at all useful in the inelastic case.

We next study another version of our content distribution problem with only

inelastic traffic, in which each content has an expiration time. We assume

that there is a cost for replacing each expired content chunk with a fresh one.

For this model, we first find the feasibility region and, following a similar

technique to [14], develop a joint content placement and scheduling

algorithm that minimizes the average expected cost while stabilizing the

deficit queues.

We illustrate our main insights using simulations on a simple wireless

topology and show that our algorithm is indeed capable of stabilizing the

system. We also propose two simple algorithms, which are easily

implementable, and compare their performance to the throughput-optimal

scheme.

6

CHAPTER-3

SYSTEM DESIGN AND IMPLEMENTATION

7

3. SYSTEM DESIGN AND IMPLEMENTATION

3.1 SYSTEM REQUIREMENTS

3.1.1 Hardware Requirements

The hardware for the system is selected considering the factors such

as CPU processing speed, memory access speed, peripheral channel speed,

printer speed, seek time & relational delay of hard disk and communication

speed etc. The hardware specifications are as follows:

Processor : Intel(R) Core (TM) 2 DUO CPU

E4700 @2.60GHz

Monitor Size : Display Panel (1024 x768) preferably

Hard Disk Drive : 80 GB or higher

Keyboard : Standard 103/104 keyboard

RAM : 1GB

Mouse : Logitech Serial Mouse

3.1.2 Software Requirements

To implement the proposed system the following configuration is

needed.

Operating system - Windows XP

Front - End - ASP.NET 2008

Back end - SQL Server 2005

Coding - C#.NET 2008

Web server - IIS

Frame work - .NET frame work 3.5

8

3.2 SYSTEM ARCHITECTURE

Fig 3.1 System Architecture

3.3 DATAFLOW DIAGRAM

A Data Flow Diagram (DFD) is a graphical technique that depicts

information flow and the transformations that are applied as data move from

input to output. Data Flow Diagrams (DFD) are commonly used to get a

pictorial representation of the various processes occurring within the

system , which shows the minimum content of data stores. Each data stores

should contain all the data elements that flow in and out. Process modeling

involves figuring out how data will flow within the system and what steps to

perform on the data.

9

Circles represent the processes and arrows indicate the dataflow. The

processes are represented by ellipses and are outside the system such as

users or customers with whom the system interacts. They either supply or

consume data. Entities supplying data are known as source and those that

consume data are called links. Data are stored in a data store by a process in

the system. Each component in a DFD is labeled with a descriptive name.

Process name are further identified with number. Context level DFD is draw

first. Then the processes are decomposed into several elementary levels and

are represented in the order of importance. A DFD describes what data flow

(logical) rather than how they are processed, so it does not depend on

hardware, software, and data structure or file organization. A DFD

methodology is quite effective; especially when the required design is clear

and the analyst need a notation language for communication. The DFD is

easy to easy to understand after a brief orientation.

Fig 3.2 DFD Levels 0 - Login

10

Fig 3.3 DFD Level 1 – Cloud Administrator

Fig 3.4 DFD Level 2 – Data Consumer

11

Fig 3.5 DFD Level 3 – File Manager

12

3.4 USE CASE DIAGRAM

A use case diagram in the Unified Modeling Language (UML) is a

type of behavioral diagram defined by and created from a Use-case analysis.

Its purpose is to present a graphical overview of the functionality provided

by a system in terms of actors, their goals (represented as use cases), and any

dependencies between those use cases.

Fig 3.6 Use Case Diagram – Cloud Administrator

13

Fig 3.7 Use Case Diagram – File Manager

Fig 3.8 Use Case Diagram – Data Consumer

14

3.5 ER DIAGRAM

The ER diagram demonstrates the Entity relationship between various

process of tables in the system

Fig 3.9 Content caching and Scheduling

15

CS-EIN

3.6 PROJECT DESCRIPTION

MODULES

Cloud Administrator

File Manager

Data Consumer

Data Consumer Registration

Cloud Administrator

Update Security

Register File Manager

Manage File manager

View files in server

View File Access details

Logout

File Manager

Upload files to server

Assign Key Policies

Authenticate Data consumer

View Request

Process Request

Logout

16

Data Consumer

View Files

Send Download Request

Download files

Logout

Data Consumer Registration

3.7 DATABASE DESIGN

Database design is recognized as a standard of MIS and is available

for virtually every computer system. In database environment, common data

are available which several authorized user can use. General theme behind a

database is to integrate all the information. A database is an organized

mechanism that has the capability of storing information through which a

user can retrieve stored information in an effective and efficient manner. The

data is the purpose of any database and must be protected. The database

design is a two level process. In the first step, user requirements are gathered

together and a database is designed which will meet these requirements as

clearly as possible.

This step is called Information Level Design and it is taken

independent of any individual DBMS. In the second step, this Information

level design is transferred into a design for the specific DBMS that will be

used to implement the system in question. This step is called Physical Level

Design, concerned with the characteristics of the specific DBMS that will be

used. A database design runs parallel with the system design. The

17

organization of the data in the database is aimed to achieve the following

two major objectives.

Data Integrity

Data independence

To structure the data so that there is no repetition of data, this helps in

saving.

A database is an integrated collection of data and provides a

centralized access to the data. Usually centralized data managing software is

called RDBMS. The main significant difference between RDBMS and other

type of DBMS is the separation of data as seen by the program and data as

stored on direct access to storage device. This is the difference between

logical and physical data. While designing a database, several objectives

must be considered. They include the redundancy of

Easy to leaving and use

Data independence

More information at low cost

Accuracy and integrity

While creating the tables, above objectives were considered to avoid

redundancy of data.

Field Name Data Type

logname varchar(50)

passwd varchar(50)

Table 3.1 tblAdmin

18


rid varchar(50)

name varchar(50)

doj varchar(50)

address varchar(50)

age varchar(50)

nationality varchar(50)

phone varchar(50)

mobile varchar(50)

email varchar(50)

edu varchar(50)

expe varchar(50)

logname varchar(50)

passwd varchar(50)

status varchar(50)

Table 3.2 tblFmanager

19


fle_id numeric(18, 0)

fle_name varchar(50)

fle_icon varchar(150)

fle_desc varchar(350)

fle_type varchar(50)

fle_file varchar(300)

fle_size varchar(50)

fle_status varchar(50)

fle_date varchar(50)

Table 3.3 tblFiles

20


CST_ID varchar(50)

CSR_NAME varchar(50)

CST_ADDR varchar(50)

CST_EMAIL varchar(50)

CST_CITY varchar(50)

CST_STATE varchar(50)

CST_COUNTRY varchar(50)

CST_REGDATE varchar(50)

CST_LOG varchar(50)

CST_PASS varchar(50)

CST_STATUS int

Table 3.4 tblClient

21


req_rid varchar(50)

req_id varchar(50)

req_filename varchar(150)

req_fid varchar(50)

req_fdesc varchar(150)

req_fsize int

req_file varchar(50)

req_status int

req_date varchar(50)

Table 3.5 tblRequest


RECID varchar(50)

RKEY varchar(50)

Table 3.6 AssignKey

22

CHAPTER-4

RESULTS AND DISCUSSION

23

4. RESULTS AND DISCUSSION

4.1 SCREENSHOTS

Fig 4.1 Home Page

24

Fig 4.2 Admin Login

25

Fig 4.3 Register File Manager

26

Fig 4.4 Manage File manager

27

Fig 4.5 View Files

28

Fig 4.6 File Upload

29

Fig 4.7 File Manager Login

30

Fig 4.8 Register Data Consumer

31

Fig 4.9 Manage Data Consumer

32

Fig 4.10 Send Down Load Request

33

Fig 4.11 View File Request

34

Fig 4.12 Process Request

35

Fig 4.13 Down Load Files

36

Fig 4.14 View File Access

37

CHAPTER-5

CONCLUSION AND FUTURE ENHANCEMENT

38

5. CONCLUSION

5.1 CONCLUSION

In these project algorithms for content placement and scheduling in wireless

broadcast networks is studies. While there has been significant work on

content caching algorithms, there is much less on the interaction of caching

and networks. Converting the caching and load balancing problem into one

of queueing and scheduling is hence interesting. We considered a system in

which both inelastic and elastic requests coexist. Our objective was to

stabilize the system in terms of finite queue lengths for elastic traffic and

zero average deficit value for the inelastic traffic. We incorporated the cost

of loading caches in our problem with considering two different models. In

the first model, cost corresponds to refreshing the caches with unit

periodicity. In the second model relating to inelastic caching with expiry, we

directly assumed a unit cost for replacing each content after expiration. A

max-weight-type policy was suggested for this model, which can stabilize

the deficit queues and achieves an average cost that is arbitrarily close to the

minimum cost..

39

REFERENCES

[1] I. Hou, V. Borkar, and P. Kumar, “A theory of QoS for wireless,” in

Proc. IEEE INFOCOM, Rio de Janeiro, Brazil, Apr. 2009, pp. 486–494.

[2] R. M. P. Raghavan, Randomized Algorithms. NewYork,NY,USA:

Cambridge Univ. Press, 1995.

[3] P. Cao and S. Irani, “Cost-awareWWWproxy caching algorithms,” in

Proc. USENIX Symp. Internet Technol. Syst., Berkeley, CA, Dec. 1997, p.

18.

[4] K. Psounis and B. Prabhakar, “Efficient randomized Web-cache

replacement schemes using samples from past eviction times,” IEEE/ACM

Trans. Netw., vol. 10, no. 4, pp. 441–455, Aug. 2002.

[5] N. Laoutaris, O.T. Orestis, V.Zissimopoulos, and I. Stavrakakis

istributed selfish replication,” IEEE Trans. Parallel Distrib. Syst., vol.17,

no. 12, pp. 1401–1413, Dec. 2006.

[6] S. Borst, V. Gupta, and A. Walid, “Distributed caching algorithms for

content distribution networks,” in Proc. IEEE INFOCOM, San Diego, CA,

USA, Mar. 2010, pp. 1–9.

[7] L. Tassiulas and A. Ephremides, “Stability properties of constrained

queueing systems and scheduling policies for maximum throughput in

multihop radio networks,” IEEE Trans. Autom. Control, vol. 37, no. 12, pp.

1936–1948, Dec. 1992.

40

[8] X. Lin and N. Shroff, “Joint rate control and scheduling in multihop

wireless networks,” in Proc. 43rd IEEE CDC, Paradise Islands, Bahamas,

Dec. 2004, vol. 2, pp. 1484–1489.

[9] A. Stolyar, “Maximizing queueing network utility subject to stability:

Greedy primal-dual algorithm,” Queueing Syst. Theory Appl., vol. 50, no. 4,

pp. 401–457, 2005.

[10] A. Eryilmaz and R. Srikant, “Joint congestion control, routing, and

MAC for stability and fairness in wireless networks,” IEEE J. Sel.

41

Documents

Content Caching and Scheduling