11
Peer-Assisted On-Demand Streaming: Characterizing Demands and Optimizing Supplies Fangming Liu, Member, IEEE, Bo Li, Fellow, IEEE, Baochun Li, Senior Member, IEEE, and Hai Jin, Senior Member, IEEE Abstract—Nowadays, there has been significant deployment of peer-assisted on-demand streaming services over the Internet. Two of the most unique and salient features in a peer-assisted on-demand streaming system are the differentiation in the demand (or request) and the prefetching capability with caching. In this paper, we develop a theoretical framework based on queuing models, in order to 1) justify the superiority of service prioritization based on a taxonomy of requests, and 2) understand the fundamental principles behind optimal prefetching and caching designs in peer-assisted on-demand streaming systems. The focus is to instruct how limited uploading bandwidth resources and peer caching capacities can be utilized most efficiently to achieve better system performance. To achieve these objectives, we first use priority queuing analysis to prove how service quality and user experience can be statistically guaranteed, by prioritizing requests in the order of significance, including urgent playback (e.g., random seeks or initial startup), normal playback, and prefetching. We then proceed to construct a fine-grained stochastic supply-demand model to investigate peer caching and prefetching as a global optimization problem. This not only provides insights in understanding the fundamental characterization of demand, but also offers guidelines toward optimal prefetching and caching strategies in peer-assisted on-demand streaming systems. Index Terms—On-demand video streaming, peer-to-peer, queuing model, performance evaluation Ç 1 INTRODUCTION I N recent years, peer-assisted on-demand streaming sys- tems have not only been the target of a substantial amount of research, but also been core industry products in both startup and established corporations alike, such as PPLive [1] and Joost [2]. Such systems offer great potential to bring a rich repository of video content to users’ fingertips. Typical peer- assisted on-demand streaming systems provide users the convenience and flexibility of watching whatever video clips whenever they wish: they are able to play back the video, pause the video, perform random seeks to an arbitrary point of playback, or switch to and startup new videos. To offload dedicated streaming servers that maintain a baseline on content availability, peers contribute their uploading band- width resources and limited local cache capacities to cooperatively serve one another. Despite a large variety of different design choices in the literature (e.g., [1]), they have primarily been driven by engineering intuitions, rather than a rigorous set of design principles based on a solid theoretical foundation. In particular, we believe that two critical aspects of the design space need to be revisited from a theoretical perspective: requests on the “demand” side, and caching on the “supply” side. First, in response to a large number of playback and random seek requests from peers, it would be desirable to have guaranteed continuous playback and short latencies after a random seek or an initial startup. How can a certain level of guarantees be provided with respect to service quality and user experience? Second, as peer caching has been used in both memory and nonvolatile storage to improve the “supply” of video content [1], [3], it is possible for peers to actively prefetch and cache certain content in local peer caches. The hope is that such prefetched content may become useful to other peers, and as such mitigate the load on dedicated servers. While a plethora of heuristics are available (e.g., [4]), potential benefits of such prefetching do not come without costs of bandwidth. There have been no founding principles on when and how prefetching should be performed, again based on solid theoretical analysis. If we jointly consider requests that make demands for content and prefetching that augments supplies of content with additional a priori consumption of upload bandwidth, we may be able to develop a theoretical framework that cultivates the root of a high-quality on-demand streaming system design. In this paper, we seek to address these fundamental questions with the following contributions. Rather than heuristics based on intuition, we construct a new analytical framework based on queuing models, that 1) advocates and justifies the superiority of service prioritization based on a IEEE TRANSACTIONS ON COMPUTERS, VOL. 62, NO. 2, FEBRUARY 2013 351 . F. Liu and H. Jin are with the Services Computing Technology and System Lab, Cluster and Grid Computing Lab, School of Computer Science and Technology, Huazhong University of Science and Technology, No. 5 Eastern Building, No. 1037 Luoyu Road, Hongshan District, Wuhan 430074, China. E-mail: {fmliu, hjin}@mail.hust.edu.cn. . B. Li is with the Department of Computer Science and Engineering, Hong Kong University of Science and Technology, Clear Water Bay, Kowloon 999077, Hong Kong. E-mail: [email protected]. . B. Li is with the Edward S. Rogers Sr. Department of Electrical and Computer Engineering, University of Toronto, 10 King’s College Road, Toronto, ON M5S 3G4, Canada. E-mail: [email protected]. Manuscript received 24 Mar. 2011; revised 6 Sept. 2011; accepted 16 Oct. 2011; published online 10 Nov. 2011. Recommended for acceptance by S. Nikoletseas. For information on obtaining reprints of this article, please send e-mail to: [email protected], and reference IEEECS Log Number TC-2011-03-0193. Digital Object Identifier no. 10.1109/TC.2011.222. 0018-9340/13/$31.00 ß 2013 IEEE Published by the IEEE Computer Society

Peer-Assisted On-Demand Streaming: Characterizing Demands and Optimizing Supplies

  • Upload
    hai

  • View
    212

  • Download
    0

Embed Size (px)

Citation preview

Peer-Assisted On-Demand Streaming:Characterizing Demands and

Optimizing SuppliesFangming Liu, Member, IEEE, Bo Li, Fellow, IEEE,

Baochun Li, Senior Member, IEEE, and Hai Jin, Senior Member, IEEE

Abstract—Nowadays, there has been significant deployment of peer-assisted on-demand streaming services over the Internet. Two

of the most unique and salient features in a peer-assisted on-demand streaming system are the differentiation in the demand (or

request) and the prefetching capability with caching. In this paper, we develop a theoretical framework based on queuing models, in

order to 1) justify the superiority of service prioritization based on a taxonomy of requests, and 2) understand the fundamental

principles behind optimal prefetching and caching designs in peer-assisted on-demand streaming systems. The focus is to instruct how

limited uploading bandwidth resources and peer caching capacities can be utilized most efficiently to achieve better system

performance. To achieve these objectives, we first use priority queuing analysis to prove how service quality and user experience can

be statistically guaranteed, by prioritizing requests in the order of significance, including urgent playback (e.g., random seeks or initial

startup), normal playback, and prefetching. We then proceed to construct a fine-grained stochastic supply-demand model to

investigate peer caching and prefetching as a global optimization problem. This not only provides insights in understanding the

fundamental characterization of demand, but also offers guidelines toward optimal prefetching and caching strategies in peer-assisted

on-demand streaming systems.

Index Terms—On-demand video streaming, peer-to-peer, queuing model, performance evaluation

Ç

1 INTRODUCTION

IN recent years, peer-assisted on-demand streaming sys-tems have not only been the target of a substantial amount

of research, but also been core industry products in bothstartup and established corporations alike, such as PPLive [1]and Joost [2]. Such systems offer great potential to bring a richrepository of video content to users’ fingertips. Typical peer-assisted on-demand streaming systems provide users theconvenience and flexibility of watching whatever video clipswhenever they wish: they are able to play back the video, pausethe video, perform random seeks to an arbitrary point ofplayback, or switch to and startup new videos. To offloaddedicated streaming servers that maintain a baseline oncontent availability, peers contribute their uploading band-width resources and limited local cache capacities tocooperatively serve one another. Despite a large variety ofdifferent design choices in the literature (e.g., [1]), they have

primarily been driven by engineering intuitions, rather than arigorous set of design principles based on a solid theoreticalfoundation. In particular, we believe that two critical aspectsof the design space need to be revisited from a theoreticalperspective: requests on the “demand” side, and caching on the“supply” side.

First, in response to a large number of playback andrandom seek requests from peers, it would be desirable tohave guaranteed continuous playback and short latenciesafter a random seek or an initial startup. How can a certainlevel of guarantees be provided with respect to service qualityand user experience? Second, as peer caching has been usedin both memory and nonvolatile storage to improve the“supply” of video content [1], [3], it is possible for peers toactively prefetch and cache certain content in local peercaches. The hope is that such prefetched content maybecome useful to other peers, and as such mitigate the loadon dedicated servers. While a plethora of heuristics areavailable (e.g., [4]), potential benefits of such prefetching donot come without costs of bandwidth. There have been nofounding principles on when and how prefetching should beperformed, again based on solid theoretical analysis. If wejointly consider requests that make demands for content andprefetching that augments supplies of content with additionala priori consumption of upload bandwidth, we may be ableto develop a theoretical framework that cultivates the root ofa high-quality on-demand streaming system design.

In this paper, we seek to address these fundamentalquestions with the following contributions. Rather thanheuristics based on intuition, we construct a new analyticalframework based on queuing models, that 1) advocates andjustifies the superiority of service prioritization based on a

IEEE TRANSACTIONS ON COMPUTERS, VOL. 62, NO. 2, FEBRUARY 2013 351

. F. Liu and H. Jin are with the Services Computing Technology and SystemLab, Cluster and Grid Computing Lab, School of Computer Science andTechnology, Huazhong University of Science and Technology, No. 5Eastern Building, No. 1037 Luoyu Road, Hongshan District, Wuhan430074, China. E-mail: {fmliu, hjin}@mail.hust.edu.cn.

. B. Li is with the Department of Computer Science and Engineering, HongKong University of Science and Technology, Clear Water Bay, Kowloon999077, Hong Kong. E-mail: [email protected].

. B. Li is with the Edward S. Rogers Sr. Department of Electrical andComputer Engineering, University of Toronto, 10 King’s College Road,Toronto, ON M5S 3G4, Canada. E-mail: [email protected].

Manuscript received 24 Mar. 2011; revised 6 Sept. 2011; accepted 16 Oct.2011; published online 10 Nov. 2011.Recommended for acceptance by S. Nikoletseas.For information on obtaining reprints of this article, please send e-mail to:[email protected], and reference IEEECS Log Number TC-2011-03-0193.Digital Object Identifier no. 10.1109/TC.2011.222.

0018-9340/13/$31.00 � 2013 IEEE Published by the IEEE Computer Society

taxonomy of streaming requests, and 2) helps understandthe fundamental principle behind optimal prefetching andcaching designs. Uploading bandwidth and peer cachingare finite resources in any design of on-demand streamingsystems, and it is our focus to provide guidelines how afinite pool of uploading bandwidth resources and peercaching capabilities are to be utilized most efficientlytoward better performance.

In particular, we first use priority queuing analysis toprove how service quality and user experience can bestatistically guaranteed by prioritizing requests in the orderof urgent playback (e.g., random seeks or initial startup),normal playback, and prefetching. We then proceed to afine-grained stochastic supply-demand model to investigatepeer caching and prefetching as a global optimizationproblem. This not only provides insights in understandingthe fundamental characterization of the demand, but alsooffers guidelines toward optimal prefetching and cachingstrategies in peer-assisted on-demand streaming systems.

The remainder of this paper is organized as follows: inSection 2, we discuss our contributions in the context ofrelated work. In Section 3, we present a conceptual queuingmodel for peer-assisted on-demand streaming systems, andjustify the superiority of service prioritization based on ataxonomy of requests. Section 4 presents our investigationon caching and prefetching based on a fine-grainedstochastic supply-demand model. Finally, we conclude thepaper in Section 5.

2 RELATED WORK

Though a number of commercial peer-assisted on-demandmedia streaming systems have been deployed (e.g., [1]),there is a lack of theoretical foundations and analysestoward the understanding of fundamental design principlesand challenges in peer-assisted on-demand streamingsystems. From a theoretical perspective, to date, only a fewanalytical studies on peer-assisted on-demand streamingexist in the literature. Suh et al. [3] have analyticallyinvestigated the design of a push-to-peer video-on-demandsystem in cable networks, and bounds of catalog sizes insuch a system have been derived in a subsequent work [5].More recently, Parvez et al. [6] have developed a theoreticalmodel to analyze the performance of BitTorrent-like proto-cols for on-demand stored media streaming. However, thesemodels have not been able to characterize the differentiationin demand, and thus are not in a position to provide insightson how to guarantee the service quality for urgent requests,such as random seeks. In contrast, our model characterizesdemand of different levels of urgency through a priorityqueuing analysis, which both qualitatively and quantita-tively justifies how service quality and user experience canbe statistically guaranteed through service prioritization.

With respect to studies on caching and prefetching inpeer-assisted on-demand streaming systems, there exist anumber of proposals (e.g., [7]), mostly driven by engineeringintuitions or heuristics. Guo et al. have proposed a hybridmedia proxy system assisted by P2P networks, called PROP[8], in which DHT-based overlay management and cachereplacement heuristics are designed to achieve systemscalability and reliability. Instead of dictating specific systeminfrastructure and implementation, we focus on the under-lying resource allocation problem—with respect to finite peer

bandwidth and cache capacities in any design of on-demandstreaming systems—from a theoretical perspective.

Rather than designing specific prefetching or cachingstrategies (e.g., [9]), we analytically characterize and under-stand the fundamental problem behind caching and pre-fetching designs based on sound stochastic queuing analysis.Rather than optimizing prefetching from an individualpeer’s viewpoint (e.g., [10]), we investigate optimal cachingand prefetching as a global optimization problem to balancethe system-wide demand and supply. Similarly, a cacheredundancy model is developed in [8] to achieve the optimaldistribution of media segment replicas across the system,under the assumption of Zipf-like distribution of segmentpopularity and the storage constraint of peer caches. Incomparison, our model not only has no a priori assumptionon segment popularity distributions, but also cohesivelyincorporates both storage and upload bandwidth constraintsat peers.

While data replication and caching have been extensivelystudied in P2P file sharing and distributed storage systemsto satisfy search queries and download demand (e.g., [11]),how to adapt existing replication schemes to peer-assistedon-demand streaming systems is still an open problem. Ourmodel and performance analysis, which intrinsically con-sider the timing and bandwidth requirements in peer-assisted on-demand streaming, can reveal the fundamentalprinciple of a family of prefetching and caching strategies.We show that simple heuristics used in recent real-worldsystems essentially lie in this family, and can potentiallyachieve comparable performance to optimal strategies.

In addition, there have been extensive prior studies onconventional web caching for reducing user access latenciesto text-based web content. Different from web trafficcaching under the widely accepted Zipf-like access pattern(e.g., [12]), the reference locality in many media workloadsis shown to be less skewed than that of web objects [13].Accordingly, we have evaluated peer-assisted cachingdesigns under both the classical Zipf and the recentlyobserved stretched exponential (SE) distribution [14] toobtain complementary understanding. This demonstratesthat the performance of caching in peer-assisted on-demandstreaming systems is distinctively affected by two repre-sentative access patterns, which is consistent with theimplications from [14]. Also, server-based content deliverynetworks (CDNs) have been deployed to replicate mediacontent across the Internet to move the content close to theviewers. For instance, it is shown that the popularity of livemedia programs hosted by Akamai CDN exhibits a two-mode Zipf distribution [15]. Such server-based solutions canimprove the media streaming performance, yet they sufferfrom prohibitive cost [8]. In contrast, we focus on the cost-effective peer-assisted on-demand streaming, with miti-gated server bandwidth costs demonstrated by our analysis.

3 DEMAND CHARACTERIZATION AND SERVICE

PRIORITIZATION

3.1 Characterizing Requests for Segments

We begin by stating our model of a peer-assisted on-demandstreaming system providing a rich repository of video files,each encoded with a constant bit rate. While different videoscan have distinct lengths, each of them is divided into anumber of fixed-length segments, and the total number of

352 IEEE TRANSACTIONS ON COMPUTERS, VOL. 62, NO. 2, FEBRUARY 2013

segments is V . We assume that there exists a pool of servers,with an aggregate upload capacity Us, deployed by theservice provider, and there are N peers participating in thesystem in steady state. Let ui denote the upload capacity ofpeer i, 8i 2 N . Then, the total uploading bandwidth resourcein the system isC ¼ Us þ

Pui. Without loss of generality, we

assume that time t is slotted and each time unit corresponds tothe amount of time for a peer to play back one segment. Givena fixed segment size as mentioned earlier, the total systemupload capacity C can be normalized as c, in terms of thenumber of segments that can be served per unit time,henceforth referred to as c service units.

As peers play back video segments over time, they will alsorequest segments from the system, which are represented bythe total arrival rate of segment requests �. Different from livevideo streaming, there is a unique and critical differentiationwith respect to requests in on-demand streaming, which canbe represented in a taxonomy of three categories:

. Urgent playback requests that are used to meetimmediate buffering and playback needs after userinteractions, such as initial startup of a video stream,and random seek activities.

. Normal playback requests that are used to maintain abuffer of segments in the near future, so that asmooth playback quality can be sustained within thevicinity of the current playback point.

. Prefetching requests that are used to prefetch certaindesirable segments beyond the vicinity of the currentplayback point (and may even be in a differentvideo) for local caching. Such prefetching requestsare made with the hope that the prefetched segmentscan be used in the future to serve other peers, so thatserver bandwidth costs will be mitigated (as we shallelaborate in Section 4).

Each segment request requires to be satisfied by oneof the available service units (i.e., uploading bandwidthresource from either peers or servers), in order to satisfy thedemand from peers.

3.2 Prioritizing Services: Priority Queuing Analysis

To guarantee service quality and user experience, weadvocate that when requests for segments—on the “de-mand” side—are served, they should be prioritized in theorder of their urgencies, with three priority types J ¼f3; 2; 1g for urgent playback requests (with arrival rate �3),normal playback requests (with arrival rate �2), andprefetching requests (with arrival rate �1), respectively,where

P3j¼1 �j ¼ �. In other words, requests belonging to a

type j ¼ 3; 2; 1 are associated with high priority, mediumpriority, and low priority, respectively, while requests withinthe same priority type will be served with the First-Come-First-Served (FCFS) policy.

As illustrated in Fig. 1, the entire system is abstracted asa service station consisting of c identical parallel serviceunits with the mean service rate � ¼ 1 each (i.e., onesegment per unit time). Note that though peers can haveheterogeneous upload capacities, our model can mitigatesuch a complexity by transforming the overall uploadingbandwidth resource of the system to a number ofhomogeneous service units, under the assumption of heavy

workloads in such on-demand streaming services that tend

to fully exploit the limited pool of bandwidth resources.

Essentially, we abstract peer-assisted on-demand streaming

with service prioritization as a multiserver multipriority

classes model, represented by Kendall’s notation [16]

M=M=c� PR, where segment requests arrive with a

Poisson process and service times follow an exponential

distribution, c is the number of parallel service units, and

PR specifies the queuing discipline as Priority, and the

system utilization factor � ¼ �=ðc�Þ 2 ½0; 1Þ for steady state

analysis. Our considerations include the following aspects:

. While still allowing peers to have heterogeneouscapacities in original practical systems, our modelcan analyze the system performance through classi-cal queuing models with homogeneous service unitsto yield tractable analysis. Otherwise, an asym-metric queuing system with heterogeneous serviceunits is hard to obtain tractable results, or itsanalysis is only restricted to special cases such asM=M=2 systems [16].

. Since a peer (or server) can correspond to multipleservice units depending on its upload capacities, ourmodel can also cover the typical scenario where apeer (or server) is simultaneously serving multiplesegments and its uploading bandwidth is fairlyshared between these concurrent transmissions. Forpractical peer-assisted systems wherein peers usual-ly cooperate with multiple partners, such a ProcessorSharing (PS) policy is an adequate model of fairsharing between concurrent TCP connections [3].

. Note that each service unit essentially represents theaverage upload capacity of serving one segment perunit time. Due to practical factors such as networkcongestion and bandwidth fluctuation, there couldbe variations in serving segments. This implies thatthe service time of each service unit should be arandom variable, rather than a deterministic con-stant that would be too idealized. To this end, weassume that the service time of each service unitfollows an exponential distribution with meanservice rate � ¼ 1. This is considered to be areasonable assumption and typically used in existinganalytical studies of peer-assisted systems [17]. On

LIU ET AL.: PEER-ASSISTED ON-DEMAND STREAMING: CHARACTERIZING DEMANDS AND OPTIMIZING SUPPLIES 353

Fig. 1. A conceptual queuing system consisting of multiple parallelservice units in response to the segment requests sent from peers, withthe same aggregate upload capacity as the original peer-assisted on-demand streaming system.

the other hand, even if one prefers to assume adeterministic service time, our model can be easilyamended to a M=D=c system.

. We mainly focus on the nonpreemptive mode forserving requests, though our model can also beamended to accommodate a preemptive mode.

To analyze and justify the superiority of service prior-itization, we next use representative performance metrics ofour queuing model to characterize the service quality anduser experience in on-demand streaming systems, includ-ing the playback fluency and latency after user interaction (e.g.,initial startup and random seeks).

3.2.1 Playback Fluency

Formally, given a certain number of segments, d, that arebuffered ahead of the real-time playback point, it is drainedat the playback rate (i.e., one segment per unit time), whilefed by subsequent segments as a peer continuously views avideo. Hence, the playback fluency can be reflected by theprobability of PrððL� dÞW � dÞ, where W is the waitingtime for a segment request to obtain service in steady state,and L is the time length for which the peer has beenviewing the video. For example, L can be up to the videolength. Let a factor � ¼ d

L represent the relative playbackbuffer length, we have

PrððL� dÞW � dÞ ¼ Pr W � �

1� �

� �; ð1Þ

where PrðW � �1��Þ is essentially the waiting time distribu-

tion function for a segment request, PrðW � �Þ, with adesired delay bound � ¼ �

1�� . Intuitively, a higher level ofPrðW � �Þ would imply a timely and sustained playbackbuffer, and thus a fluent viewing experience. Specifically,an extreme exercise with � ¼ 0 and PrðW ¼ 0Þ canrepresent the most stringent timing requirement, i.e., ifthe first segment in the playback buffer closest to theplayback point is missing.

Unfortunately, in our model of peer-assisted on-demandstreaming with service prioritization, the exact waiting timedistributions in the corresponding multiserver multipriorityclasses model M=M=c� PR are far less tractable, and mostexisting analytical studies are restricted to only two priorityclasses (e.g., [18]). Instead, we resort to the following simpleapproximation of the waiting time distribution for multi-priority classes systems, with both theoretical support [19]and simulation-based observations [20]. Such a reasonableapproximation is useful to estimate the trend of PrðWj � �Þ,8j 2 J , and facilitate our later comparison with traditionalnonprioritization systems, especially under heavy work-loads (i.e., �! 1):

PrðWj � �Þ � 1� �e���=Wj ; � > 0; ð2Þ

where Wj is the mean waiting time for jth priority segmentrequests in steady state, as given below:

Wj ¼ðc�Þcc!ð1��Þ p0

h ic�ð1� �jÞð1� �jþ1Þ

; ð3Þ

where �j ¼PjJ j

i¼j �i, �j ¼ �j=ðc�Þ, � ¼P

j2J �j, and

p0 ¼Xc�1

k¼0

ðc�Þk

k!þ ðc�Þ

c

c!

1

1� �

" #�1

is the state probability of 0 jobs in the system.Note that though we focus on three priority types (i.e.,

J ¼ f1; 2; 3g) according to on-demand streaming features,(3) is also general for M=M=c� PR models with anynumber of priority types [16].

3.2.2 Latency after User Interaction

Another performance concern in peer-assisted on-demandstreaming is the latency to retrieve segments after initialstartups or random seeks. When a peer joins a video orseeks to a new playback position, at least a minimum bufferof segments, up to the playback buffer size d, need to berequested and downloaded before the video starts play-back. Such a latency can be modeled as a function Dðd;RÞ,depending on the playback buffer size d and the retrievaltime R for a segment. Specifically, we use the followingmonotonically increasing function as the expected latency:

Dðd;RÞ ¼ �dR; � 2 1

d; 1

� �; ð4Þ

where R ¼W þ 1=� is the mean retrieval time for a

segment in steady state, and � is a factor depending on

how many segment requests within the buffer are served in

parallel by the system. Alternatively, � can also be

interpreted as an acceptable buffer level for startups. For

example, a buffer level of 75 percent is empirically regarded

as satisfactory in UUSee [21]. Note that Dðd;RÞmin ¼ Ressentially places a lower bound on the achievable startup

delay [6] or seek latency, since at least the segment closest to

the playback (or seek position) needs to be downloaded.

Intuitively, a smaller value of R would imply a shorter

latency to retrieve segments.For our model of peer-assisted on-demand streaming

with service prioritization, from (3), the lower bound ofexpected latency Djðd;RjÞmin ¼ Rj for different prioritytypes j 2 J under the M=M=c� PR model is given as

Rj ¼ðc�Þcc!ð1��Þ p0

h ic�ð1� �jÞð1� �jþ1Þ

þ 1

�; ð5Þ

where �j ¼PjJ j

i¼j �i, and p0 is given earlier.According to (4) and (5), we have Djðd; RjÞ � Rj �

1=�; 8j 2 J . This is also true in real-world systems, since itat least takes a certain amount of time to download thetarget segments after a random seek or an initial startup,even if such requests can be served immediately.

3.3 Statistical Guarantees with ServicePrioritization

We are now ready to justify and demonstrate how servicequality and user experience can be statistically guaranteedwith our service prioritization principle (henceforth referredto as prioritization systems). Specifically, to establish a baselineperformance reference point, we also include a comparisonwith traditional on-demand streaming systems withoutservice prioritization (henceforth referred to as nonprioritiza-tion systems). In nonprioritization systems, all segment

354 IEEE TRANSACTIONS ON COMPUTERS, VOL. 62, NO. 2, FEBRUARY 2013

requests from downstream peers to the serving peers and

servers will be treated identically. Essentially, this corre-

sponds to M=M=c� FCFS queuing models [16], and the

corresponding performance measures PrðWFCFS � �Þ and

RFCFS can be represented by the waiting time distribution

and mean sojourn time of a job in steady state, respectively:

PrðWFCFS � �Þ ¼1� ðc�Þc

cð1� �Þ p0; � ¼ 0;

1� ðc�Þc

c!ð1� �Þ p0e�c�ð1��Þ� ; � > 0;

8>><>>:

RFCFS ¼WFCFS þ1

�; ð6Þ

whereWFCFS ¼ ½ �ð1��Þ��½

ðc�Þcc!ð1��Þ p0� is the mean waiting time of a

job in the corresponding M=M=c� FCFS system in steady

state, and p0 ¼ ½Pc�1

k¼0ðc�Þkk! þ

ðc�Þcc!

11���

�1 is the state probability

of 0 jobs in the corresponding M=M=c� FCFS system.First, we have the following relationship with respect to

the latency performance.

Theorem 1. Given a limited system upload capacity, prioritiza-

tion systems with three priority types J ¼ f3; 2; 1g for urgent

playback, normal playback, and prefetching can 1) guarantee

to improve the service quality and user experience for urgent

requests (e.g., random seeks and initial startup); 2) con-

ditionally improve the normal playback fluency; and 3) at a

cost of deferring prefetching requests with a conditionally

bounded proportion to the performance of nonprioritization

systems, as characterized by the following relationship:

R3 < RFCFS < R1; for � 2 ½0; 1Þ; ð7Þ

R2 � RFCFS; for1� �

ð1� �2 � �3Þð1� �3Þ� 1; ð8Þ

R2 > RFCFS; for1� �

ð1� �2 � �3Þð1� �3Þ> 1; ð9Þ

W1

WFCFSremains constant as � increases, when �1 increases while

�2 and �3 remain constant.

Proof. First, based on conservation law, we have W3 <

W2 < W1; 8� 2 ½0; 1Þ. By simply adding the mean service

time 1=�, we have R3 < R2 < R1; 8� 2 ½0; 1Þ.Furthermore, to compare RFCFS and Rj; 8j 2 J , we

need to compare the corresponding WFCFS and Wj;8j 2 J , as follows:

Wj

WFCFS

¼ 1� �ð1� �jÞð1� �jþ1Þ

;

(by (3) and (6)), where �j ¼PjJ j

i¼j �i.Hence, for high priority j ¼ 3, we have

W3

WFCFS

¼ 1� �1� �3

¼ 1� �1� �3

< 1; ðtypically �2 þ �1 > 0Þ:

ð10Þ

Then, for low priority j ¼ 1, we have

W1

WFCFS

¼ 1� �ð1� �1Þð1� �2Þ

¼ 1

1� �2 � �3> 1: ð11Þ

Combining (10) and (11) gives R3 < RFCFS < R1;8� 2 ½0; 1Þ.

In addition, (11) also indicates that W1

WFCFScan keep

constant as � increases, when �2 and �3 keep constant

while �1 increases.Finally, for medium priority j ¼ 2, we have

W2

WFCFS

¼ 1� �ð1� �2Þð1� �3Þ

¼ 1� �ð1� �2 � �3Þð1� �3Þ

: ð12Þ

From this, we can obtain the conditions (8) and (9) forR2 � RFCFS and R2 > RFCFS , respectively. tu

Remark. Our model with Theorem 1 not only characterizesthe unique and critical differentiation in three distinctrequest classes in peer-assisted on-demand streamingsystems, but also qualitatively instructs how limiteduploading bandwidth resources can be effectively utilizedthrough service prioritization, so as to provide statisticalguarantees on service quality and user experience. Thisaims to provide a theoretical underpinning of peer-assisted on-demand streaming systems, which is com-plementary to existing heuristic-based studies in this area.

3.4 Service Prioritization: Performance Evaluation

To more clearly demonstrate the superiority of serviceprioritization, we perform a series of performance analysis.Fig. 2 plots the mean retrieval times for high-priority,medium-priority, and low-priority requests in prioritizationsystems compared to the mean retrieval time in nonprior-itization systems. As the system utilization factor � increases,the performance of traditional nonprioritization systemswill degrade as a whole, which implies an unsatisfactoryservice quality and user experience, especially for urgentrequests (e.g., random seeks and initial startup) and normalplayback. This is due to the heavy competition among allsegment requests for a limited pool of uploading bandwidthresources in the system, and the passive resource allocationmechanism without being aware of service quality and userexperience.

In contrast, under the same system capacity andutilization factor, prioritization systems can effectivelyallocate a limited pool of uploading bandwidth resources,and successfully maintain a nearly constant level of

LIU ET AL.: PEER-ASSISTED ON-DEMAND STREAMING: CHARACTERIZING DEMANDS AND OPTIMIZING SUPPLIES 355

Fig. 2. Mean retrieval times for high-priority, medium-priority, and low-priority requests in prioritization on-demand streaming systems com-pared to the mean retrieval times in traditional nonprioritization systems,along with the increase of the system utilization factor.

performance for high-priority and medium-priority re-quests, which implies timely responses for random seeksand initial startup, as well as fluent playback, even thoughat a cost of deferring prefetching requests that are expectednot to directly affect user viewing experiences.

Next, we quantitatively examine the performance withrespect to playback fluency. Fig. 3 plots the waiting timedistribution for high-priority, medium-priority, and low-priority segment requests in prioritization systems, com-pared to traditional nonprioritization systems, under aheavy system utilization factor � ¼ 0:975. Specifically, weexamine a representative range of � ¼ �

1�� 2 ½0:01; 0:1�, andassume L is set as an expected video length of 1 hour.

We obtain the following insights: 1) As the desired delaybound for a segment request � increases (� ¼ d

L " ), thewaiting time probabilities PrðWFCFS � �Þ and PrðWj � �Þ;8j 2 J , all increase. This indicates that a longer playbackbuffer length can help improve the playback fluency, whichis consistent with empirical experiences. 2) However,PrðWFCFS � �Þ in nonprioritization systems is relativelylow, which implies a higher probability of interruptedviewing experience. In contrast, the playback fluency inprioritization systems (i.e., PrðW2 � �Þ) grows rapidly, andcan even approach 1 with more adequate buffer lengths.This clearly demonstrates that prioritization systems canoffer a better service quality and user experience thantraditional nonprioritization systems. 3) The performancefor high-priority requests in prioritization systems is evenbetter with PrðW3 � �Þ quickly approaching 1, whichimplies timely responses for urgent requests such as initialstartup and random seeks. On the other hand, the waitingtime for low-priority prefetching requests could becomecorrespondingly longer due to the conservation law. Sinceprefetching requests beyond the playback buffer mainly aimto cache and serve future demand, a relatively longer delayfor such requests is acceptable in practice, given that theservice qualities for normal playback and urgent demands arestatistically guaranteed.

Further, under the same setting of � ¼ 0:06, Fig. 4compares the waiting time distributions of segmentrequests. We observe that as we increase � by allowingmore prefetching requests, the playback fluency in non-prioritization systems could degrade significantly. In con-trast, the playback fluency in prioritization systems can bemaintained at a high level.

4 OPTIMIZING PREFETCHING AND CACHING UNDER

A STOCHASTIC SUPPLY-DEMAND MODEL

To jointly consider requests that make demands for contentand prefetching that augments supplies of content, wefurther proceed to analyze the prefetching and cachingdesigns in peer-assisted on-demand streaming systems as aglobal optimization problem, under a stochastic supply-demand model as follows:

. Demand. Assume the system-wide normal playbackrequests for segments arrive as a Poisson process ofrate �2, and the access probabilities of segments aredenoted as ða1; a2; . . . ; aV Þ, which satisfies

PVi¼1 ai ¼

1. Without loss of generality, we assume ai � aj,8i < j. Hence, the expected amount of requests forsegment i per unit time is �2ai.

. Supply. Assume that each peer in the system main-tains a limited size of local cache, up to B segments.The system-wide caching state of segments at time t isdenoted as XðtÞ ¼ ðx1ðtÞ; x2ðtÞ; . . . ; xV ðtÞÞ, wherexiðtÞ � N represents the number of peers holdingsegment i in their local caches at time t. Without lossof generality, we assume peer local caches will becompletely filled as time progresses, and hencePV

i¼1 xiðtÞ ¼ NB; 8t. To fully utilize their local cachesto serve one another, peers can proactively prefetchcertain segments to their local caches, and replacecertain segments due to the limited cache size. We areinterested in the optimal prefetching and cachereplacement strategies to produce the optimal sys-tem-wide cache state, so that system-wide normalplayback demand can be satisfied by peers holdingthose segments, rather than by servers.

As peers cache and prefetch segments, each segment isassociated with a certain service capacity, denoted asðo1ðtÞ; o2ðtÞ; . . . ; oV ðtÞÞ, in terms of the number of segmentsthat can be served by peers per unit time. Generally, weassume

PVi¼1 oiðtÞ ¼ "Cp, where Cp ¼

PNj¼1 uj is the total

uploading capacity of peers, and " 2 ð0; 1� represents theefficiency of utilizing peer uploading bandwidth. Intui-tively, the larger the number of peers holding segment i, thelarger the capacity there exists to serve this segment. Tofacilitate our analysis, we use a proportion function for theexpected service capacity of a segment, oiðtÞ ¼ ðxiðtÞNB Þ "Cp,which is considered to be reasonable and typically assumedin existing modeling studies of peer-assisted systems [22].

356 IEEE TRANSACTIONS ON COMPUTERS, VOL. 62, NO. 2, FEBRUARY 2013

Fig. 3. Waiting time distribution for high-priority, medium-priority, andlow-priority requests in prioritization on-demand streaming systemscompared to traditional nonprioritization systems, under a heavy systemutilization factor � ¼ 0:975.

Fig. 4. Waiting time probability (with a desired � ¼ 0:06) for high-priority,medium-priority, and low-priority requests in prioritization on-demandstreaming systems compared to traditional nonprioritization systems,along with the increase of system utilization factor.

Instead of directly optimizing the prefetching and cachereplacement of segments—which would lead to an over-whelming number of variables—we first derive the desiredoptimal cache profile X�ðtÞ ¼ ðx�1ðtÞ; x�2ðtÞ; . . . ; x�V ðtÞÞ inresponse to a given system-wide normal playback demand,represented by �2 and ða1; a2; . . . ; aV Þ. Then, by examiningthe discrepancy between X�ðtÞ and the current peer cachestate XðtÞ, we can determine which segments are relativelyundersupplied (to be prefetched) or oversupplied (to bereplaced), so as to provide guidelines and insights towardoptimal prefetching and cache replacement strategies.

4.1 Stochastic Queuing Analysis

We mainly consider two stochastic evaluation modes [3]based on queuing theory: the waiting mode wherein systemperformance is measured by the waiting time probability,and the blocking mode wherein system performance ismeasured by the request blocking probability. In thewaiting mode, our supply-demand model of each segmentcorresponds to an M=M=1 system, by assuming Poissonarrival of normal playback requests for segment i withmean rate �2ai, and exponentially distributed service timeswith mean rate oiðtÞ. Then, the waiting time distribution forsegment i in our supply-demand model is given as

PrðWi � �Þ ¼1� �2ai

oiðtÞ; � ¼ 0;

1� �2aioiðtÞ

eð�2ai�oiðtÞÞ� ; � > 0:

8<: ð13Þ

Using (13), we can derive the optimal cache profile X�ðtÞ bymaximizing the expected amount of normal playbackrequests that can be served within a certain desired delaybound � , i.e.,

PVi¼1 �2aiPrðWi � �Þ. However, such a waiting

mode intrinsically assumes �i ¼ �2aioiðtÞ

< 1, i ¼ 1; 2; . . . ; V , forthe system to be stable. Such an assumption may not alwayshold across all segments in the system, as the demand forcertain highly popular segments could possibly exceed theircorresponding service capacities in the system. Hence, weconsider the blocking mode without such a restriction.

In the blocking mode, our supply-demand model ofsegment i corresponds to a Processor Sharing systemM=M=1=Ki � PS with a finite queuing capacity Ki repre-senting the allowed number of simultaneous downloads forsegment i, which depends on the service capacity oiðtÞ. Forsimplicity, we can let Ki ¼ doiðtÞe in order for eachconcurrent transmission of segment to be roughly servedwith the playback rate (i.e., one segment per unit time). If arequest for segment i arrives when Ki is saturated, it isdropped and redirected to the servers. It is revealed byinsensitivity results [23] that the blocking probability Pbi foran M=M=1=Ki � PS system is identical to that for thecorresponding M=M=1=Ki system, as given below:

Pbi ¼�doiðtÞei ð1� �iÞ

1� �doiðtÞeþ1i

; ð14Þ

�i ¼�2ai

oiðtÞ¼ �2aiNB

"CpxiðtÞ; i ¼ 1; 2; . . . ; V : ð15Þ

As such, intrinsically, a blocking mode queuing modeldoes not need to assume �i < 1 for the system to be stable [16],it can be used to evaluate the performance of peer-assisted

on-demand streaming systems under any supply-demandconditions across all segments, e.g., either deficit (�i > 1) orsurplus (�i < 1).

To minimize server cost, we essentially need to max-imize the system-wide effective throughput over all segments(i.e., the total number of unblocked segment requests servedby peers):

MaximizeXVi¼1

�2aið1� PbiÞ; ð16Þ

Subject to :XVi¼1

xiðtÞ ¼ NB; 8t; ð17Þ

0 � xiðtÞ � N; i ¼ 1; 2; . . . ; V ; ð18Þ

xiðtÞ 2 IN; i ¼ 1; 2; . . . ; V ; 8t; ð19Þ

wherePV

i¼1 ai ¼ 1. With a specific amount of demand �2, theproblem above can be transformed to minimize the system-wide blocking probability as follows: differing from the cachehit ratio derived in a relevant cache redundancy model [8]with the constraint of the total cache size, our model andperformance metric cohesively incorporate the constraintson both the upload bandwidth and storage of peers. Therationale is that even if a segment is cached by peer(s), thereis no guarantee that the currently available bandwidth frompeer(s) is sufficient to satisfy the video playback rate.

MinimizeXVi¼1

aiPbi ¼XVi¼1

ai�doiðtÞei ð1� �iÞ

1� �doiðtÞeþ1i

Subject to : constraintsð17Þ; ð18Þ; ð19Þ:

ð20Þ

This is a nonlinear integer programming problem, whichis NP-hard in general, and even harder to solve than integerlinear programming problems [24]. More specifically, byplugging (15) into (20), the structure of the objectivefunction can be viewed as a sum-of-ratios problem in fractionalprogramming, which still remains a hard problem in theliterature of global optimization [25]. Though it is mathe-matically challenging to reach global optimality, ourconstruction of (20) analytically characterizes critical factorsthat are general in any caching and prefetching designs inpeer-assisted on-demand streaming systems.

Due to the intractability of solving the general optimiza-tion problem, we seek to derive near-optimal solutions thatreveal helpful insights. To simplify the problem, we utilize areasonable upper bound of the system-wide blockingprobability.

Lemma 1. Under any given cache profile XðtÞ, the system-wideblocking probability has the following corresponding upperbound:

XVi¼1

aiPbi �XVi¼1

ai�i1þ �i

: ð21Þ

Proof. By limiting the queuing capacity of each segment toKi ¼ 1, i ¼ 1; 2; . . . ; V , our supply-demand model of eachsegment M=M=1=Ki � PS becomes the correspondingM=M=1=1� PS system. This essentially imposes more

LIU ET AL.: PEER-ASSISTED ON-DEMAND STREAMING: CHARACTERIZING DEMANDS AND OPTIMIZING SUPPLIES 357

stringent timing requirements for segment requests, andas such a limited queuing capacity results in lower chancesfor segment requests to wait, which alternatively meansthat segment requests become less tolerant to delay.

Specifically, the blocking probability of segment iunder M=M=1=1� PS becomes �i=ð1þ �iÞ. Then, wehave the following relationship for both �i < 1 and �i � 1:

ai�i1þ �i

¼ ai�doiðtÞei ð1� �iÞ

�doiðtÞe�1i � �doiðtÞeþ1

i

� ai�doiðtÞei ð1� �iÞ

1� �doiðtÞeþ1i

¼ aiPbi:

Note that given any cache profile XðtÞ, it is possiblethat certain segments may not be cached by any peers inthe system, i.e., 9i, xiðtÞ ¼ 0, and oiðtÞ ¼ 0. In such cases,we have both �i=ð1þ �iÞ ¼ 1 and Pbi ¼ 1, and hence theabove still holds.

Then, since the above relationship holds for anysegment i ¼ 1; 2; . . . ; V , the summation over all segmentsgives (21). Specifically, when xiðtÞ ¼ 0; 8i, both sides of(21) degenerate to 1, which also captures the traditionalserver-based on-demand streaming systems without peerassistance, where all segment requests need to be satisfiedby the servers. tu

Since such an upper bound is able to preserve relevantfactors in problem (20) while being more reflective ofstringent timing requirements, it is reasonable to simplifyproblem (20) by minimizing such an upper bound.

Theorem 2. In response to a given system-wide normal playbackdemand represented by �2 and ða1; a2; . . . ; aV Þ, to minimizethe upper bound of system-wide blocking probability given byLemma 1, the optimal cache profile X�ðtÞ should match thedemand as x�i ðtÞ ¼ NBai, i ¼ 1; 2; . . . ; V .

Proof. To minimize the upper bound of system-wideblocking probability, we need to solve the followingoptimization problem:

MinimizeXVi¼1

ai�i1þ �i

Subject to: constraintsð17Þ; ð18Þ; ð19Þ:

By relaxing the integer constraints (19) and theinequality constraints (18) (this does not undermine theoptimality of the solution as discussed later), we can useLagrangian multiplier and Karush-Kuhn-Tucker condi-tions [26] to solve it:

� ¼XVi¼1

ai�i1þ �i

þ !�XV

i¼1

xiðtÞ �NB�;

@�

@xiðtÞ¼ !� �2a

2i NB"Cp

ð"CpxiðtÞ þ �2aiNBÞ2¼ 0;

w h i c h g i v e s x�i ðtÞ ¼ ð ai"CpÞffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi�2NB"Cp

!

q� �2aiNB

"Cp. T h e n ,

through the equality constraint (17), we can obtain theoptimal cache profile x�i ðtÞ ¼ NBai, i ¼ 1; 2; . . . ; V . tu

Remark. Though the optimal cache profile above does notguarantee the inequality constraint xiðtÞ � N; i ¼ 1; 2;. . . ; V , it is still a “good” approximation. According tothe heavy tail effect of content popularity observed in real-world on-demand streaming systems [27], a considerable

portion of x�i ðtÞ in the optimal cache profile could satisfyx�i ðtÞ ¼ NBai � N . For those highly popular segmentsresulting in x�i ðtÞ ¼ NBai > N , we can simply adjust theircorresponding x�i ðtÞ ¼ N , and shift the surplus portion toreplicate other less popular ones. Such a near-optimalcache profile does not undermine the optimality of thesolution. It conveys the fundamental principle for cachingand prefetching designs in peer-assisted on-demandstreaming systems: optimal prefetching and cachingstrategies should help the peer cache state evolve towardthe optimal cache profile, which tends to balance thesystem-wide demand and supply.

Our aforementioned insights can lead to a generalframework of optimal caching principle as described inAlgorithm 1. Rather than a specific heuristic, this abstracts afamily of prefetching and caching strategies toward theoptimal cache profile. For instance, a popularity-basedcaching heuristic in [9] estimates the segment popularityand peer cache state through a distributed averaging scheme[28]; and the discrepancy between demand and supply iscaptured by the difference between normalized segmentpopularity and the average number of replicas. Anotherweight-based replication heuristic in [1] relies on thetracking server to provide the availability to demand ratio(ATD) for each video, which captures the discrepancybetween supply and demand in a different form. Theseheuristics, albeit with different schemes to capture thesystem-wide supply-demand relationship and their discre-pancy, essentially follow the key principle of our frameworkto better match demand and supply. While neither of thembase their designs upon sound theoretical foundations, ourgeneral framework based on stochastic queuing analysis andglobal optimization can be used to characterize and under-stand the essence of such prefetching and caching designs.

Algorithm 1. A General Framework of Optimal Caching

(with Prefetching) Principle

1: Obtaining information on system-wide demand and

supply (either in a centralized or distributed manner asillustrated later):

i) Estimating segment popularity ai, i ¼ 1; 2; . . . ; V ;

ii) Estimating the peer cache state XðtÞ, i ¼ 1; 2; . . . ; V .

2: Examining the discrepancy between the peer cache state

XðtÞ and the optimal cache profile X�ðtÞ in response to

ai, i ¼ 1; 2; . . . ; V .

3: Balancing the system-wide demand and supply

according to the discrepancy:i) Preferentially prefetch those under-supplied

segments (e.g., popular yet less cached in the

system) to meet the demand;

ii) Preferentially replace those over-supplied segments

(e.g., unpopular yet excessively cached in the

system).

4.2 Performance Evaluation

We now quantitatively examine the optimal caching andprefetching principle represented by Theorem 2 and Algo-rithm 1 (henceforth referred to as optimal), its correspondingupper bound in Lemma 1, as well as one of the representativeweight-based replication heuristic used in a real-worldsystem [1]. The original heuristic in [1] is primarily limitedto cache replacement at the video level, without allowing

358 IEEE TRANSACTIONS ON COMPUTERS, VOL. 62, NO. 2, FEBRUARY 2013

prefetching. To perform a fair comparison, we modify it to asegment level and extend it to allow active prefetching, alsousing the hints based on the ATD information. Specifically,the lower the ATD of a segment, the relatively higherprobability will be prefetched. In addition, we also include arandomized segment replication as a baseline performancebenchmark, where peers randomly choose segments toreplicate with an equal probability.

First, in Fig. 5, we examine the performance of theaforementioned candidates in terms of the system-wideblocking probability versus the increase of relative peercache size B=V , under a heavy relative system-wideworkload �2=ð"CpÞ ¼ 0:9 and a Zipf popularity distributionof segments, as typically observed in real-world Internetmedia workloads. We observe that: 1) The performance ofall candidates improves as the relative peer cache sizeincreases, which demonstrates the potential benefits of peercaching and prefetching to meet user demand by peersalone and thus offload the servers. 2) The optimalconsistently outperforms the others. Meanwhile, by follow-ing the key principle to balance the system-wide demandand supply, the heuristic shows a comparable performanceto the optimal. 3) The upper bound of system-wide blockingprobability under the optimal cache profile is insensitive tothe relative peer cache size as long as the relative system-wide workload is fixed, which represents a conservativeestimation of system performance.

Second, in Fig. 6, we further investigate the performanceof the aforementioned candidates by varying the relativesystem-wide workload �2=ð"CpÞ from a moderate level of0.8 to an intense level of 1.25, under a setting of B=V ¼ 10%.

We observe that: 1) The system-wide blocking probabilitynotably increases along with relatively more demandimposed on the entire system. However, the optimal andthe heuristic are still able to mitigate a substantial portion ofserver load. 2) While the gap between the optimal and theheuristic is insignificant under moderate workloads, itbecomes relatively more profound under heavy workloads.Meanwhile, the upper bound of system-wide blockingprobability under the optimal cache profile becomesrelatively tighter under heavy workloads.

Third, since a recent measurement study has argued thataccess patterns in many media workloads could follow thestretched exponential distribution [29] that deviates from Zipf,we further exercise the aforementioned candidates undersuch access patterns in Fig. 7. Under the same settings as inFig. 6, we observe that the performance of the optimal andthe heuristic notably decreases compared to that in Fig. 6;and their performance margin over the randomized onebecomes less substantial. This is essentially because that theSE popularity distribution usually considers a much longermeasurement duration (e.g., weeks or even months), whichresults in less skewed popularity distribution compared toZipf, and thus potentially renders the caching benefitrelatively restricted [29]. Nevertheless, we believe that theperformance gain would be more profound over a shorterperiod of time and under larger cache sizes.

Finally, it is also essential to examine the system-wideperformance with respect to the segment retrieval latency inour model. To this end, we define the system-wide segmentretrieval latency as F ¼

PVi¼1 aiðFnb

i þ Fbi Þ, where Fnb

i ¼ð1� PbiÞwi is the expected latency for unblocked requestsfor segment i, and wi is the mean sojourn time [16] of a job inthe corresponding M=M=1=Ki system based on Little’sLaw. In addition, Fb

i ¼ Pbi is the expected latency forblocked requests for segment i, where represents anexpected tolerance threshold beyond which such blockedones shall be satisfied by the servers as the last resort. Then,the system-wide relative retrieval latency can be obtainedthrough normalizing F by , since relative values yield morerelevant insights. Figs. 8, 9, and 10 plot the system-widerelative segment retrieval latency under the same setting asin Figs. 5, 6, and 7, respectively.

As the relative peer cache size increases in Fig. 8, thesystem-wide segment retrieval latencies of all the candi-dates have notably decreased, which implies an improved

LIU ET AL.: PEER-ASSISTED ON-DEMAND STREAMING: CHARACTERIZING DEMANDS AND OPTIMIZING SUPPLIES 359

Fig. 5. System-wide blocking probability versus the increase of relativepeer cache size, under a heavy relative system-wide workload of�2=ð"CpÞ ¼ 0:9 and a Zipf popularity distribution of segments.

Fig. 6. System-wide blocking probability versus the increase of relativesystem-wide workload �2=ð"CpÞ, under a relative peer cache size ofB=V ¼ 10% and a Zipf popularity distribution of segments.

Fig. 7. System-wide blocking probability versus the increase ofrelative system-wide workload �2=ð"CpÞ, under a relative peer cachesize of B=V ¼ 10% and stretched exponential distribution of segmentpopularity.

streaming performance under more adequate cachingcapacities. On the other hand, as shown in Fig. 9, theperformance could degrade under excessive system-wideworkloads, and the performance gap between the optimaland the heuristic expands. A similar trend is also observedunder the SE popularity distribution in Fig. 10, yet theperformance margin enjoyed by the optimal and heuristicover the randomized alternative becomes less substantial,which is consistent with our previous discussions.

5 CONCLUDING REMARKS

Despite a large variety of heuristics proposed in existingpeer-assisted on-demand streaming designs, there is a lackof sound theoretical principles and analytical insights toguide the design of two critical aspects: serving requests andprefetching. In this paper, we construct a new theoreticalframework for peer-assisted on-demand streaming systemsbased on queuing models. In particular, we characterizedemands with different levels of urgency through a priorityqueuing analysis, which both qualitatively and quantita-tively justifies how service quality and user experience canbe statistically guaranteed through service prioritization.Based on a fine-grained stochastic supply-demand modelthat we developed, we further investigate peer caching andprefetching as a global optimization problem. Our study notonly provides insights in understanding the fundamentalcharacterization of the demand, but also offers guidelinestoward optimal prefetching and caching strategies in peer-assisted on-demand streaming systems.

ACKNOWLEDGMENTS

The research was support in part by a grant from The

National Natural Science Foundation of China (NSFC) undergrant No. 61103176, by a grant from the NSFC under grant

No. 61133006, by a grant from the Independent InnovationResearch Fund of Huazhong University of Science andTechnology under grant No. 2011QN051. The authors would

like to thank the editor and anonymous reviewers of theIEEE Transactions on Computers for their helpful comments

and suggestions in improving the quality of the paper.

REFERENCES

[1] Y. Huang, T. Fu, D. Chiu, J. Lui, and C. Huang, “Challenges,Design and Analysis of a Large-Scale P2P-VoD System,” Proc.ACM SIGCOMM, Aug. 2008.

[2] Y. Hall, P. Piemonte, and M. Weyant, “Joost: A MeasurementStudy,” technical report, School of Computer Science, Carnegie-Mellon Univ., May 2007.

[3] K. Suh, C. Diot, J. Kurose, L. Massoulie, C. Neumann, D. Towsley,and M. Varvello, “Push-to-Peer Video-on-Demand System: De-sign and Evaluation,” IEEE J. Selected Areas in Comm., vol. 25, no. 9,pp. 1706-1716, Dec. 2007.

[4] C. Huang, J. Li, and K. Ross, “Can Internet Video-on-Demand BeProfitable?” Proc. ACM SIGCOMM, Aug. 2007.

[5] Y. Boufkhad, F. Mathieu, F. de Montgolfier, D. Perino, and L.Viennot, “Achievable Catalog Size in Peer-to-Peer Video-on-Demand Systems,” Proc. Int’l Workshop Peer-to-Peer Systems,Feb. 2008.

[6] N. Parvez, C. Williamson, A. Mahanti, and N. Carlsson, “Analysisof Bittorrent-Like Protocols for On-Demand Stored MediaStreaming,” Proc. ACM SIGMETRICS Int’l Conf. Measurement andModeling of Computer Systems, June 2008.

[7] Y. Choe, D. Schuff, J. Dyaberi, and V. Pai, “Improving VoDServer Efficiency with Bittorrent,” Proc. Int’l Conf. Multimedia,Sept. 2007.

[8] L. Guo, S. Chen, and X. Zhang, “Design and Evaluation of aScalable and Reliable P2P Assisted Proxy for On-DemandStreaming Media Delivery,” IEEE Trans. Knowledge and DataEng., vol. 18, no. 5, pp. 669-682, May 2006.

[9] W. Yiu, X. Jin, and S. Chan, “VMesh: Distributed Segment Storagefor Peer-to-Peer Interactive Video Streaming,” IEEE J. SelectedAreas in Comm., vol. 25, no. 9, pp. 1717-1731, Dec. 2007.

[10] Y. He, G. Shen, Y. Xiong, and L. Guan, “Optimal PrefetchingScheme in P2P VoD Applications with Guided Seeks,” IEEE Trans.Multimedia, vol. 11, no. 1, pp. 138-151, Jan. 2009.

[11] S. Tewari and L. Kleinrock, “Proportional Replication in Peer-to-Peer Networks,” Proc. IEEE INFOCOM, Apr. 2006.

[12] L. Breslau, P. Cao, L. Fan, G. Phillips, and S. Shenker, “WebCaching and Zipf-Like Distributions: Evidence and Implications,”Proc. IEEE INFOCOM, Mar. 1999.

[13] M. Chesire, A. Wolman, G. Voelker, and H. Levy, “Measurementand Analysis of a Streaming Media Workload,” Proc. ThirdUSENIX Symp. Internet Technologies and Systems, Mar. 2001.

360 IEEE TRANSACTIONS ON COMPUTERS, VOL. 62, NO. 2, FEBRUARY 2013

Fig. 9. System-wide relative segment retrieval latency versus theincrease of relative system-wide workload �2=ð"CpÞ, under a relativepeer cache size of B=V ¼ 10% and a Zipf popularity distribution ofsegments.

Fig. 10. System-wide relative segment retrieval latency versus theincrease of relative system-wide workload �2=ð"CpÞ, under a relativepeer cache size of B=V ¼ 10% and stretched exponential distribution ofsegment popularity.

Fig. 8. System-wide relative segment retrieval latency versus theincrease of relative peer cache size, under a heavy relative system-wideworkload of �2=ð"CpÞ ¼ 0:9 and a Zipf popularity distribution ofsegments.

[14] L. Guo, E. Tan, S. Chen, Z. Xiao, and X. Zhang, “The StretchedExponential Distribution of Internet Media Access Patterns,”Proc. ACM Symp. Principles of Distributed Computing (PODC ’08),Aug. 2008.

[15] K. Sripanidkulchai, B. Maggs, and H. Zhang, “An Analysis of LiveStreaming Workloads on the Internet,” Proc. ACM SIGCOMMConf. Internet Measurement (IMC), Oct. 2004.

[16] G. Bolch, S. Greiner, H. de Meer, and K. Trivedi, QueueingNetworks and Markov Chains: Modeling and Performance Evaluationwith Computer Science Applications. Wiley-Interscience, 2006.

[17] X. Yang and G. de Veciana, “Service Capacity of Peer to PeerNetworks,” Proc. IEEE INFOCOM, Mar. 2004.

[18] W. Feng, M. Kowada, and K. Adachi, “Analysis of a Multi-ServerQueue with Two Priority Classes and (M, N)-Threshold ServiceSchedule I: Non-Preemptive Priority,” Int’l Trans. OperationalResearch, vol. 7, no. 6, pp. 653-671, 2000.

[19] Y. Jiang, C. Tham, and C. Ko, “An Approximation for WaitingTime Tail Probabilities in Multi-Class Systems,” IEEE Comm.Letters, vol. 5, no. 4, pp. 175-177, Apr. 2001.

[20] B. Walke, “Improved Bounds and an Approximation for aDynamic Priority Queue,” Proc. Third Int’l Symp. Measuring,Modelling and Evaluating Computer Systems, Oct. 1977.

[21] C. Wu, B. Li, and S. Zhao, “Exploring Large-Scale Peer-to-PeerLive Streaming Topologies,” ACM Trans. Multimedia Computing,Comm. and Applications, vol. 4, no. 3, pp. 175-177, 2008.

[22] Z. Ge, D. Figueiredo, S. Jaiswal, J. Kurose, and D. Towsley,“Modeling Peer-Peer File Sharing Systems,” Proc. IEEE INFO-COM, Apr. 2003.

[23] D. Burman, “Insensitivity in Queueing Systems,” Advances inApplied Probability, vol. 13, pp. 846-859, 1981.

[24] R. Hemmecke, M. Koppe, J. Lee, and R. Weismantel, “NonlinearInteger Programming,” 50 Years of Integer Programming 1958-2008:The Early Years and State-of-the-Art Surveys, Springer-Verlag, 2009.

[25] S. Schaible and J. Shi, “Fractional Programming: The Sum-of-Ratios Case,” Optimization Methods and Software, vol. 18, no. 2,pp. 219-229, 2003.

[26] D. Bersekas, Nonlinear Programming. Athena Scientific, 1999.[27] B. Cheng, L. Stein, H. Jin, and Z. Zhang, “Towards Cinematic

Internet Video-on-Demand,” Proc. Third ACM SIGOPS/EuroSysEuropean Conf. Computer Systems, Apr. 2008.

[28] M. Mehyar, D. Spanos, J. Pongsajapan, S. Low, and R. Murray,“Asynchronous Distributed Averaging on Communication Net-works,” IEEE/ACM Trans. Networking, vol. 15, no. 3, pp. 512-520,June 2007.

[29] L. Guo, E. Tan, S. Chen, Z. Xiao, and X. Zhang, “The StretchedExponential Distribution of Internet Media Access Patterns,” Proc.ACM Symp. Principles of Distributed Computing (PODC), Aug. 2008.

Fangming Liu (S’08-M’11) received the BEngrdegree in 2005 from the Department of Compu-ter Science and Technology, Tsinghua Univer-sity, Beijing, China; and the PhD degree incomputer science and engineering from theHong Kong University of Science and Technol-ogy in 2011. He is currently an associateprofessor in the School of Computer Scienceand Technology, Huazhong University ofScience and Technology, Wuhan, China. In

2007, he worked as a research assistant at the Department of ComputerScience and Engineering, Chinese University of Hong Kong. FromAugust 2009 to February 2010, he was a visiting student at theDepartment of Electrical and Computer Engineering, University ofToronto, Canada. Since 2010, he has also been collaborating with theChinaCache Content Delivery Network Research Institute in TsinghuaUniversity. His research interests are in the areas of peer-to-peernetworks, rich-media distribution, cloud computing, and large-scaledatacenter networking. He is a member of the IEEE and the IEEECommunications Society, and a member of the ACM.

Bo Li (S’89-M’92-SM’99-F’11) received theBEngr degree in computer science from Tsin-ghua University, Beijing, and the PhD degree inelectrical and computer engineering from theUniversity of Massachusetts at Amherst. He is aprofessor in the Department of ComputerScience and Engineering, Hong Kong Universityof Science and Technology. He was with IBMNetworking System, Research Triangle Park,between 1993 and 1996. He was an adjunct

researcher at Microsoft Research Asia (MSRA) from 1999 to 2006,where he spent his sabbatical leave from 2003 to 2004. He has madeoriginal contributions on Internet proxy placement, capacity provisioningin wireless networks, routing in WDM optical networks, and Internetvideo streaming. He is best known for a series of works on a systemcalled Coolstreaming (Google entries over 1,000,000 in 2008 andGoogle scholar citations over 800), which attracted millions of down-loads and was credited as the first large-scale Peer-to-Peer live videostreaming system in the world. His recent work on the peer-assistedonline hosting system, FS2You (2007-2009) (Google entries 800,000 in2009) has also attracted millions of downloads worldwide. He has beenan editor or guest editor for 17 IEEE/ACM journals and magazines. Hewas the co-TPC chair for IEEE Infocom ’04. He is a fellow of the IEEE.

Baochun Li (S’98-M’00-SM’05) received theBEngr degree in 1995 from the Department ofComputer Science and Technology, TsinghuaUniversity, Beijing, China, and the MS and PhDdegrees in 1997 and 2000 from the Departmentof Computer Science, University of Illinois atUrbana-Champaign. Since 2000, he has beenwith the Department of Electrical and ComputerEngineering at the University of Toronto, wherehe is currently a professor. He has been holding

the Bell University Laboratories Endowed Chair in computer engineeringsince August 2005. In 2000, he was the recipient of the IEEECommunications Society Leonard G. Abraham Award in the Field ofCommunications Systems. In 2009, he was the recipient of theMultimedia Communications Best Paper Award from the IEEE Com-munications Society. His research interests include large-scale multi-media systems, peer-to-peer networks, applications of network coding,and wireless networks. He is a senior member of the IEEE, and amember of the ACM.

Hai Jin received the PhD degree in computerengineering from Huazhong University ofScience and Technology (HUST), China, in1994. He is a Cheung Kung Scholars chairprofessor of computer science and engineeringat the Huazhong University of Science andTechnology. He is now the dean of the Schoolof Computer Science and Technology at HUST.In 1996, he was awarded a German AcademicExchange Service fellowship to visit the Techni-

cal University of Chemnitz in Germany. He worked at The University ofHong Kong between 1998 and 2000, and as a visiting scholar at theUniversity of Southern California between 1999 and 2000. He wasawarded Excellent Youth Award from the National Science Foundation ofChina in 2001. He is the chief scientist of ChinaGrid, the largest gridcomputing project in China, and the chief scientist of National 973 BasicResearch Program Project of Virtualization Technology of ComputingSystem. He is the member of Grid Forum Steering Group (GFSG). Hehas coauthored 15 books and published more than 400 research papers.His research interests include computer architecture, virtualizationtechnology, cluster computing and grid computing, peer-to-peer comput-ing, network storage, and network security. He is the steering committeechair of International Conference on Grid and Pervasive Computing(GPC), Asia-Pacific Services Computing Conference (APSCC), Interna-tional Conference on Frontier of Computer Science and Technology(FCST), and Annual ChinaGrid Conference. He is a member of thesteering committee of the IEEE/ACM International Symposium on ClusterComputing and the Grid (CCGrid), the IFIP International Conference onNetwork and Parallel Computing (NPC), and the International Con-ference on Grid and Cooperative Computing (GCC), InternationalConference on Autonomic and Trusted Computing (ATC), InternationalConference on Ubiquitous Intelligence and Computing (UIC). He is asenior member of the IEEE and a member of the ACM.

LIU ET AL.: PEER-ASSISTED ON-DEMAND STREAMING: CHARACTERIZING DEMANDS AND OPTIMIZING SUPPLIES 361