26

Document;;

Embed Size (px)

Citation preview

Page 1: Document;;

Continuous Media Sharing inMultimedia Database SystemsMohan Kamathy, Krithi Ramamrithamy and Don TowsleyzDepartment of Computer ScienceUniversity of MassachusettsAmherst MA 01003fkamath,krithi,[email protected] timeliness and synchronization requirement of multimedia data demands e�cient bu�ermanagement and disk access schemes for multimedia database systems (MMDBS). The data ratesinvolved are also very high and despite the development of e�cient storage and retrieval strategies,disk I/O is likely to be a bottleneck, thereby limiting the number of concurrent sessions supported bya system. This calls for better use of data that has already been brought into the bu�er by exploitingsharing whenever possible using advance knowledge of the multimedia stream to be accessed.This paper introduces the notion of continuous media caching which is a simple and novel tech-nique where bu�ers that have been played back by a user are preserved in a controlled fashion foruse by subsequent users requesting the same data. This is shown to have considerable impact on theperformance of bu�er management schemes. When continuous media sharing is used in conjunctionwith batching of user requests, even better performance improvement can be achieved. The resultsof our performance study clearly indicate that schemes that utilize continuous media caching out-perform existing schemes in environments where sharing is possible. We also discuss the impact offast-forward and rewind on continuous media sharing schemes.Keywords: Bu�er Management, Batching, Multimedia Databases, Data Sharing, Optimization,Performancey Supported by the National Science Foundation under grant IRI - 9109210.z Supported by ARPA under contract F19628-92-C-0089.

Page 2: Document;;

Contents1 Introduction 12 Data Characteristics 33 Continuous Media Sharing 44 Bu�er Management Schemes 64.1 Traditional Scheme - Use And Toss (UAT) : : : : : : : : : : : : : : : : : : : : : : : : : 84.2 Proposed Scheme - Sharing Data (SHR) : : : : : : : : : : : : : : : : : : : : : : : : : : 84.2.1 Criteria to attempt sharing : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 105 Batching Schemes 115.1 Batching without Sharing (BAT-UAT) : : : : : : : : : : : : : : : : : : : : : : : : : : : 125.2 Batching with Sharing (BAT-SHR) : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 126 Performance Tests 126.1 Experimental Setting & Assumptions : : : : : : : : : : : : : : : : : : : : : : : : : : : : 126.2 Parameters : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 146.3 Metrics : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 147 Results 157.1 Comparison of UAT and SHR based on Disk Bandwidth : : : : : : : : : : : : : : : : : 157.1.1 Comparison based on Inter Arrival Time : : : : : : : : : : : : : : : : : : : : : : 157.1.2 Comparison based on Bu�er Size : : : : : : : : : : : : : : : : : : : : : : : : : : 157.1.3 Comparison based on Topic Parameters : : : : : : : : : : : : : : : : : : : : : : 167.2 Comparison of BAT-UAT and BAT-SHR based on Disk Bandwidth : : : : : : : : : : 177.2.1 Comparison based on Inter Arrival Time : : : : : : : : : : : : : : : : : : : : : : 177.2.2 Comparison based on Bu�er Size : : : : : : : : : : : : : : : : : : : : : : : : : : 177.2.3 Comparison based on Topic Parameters : : : : : : : : : : : : : : : : : : : : : : 187.2.4 Comparison based on Batching Parameters : : : : : : : : : : : : : : : : : : : : 198 E�ect of Fast Forward/Rewind on Continuous Media Caching 199 Extensions to this Work 2110 Conclusion 21

Page 3: Document;;

1 IntroductionMultimedia systems have attracted the attention of many researchers and system developers, becauseof the multimedia nature of the data used in modern applications and due to their capability to im-prove the communication bandwidth between computers and users. These multimedia informationsystems cater to a variety of scienti�c and commercial �elds ranging from medicine to insurance.Lately the construction of the national information super-highway has also received enormous at-tention and many commercial ventures are focusing on providing on-demand services. Multimediainformation systems will form the backbone of these services. Multimedia data can be classi�edas static or dynamic (also known as continuous) media. While static media, like text and images,do not vary with time (from the perspective of the output device), dynamic media, like audio andvideo, vary with time. Whenever continuous media is involved, in order to ensure smooth and mean-ingful ow of information to the user, timeliness and synchronization requirements are imposed.Timeliness requires that consecutive pieces of data from a stream be displayed to the user within acertain duration of time for continuity. Synchronization requires that data from di�erent media becoordinated so that the user is presented with a consistent view. Hence, unlike real-time databasesystems where deadlines are associated with individual transactions, multimedia data has contin-uous deadlines. Thus far, most of the research work in the area of multimedia data managementhas concentrated on data modeling and storage mangement. While some of the issues relevant toMMDBS are enumerated in [4, 5, 15, 20], object-oriented approaches for implementing MMDBS aresuggested in [6, 15, 21, 30]. Some architectural aspects of multimedia information systems like dataplacement, bu�ering and scheduling are described in [6] while issues related to a distributed multi-media information system, performance in particular are discussed in [4]. Though e�cient storageand retrieval strategies are being developed [2, 12, 16, 24, 27, 29, 31], since the data rates involvedare very high (around 2 MB/sec of compressed data per request) disk I/O could still be a bottleneck,i:e:; the number of concurrent sessions supported could be limited by the bandwidth of the storagesystem. Hence, it is important to make better use of data brought from the disk into the databasebu�er.In this paper, we explore the bene�ts of continuous media sharing in multimedia database systems(MMDBS). Speci�cally, we consider the News-On-Demand-Service (NODS) which typi�es applica-tions where extensive data sharing is possible and focus on bu�er management schemes that promotesharing of continuous media data. Since multimedia data are typically read only, by sharing the databetween users, many trips to the disks can be avoided. The motivation for our work comes from thefact that sharing of continuous media data can increase the number of users that can be supportedby a system. Furthermore, there are many interesting issues that arise in such an environment.The most important fact is that once the user has requested a playback, advance knowledgeof the multimedia stream to be accessed can be used to anticipate demand and promote sharing(planned sharing for future accesses). To achieve this, we introduce the notion of continuous mediacaching. It is a simple and novel technique where bu�ers that have been played back by a user arepreserved in a controlled fashion for use by subsequent (or lagging) users requesting the same data.This technique can be used when there is su�cient bu�er space such that it is possible to retaindata for the required duration. With this technique, the system can avoid fetching the data from the1

Page 4: Document;;

disk again for the lagging user. Hence this technique makes it possible to support large number ofsessions than the number permitted by the disk bandwidth. Also the throughput of the system, i:e:;the percentage of user requests permitted, can be increased by exercising the less expensive optionof additional memory rather than choosing a storage system with a higher bandwidth. Batchingis another technique to improve the performance of a system by grouping requests that demandthe same topic [7]. By using continuous media caching in conjunction with batching, even betterperformance can be achieved. Throughout our discussion, we disregard issues that might arise ifusers request fast-forward or rewind. We postpone our discussion on the impact of fast-forward andrewind to the end of the paper.Bu�er management schemes have been widely studied in traditional database systems [11, 26] andin real-time database systems [18]. The objective of these bu�er management schemes is to promotesharing and reduce disk accesses, thereby increasing the throughput. However these schemes typicallyconsider only textual data. Some aspects of bu�er management for multimedia data are addressedin [2, 6]. While [6] does not speci�cally consider concurrent sessions in multimedia systems, [2] doesaddress some performance issues when concurrent users are trying to access continuous media data.However it disregards sharing of continuous media data. The Use-And-Toss (UAT) policy has beensuggested in [6] for multimedia data, under which, once a piece of data is used, it is tossed away sothat the next piece of required data is brought into the same bu�er. Generally the UAT policy is goodonly if data sharing does not exist or cannot be exploited. Essentially, there has not been any researchto evaluate the potential bene�ts of continuous media sharing when users are accessing multimediadata concurrently. Hence we propose a new bu�er management scheme, SHR that enhances UATwith continuous media caching. This scheme should perform very well since it uses past data andalso does planned sharing of future data. To study the bene�ts of using continuous media cachingin conjunction with batching, we also study two other schemes BAT-UAT and BAT-SHR, which areenhancements of UAT and SHR respectively with batching.The rest of the paper is organized as follows. In section 2 we discuss the data characteristics ofmultimedia data and NODS in particular. Issues related to continuous media sharing are discussedin section 3. Using simple examples, we illustrate how continuous media sharing can be useful inincreasing the number of users supported by a system. Section 4 discusses the UAT scheme andits enhancement (SHR scheme) through the use of continuous media caching. Batching schemes forgrouping requests that demand the same data are discussed in section 5. Strategies to improve per-formance through the use of continuous media caching are also discussed. Details of the performancetests are discussed in section 6. This section describes the experimental setting, various assumptionsmade, parameters and metrics used for the tests. Section 7 interprets the results of the performancetests. These results show the superior performance of schemes that utilize continuous media caching.In section 8 we discuss the impact of fast-forward and rewind in continuous media sharing envi-ronments. Possible extensions to this work are described in section 9. Section 10 concludes with asummary of this work. 2

Page 5: Document;;

2 Data CharacteristicsMultimedia objects are captured using appropriate devices, digitized and edited, until they are in aform that can be presented to the user. These independently created objects are then linked usingspatial and temporal information, to form a composite multimedia object that may correspond to apresentation or some speci�c application.In order to ensure smooth and meaningful ow of information to the user, there are timelinessand synchronization requirements which vary from application to application. Timeliness requiresthat consecutive pieces of data from a stream be displayed to the user within a certain durationof time for continuity; hence multimedia data has continuous deadlines. Synchronization requiresthat data from di�erent media be coordinated so that the user is presented with a consistent view.Essentially it refers to the temporal sequencing of events among multimedia data. It deals with theinteraction of independent objects that coexist on di�erent media such that the user is presentedwith a consistent view and is very much dependent on the type of application. Detailed analysis ofsynchronization requirements for multimedia data can be found in [23, 28].The rest of this section describes the data characteristics of NODS, which typi�es applicationswhere extensive sharing of data is possible. In NODS, given that the number of topics in high demandis likely to be limited at any particular time, given the higher level of interest in certain topics likesports/business, the probability of one or more users requesting the same item for playback is veryhigh. To request a service the user has to choose a topic and a language from, for example, thefollowing:1. Topic - Politics, International, Business, Sports, Entertainment, Health, Technology etc.2. Language - English, Spanish, Italian, French, German, Hindi etc.Each topic consists of a list of news items. News items are composed of video and audio segmentsand possibly some images. All news items to be displayed when a topic is chosen are compiled inadvance and known apriori.Sharing of data can occur at various levels as we discuss below. Typically there are many newsitems that are shared between topics. For example, consider the news item on NAFTA1, which canbe included in a variety of topics like Politics, International and Business because of its relevance toall of them. Sharing of continuous media can also occur at the media level. Since the same newsitems will be provided in di�erent languages, to reduce storage, additional storage is necessary onlyfor the audio segments as the video segments can be shared. In addition, sharing can occur whenmultiple users are requesting service on the same topic. This clearly shows that sharing at di�erentlevels can lead to compound sharing. In this paper, we mainly focus on the last type of sharing sinceit is likely to have the most impact on performance. Intelligent bu�er management schemes can andmust utilize this to improve the performance of a system.Figure 1 shows the data stream of a request. Because video is often compressed, we assume thatthe continuous media data is played out at a variable bit rate. The data rate depends on the amount1North American Free Trade Agreement 3

Page 6: Document;;

of action that is packed in the video. We assume that a data stream is divided into segments whereeach segment is characterized by a �xed playout data rate. For example a video stream of a newsreader reading news will require a lower data rate than the video stream from a moving camera (e.g.sports clippings). In our work, segments are restricted to be of equal length (of 1 second duration)for ease of implementation.segment boundary

audio datavideo data

News Item boundaryDa

ta R

ate

TimeFigure 1: Multimedia Data Stream of a requestHowever since the data rate of segments can vary, the amount of bu�er required by a segmentvaries. Since the length of the segment also has an impact on the performance of the schemes we aregoing to propose, if a segment is too long, it can be split into smaller segments that have the samedata rate.3 Continuous Media SharingIn this section we introduce the concept of continuous media caching which promotes planned sharing.It is a simple and novel technique where bu�ers that have been played back by a user are preservedin a controlled fashion for use by subsequent users requesting the same data.Since the bu�er requirements for images and audio are less compared to video (as we will see laterin this section), we ignore them and consider only the requirements of video data. Hence we mainlyfocus on sharing that occurs when multiple users request service on the same topic (though theymay have chosen a di�erent language). When there is sharing, many possibilities exist dependingupon the arrival time of the customers and we discuss them below. When the same service is beingrequested by two (or more) users simultaneously, one of the following could be true:1. they arrive at the same time (they start at the same time)2. they arrive with a gap of few time units between them3. they arrive with a gap of large number of time units between themDepending upon the order in which users share an application, they may be classi�ed as leading(the �rst one) or lagging (the subsequent ones that can share the bu�ers). Using the same bu�ersfor both users can de�nitely payo� in case 1 (since the same data is used at the same time by bothrequests), whereas in case 2 it may payo�. In case 2, after the data has been played out of the bu�erfor the leading user, rather then discarding the data, it can be cached by preserving it in the bu�erso that the same data can be reused by the lagging user (rather than fetching it again from the disk).We refer to this technique as continuous media caching. In case 3, it is easy to see that continuousmedia caching will not payo� since preserving bu�ers may prove to be too costly in terms of bu�errequirements. In this case it is better to obtain the data for the lagging user directly from the diskand reuse the data bu�ers from the leading user immediately after it has been played back. In ourbu�er management scheme, we allow sharing to occur as long as the di�erence between the arrivaltime of two consecutive users for the same topic is less than a particular value (details are described4

Page 7: Document;;

in section 4.2.1). Barring information about the future arrivals, this is the earliest time instant whena sharing situation can be recognized.Figure 2 compares the disk and bu�er requirements as a function of time for the non-sharedcase and the shared case (continuous media caching). Two requests for the same topic (containing 5segments) that arrive with a time di�erence of 2 segments is shown in the �gure. In the non-sharedcase, disk and bu�er bandwidths are allocated independently for each request. However for theshared case, since it is known that segments three through �ve will be available in the bu�er (it willbe loaded into the bu�er for the �rst request) when the second request needs it, the bu�ers are reusedfor the second request, i:e:; the bu�ers are preserved until the corresponding playback time for thesecond user. These segments need not be fetched again from the disk for the second request. Thisresults in an increase in the bu�er requirement, but a decrease in the disk bandwidth requirement.Thus it can be seen that continuous media caching exploits sharing to decrease the number of diskI/Os and, hence, improve performance. This philosophy concurs with the increasing trend towardsmain memory systems since disk transfer rates are typically limited by mechanical movements.New Buffers for Next Request

Preserved Buffers for Next Request

Buffers Allocated for Present Request

Byt

esB

ytes

/sec

Byt

es

Time

Disk Bandwidth for Next Request

Disk Bandwidth for Present Request

Disk Bandwidth for Shared CaseDisk Bandwidth for Non−Shared Case

Buffer Requirement for Non−Shared Case Buffer Requirement for Shared Case

Buffers Allocated for Present Request

New Buffers for Next Request

Time

Time Time

Disk Bandwidth for Next Request

Disk Bandwidth for Present Request

Byt

es/s

ec

Figure 2: Bu�er Sharing between two requests

Byt

es

Time1

23

45

1

2

3

4

5

3

4

5

T1 T2 T3 T4 T5 T6 T7

1, 2, 3, 4, 5 − Segments Loaded for First User1,2 − Segments Loaded from disk for Second User3,4,5 − Segments preserved in buffer for Second UserFigure 3: Segments available in bu�er at each time slot (zooming into bottom right of �gure 2)Figure 3 shows the details of the segments that are actually present in the bu�er at each timeinstant in the shared case (bottom right case of �gure 2). It illustrates how the continuous mediacaching technique preserves only the necessary segments for just the required amount of time. Itdi�erentiates the segments as follows | segments that are loaded for the �rst user, segments fetched5

Page 8: Document;;

from the disk for the second user and segments that have been used by the �rst user but preserved forthe second user. We present an example of a simple system, to illustrate how sharing can be exploitedto support more users. Consider a system which has a main bu�er of 1280 MB (Mega Bytes) anda storage system with a high storage capacity and a data transfer rate of 40 MB/s. MPEG-2 is anaccepted standard for continuous media data storage/transmission and the requirements of di�erentmedia based on this standard are as follows (in the compressed form): image � 100 KB for a frame,audio � 16 KB/s (high quality) and video � 2 MB/s. Ignoring the data requirements of audio andimages, we analyze the requirements of a session using the characteristics of its video data on asegment (of duration 1 second and having same data rate of 2MB/s) basis. The available bu�er canhold 1280MB/(2MB/sec) = 640 seconds of video, which translates to 640 segments. This impliesthat the bu�er is capable of handling 640 concurrent sessions (since each session needs at least asegment in memory). Neglecting the seek time (which is typically in the order of 10's of milliseconds)and based on the data bandwidth of the storage device, the system can support (40MB/s)/(2MB/s)= 20 concurrent sessions (topics). Thus we see that the number of concurrent sessions supported islimited to 20 by the bandwidth of the storage system even though the bu�er can handle 640 sessions.However by exploiting sharing, the bu�er that can support 640 seconds of video can be e�ectivelyutilized to satisfy more users, provided they request one of the 20 topics that can be fetched using theavailable disk bandwidth. If the 20 topics are shared by u1; u2; :::; u20 concurrent users respectively,then the number of users supported will beP20i=1 ui. This is however the best case and the probabilityof this is very low, since all the users sharing the sessions arrive at the same time in this scenario.On the other hand, the worst case occurs when the users requesting the same topic come with largerinter-arrival times. Continuous media caching may not be possible in this case (due to insu�cientbu�er bandwidth), resulting in only 20 users being supported by the system. However if they arrivewithin short intervals, instead of fetching the data from the disk for the lagging user, data read fromthe disk by the leading user can be preserved for reuse by the lagging user. Thus depending onthe inter-arrival time and other possibilities of sharing (three cases discussed earlier), the number ofusers supported will be between 20 and P20i=1 ui. This simple example gives an idea of how sharingcan be exploited in a MMDBS for the case of NODS to provide better bu�er management. If seektimes were also considered in the above example, continuous media caching will prove to be evenmore bene�cial.4 Bu�er Management SchemesThis section describes the two bu�er management schemes (UAT and SHR) we are going to compare.First we brie y review traditional bu�er management techniques and then discuss the details of UATand SHR.The objective of the bu�er manager is to reduce disk accesses for a given bu�er size. Traditionalbu�er management schemes for textual data use �xed size pages to avoid fragmentation. When therequested database page is not in the bu�er, it is fetched from the disk into bu�er. Then the page iskept in the bu�er using the fix operation, so that it is not paged out and future references to the pagedo not cause page faults in a virtual memory operating system environment. Once it is determined6

Page 9: Document;;

that the page is no longer necessary or is a candidate for replacement an unfix is done on that page.Also in traditional database systems, bu�er allocation and replacement are considered orthogonalissues. A page can be allocated anywhere in the global bu�er in a global scheme whereas in a localscheme, speci�c portions of the bu�er are allocated for a given session and a page for a session isallocated in the area allocated for that particular session. Hence in the global scheme there is onlyone page replacement algorithm whereas in the local scheme several page replacement algorithmsmay exist. In either case the goal of the page replacement algorithm is to minimize the bu�er faultrate. Before allocating a page, the bu�er manager checks if the corresponding page is in the bu�erand if not, retrieves it from the disk. While static allocation schemes allocate a �xed amount ofbu�er to each session when it starts, dynamic allocation schemes allow the bu�er requirements togrow and shrink during the life of the session. In virtual memory operating systems where thereis paging, code pages (which are read only) are typically shared. For example, if many users areusing a particular program like an editor, then the code pages of the editor are shared between thedi�erent users. Though this scheme cannot be used as such for continuous media data, it providessome motivation for our continuous media caching scheme. Some of the common page replacementstrategies used for textual data include FIFO and LRU [11]. These traditional schemes do not haveprovision for admission control and resource reservation and hence can not guarantee the successof a request in multimedia environments. This can result in wastage of resources for requests thateventually fail. We now discuss two schemes that have admission control and resource reservationintegrated with them. The �rst is UAT, a scheme proposed previously for multimedia systems [6]and SHR, our new scheme that exploits the potential for sharing.First we discuss the memory model, admission control and bandwidth reservation scheme. Inthe case of a MMDBS, advance knowledge of the data stream to be accessed is available and thiscan be utilized for better bu�er management. Since we are assuming a large main bu�er, we assumethat this information about the topic and segments can be stored in main bu�er (i:e:; the numberof segments that make up a topic and the segment data rates for each segment). Bu�er space thatis not used exists in a free-pool (this is a global pool). Bu�ers needed for the segments of a requestare acquired (analogous to �x) from the free-pool and returned (analogous to un�x) to the free-poolwhen they have been used. When bu�ers are requested from the free-pool, the oldest bu�er in thefree-pool is granted to the requester. We mentioned earlier that since the video rate dominatesthe audio rates, only the requirements of video are considered for sharing. The availability of thenecessary bandwidth for audio is also to be checked by the admission control policy to meet thesynchronization requirements of audio and video streams. The admission control and reservationscheme is dependent upon the sharing policy. Given (n � 1) sessions already in progress, the nthrequest is accepted i� the following inequalities hold.nXi=1 l � di;t � M 8t = 1; ::; s (1)nXi=1 di;t � R 8t = 1; ::; s (2)where di;t represents the continuous media rate (audio and video) for request i at time slot t, l7

Page 10: Document;;

represents the length of each segment (of 1 second duration), s represents the total number ofsegments in the request. The 8t = 1; ::; s condition indicates that this inequality should hold for allthe segments of the request, thereby verifying that there is su�cient resources for the entire durationof the request. The �rst equation gives the limiting case for the bu�er bandwidth (bu�er size = M)and the second for disk bandwidth (data rate supported = R). If the inequalities hold for a request,the required bu�er and disk bandwidth is reserved and the request is guaranteed to succeed.4.1 Traditional Scheme - Use And Toss (UAT)The UAT policy that is traditionally used for MMDBS is described below with some enhancements.UAT replaces data on a segment basis and it also uses admission control and bandwidth reservationto anticipate future demand. A FIFO policy is used to decide which of the segments from the free-pool are to be allocated to a new request, i:e:; when a new request is to be allocated a segmentfrom the free-pool, the oldest segment in the free-pool is allocated �rst. A timestamp mechanism isused to maintain information about the oldest segment in the free-pool. The UAT policy is ideal forcontinuous media data where there is no sharing. UAT reuses segments that may already exist inbu�er, i:e:; if any of the required segments happen to exist in the free-pool then they are grabbed fromthe free-pool, thereby avoiding fetching those segments again from the disk. UAT does not considersharing explicitly, i:e:; when the segments of a topic have been played back, they are not preserved inbu�er for a future user which might request the same topic. Hence requests from di�erent users areconsidered independent of each other for admission control and reservation purposes, even if they arefor the same topic. The admission control policy checks for each request separately to see if there issu�cient bu�er and disk bandwidth available to fetch and store data for the request on a segment bysegment basis (using inequalities 1 and 2). If the required bandwidth is available, then the requireddisk and bu�er bandwidths are reserved; else the request is rejected.4.2 Proposed Scheme - Sharing Data (SHR)Here we propose a new bu�er management scheme which uses the continuous media caching techniqueto speci�cally exploit the sharing characteristics of MMDBS applications. This scheme (SHR) is avariation of the UAT scheme we discussed earlier, with enhancements for continuous media caching.The bu�er and disk bandwidth for these schemes are determined by utilizing the technique illustratedin �gure 2 (page 6). We do not discuss details of disk scheduling here. A rate monotonic type ofscheduling is the best, i:e:; since the disk bandwidth will be assured for the various sessions, somedisk bandwidth will be scheduled for each session, corresponding to the ratio of the data rate of thesegment it is fetching to the total demanded data rate.Since SHR utilizes continuous media caching, segments used by a leading user are explicitlypreserved for use by the lagging user. This can be explained better using �gure 3. The segmentsthat are explicitly preserved begin with the segment that is being played back by the leading userwhen the lagging user enters the system. All segments that follow this are also preserved in thebu�er until the lagging user needs it for playback. (Referring to �gure 3, the segments that havebeen preserved are 3, 4 and 5). If it happens that some of the segments are in the free-pool (saysegments 1 and 2 are in the free-pool after they have been used and returned by user 1), they canbe reacquired from the free-pool if the bu�er space corresponding to those segments has not already8

Page 11: Document;;

been granted to other sessions. If all the segments (loaded for the �rst user prior to this user'sarrival) are available, then no data is to be fetched from the disk for the second user (only bu�erbandwidth is checked/reserved), else data is to be fetched from the disk for those segments that donot exist in the free-pool(both bu�er and disk bandwidth checked/reserved). Thus SHR uses pastand future information about segment access and hence should perform better that UAT as we verifylater in the performance tests. Essentially the performance di�erence between UAT and SHR canbe attributed solely to continuous media caching.end do

admitted = true

elserestore old total profile

endif

admitted = false;

;

;reject request ;

For each segment in the topic, check if the segment is alreadyin the free−pool and grab it if it is available ;

For segments that exist in the buffer and are shareable, modifythe total profile for buffer and check for bandwidth violation ;

/* total profile contains reservation for this request */

For segments that have to be fetched from the disk, modify the total profile for disk and buffer accordingly and check for buffer/disk bandwidth violation ;

do while there are requests in the queue select next request ;

For each segment grabbed from the free−pool, modify the total profile to reflect preserving of these segments and check for buffer bandwidth violation ;

if no violation at any stageFigure 4: Admission Control and Reserving future bandwidthdo forever

end do

when a new segment is encountered

endif

if leading user

else if lagging user

else

endif

fetch data from disk into the buffer ;

fetch segment from disk;

if buffers are shared with another session

read from preserved buffers ;

if there are more lagging users

else

endifread from preserved buffers ;return the used buffers to the free pool ;

preserve the used buffers for lagging users ;acquire new buffers from the free pool ;

preserve the used buffers for lagging users ;

acquire new buffers from the free pool ;return the used buffers to the free pool ;

do for all active sessions

end doFigure 5: Bu�er Allocation & Replacement | SHRFigure 4 describes the admission control and bandwidth reservation scheme for SHR. The totalpro�le refers to the total requirements of bu�er/disk bandwidth. The �gure describes the sequencein which the availability of segments in the bu�er is checked along with admission control andbu�er/disk bandwidth reservation. Since SHR considers sharing explicitly, the integrated admissioncontrol and reservation scheme has to be modi�ed accordingly. Speci�cally, in SHR, the admissioncontrol policy checks if there is su�cient disk bandwidth available for segments to be loaded from thedisk (for segments 1 and 2 if they are not available in the free-pool) and su�cient bu�er bandwidthfor loading these segments in bu�er and preserving the other segments (i:e:; loading segments 1 &9

Page 12: Document;;

2 and preserving segments 3, 4 and 5). If the required bandwidth is available then appropriate diskand bu�er bandwidths are reserved, else sharing is not possible. When this is the case, the requestin considered independently (similar to the UAT scheme) and an attempt is made to reserve thenecessary bu�er and disk bandwidth for each request. If this also fails, then the request is rejected.Bu�er allocation and replacement is to be done in an integrated manner as shown in �gure 5.Bu�er allocation is done when a inter segment synchronization point is encountered. Bu�ers usedby a segment that has been played back are freed �rst and then bu�ers needed by the next segmentare allocated. Once the bu�ers have been allocated for a segment, they are not freed till the end ofthe segment. For every user, at the beginning of a data transfer from disk, new bu�ers have to beallocated. This is achieved by acquiring the necessary number of bu�ers from the free pool. Oncedata from these bu�ers are played back by a user, if there are lagging users, instead of returning themto the free pool, they are preserved in the bu�er. For a lagging user, bu�ers need not be acquiredfrom the free pool, since a disk transfer is not necessary. Instead data is read from the preservedbu�ers after which the bu�ers are returned to the free pool. Information about allocated bu�erscan be maintained using a hash table since it is necessary for sharing bu�ers. We do not explicitlyconsider prefetching of data [26, 27] but if it is considered, then our schemes have to be modi�edsuch that prefetching is applied to all or none of the sessions.4.2.1 Criteria to attempt sharingAnother important tradeo� related to continuous media caching is how to decide if it is advantageousto share the data with another user. There are several factors that determine this. For example, ifthe di�erence between the arrival times of the leading and lagging users is not much then clearly thistechnique pays o� because in this case, bu�ers needed by lagging users are not retained for a longtime. However if there is a large di�erence in the arrival times, then the amount of time for whichthe respective segments have to be carried over (cached) becomes large. Essentially this means thatmore bu�er space will be occupied to enable sharing between users. This may cause many otherrequests to fail for lack of bu�er space. Hence we performed some empirical tests to determine thepoint beyond which there is no payo� (i:e:, the maximumdi�erence in the arrival time beyond whichcontinuous media caching should not be used). Our tests indicate that continuous media cachingpays o� as long as it is used when the inter arrival time (iat) is less than maximum tolerable interarrival time for sharing (max-iat) . Other than the arrival time, there are several other factors thatdetermine the value of max-iat as shown below.max iat = C � Mean Arrival T ime �Memory SizeTopic Length �Data Rate (3)iat � max iat (4)If more memory is available, then segments can be cached for a longer duration of time and, hence,max-iat is directly proportional to memory size. On the other hand, if the topic length is large or thedata rate is high, it is di�cult to cache segments for a longer duration since the bu�er space neededwill be too high. Hence max-iat is inversely proportional to the topic length and the data rate. Theconstant C can be determined via empirical tests. Developing better heuristics to determine the10

Page 13: Document;;

criteria for sharing can be a separate area of research and hence we do not go too much into thedetails. Thus, for the SHR scheme, we have used inequality (4) as a basis for deciding when therequests are to be shared. If iat is larger than max-iat then the request is satis�ed using the UATpolicy. It should be noted that in all these cases, a request is considered for scheduling as soon as itis received (zero response time).5 Batching SchemesIn order to achieve better performance out of an existing system, i:e:; support more number ofconcurrent users, batching can be used [7]. Batching tries to group requests that arrive within ashort duration of time for the same topic. Our objective is to explore how batching can enhance theperformance of continuous media caching schemes.There are a few tradeo�s in the design of batching schemes. The response time of a requestis the time interval between the instant it enters the system and the instant at which it is served(i:e:; the �rst segment of the request is played back). Since customers will be inclined to choose aservice with a faster response time, it is important to study the impact of response time on batchingand sharing schemes. Thus the success of a scheme can be measured in terms of the number ofrequests that succeed2. The factors that determine the actual response time of a request includethe scheduling policy and the scheduling period. The scheduling period is the time interval betweentwo consecutive invocations of the scheduler. At each invocation of the scheduler (scheduling point),requests for the same topic are batched and scheduled according to the scheduling policy. If thetolerable response time is small, then it is necessary to have smaller scheduling periods or else thepercentage of successful requests will be low. However for better \batching" (maximize sharing), thescheduling period has to be large. Hence, as a compromise, some intermediate value has to be chosensuch that batching can be achieved with smaller response times. Next, we brie y discuss some ofthe scheduling policies. The scheduling policy determines the order in which the requests are to beconsidered (i:e:; if the admission control is successful, then the required resources are allocated) ateach scheduling point. Some of these batching policies have been studied for a video-on-demandserver in [7]. In the First-Come-First-Serve (FCFS) policy, requests for the same topic are batchedand the topic which has the earliest request is considered �rst and so on. In the Maximum-Group-Scheme (MGS) policy, requests for the same topic are batched and the topic which has the maximumnumber of requests is considered �rst. From the point of view of continuous media caching, othervariations are possible where the topic which was scheduled most recently can be scheduled again topromote sharing (i:e:; schedule the topic with the smallest time di�erence between now and whenthe topic was scheduled last). While other policies favor topics that are requested more frequently,the FCFS policy is fair and is recommended in [7]. For our study, we consider the FCFS batchingscheme.Having studied some of the issues related to batching, we now describe how continuous mediacaching can be used in conjunction with batching to improve sharing. This results in considerable2a request succeeds if the required resources are allocated for the entire request and the actual response time doesnot exceed the maximum tolerable response time 11

Page 14: Document;;

savings in the resources required to satisfy a given workload as we will see later in the performancetests. For comparison purposes, we describe two batching schemes that use the FCFS policy forscheduling but use di�erent bu�er management schemes.5.1 Batching without Sharing (BAT-UAT)BAT-UAT is the FCFS batching scheme that uses the UAT bu�er management scheme. SinceUAT does not use continuous media caching, in BAT-UAT data sharing is achieved purely throughthe e�ect of batching. Data sharing possibilities cannot be fully exploited purely through batchingbecause of the dependency between scheduling period and response time which we discussed earlier.5.2 Batching with Sharing (BAT-SHR)BAT-SHR is the FCFS batching scheme that uses the SHR bu�er management scheme. Here thee�ect of sharing is achieved in two di�erent ways | through batching and continuous media caching.Since BAT-SHR uses the same scheduling period as BAT-UAT, the response time will remain thesame or improve (since more resources will be available due to the cumulative e�ects). Howeverthe extra data sharing that could not be achieved by batching is achieved via continuous mediacaching i:e; requests for the same topic may have been scheduled in di�erent neighboring schedulingpoints, but sharing is achieved through continuous media caching when the neighboring points arewithin max-iat. Thus the di�erence in performance between BAT-UAT and BAT-SHR can also beattributed solely to continuous media caching.6 Performance TestsWe have performed a number of tests to compare the di�erent schemes and evaluate the bene�tsof continuous media caching. In this section we discuss our experimental setting, a few of ourassumptions and then describe the various parameters and metrics used for the tests.6.1 Experimental Setting & AssumptionsWe assume a client-server type of architecture shown in �gure 6. A DMA transfer is assumed betweenthe disk and the bu�er at the server and hence the CPU is not likely to be a bottleneck. So theCPU has not been modeled in the performance tests. It is assumed that the network between theclient and the server provides certain quality-of-service (QOS) guarantees about the network delayand bandwidth. A detailed discussion of reliability and networking issues for continuous media canbe found in [3]. We will assume a two level storage system: primary storage (main memory bu�er)and secondary storage (disk). Knowing the maximum seek time, rotational latency and data readrate, the e�ective transfer rate, R can be determined for a storage system. This is used in performingbandwidth allocations for the disk. The secondary storage device could be a magnetic/optical disk oreven a RAID based system for increased bandwidth. We do not go into the details of how multimedia12

Page 15: Document;;

data is organized and retrieved from the secondary storage in an e�cient manner since there hasbeen extensive research in this area [2, 16, 24, 27, 29, 31] and it is orthogonal to the main focusof this work. Since the data rate for video dominates, we assume that requests desiring the sametopic with di�erent language options can share the segments. We also assume that segments are ofequal length (this simpli�es our implementation for the performance tests but does not a�ect theresults).ClientDisk

Client

ClientDisk

Client

ClientDisk

Client

Database

Server

ClientDisk

ClientFigure 6: Con�guration of the SystemAdmission control and bandwidth reservation is performed on a segment basis for each request andif the bandwidth is insu�cient at any point, the request is rejected. The bu�er and disk bandwidthchecks are performed for the entire request at admission time, i:e:, even for segments that will haveto be played at later point in time. Before a segment is placed in the bu�er, the following actions aretaken. First the bu�er is searched to see if the requested segment exists in free-pool. If it does notexist then the required bu�er space is acquired from the free-pool. If this cannot be done becausethe available bu�er space in the free-pool is not su�cient, then the request is rejected and it countsas a reject due to insu�cient bu�er bandwidth. When segments have to be fetched from the disk,if su�cient bu�er space is available then the disk bandwidth is checked for su�ciency. If the diskbandwidth is su�cient then the request proceeds. If all the segments can be brought in, then therequest is successful else the request is rejected and it counts as a reject due to insu�cient diskbandwidth.For the batching schemes, at each scheduling point, the requests are grouped based on thedemanded topic. Since we use a FCFS scheme, the grouped requests are considered one by onebased on the earliest arrival time of a request in the group. A request that is considered undergoesthe admission control and bandwidth reservation described above. If the required resources areavailable and allocated, then all the requests for that topic are successful. If this request cannotbe satis�ed, then the remaining requests are considered for that scheduling point. If a particularrequest cannot be satis�ed at a scheduling point, it will be tried again at the next point and so onas long as the response time requirements are met. Hence a request that could not be satis�ed at aparticular scheduling point may be successful at the next point. If the actual response time increasesthe maximum tolerable, then the request is no longer considered for scheduling and is rejected.13

Page 16: Document;;

6.2 ParametersThe parameters to be considered fall under four di�erent categories | customer, system, topic andbatching. They are listed below in the table with their default values. Topic details (length, datarate, etc.) are generated using a uniform distribution. Customer arrival is modeled using Poissondistribution. The topic chosen by a customer is modeled using Zipf's law [32]. According to this, ifthe topics 1; 2; :::; n are ordered such that p1 � p2 � ::: � pn then pi = c=i where c is a normalizingconstant and pi represents the probability that a customer chooses topic i. This distribution hasbeen shown to closely approximate real user behavior in videotex systems [14] which are similar toon-demand services.Category Parameter Setting (Default Values)Customer Inter-Arrival Time Poisson Distribution with mean of 10 secTopic Chosen Zipf's Distribution (1-10)System Bu�er Size 1.28 GBDisk Rate < determined based on 95 % success rate>Topic Total Number 10Length Uniform Distribution (500-700) secData Rate of Segment Uniform Distribution (1-2) MB/secBatching Maximum Response Time 40 secScheduling Period 10 sec6.3 MetricsA popular metric for these tests would be the percentage of successful requests. But this alone isnot su�cient for system designers since they are more interested in metrics that are closely tied toQOS guarantees. An example of this would be the maximum arrival rate that can be sustained if95 % of the requests are to succeed given certain resources. Another metric would be the resourcerequirements for a 95 % success rate give certain workload. Evaluating metrics that are based onQOS guarantees is a time consuming process since it is iterative in nature. For example, to evaluatethe disk bandwidth that ensures a 95 % success rate when the inter-arrival time is 10 sec, startingfrom a small value of disk bandwidth (for which the success rate may be lower than 95 %), the diskbandwidth has to be incremented till the success rate reaches 95 %.For all our tests, the metric we use is the disk bandwidth needed to guarantee that 95 % of therequests will be successful. For the bu�er management schemes, since we do not batch the requests,95 % of the requests are to be satis�ed with a zero response time. A non-zero value like 10 secondscan also be used as the tolerable response time but since we do not batch requests, this might notbene�cial. Response time is more relevant for the batching schemes, since the scheduling is doneat periodic intervals. A tolerable response time is speci�ed for these tests and 95 % of the requestsshould have response times smaller than the tolerable response time.14

Page 17: Document;;

7 ResultsEach simulation consisted of 25 iterations and each iteration simulated 2000 customers. The con�-dence interval tests of the results indicate that 95 % of the results (percentage of successful requests)are within 3 % of the mean in almost all cases. The results are discussed based on the disk bandwidthrequired to guarantee that 95 % of the requests will succeed. All parameters are set to their defaultvalues speci�ed earlier unless explicitly indicated in the graphs. Topic details (the database accessedby users) are generated for every iteration. We �rst compare the results of the bu�er managementschemes (UAT and SHR) and then compare the batching schemes (BAT-UAT and BAT-SHR). Thearea below the curves are �lled in light gray for schemes that do not use continuous media cachingand in dark gray (which blots out the light gray) for schemes that use continuous media caching.Hence in all the graphs, the light gray region shows the bene�t from continuous media caching.7.1 Comparison of UAT and SHR based on Disk BandwidthThis section compares the performance of UAT and SHR based on the various parameters we men-tioned earlier. We analyze the behavior of the two schemes and discuss how continuous media cachinghelps to improve the performance of the system.7.1.1 Comparison based on Inter Arrival TimeFigure 7(a) shows the e�ect of inter-arrival time on the disk bandwidth requirement. For smallervalues of inter-arrival time, higher disk bandwidths are needed to satisfy the requests. The diskbandwidth requirement drops gradually as the inter arrival time increases. SHR requires about 50-70 % of the bandwidth required by UAT depending upon the inter-arrival time. This clearly showsthe ability of continuous media caching schemes to improve the number of concurrent requests thatcan be handled by a system.7.1.2 Comparison based on Bu�er SizeFigure 7(b) studies the e�ect of bu�er size on the performance of the schemes (a small portion ofthe UAT curve is masked by the SHR curve at the left). In general if the bu�er space available islow, all bu�er mangement schemes experience poor performance. Furthermore, to exploit sharing,SHR needs a sizeable amount of bu�er. It can be observed that at lower bu�er sizes of 400-600 MB,SHR is not able to exploit sharing as much and hence requires disk bandwidth comparable to that ofUAT. However, as the bu�er size available increases (600MB-1.28GB), SHR uses the extra bu�er toexploit sharing and hence begins to outperform UAT. The disk bandwidth requirement drops downto almost half of what is required by UAT. It can be noted that UAT does not utilize the availablebu�er resources to the maximumwhereas SHR does. Hence UAT requires a constant disk bandwidthof 101 MB/s while the disk bandwidth for SHR reduces steadily with an increase in bu�er size. Thisgraph is also more important since it gives the tradeo�s in the system resources (bu�er and disk)required by the di�erent schemes for 95 % success given a certain workload. Essentially the UATscheme is preferable when the available bu�er size is small and if more bu�er is available SHR can15

Page 18: Document;;

be used.� �

UAT� �

SHR

|0

|5

|10

|15

|20

|25

|30

|35

|40

|0

|50

|100

|150

|200

|250

|300

(a) Inter-Arrival Time (sec)

Dis

k R

ate

(MB

/s)

for

95 %

suc

cess

� �

UAT� �

SHR

|320

|480

|640

|800

|960

|1120

|1280

|0

|20

|40|60

|80

|100

|120

|140

|160

(b) Buffer Size (MB)

Dis

k R

ate

(MB

/s)

for

95 %

suc

cess

� � � � �

��

�Figure 7: E�ect of Inter Arrival Time and Bu�er Size7.1.3 Comparison based on Topic ParametersThe next comparison is based on the number of topics available. This test is important since theprobability of locating a required segment in memory with or without planning depends upon thenumber of topics, i:e:; lesser the number of topics, higher the probability of �nding a required segment.It can be observed in �gure 8(a) that the increase in bandwidth is steep initially and then becomesmore gradual. This behavior can be attributed to the Zipf's distribution we have chosen to modelthe customer's topic selection. If a uniform distribution is chosen, the increase in disk bandwidthmay be steeper throughout. The main observation however is that SHR continues to dominate theperformance (with lesser disk bandwidth requirement) even with an increasing number of topics.The bene�t from continuous media caching is highly evident when there are fewer topics.Figure 8(b) explores the e�ect of the length of the topic on the performance of the di�erentschemes. This is important due to the fact that increasing the length of the topic results in arequest taking longer time to complete and hence the number of concurrent customers in the systemincreases, which in turn demands increased bu�er and disk bandwidth. Thus the disk bandwidthrequirements of the schemes increases with an increase in length of topic. However if SHR is used,some of the concurrent sessions are shared and hence the disk bandwidth requirement for SHR isless than that of UAT.Another important parameter to be considered is the average data rate of the segments. Based onthe MPEG-2 standards, the average data rate for a compressed high quality video is around 2MB/s.Hence we have performed tests for average data rate ranging from 2-5MB/s. As expected, we notefrom �gure 8(c) that with increasing data rates, the disk bandwidth requirement increases. Howeverwe observe that SHR requires lesser disk bandwidth compared to UAT. As the data rate increases,the gap between UAT and SHR tends to narrow down since more bu�er space is needed for caching16

Page 19: Document;;

larger amount of data and hence data has to be brought from the disk for some of the requests.� �

UAT� �

SHR

|0

|5

|10

|15

|20

|25

|30

|35

|40

|45

|50

|0

|10

|20

|30

|40

|50

|60

|70

|80

|90

|100

|110

|120

|130

(a) Number of Topics

Dis

k R

ate

(MB

/s)

for

95 %

suc

cess

� � � �

� �

UAT� �

SHR

|100

|200

|300

|400

|500

|600

|700

|800

|0

|20

|40

|60

|80

|100

|120

|140

|160

(b) Length of Topic (sec)

Dis

k R

ate

(MB

/s)

for

95 %

suc

cess

� �

UAT� �

SHR

|2

|3

|4

|5

|60

|100

|140

|180

|220|260

|300

|340

(c) Data Rate (MB/s)

Dis

k R

ate

(MB

/s)

for

95 %

suc

cess

Figure 8: E�ect of Topic Parameters7.2 Comparison of BAT-UAT and BAT-SHR based on Disk BandwidthThe batching schemes also behave similar to the bu�er management schemes, the main di�erencebeing that requests for the same topic are grouped and hence the overall disk bandwidth requirementsis less in all cases. We also discuss results of some additional tests which are based on the batchingparameters, i:e:; tolerable response time and scheduling period.7.2.1 Comparison based on Inter Arrival TimeThe e�ect of inter-arrival time on the two batching schemes is shown in �gure 9(a). Similar to thebehavior we observed in �gure 7(a) we see that BAT-SHR requires lesser disk bandwidth comparedto BAT-UAT. Another important point to notice is that the disk bandwidth requirement has almosthalved and this indicates that inter-arrival time has the most impact on data sharing through batchingthan any other parameter.7.2.2 Comparison based on Bu�er SizeIt can be observed from �gure 9(b) that the disk bandwidth requirement of BAT-SHR is less thanBAT-UAT for all the bu�er sizes we have tested for. This di�ers from our observation in the case ofa non batched system where UAT performed better than SHR for smaller bu�er sizes. Here we havea slight departure from the behavior of �gure 7(b) in the sense that the disk bandwidth required forBAT-SHR is always less than BAT-UAT and this can be attributed to batching. We also observethat the overall disk bandwidth requirements has dropped but not as much as we noticed in the caseof the inter-arrival time. 17

Page 20: Document;;

� �

BAT-UAT� �

BAT-SHR

|0

|5

|10

|15

|20

|25

|30

|35

|40

|0

|50

|100

|150

(a) Inter-Arrival Time (sec)

Dis

k R

ate

(MB

/s)

for

95 %

suc

cess

� �

BAT-UAT� �

BAT-SHR

|320

|480

|640

|800

|960

|1120

|1280

|0

|20|40

|60

|80

|100

|120

|140

(b) Buffer Size (MB)

Dis

k R

ate

(MB

/s)

for

95 %

suc

cess

� � � � �

��

�Figure 9: E�ect of Inter Arrival Time and Bu�er Size� �

BAT-UAT� �

BAT-SHR

|0

|5

|10

|15

|20

|25

|30

|35

|40

|45

|50

|0|10

|20

|30

|40

|50

|60

|70

|80

|90

|100

|110

|120

|130

(a) Number of Topics

Dis

k R

ate

(MB

/s)

for

95 %

suc

cess

� � � �

� � �

� �

BAT-UAT� �

BAT-SHR

|100

|200

|300

|400

|500

|600

|700

|800

|0

|20

|40

|60

|80

|100

|120|140

(b) Length of Topic (sec)

Dis

k R

ate

(MB

/s)

for

95 %

suc

cess

��

� �

BAT-UAT� �

BAT-SHR

|2

|3

|4

|5

|60

|100

|140

|180

|220

|260

|300

(c) Data Rate (MB/s)

Dis

k R

ate

(MB

/s)

for

95 %

suc

cess

Figure 10: E�ect of Topic Parameters7.2.3 Comparison based on Topic ParametersThe e�ects of the topic parameters is shown in �gure 10(a),(b) &(c). Even here we observe a behaviorsimilar to that of �gure 8. Again we see that the disk bandwidth requirement has dropped due tothe e�ect of batching. We also observe the e�ect of continuous media caching in conjunction withbatching in �gure 10(a). The disk bandwidth requirement for BAT-SHR does not increase even whenthe number of topics is increased from 10 to 50 which is di�erent from what we observed in the caseof SHR in �gure 8(a). Essentially batching contributes to additional data sharing and hence thereduction in the disk bandwidth requirement. 18

Page 21: Document;;

7.2.4 Comparison based on Batching ParametersThe e�ect of the maximum tolerable response time on the two batching schemes is shown in �gure11(a). Here we observe that the drop in disk bandwidth requirement for BAT-SHR is more steepercompared to BAT-UAT. This clearly indicates that continuous media caching in conjunction withbatching tries to e�ectively utilize the increase in tolerable response time.Figure 11 (b) indicates more precisely how continuous media caching increases the e�ect ofsharing. While it can be observed that at low scheduling intervals the e�ect of sharing is notcompletely achieved by BAT-UAT through batching, BAT-SHR achieves this by using continuousmedia caching in conjunction with batching. Hence the disk bandwidth required for BAT-SHR isalmost the same irrespective of the scheduling period while BAT-UAT requires relatively higher diskbandwidth at smaller scheduling periods. As discussed earlier, in order to satisfy the response timerequirements, it might be necessary to have smaller scheduling periods. To have better performance(lesser disk bandwidth) at smaller scheduling periods, BAT-SHR scores over BAT-UAT.� �

BAT-UAT� �

BAT-SHR

|10

|15

|20

|25

|30

|35

|40

|0

|10

|20|30

|40

|50

|60

|70

|80

|90

|100

|110

|120

|130

(a) Tolerable Response Time (sec)

Dis

k R

ate

(MB

/s)

for

95 %

suc

cess

�� �

� �

��

� �

� �

BAT-UAT� �

BAT-SHR

|5

|10

|15

|20

|25

|30

|35

|40

|0

|10

|20

|30

|40

|50

|60

|70

|80

|90

|100

|110

|120

|130

(b) Scheduling Period (sec)

Dis

k R

ate

(MB

/s)

for

95 %

suc

cess

��

��

� � � � �Figure 11: E�ect of Batching ParametersThus using these tests we have been able to demonstrate that in multimedia information systemswhere sharing can be exploited, schemes can and must be designed such that they make use of thissharing potential based on information about past and future access to perform more e�cient bu�ermanagement.8 E�ect of Fast Forward/Rewind on Continuous Media CachingIn this section we brie y discuss the e�ects of fast forward and rewind while continuous media cachingis used. Support for fast forward and rewind (which are commonly known as VCR functions) is atopic that is being researched [8]. Concrete solutions to this problem have not yet been proposed.19

Page 22: Document;;

Our discussion attempts to describe enhancements to support fast forward and rewind in the contextof the schemes we have proposed.Before we proceed with our discussion, we make a few reasonable assumptions. We assume thatfast-forward/rewind will be requested at salient points in a topic. In this case, salient point wouldmean the beginning of a segment. So the user requests could be as follows: (1) Rewind this topic fora duration (playback duration) of 5 seconds or (2) Rewind this topic to the beginning of the currentnews item or (2) Fast Forward this topic to the next news item. This type of fast-forward/rewind(in multiples of small units) can be found in CD players and many multimedia packages. Since weare assuming this type of support, the user may not wish to preview the video when fast-forward orrewind is being performed. Hence our second assumption is that the video frames need to be shownat 2X or 3X speeds (by skipping segments) during fast forward or rewind. Since the issues related tofast forward and rewind are similar, we only discuss the support for fast forward in detail. Similarideas can be used for rewind.Consider the case when a user requests a fast forward. Depending upon whether this user is aleading user or a lagging user di�erent course of action has to be taken. If this user is not involvedin the sharing process (i:e; this user is neither a leading user nor a lagging user) the disk andbu�er resources allocated have to be released �rst. Then depending upon the possibility whetherthe modi�ed request (i:e: after fast forward) can be shared with another request, the disk andbu�er resources can be allocated appropriately. If the request cannot be allocated due to insu�cientresources, it will have to be tried again at periodic intervals. This implies that the user will pay aprice of some delay if a fast forward is requested. One optimization possible in such an environment isthat extra resources can be allocated for a request at admission time (depending upon the maximumdata rate of the segments of a topic) such that there is no necessity for deallocation and reallocationof resources. If this is a leading user, then all the sequences of action described above for deallocationand reallocation of resources have to be performed for both the leading as well as the lagging users.The situation gets more interesting if this is a lagging user. We de�ne the cross-over event as onewhere a lagging user fast forwards and moves ahead of the leading user or a leading user rewinds andgets behind the lagging user. If the lagging user does not have a cross-over, then the time di�erentialbetween the lagging and the leading user decreases and hence lesser bu�er resources are requiredfor the lagging user (the duration of time for which the segments are to be carried over decreases).By some simple deallocations and reallocation for the lagging user, the excess bu�ers can be freedfor use by others. Since the leading user will be una�ected as there is no cross-over, the resourcerequirements need not be checked for the leading user. This is one of the situations where a fastforward request can be advantageous to the continuous media caching schemes. If there are morelagging users following this user, then some resource allocation and deallocation have to be donefor those lagging users. If the lagging user has a cross-over, then resources have to be released andreallocated for both the lagging and the leading user. However some optimization can be performedin the sense that the lagging user could possibly become a leading user and vice versa if the newtime di�erential is small and less that the max-iat.In summary, fast forward and rewind is a complex problem to solve and the implications on con-tinuous media sharing are many. We have discussed some solutions to support these VCR functions20

Page 23: Document;;

when continuous media caching is used. We also discussed some speci�c cases where these functionscan be bene�cial in continuous media caching environments. Because of the extensive performancebene�ts provided by continuous media caching, even though there are some complexities involved insupporting fast forward and rewind, it will de�nitely payo� to use it.9 Extensions to this WorkWe mentioned earlier in section 2 that in the case of NODS, sharing is also possible at the news-itemlevel. e:g:; when news items are being shared by the topics. Though we have not performed anyexplicit tests to investigate this, it would be interesting to determine the amount of performance thatcan be gained. Another issue to be considered in the context of continuous media caching is the wayin which the schemes must be changed if there are alternatives paths available at intermediate pointsof the request (e:g: in sports news items, if extra coverage of the user's favorite teams is needed).The concept of continuous media caching and the bu�er management schemes that utilize thisare applicable in many other scenarios. A modi�cation of this technique is applicable for storagemanagement in a multi-level hierarchy, i:e:; data management on the secondary device while data isfetched from a tertiary device. This is one of the areas in multimedia that needs immediate attention.It may not possible to store all data that is required on a secondary device (disk system) for thedi�erent topics o�ered. Data might have to be periodically fetched from a tertiary device (digitaljuke box library) onto the secondary device. Instead of fetching data for the entire request fromthe tertiary device, by sharing data (i:e:; preserving data on the disk in a controlled fashion) only afew chunks of data might have to be fetched from the tertiary device to the secondary device. As aresult, the transfer time between the tertiary and the secondary device can be considerably reduced.This concept is likely to be useful for video-on-demand services (VODS) when some of the titlesshown are in heavy demand. However VODS typically requires a di�erent type of support for fast-forward and rewind and hence enhancements are necessary to the methods we discussed earlier forfast-forward and rewind.10 ConclusionThe data rates of multimedia data is very high and the storage system is likely to limit the numberof concurrent sessions supported by the system. To make e�cient use of data that has been fetchedinto the bu�er, by considering the applications of NODS, we explored the potential bene�ts ofcontinuous media sharing to reduce the number of disk I/Os and hence support more number ofusers. Traditional schemes like UAT do not explicitly consider sharing of data among concurrentusers. We e�ectively utilize the advance knowledge of the multimedia stream to be accessed topromote sharing. Using the simple technique of continuous media caching (planned sharing) thatpreserves bu�ers in a controlled fashion for use by subsequent users requesting the same data, weproposed enhancements to the UAT scheme to form the SHR scheme. We then used continuousmedia caching in conjunction with a traditional batching scheme BAT-UAT, creating BAT-SHR, anew scheme that compounds the e�ect of sharing.To compare the performance of our proposed schemes with traditionally used schemes, we con-21

Page 24: Document;;

ducted a variety of performance tests. The parameters we used for the tests include bu�er size(system parameter), mean arrival time (customer parameter), number of topics, length of topic anddata rate of segment (topic parameters), scheduling period and maximum response time (batchingparameters). The metrics we used for the tests is the disk bandwidth required if 95% of the requestsare to succeed. We observed that SHR performs very well in all cases since it uses shared data to thebest possible extent | by using data that might already be available in the bu�er when the requestarrives and utilizing continuous media caching for future accesses. Speci�cally we observed that thedisk bandwidth for SHR is low in all cases, thereby verifying the fact that SHR reduces the numberof disk I/Os. Hence, even at low disk bandwidth SHR can give good performance. Also, if the datarate increases, SHR degrades more gracefully compared to the other schemes. Similar observationswere also made about BAT-SHR, substantiating the fact that continuous media caching can providesigni�cant performance gains over traditional batching schemes. In particular, tests to study thee�ect of scheduling period indicated that the disk bandwidth requirement of BAT-SHR is constantsince sharing that could not be achieved through batching is obtained via continuous media caching.Thus continuous media caching is a simple and elegant technique for enhancing the performance of asystem by promoting data sharing. In summary, to increase the number of users supported, insteadof enhancing the storage system to provide additional bandwidth, by adding some extra memoryand by exploiting data sharing opportunities, higher performance can be achieved at lower costs(since disk bandwidth is always limited by mechanical movements, the trend towards main memorysystems is increasing and our schemes concur with this idea).While our schemes try to utilize the bu�er to the maximum extent possible, traditional schemes(UAT and BAT-UAT) do not aim at maximizing the bu�er or disk utilization. It would be interestingto compare our schemes (SHR and BAT-SHR) with other schemes that try to maximize the diskand bu�er utilization. Unless some planned sharing is done using schemes like continuous mediacaching, we conjecture that schemes that try to maximize the bu�er and disk utilization will notnecessarily yield higher percentage of successful requests, which is more important. Our schemes canbe enhanced with schemes that try to maximize the disk and bu�er utilization to yield even higherperformance (by prefetching segments from disk when extra bu�er space is available).While there are no concrete schemes to support fast-forward and rewind, by making a few rea-sonable assumptions, we have discussed how our schemes are to be enhanced to support fast-forwardand rewind. Thus we have taken an exhaustive look at all the important issues that arise whenconsidering continuous media sharing. We also discussed how this concept can be used in anotherimportant area of research in multimedia systems| e�cient data transfer from tertiary to secondarydevices in a multi-level storage hierarchy.Through this work, we hope to have motivated the signi�cance of bu�er management and contin-uous media sharing in the context of complex multimedia information systems which deal with hugeamounts of data. Designers of these systems should consider this and exploit the characteristics ofthe multimedia application whenever possible to get the best performance out of their systems.22

Page 25: Document;;

References[1] J. F. Allen, \Maintaining Knowledge about Temporal Intervals", in Communications of theACM, Vol 26, No 11, November 1983, pp 832-843.[2] D. P. Anderson, Y. Osawa and R. Govindan, \A File System for Continuous Media", in ACMTransactions on Computer Systems, Vol. 10, No. 4, November 1992, pp 311-337.[3] D. P. Anderson, \Metascheduling for Continuous Media", in ACM Transactions on ComputerSystems, Vol. 11, No. 3, August 1993, pp 226-252.[4] P. B. Berra et al, \Issues in Networking and Data Management of Distributed MultimediaSystems", in Proceedings of Symposium on High Performance Distributed Computing, September1992.[5] S.Christodoulakis, \Multimedia Data Base Management: Applications and Problems - A Posi-tion Paper", in Proc. of ACM SIGMOD, 1985, pp 304-305.[6] S.Christodoulakis et al, \An Object-Oriented Architecture for Multimedia Information Sys-tems", in The Quarterly Bulletin of the IEEE Computer Society Technical Committee on DataEngineering, Vol. 14, No. 3, September 1991, pp 4-15.[7] A. Dan, D. Sitaram and P. Shahabuddin, \Scheduling Policies for an On-Demand Video Serverwith Batching", IBM Research Report RC 19381, 1994.[8] J. K. Dey, James D. Salehi, James F. Kurose, Don Towsley, "Providing VCR Capabilities inLarge-Scale Video Servers", Research Report, Dept of Computer Science, Univ of Massachusetts,March 1994 (submitted for publication).[9] N. Davies and J. Nicol, \Technological Perspectives on Multimedia Computing", in ComputerCommunications, June 1991.[10] J.K. Dey, C.H. Shih and M. Kumar, \Storage SubsystemDesign in a Large Multimedia Server forHigh-Speed Network Environments", in IS&T/SPIE Symposium on Electronic Imaging Scienceand Technology, San Jose, February 1994.[11] W. E�elsberg and T. Haerder, \Principles of Database Bu�er Management", in ACM Transac-tions on Database Systems, Vol. 9, No. 4, December 1984, pp 560-595.[12] L. Felician, \Simulative and Analytical Studies on Performances in Large MultimediaDatabases", in Information Systems, Vol. 15, No. 4, pp 417-427, 1990.[13] Edward Fox, \Advances in Interactive Digital Multimedia Systems", in IEEE Computer, Octo-ber 1991.[14] J. Gecsei, The Architecture of Videotex Systems, Prentice-Hall, Englewood Cli�s NJ, 1983.[15] S. Gibbs, C. Breiteneder and D. Tsichritzis, \Audio/Video Databases: An Object OrientedApproach", in Proceedings of 9th International Conference on Data Engineering, Vienna, April1993, pp 381-390.[16] J. Gemmel and S. Christodoulakis, \Principles of Delay-Sensitive Multimedia Data Storage andRetrieval", in ACM Transactions on Information Systems, Vol. 10, No. 1, January 1992.[17] S.Ghandeharizadeh et al, \Object Placement in Parallel Hypermedia Systems", in Proc. of 17thVLDB Conference, Barcelona, Sept. 1991, pp 243-254.[18] J. Huang and J. A. Stankovic, \Bu�er Management in Real-Time Databases", Technical Report90-65, Department of Computer Science, University of Massachusetts, July 1990.23

Page 26: Document;;

[19] M. Kamath, K. Ramamritham and D. Towsley, \Bu�er Management for Continuous MediaSharing in Multimedia Database Systems", Technical Report 94-11, Department of ComputerScience, University of Massachusetts, February 1994.[20] K. Kawagoe, \Continuous Media Data Management", in SIGMOD Record, Vol. 20, No. 3,September 1991, pp 74-75[21] W. Klas, E.J. Neuhold and M. Shre , \Using an Object-Oriented Approach to Model MultimediaData", in Computer Communications, Vol. 13, No. 4, pp 204-216, 1990.[22] D. L. Gall, \MPEG: A Video Standard for Multimedia Applications", in Communications ofthe ACM, Vol. 34, No. 4, April 1991, pp 46-58.[23] T.D.C.Little, A.Gafoor et al., \Multimedia Synchronization", in The Quarterly Bulletin of theIEEE Computer Society Technical Committee on Data Engineering, Vol. 14, No. 3, September1991, pp 26-35.[24] P. Lougher and D. Shepherd, \The Design of a Storage Server for Continuous Media", in TheComputer Journal, Vol. 36, No. 1, 1993.[25] W. Putz and E. Neuhold, \is-News: a Multimedia Information System", in The QuarterlyBulletin of the IEEE Computer Society Technical Committee on Data Engineering, Vol. 14, No.3, September 1991, pp 16-25.[26] A.J. Smith, \Sequentiality and Prefetching in Database Systems" in ACM Transactions onDatabase Systems, Vol. 3, No. 3, September 1978, pp 223-247.[27] R. Staehli and J. Walpole, \Constrained-Latency Storage Access", in Computer, March 1993,pp 44-53.[28] R.Steinmetz, \Synchronization Properties in Multimedia Systems", in IEEE Journal on SelectedAreas of Communication, Vol.8, No.3, April 1990, pp 401-412.[29] P. Venkat Rangan and Harrick M. Vin, \E�cient Storage Techniques for Digital ContinuousMultimedia", in IEEE Transactions on Knowledge and Data Engineering, August 1993.[30] D.Woeld and W.Kim, \Multimedia Information Management in an Object-Oriented System",in Proc. of the 13th VLDB Conference, Brighton, 1987, pp 319-329.[31] C. Yu et al, \E�cient Placement of Audio Data on Optical Disks for Real-Time Applications",Communications of the ACM, Vol. 32, No. 7, July 1989, pp 862-871.[32] G.K. Zipf, Human Behaviour and the Principles of Least E�ort, Addison-Wesley, Reading MA,1949.24