15
IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 22, NO. 5, JUNE 2004 889 End-to-End Quality-of-Service Coordination for Mobile Multimedia Applications Teodora Guenkova-Luy, Andreas J. Kassler, Member, IEEE, and Davide Mandato Abstract—Transmitting real-time multimedia streams over heterogeneous mobile networks is a challenging task. Variation in network and system conditions can dramatically affect applica- tion performance. When providing end-to-end quality-of-service (QoS) multiple system facets should be coordinated: orchestration of local and peer resources, reservation of network resources, adaptation of multimedia streams, etc. This paper presents an end-to-end negotiation protocol (E2ENP) for negotiating and coordinating QoS on an end-to-end basis both at application and network layer. Based on a flexible extensible markup language (XML) model and extending SDPng concepts, the protocol enables the negotiation of system capabilities and allows provider-services to effectively influence the negotiation process. The aim of the E2ENP design is to optimize the efficiency of multimedia call setup and reduce the time for QoS renegotiations, whenever vertical handovers or spontaneous network reconfigurations occur. The basic protocol is presented, together with implementation and measurement results, stemming from several studies on current and future third-generation/fourth-generation scenarios. Index Terms—End-to-end negotiation protocol (E2ENP), exten- sible markup language (XML), mobile networks, quality-of-ser- vice (QoS), real-time multimedia networking, session description protocol new generation (SDPng), session initiation protocol (SIP), third-generation/fourth-generation (3G/4G). I. INTRODUCTION H IGH-DEMANDING audio/video applications that operate in mobile environment experience frequent quality-of-service (QoS) violations due to packet loss and delay variations. The reasons are mainly signal fluctuations on the radio link, handovers between different wired or wireless network technologies, and network congestion. Ubiquitous uti- lization of distributed, multimedia services in next-generation [third-generation/fourth-generation (3G/4G)] mobile systems involves application of heterogeneous end-devices (e.g., mobile phones, portable PCs, etc.), which vary significantly in their capabilities to support multimedia streaming (i.e., different codecs, memory size, processing units, operating systems, etc.). This paper focuses primarily on the 3G/4G mobile systems and ad hoc networks connected to infrastructure wireless networks. Mobility within provider-owned networks can also affect Manuscript received June 1, 2003; revised December 20, 2003. This work was supported in part in the framework of the IST Project IST-2000-28584 MIND, which is supported in part by the European Union, and in part by the DFG within the AKOM framework. T. Guenkova-Luy is with the Department of Distributed Systems, University of Ulm, Ulm 89069, Germany (e-mail: [email protected]). A. J. Kassler is with SCE, Nanyang Technological University, Singapore 639798 (e-mail: [email protected]). D. Mandato is with WSL-SCLE, Sony International (Europe) GmbH, Stuttgart 70327, Germany (e-mail: [email protected]). Digital Object Identifier 10.1109/JSAC.2004.826926 end-to-end system management, as the subscriber management can restrict network-resource utilization in accordance with preagreed user-provider contracts. Hence, enforcement of end-to-end QoS throughout all layers of a distributed mul- timedia system is a complex process incorporating multiple tasks. Current protocols and mechanisms for supporting QoS cover mostly only single facets of the global QoS management. The real-time protocol suite (RTP/RTCP) [1] is used for multimedia data transfer and QoS feedback, resource reservation protocol (RSVP) [2] allows to reserve network resources, common open policy service (COPS) [3] manages policy exchange, etc. However, QoS is a system aspect that crosses all components and layers of a distributed multimedia system. QoS manage- ment should, therefore, incorporate mapping between different classes and types of QoS parameters, orchestrating, respec- tively, various system facets. An adequate QoS coordination mechanism should in addition consider how the user experi- ences system performance. QoS management of multimedia sessions should be handled effectively also in situations, where vertical handover or resource-availability changes result in QoS violations. In such situations, applications typically have to adapt and reconfigure themselves. Signaling must be very efficient to minimize service disruption in such scenarios. This paper presents an end-to-end negotiation protocol (E2ENP) for capabilities and QoS. E2ENP uses session initia- tion protocol (SIP) [4] to transfer control data. An extensible description model based on extensible markup language (XML) [5] is used to specify system characteristics and QoS param- eters based on enhancements for session description protocol new generation (SDPng) [6]. We discuss problems of QoS management and present patterns for applying end-to-end QoS and system-resource coordination. Application- and implemen- tation-specific features of E2ENP are presented. A thorough evaluation of the protocol based on real implementation is used to analyze the protocol performance. We conclude the paper with discussion of measurements and our future research. II. REFERENCE MODEL FOR END-TO-END QoS COORDINATION In an end-to-end QoS management architecture [7], dif- ferent roles (e.g., end systems, network providers, etc.) can be identified. Consequently, a set of conceptual models can be derived [8]. A system model identifies relevant QoS-parameter categories. A description model provides a formal definition of system parameters according to the categories associated with the system model. A negotiation model is used for configura- tion of distributed multimedia systems on an end-to-end basis 0733-8716/04$20.00 © 2004 IEEE

End-to-End Quality-of-Service Coordination for Mobile Multimedia

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 22, NO. 5, JUNE 2004 889

End-to-End Quality-of-Service Coordination forMobile Multimedia Applications

Teodora Guenkova-Luy, Andreas J. Kassler, Member, IEEE, and Davide Mandato

Abstract—Transmitting real-time multimedia streams overheterogeneous mobile networks is a challenging task. Variation innetwork and system conditions can dramatically affect applica-tion performance. When providing end-to-end quality-of-service(QoS) multiple system facets should be coordinated: orchestrationof local and peer resources, reservation of network resources,adaptation of multimedia streams, etc. This paper presents anend-to-end negotiation protocol (E2ENP) for negotiating andcoordinating QoS on an end-to-end basis both at application andnetwork layer. Based on a flexible extensible markup language(XML) model and extending SDPng concepts, the protocol enablesthe negotiation of system capabilities and allows provider-servicesto effectively influence the negotiation process. The aim of theE2ENP design is to optimize the efficiency of multimedia call setupand reduce the time for QoS renegotiations, whenever verticalhandovers or spontaneous network reconfigurations occur. Thebasic protocol is presented, together with implementation andmeasurement results, stemming from several studies on currentand future third-generation/fourth-generation scenarios.

Index Terms—End-to-end negotiation protocol (E2ENP), exten-sible markup language (XML), mobile networks, quality-of-ser-vice (QoS), real-time multimedia networking, session descriptionprotocol new generation (SDPng), session initiation protocol (SIP),third-generation/fourth-generation (3G/4G).

I. INTRODUCTION

H IGH-DEMANDING audio/video applications thatoperate in mobile environment experience frequent

quality-of-service (QoS) violations due to packet loss anddelay variations. The reasons are mainly signal fluctuations onthe radio link, handovers between different wired or wirelessnetwork technologies, and network congestion. Ubiquitous uti-lization of distributed, multimedia services in next-generation[third-generation/fourth-generation (3G/4G)] mobile systemsinvolves application of heterogeneous end-devices (e.g., mobilephones, portable PCs, etc.), which vary significantly in theircapabilities to support multimedia streaming (i.e., differentcodecs, memory size, processing units, operating systems, etc.).This paper focuses primarily on the 3G/4G mobile systems andad hoc networks connected to infrastructure wireless networks.Mobility within provider-owned networks can also affect

Manuscript received June 1, 2003; revised December 20, 2003. This work wassupported in part in the framework of the IST Project IST-2000-28584 MIND,which is supported in part by the European Union, and in part by the DFG withinthe AKOM framework.

T. Guenkova-Luy is with the Department of Distributed Systems, Universityof Ulm, Ulm 89069, Germany (e-mail: [email protected]).

A. J. Kassler is with SCE, Nanyang Technological University, Singapore639798 (e-mail: [email protected]).

D. Mandato is with WSL-SCLE, Sony International (Europe) GmbH,Stuttgart 70327, Germany (e-mail: [email protected]).

Digital Object Identifier 10.1109/JSAC.2004.826926

end-to-end system management, as the subscriber managementcan restrict network-resource utilization in accordance withpreagreed user-provider contracts. Hence, enforcement ofend-to-end QoS throughout all layers of a distributed mul-timedia system is a complex process incorporating multipletasks.

Current protocols and mechanisms for supporting QoS covermostly only single facets of the global QoS management. Thereal-time protocol suite (RTP/RTCP) [1] is used for multimediadata transfer and QoS feedback, resource reservation protocol(RSVP) [2] allows to reserve network resources, commonopen policy service (COPS) [3] manages policy exchange, etc.However, QoS is a system aspect that crosses all componentsand layers of a distributed multimedia system. QoS manage-ment should, therefore, incorporate mapping between differentclasses and types of QoS parameters, orchestrating, respec-tively, various system facets. An adequate QoS coordinationmechanism should in addition consider how the user experi-ences system performance. QoS management of multimediasessions should be handled effectively also in situations, wherevertical handover or resource-availability changes result inQoS violations. In such situations, applications typically haveto adapt and reconfigure themselves. Signaling must be veryefficient to minimize service disruption in such scenarios.

This paper presents an end-to-end negotiation protocol(E2ENP) for capabilities and QoS. E2ENP uses session initia-tion protocol (SIP) [4] to transfer control data. An extensibledescription model based on extensible markup language (XML)[5] is used to specify system characteristics and QoS param-eters based on enhancements for session description protocolnew generation (SDPng) [6]. We discuss problems of QoSmanagement and present patterns for applying end-to-end QoSand system-resource coordination. Application- and implemen-tation-specific features of E2ENP are presented. A thoroughevaluation of the protocol based on real implementation is usedto analyze the protocol performance. We conclude the paperwith discussion of measurements and our future research.

II. REFERENCE MODEL FOR END-TO-END

QoS COORDINATION

In an end-to-end QoS management architecture [7], dif-ferent roles (e.g., end systems, network providers, etc.) can beidentified. Consequently, a set of conceptual models can bederived [8]. A system model identifies relevant QoS-parametercategories. A description model provides a formal definition ofsystem parameters according to the categories associated withthe system model. A negotiation model is used for configura-tion of distributed multimedia systems on an end-to-end basis

0733-8716/04$20.00 © 2004 IEEE

890 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 22, NO. 5, JUNE 2004

Fig. 1. E2ENP reference model.

and considers QoS management roles as identified in [7]. Thissection presents the system model. The description model andthe negotiation model are handled separately in Section III.

The system model distinguishes between applied ab-stractions of distributed mobile multimedia services. Theseabstractions incorporate the complete process for achievingQoS on an end-to-end basis. The model helps to identify pa-rameter classes, which are later applied to specify an accuratedescription and management model for QoS. Our system model(Fig. 1) includes an end-device,1 which runs an application thatcommunicates with its peer. The application uses the operatingsystem (OS) or an intermediate middleware, which includesscheduling mechanisms for accessing terminal resources (likeCPU or network buffer) and mechanisms for accessing thenetwork. The system software running on the network nodesapplies buffering and scheduling mechanisms to manage timelyaccess to network resources.

This model naturally leads to the identification of severalabstraction levels of QoS and their corresponding parameters,which play a key role in the global QoS management process(see [9]):

End-to-end perceivable QoS parameters: This set ofparameters corresponds to the user’s perception of the per-formance of the distributed application. The translation ofthe perceivable QoS characteristics in more technical terms istypically implemented inside the application.

Application QoS parameters are used to describeend-to-end application performance in accordance with soft-ware and hardware resources of end systems/services. InE2ENP, these parameters are negotiated between the peersfor coordinating QoS end-to-end in form of QoS contracts atapplication level. Typical parameters for video are frame size,frame rate, and visual quality.

Transport/network QoS parameters are used to describeend-to-end requirements with respect to network resources.The system must derive these values based on actual capa-

1End-device, end system device, end system, or terminal are synonyms usedinterchangeably throughout this work.

bilities/codecs and their specific QoS configurations, mediacharacteristics, and available network-access technology. InE2ENP, these parameters are associated with QoS contracts attransport level.

End system specific resource parameters are memory,CPU, battery power, etc. Such parameters are not negotiated,but may be applied to generate specific end system policiesfor the end system resource-reservation management. Suchpolicies can be used within the application to generate correctE2ENP payloads, when a multimedia session is opened orwhen an adaptation condition for the session occurs [10]. Theapplication communicates to its peer only the results of anadaptation condition using E2ENP.

In addition to the QoS relevant parameters, end systems mustagree on a common set of input/output configurations (like ad-dresses and ports) and other capabilities (like codecs, RTP pack-etization rules, etc.) in order to establish a valid end-to-end mul-timedia session.

Rules and conditions for mapping between end-to-endperceivable QoS and application QoS parameters can bebased on psychovisual experiments [11]. Mapping betweenapplication QoS parameters and transport/network QoS pa-rameters depends on codecs, codec-configurations, and themedia characteristics [11]–[14]. For audio, this mapping isstraightforward (a codec together with its parameterizationresults in network traffic requirements). However, for variablebit-rate video streams, this mapping also depends on the targetvisual quality and the amount of motion.

III. END-TO-END NEGOTIATION PROTOCOL FOR QoS

A key concept of E2ENP is the possibility to actively nego-tiate system configurations (i.e., application- and transport-levelQoS and system capabilities) for multimedia-session establish-ment and to dynamically and flexibly leverage this informa-tion for QoS-adaptation purposes. Furthermore, these configu-rations can be repeatedly applied to sessions having similar con-text (e.g., session parameters can be saved in the form of an ad-vanced address book and then reused). This section presents an

GUENKOVA-LUY et al.: END-TO-END QOS COORDINATION FOR MOBILE MULTIMEDIA APPLICATIONS 891

overview on E2ENP negotiations, a complete specification ofthis protocol can be found in [8]. The peer starting a negotiationprocess is called “offerer” and the responder peer—“answerer,”analogously to the “offer/answer model” [15].

A. Description Model

There are several standardization efforts concerning develop-ments of description models.

• MPEG-21 within the moving picture experts group(MPEG) deals with XML schemata for describing userenvironments and terminal characteristics [16].

• International Telecommunication Union (ITU) providesdescriptions for terminal characteristics in form of codecs[17] and network parameter specification as networkclasses definition [18], [19]. Additionally, ITU specifies anamespace for defining picture size for video information[20].

• Multiparty Multimedia Session Control Working Group(MMUSIC WG) [21] develops session description pro-tocol (SDP [22], SDPng [6]) with its enhancements fortransport QoS [23]. These protocols define descriptionschemes for specifying terminal characteristics in the formof codecs. In addition, [24] defines profiles to parame-terize audio and video codecs. Currently, there is a de-sired transition from SDP to SDPng [25] to introduce pos-sibilities for negotiation, QoS, security, and other systemfeatures.

E2ENP defines its own description model using a hierarchicalQoS specification that enables the management of a multimediasystem at different abstraction levels (e.g., media stream, groupsof streams, media session, media applications).

1) Hierarchical QoS Specification and QoS Adapta-tion: Mobile multimedia applications manage one or (simulta-neously) multiple media types (e.g., audio, video, data). In thelatter case, the corresponding media streams can be logicallygrouped based on various criteria. Applications that servemultiple users or allow handling distinct conversations (e.g.,private or business conversations) may aggregate the corre-sponding streams in session/-s. This bundling of streams allowsassigning priorities that the services may then convenientlyuse to select the most appropriate QoS adaptation strategy(e.g., private conversations shall be downgraded to sustainbusiness conversations, in case of limitations in the resourceavailability). Furthermore, within a session, all of the streamsoriginating from/ending in a terminal might be aggregated instream associations (e.g., in a game application the backgroundmusic streams shall be dropped to sustain the quality of thelife chat streams of the players). This stream grouping allowsassigning priorities among different associations and definingappropriate adaptation strategies within the context of a singlesession. Hence, the overall QoS specification can be conve-niently modeled in a hierarchical manner so as to capture timesynchronization, QoS correlation, and resource constraintsaspects among streams. A hierarchical model allows designingalternative QoS specifications, which the application needsfor efficiently and automatically adapting its resource usagewith respect to the current resource availability and user’sexpectations.

The model consists of a tree of QoS specifications. These QoSspecifications are defined at different levels of abstractions. Theroot of the tree is associated with a QoS specification that im-poses general constraints on the amount of resources used byall of the sessions along with their corresponding streams. AQoS specification associated with a branch node of the tree isnamed “QoS context”: it represents a high-level application QoScontract. A QoS context contains predefined adaptation rulesthat steer the adaptive application to choose the most appro-priate QoS specification upon given resource usage conditionsfor a group of media types. A QoS context also allows definingtime-synchronization constraints on subordinate QoS specifi-cations (for instance, time synchronization constraints are re-quired in a movie play out among the audio streams and themouth movements of the actors depicted in the video stream).Furthermore, a QoS context allows defining QoS correlationconstraints among media streams that are bundled for applica-tion specific reasons. For instance, electronic game applicationsand/or media-rich interactive applications might feature bundlesof audio and video streams, which are associated with objectsto be presented to the user. As an example, a rotating cube dis-played on a monitor with its faces textured with images fromdifferent video streams, might be conveniently associated witha specification imposing constraints (e.g., lower limits) on theamount of the resources shared by the play out of the variousstreams and distributed to the streams associated with the cur-rently visible faces of the cube. Finally, each leaf of the treeis associated with a QoS specification named “QoS contract”:it represents the application QoS contract for the given stream(which defines a unique QoS configuration of the stream). Eachapplication QoS contract has a one-to-one correspondence witha specific transport QoS contract.

The various QoS specification abstraction levels are orga-nized within the tree as follows (in descending order of levelsfrom the root—see Fig. 2): Session QoS context (definingpriorities and quotas), stream association QoS context, streamQoS contract. At each level, siblings represent alternativeQoS specification the application can choose from whenprovisioning/adapting QoS.

Any given subtree originating from a specific branch nodeis associated with an adaptation-rule predicate. The resolvingof this predicate selects a child node and, hence, instructs thesystem to enforce the QoS contract and/or QoS context asso-ciated with that child. For instance (see Fig. 2), “if video pa-rameters V11 are no longer enforceable, switch to video pa-rameters V12” or “if stream association 1” (video and audio)is not supportable (e.g., due to handover and, thus, lower datarate availability) switch to “stream association 2” (only audio).In the latter case, the adaptation process translates in a changeof QoS context or, more simply, in a QoS context switch (whichrecalls the context switch concept as defined in the operatingsystem ambit, hence the name QoS context).

It is completely up to the adaptive application and its businesslogic to determine when to adapt (e.g., if packet loss ratio is be-tween 0.1% and 0.5% for 2 s), how to adapt (e.g., switch codecor remove video stream) and to what extent to adapt (reduceframe rate from 25 to 10 f/s). E2ENP provides signaling mech-anisms so that peers can agree on such adaptation conditions andprocedures and prepare resources and capabilities beforehand,

892 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 22, NO. 5, JUNE 2004

Fig. 2. Example of hierarchical QoS specification.

Fig. 3. E2ENP description model and referencing.

based on this tree model. An example that combines adaptiveapplications and E2ENP to reconfigure the streaming processbased on QoS feedback is presented in [10].

The current E2ENP description model supports QoS contextsup to session level. Nevertheless, the design of E2ENP allowsdefinitions of more complex QoS contexts (e.g., prioritization ofmultimedia sessions within an application or even prioritizationof applications on an end system). Further details on the E2ENPhierarchical QoS specification can be found in Section IV-C.

2) Referencing Mechanism for Control Data: In current sys-tems, a complete set of session-configuration information is ex-changed every time a session is negotiated or adapted (e.g.,offer/answer model with SDP [15]). This scenario is not effi-

cient as the minimization of signaling traffic is crucial. Thus,E2ENP defines an optional referencing mechanism to minimizedata exchange. For this purpose, E2ENP distinguishes betweenservice configuration, session configuration, and adaptation in-dication (Fig. 3) and associates configuration-information withan identifier.

Each of the contracts/contexts associated with the nodes ofthe QoS hierarchy is labeled with a specific identifier (e.g.,qID, aID, sID—Fig. 3), which is used by peers to addresscorresponding contracts during QoS-negotiation/renegotiationprocess. By exchanging only identifiers, the QoS-renegotiationtraffic is minimized. Using the hierarchical QoS specification,the adaptation process can be addressed at various levels of

GUENKOVA-LUY et al.: END-TO-END QOS COORDINATION FOR MOBILE MULTIMEDIA APPLICATIONS 893

Fig. 4. E2ENP validation procedure.

granularity. Therefore, the E2ENP reference model definesspecific description and management for session-control data.E2ENP descriptions provided via this model are applied asconfiguration and control commands for managing multimediasessions. Peers can exchange this information using any ses-sion-management protocol (e.g., SIP [4], see Section III-B2).Examples on the E2ENP referencing mechanism are alsoprovided in Sections III-B and IV.

3) General QoS Contract Generation Procedure: In orderto produce meaningful information that can be negotiated viaE2ENP, a validation procedure is required. The application in-teracts with its resource management system to derive condi-tions for session establishment and adaptation. The applicationuses the E2ENP description model to formally specify propersystem configurations and performance constraints. The E2ENPuses a validation procedure (Fig. 4) in order to produce enforce-able or valid QoS contracts.

Basic QoS contracts are generated at the end-devices andexpress user’s QoS preferences. Transforming the user’s pref-erences the application derives application QoS parameters.These parameters are mapped to the end system capabilitiesto produce several QoS contract sketches, which indicatetheoretically possible QoS support at the local end system.Provider policies can either be applied at the local system orinside the network (during negotiation time), depending onthe subscriber-management model [8]. QoS contract sketchesthat contradict the provider policies (e.g., user-provider con-tract) are marked as “spare” and cannot be applied within thecorrespondent network segment. Thus, the valid QoS con-tracts contain the same QoS information as the QoS contractsketches and the provider restrictions are explicitly visible inthe contracts (see also Section III-B2). The basic idea behindthis validation process is to convey all technically possible con-figurations between the negotiating end systems, irrespectiveof provider constraints, since each end system might changeits access network within the duration of a given session (e.g.,performing handover from wireless local are network (WLAN)to universal mobile telecommunications system (UMTS), someof the “spare” contracts might become applicable, whereassome currently enforced ones might no longer be applicable,

Fig. 5. End-to-end negotiation protocol phases.

and are marked as “spare”). The “collapsing algorithm” [6]and “feature matching algorithm” [26] are similar to theE2ENP validation algorithm approaches. However, they donot consider the modular application of constraints at localsystem (“technology validation”) and at the network (“policyvalidation”).

B. Negotiation Model

A candidate protocol for developing negotiation modelfor QoS coordination is the session initiation protocol (SIP)[4]. Within the SIP framework several roles are identifiedlike end system, proxy, registrar, location service, whichcan be enhanced to define provider management [8]. As SIPdefines an “envelope” for transporting data, an additionalprotocol (e.g., SDP, SDPng, etc.) is necessary to define con-figuration descriptions for session establishment or sessionadaptation. The “offer/answer model with SDP” [15] (and itsenhancements [27]–[29]) specifies a negotiation procedure forsession establishment. However, SDP and SDPng do neitherregard end-to-end QoS as global system feature, nor considerminimization of signaling traffic. Hence, E2ENP applies the“offer/answer model” only as a principle idea for its negotiationmodel. In addition, E2ENP optionally considers interactions ofthe protocol with the resource management subsystem within aQoS-aware multimedia system. Local resource availability is amajor limitation when defining supportable QoS configurationsfor a multimedia service.

1) E2ENP Modes and Phases: E2ENP uses different nego-tiation modes—push mode, pull mode, and push–pull mode, de-pending on who is the initiator, the responder, and who providesparticular initial information. In the push mode, the initiator pro-poses an initial set of QoS contract to the responder, whereas inthe pull mode, the responder provides the first set of QoS con-tracts to the initiator. In the push–pull mode, both peers provideQoS and capabilities and try to reach an agreement.

E2ENP consists of three major phases (Fig. 5). The solid linesdenote E2ENP signaling, dotted and dashed lines express peerinteractions, which are being coordinated via E2ENP but are

894 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 22, NO. 5, JUNE 2004

not part of E2ENP, e.g., “reservation” with RSVP and “mediastreaming” with RTP. Depending on services and scenarios, dif-ferent phases can apply different negotiation modes.

Prenegotiation: This phase deals with negotiation of config-uration information valid for more than one multimedia session,e.g., service configuration. This information can be leasedbetween end systems. During this phase, terminals exchangeinformation on supported codecs, desirable QoS contracts,etc. Having the prenegotiation information in advance, theend systems can speed up the negotiation process for sessionestablishment.

Negotiation: The information exchanged during this phasedeals with configuration and establishment of a specific multi-media session. If the terminals cannot (due to limited memory)or are not allowed to (due to system policies) perform prenegoti-ations, necessary system configurations can be additionally ex-changed within the negotiation phase. Thus, two types of negoti-ation are defined—a short negotiation, which uses references todata exchanged in previous prenegotiation phase and a full ne-gotiation, which contains both service information and specificsession configurations. Parameters exchanged during negotia-tion phase define QoS contexts, which media streams shouldbe created for the given session; how these streams are associ-ated with codecs and basic QoS configurations thereof; and whatkind of QoS and time-synchronization constraints are applied.QoS contracts and contexts are applied for multimedia sessionestablishment and for any eventual QoS adaptation process thatmight take place during a multimedia session. Negotiated infor-mation might also be leased between end systems.

Renegotiation: This phase deals with enforcement of spe-cific QoS Contracts and contexts and the indication of adapta-tion conditions. If a different QoS contract/context should be en-forced due to resource availability change, renegotiation phasewould be invoked by the application providing the index of thenew QoS contract to be enforced. The renegotiation phase canalso be used to specify new configurations (e.g., if a dynamiccodec download extends the capabilities). A short renegotia-tion (using only references to previously exchanged data) and afull renegotiation (enhancing the system/session configurations,whenever system upgrades/downgrades occur) are defined. Theshort renegotiation is, thus, an efficient process, since it usesonly references to previously negotiated data, and singnalingimpact is correspondingly minimized.

2) Resource-Management Aspects: In order to support theE2ENP referencing model, E2ENP imposes requirements onthe behavior of the network entities interacting with E2ENP.The QoS signaling via E2ENP is out of band and path decou-pled (after [30]). Application-level proxies that can interpretsuch kind of signaling are in position to manipulate the QoSconfigurations exchanged by terminals [23]. However, theoriginal, unchanged configurations might be useful for end sys-tems, when planning and performing handovers. Additionally,provider rules applied within one subnetwork, might not beavailable and/or legal in another subnetwork when performinga handover. Consequently, network components, which are inposition to enforce provider rules, should add them to the QoScontracts in a way that this information is explicitly visibleand recognizable as a provider restriction from the negotiatingpeers. Terminals can, thus, distinguish between end system con-

Fig. 6. Example of E2ENP resource coordination at session establishmentbased on SIP as E2ENP carrier.

straints and provider rules. The resulting required separation ofQoS contracts and provider rules should guarantee successfulend-to-end application-signaling even when components forenforcing provider policies are involved.

E2ENP QoS contracts are used for resource-reservationduring session establishment and adaptation. E2ENP adoptsthe “economy principle” [14] to describe the order of reser-vation processing, i.e., end system local resources shouldbe reserved ahead of network resources, which are the mostexpensive ones. Resource management within the networkmay be heterogeneous (e.g., IntServ [2], DiffServ [31], etc.);end systems may not be allowed to perform network resourcereservation signaling; or the signaling may be terminated in thenetwork at a gateway. However, it is the task of the end systemsto synchronize session setup with optional network resourcereservation.

Fig. 6 shows an example of the E2ENP resource coordina-tion at session establishment based on examples with SIP. Thenegotiation phase is performed before or at the actual start of asession. Based on results of previously applied phases, the endsystems agree which capabilities and QoS contracts to enforcefor the given streams within the session (the XML code of theseconfigurations is shown in Sections IV-B and C). At completionof this phase, peers have agreed upon the QoS contracts theyare going to enforce for the given streams and the right set ofcapabilities to apply. Fig. 6 explains the push mode, wherebythe offerer sends an initial proposal (i.e., E2ENP configurationoffer) that contains its desired capabilities and QoS levelstogether with the potential stream adaptation paths in a SIPINVITE to the answerer. As the offerer knows its desired QoSlevels, it optionally reserves local resources beforehand [i.e.,local resource management (LRM)]. The answerer proves theofferer’s proposal and also reserves appropriate resources,locally. The answerer replies (183 Session Progress) with asubset of capabilities and QoS parameters (i.e., E2ENP config-uration answer). This information matches both the answererand the offerer views about the session establishment andfuture QoS adaptations. The answerer may also reply with itscounterproposal which may describe additional capabilities and

GUENKOVA-LUY et al.: END-TO-END QOS COORDINATION FOR MOBILE MULTIMEDIA APPLICATIONS 895

QoS that the answerer would like to apply for the session. Thisenables the offerer side to upgrade its capabilities dynamically,e.g., by downloading appropriate codecs. By receiving theanswerer’s counterproposal, the offerer proves locally if theresources need to be relocated to match the answerer’s reply.The offerer sends PRACK to the answerer to inform it about thestart of the network reservation and the answerer acknowledgesthis action (i.e., E2ENP command-start-reservation). Networkresource reservation proceeds at the transport/network layerusing, e.g., RSVP. The next sequence depends on the modelused for network resource reservation (sender initiated/receiverinitiated). If RSVP is not available end-to-end or terminatedat the network edge, RSVP PATH will not be received by theanswerer. Therefore, by using UPDATE the offerer and theanswerer inform each other about the state of the network reser-vation and about the final capability/QoS level to be enforcedfor the session to be established. Once network resources arereserved end-to-end or only partially, the answerer sends a 180ringing to the offerer. The call setup is completed by sending a200 OK for the initial INVITE from the answerer to the offerer.The offerer acknowledges this by sending an ACK. There maybe situations, where the local resource reservation may fail,e.g., at the answerer after receiving the initial proposal. As theinitial proposal may contain alternative QoS levels and alter-native capabilities that the offerer would also tolerate (withinthe adaptation path), the answerer may chose an alternativecapability/QoS level. This information must be propagatedin the 183 Session Progress that the offerer is able to relaxresource requirements and reconfigure its media stream engine.Also, network resource reservation may fail due to insufficientresources. In this case, the offerer would try an alternative QoSlevel with less network resource requirements until resourcesare reserved successfully. The final agreed upon QoS level mustbe notified to the answerer within the UPDATE message so asto release additional resources. E2ENP itself does not providemessages for resource reservation. E2ENP only coordinates thenetwork resource reservation, which should be accomplishedusing protocols like RSVP triggered by the hooks providedthrough the E2ENP agent (i.e., the reception and sending ofcommand-start-reservation).

Renegotiation message sequence is similar to negotiation, buthas less signaling overhead due to the E2ENP referencing mech-anism. Renegotiation can be started by any of the peers detectinga QoS violation and, thus, the initial offerer may become byrenegotiation an answerer and vice versa.

There may be situations, where a service domain entity (e.g.,enhanced SIP proxy) is involved in the negotiation processby intercepting the signaling between the offerer and the an-swerer [8]. A provider entity may analyze the application QoSparameter and add provider-constraints in form of transportQoS parameter to the QoS offers and answers. Additionally,a provider entity may define restrictions to single transportQoS parameters exchanged between the offerer and the an-swerer or may add provider-authorization tokens for resourcereservation to the respective offers and answers. The peerscan extract and use these tokens to authenticate themselvesbefore network-reservation entities. The peers recognize theprovider-specific information as it is added in a separate part ofthe E2ENP payloads (see beginning of Section III-B2). Thus,

the peers distinguish between the information send end-to-endby the offerer/answerer and the additional constraints/tokensadded by the service entities. When intercepting a PRACKmessage with E2ENP command-start-reservation the providerentities may start a network resource reservation providingassistance to those end systems which cannot reserve networkresources on their own. The provider processing of the E2ENPpayloads (i.e., adding constraints to application or transportparameters or including reservation tokens) depends on thetrust models between the end-peers and the provider [8].

In addition to the management of end system and providerconfiguration information, the E2ENP-specific resource coor-dination includes requirements for data consistency, when man-aging resources, in order to avoid deadlock-like reservation sce-narios on the terminals (see [7] and [8]).

IV. E2ENP DESIGN AND EXAMPLES

E2ENP is designed to be network and system independent. Itallows to describe system capabilities, QoS contracts and QoScontexts in a declarative manner. E2ENP does not define its ownnetwork transport medium but uses an already available carrierprotocol, e.g., SIP [4]. Nevertheless, E2ENP defines its own ad-dressing scheme to uniquely identify E2ENP sessions and com-municating peers at application level. The addressing syntax canbe mapped to any IP-based session management protocol [8].E2ENP summarizes several specifications, concerning multi-media system configurations. E2ENP directly adopts these def-initions without unifying the metric units (i.e., time units are inseconds or milliseconds, information units in bits or bytes, etc.)and the naming schemes applied there (see also Section III-A).A concise namespace for defining identifiers to single contracts,stream associations and higher level associations has to be stan-dardized to produce unique identifiers, since the applicationsusing E2ENP need to precisely understand these identifiers. TheE2ENP specification differs to some extent from SDPng, dueto the fact that E2ENP adapts system capability descriptions tothe needs of QoS application and delivery. Despite these differ-ences, E2ENP is still applicable as an extension of SDPng, dueto the extensibility of the XML-Schema definitions [32].

The following examples show XML code snippets of themajor elements used during E2ENP phases. The correctnessof the examples has been proven using a validating XMLparser [33], [34]. We have used a scenario, where a sessioncontains an audio and a video stream. The application canselect between two audio codecs (GSM, G.722) and one videocodec (MJPEG). The application can adapt frame size, framerate, video quality (by tuning the quantization parameter ofthe codec) resulting in different bandwidth requirements (seeTable I). In our example, there are six possible combinationsfrom which the application can select. E2ENP provides themechanisms to signal the codecs, the application QoS contracts,network QoS contracts, and the QoS contexts. It is up to theapplication to decide when to switch from step m to step n (see[13], for an example).

Some of the major E2ENP parent elements do not carry“e2enp” header, since they stem from different E2ENP XMLnamespace definitions [6] and [23]. Different E2ENP elements

896 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 22, NO. 5, JUNE 2004

TABLE IADAPTATION “QUALITY STEPS” OF THE APPLICATION

Fig. 7. PDU header.

are shown in uncompressed form for better readability. Futureversions of E2ENP will consider partial or complete E2ENPcompression, e.g., using hashes.

A. PDU Header

The “e2enp:purpose” section (Fig. 7) is the header ofthe E2ENP protocol data unit (PDU). It appears at thebeginning of every PDU exchanged at prenegotiation, ne-gotiation, or renegotiation. The “e2enp:session” sectiondefines the owner/originator of the E2ENP PDU (after [6]and [22] owner/originator specification). If available the“e2enp:expires” states the relative lease time of this PDUin seconds. The absolute time point defining the start of theleasing is provided using the time/date tokens of the carrierprotocol applied, e.g., SIP [4]. The example in Fig. 7 showsa “e2enp:purpose” section for a prenegotiation, which refersto another PDU identified through “e2enp:use” section, i.e.,the session with refers to aprevious session with . Thestructure of the E2ENP PDU identification and the identifica-tion of the referred PDU is the same, i.e., the “e2enp:session.”The “e2enp:description” identifies the PDU type, the ne-gotiation phase and the negotiation mode. Additionally, the“e2enp:description” is used to provide reservation-coordinationcommands for start and end of the network reservation [8].

B. Capabilities and QoS Contracts

Fig. 8 shows a definition of system capabilities in form ofcodecs according to the RTP-profile definition [24]. The twomajor sections “e2enp:audio-codec” and “e2enp:video-codec”specify the supportable codecs with their names and applica-bility scope. The “rtp:pt” specifies RTP-transport for the codecs,

Fig. 8. Capabilities.

referencing the already defined with “e2enp:audio-codec” and“e2enp:video-codec” descriptions. Dynamic RTP-payloadtypes are characterized through definition of parameterizationsubelement of “rtp:pt.”

The aim of the separate mapping of the codec definitions tothe payload-types is to optimize the PDU processing. The ap-plication can first prove if a given codec is supported and ifthe codec is not supported the respective serialization format(in this case RTP payload-types) can simply be ignored. Ad-ditionally, for some codecs it is possible to associate dynamicpayload types, which in fact is a complementary information tothe codec, therefore, it is presented separately from the codec it-self. The XML parameter definition for dynamic payload typesis a feature supported in E2ENP, but not in SDPng.

Fig. 9 shows the structure of the QoS contracts at applicationlevel (i.e., audio and video QoS contracts) and at transportlevel. Within the audio and the video QoS contracts the“e2enp:contract” describes the user’s QoS preferences asapplication QoS parameters. Inside the audio contracts theparameters “sampling-rate-set” (sampling rate in Hertz) and“channel-set” (e.g., mono , stereo , etc.) are accordingto [24]. Within the video contracts the meaning of the param-eters is the same as for the codec parameterization (Fig. 8),i.e., “e2enp:video-codec,” where “frame-rate-range” definesnumber of frames [frame/s], “frame-size-set”—the displayableframe sizes ( , see [20]) andthe “color-quality-range” and “overall-quality-range” are per-centage values, where 1% “color-quality” or “overall-quality”is equal to 100 units of the corresponding descriptions. Themapping between the QoS definitions and the codecs to form aQoS contract is done using “rtp:map.” The QoS contract is iden-tified through the attribute “name” within the “e2enp:contract”element. The definition of the transport QoS is adopted from

GUENKOVA-LUY et al.: END-TO-END QOS COORDINATION FOR MOBILE MULTIMEDIA APPLICATIONS 897

Fig. 9. QoS contracts.

[23]. The mapping between the QoS contracts and transportdefinitions is achieved at session establishment time, since onlythen it is known how many streams would be opened for therespective media session and which network resources are nec-essary for these streams. The provider restrictions on the QoScontracts are expressed with an additional element associatedwith the “e2enp:contract” [8]. The capabilities (Fig. 8) andthe QoS contracts (Fig. 9) are usually exchanged between endsystems during a prenegotiation (see Section III-B1).

C. Media Session Configuration—QoS Contexts

Fig. 10 shows the RTP-specific stream configuration. Thisdescription is partially adopted from SDPng [6]. The “com-ponent” element is used to define a single media stream, e.g.,Fig. 10 shows the definition of one audio and one video stream.The “alt” element specifies an alternative configuration of therespective media stream. Fig. 11 shows the logical configura-tion of the stream association QoS contexts, expressing different

Fig. 10. RTP specific stream configuration.

adaptation possibilities. A QoS context (the “e2enp:context” el-ement) is defined (as described in Section III-A1) as the asso-ciation of adaptation paths (the e2enp:adapath element), whichdescribe the nominal and alternative QoS specifications to en-force during adaptation, and the applicable time-synchroniza-tion and/or QoS correlation constraints (the e2enp:constraints).As defined in Section III-A2, the audio and the video config-urations are identified and referenced in this example by theirnames. For example, the audio information described in theRTP specific stream configuration (Fig. 10) is referenced via itsname “audio-stream-1,” a video QoS contract information de-fined in “e2enp:qos-contracts for=“video”” (Fig. 9) is identifiedby its name “video-contract-1.” For the purpose of referencingalready defined components, the attributes named “ref_ ”(e.g., “ref_component,” “ref_contract,” etc.) are used.

The “e2enp:adapath” (Fig. 11) defines QoS adaptation rulesfor single streams, stream groups, and multimedia sessionsas a whole. The “e2enp:alt” element specifies possible al-ternative QoS contracts used within the adaptation process.The first “e2enp:alt” element in a group identifies a nominalcontract the end systems should first apply to reserve resources

898 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 22, NO. 5, JUNE 2004

Fig. 11. QoS contexts and QoS adaptation.

at multimedia-session establishment. Streams grouping andsynchronization rules (see Section III-A1) for an association ofmedia streams are defined over the “e2enp:context” element.For example, Fig. 11 shows two possible QoS contexts, whichcan be applied for a multimedia session—“association1-1”with audio and video stream and “association1-2” with onlyaudio stream.

The examples in Figs. 10 and 11 correspond to the informa-tion exchanged between the end systems during a short negoti-ation (Section III-B1).

D. QoS Adaptation

E2ENP refers to already negotiated data in order to performQoS adaptation at renegotiation time (Fig. 12). The “e2enp:en-force” element is used to point at the new QoS contract tobe applied and the “e2enp:block” defines, which currentlyused QoS contract should be blocked. In this case, the stream

Fig. 12. Enforcing and blocking of QoS contracts.

representation of “audio1” changes between the currentlyactive “audio-contract-1” and the alternative configuration“audio-contract-2.” The “e2enp:enforce” and “e2enp:block”can also be applied for enforcing/blocking QoS contexts, e.g., ifan association of an audio and video stream is no longer appli-cable, this association can be blocked and an association witha single audio stream can be enforced [8]. The “e2enp:target”element defines the abstraction level of the QoS information(see also Section III-A1).

In this case, the referencing mechanism (Section III-A2) isapplied via the “path” attribute and identifies complete branchof a hierarchical QoS definition (Section III-A1) producedduring the prenegotiation (see Figs. 8 and 9) and the negotiationphases (see Figs. 10 and 11). The example in Fig. 12 corre-sponds to information exchanged between end systems duringa short renegotiation phase (Section III-B1). The renegotiationstructure of a short E2ENP renegotiation is always fixedirrespective of how many alternative audio/video streams andassociations of streams are defined. At a short renegotiation thepeers use only block/enforce elements to point at the respectivechange. As a complete renegotiation PDU is formed from con-tents similar to those shown in Figs. 7 and 12 compared with aprenegotiation PDU (combination of the contents—Figs. 7–9)or a negotiation PDU (combination of the contents—Figs. 7,10, and 11), the short renegotiation is very compact and leads toefficient signaling. In Section V-D, we analyze the performancegain when using the short renegotiation phase.

V. E2ENP CONCEPT VALIDATION

This section presents the E2ENP UA architecture and the ap-plied test environment, developed for validating the E2ENP con-cept, as well as the results of this validation.

A. E2ENP User Agent Architecture

Fig. 13 shows the architecture of the E2ENP user agent(E2ENP UA). The complete implementation of the E2ENP UAis based on Java 1.4.1 [35] and consist of the following entities.

• E2ENP finite-state machine (FSM)—Each instance ofsuch FSM corresponds to an E2ENP control session,which can be created by the application. The E2ENP UAuses the FSMs to regulate the access of multiple E2ENPcontrol sessions to the carrier protocol (e.g., SIP) stack.The E2ENP UA controls the consistency of the E2ENPinformation, i.e., the E2ENP UA keeps track of mutu-ally-dependant E2ENP sessions saving and verifying theinformation of the E2ENP PDU headers.

GUENKOVA-LUY et al.: END-TO-END QOS COORDINATION FOR MOBILE MULTIMEDIA APPLICATIONS 899

Fig. 13. E2ENP user agent architecture.

• Description language translator—uses SAX XML parser[33], [34] for processing the XML code of the E2ENPPDUs, where:

i) parser—translates E2ENP payloads into system-in-ternal representation;

ii) generator—translates system-internal objects intoE2ENP XML descriptions.

• Carrier protocol stack—for sending/receiving of controlmessages. The implementations presented in this paperare a SIP surrogate using RMI [36] (implementation ofUniversity of Ulm) and a SIP stack [Sony International(Europe) GmbH solution] based on Java, developed ac-cording to [4].

B. Test Goals

The major goals are: to validate the E2ENP logic; to optimizethe processing through tuning the E2ENP PDUs’ structure; tomeasure the E2ENP round-trip times within different networksfor the TCP-based RMI and the UDP-based SIP; to investigatethe reaction time of the E2ENP UA, in order to prove if the ses-sion establishment/management times are acceptable from theuser point of view and to perform comparative measurementswith a similar control protocol (e.g., SDPng). As an evaluationcriterion it was considered that all media-session establishmentshould last no longer than 2–5 s (prenegotiation, negotiation)and the adaptation within the duration of a multimedia sessionshould last no longer than 1 s (renegotiation) to fulfill the user’sexpectation for almost immediate reaction. This evaluation cri-terion considers the fact that a user would rather accept longersession establishment times, but for him/her it is not acceptablethat an adaptation during running multimedia session lasts verylong. Additionally, long lasting adaptation processing and sig-naling might cause disturbances in the perceived media, whichis also not acceptable for the user [37].

TABLE IIPARAMETERS OF THE USED PCS

TABLE IIIEMULATED NETWORK TYPES WITH NISTNET

C. Test Environment

The equipment used for E2ENP tests is shown in Table II.Table III displays network types considered for the measure-ments. The network measurements were performed using theNISTNET [38] network emulator and the parameters andvalues given in Table III correspond to the parameter inputsin NISTNET. In Table III “delay” corresponds to the oneway access-network delay between a terminal and the firstaccess-router.

D. Test Results

For the performance measurements of the E2ENP UAcomponents, the end-to-end application round-trip times(Section V-D1), and for the comparison of E2ENP with SDPng(Section V-D2) the following scenario is applied:

The multimedia session to be established and later onadapted consists of one unidirectional audio and video stream.No more than three alternative codecs and no more thanthree QoS contracts per stream are used. Streams need to besynchronized, forming two different QoS contexts.

More complex scenarios would result in larger payloads.The influence of the payload size on E2ENP processing andend-to-end signaling is discussed in Section V-D1.

1) E2ENP Components and Processing: Measurements forevaluating the parser and the generator were performed on asingle PC (Table II—first entry) and the performance of thesecomponents depends on the size of the negotiated information.The basic unit used when processing the XML payloads is an“XML line” as the SAX parser [33], [34] processes the XMLcode sequentially, generating a single event per XML element.

The initial PDU size for an E2ENP payload is 100 lines(i.e., 4.26 kB XML code). For investigating the parser/gener-ator performance, we varied the number of input lines. As shownin Table IV, the XML processing time of the parser/generator in-creases linearly with the PDU size. It should be noted that the

900 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 22, NO. 5, JUNE 2004

TABLE IVMEASURED PARSER AND GENERATOR SCALING BEHAVIOR FOR

PROCESSING A SINGLE E2ENP PDU

Fig. 14. Measured E2ENP roundtrip times depending on the networkcharacteristics.

E2ENP prenegotiation/negotiation payload sizes varies with thenumber of streams, QoS contracts, and contexts. However, thesize of a short renegotiation is always fixed (Fig. 12) due to thehierarchical QoS specification. Therefore, if a QoS renegotia-tion is at stream level—one stream level QoS contract is blockedand another is enforced, if a QoS change is at stream-associationlevel—a QoS context is blocked and another is enforced, etc.Irrespective of the level at which a QoS change occurs a shortrenegotiation expresses this within only two XML elements andwithin two reference lines (see “path” definition in Fig. 12).

As the size of a PDU influences the end-to-end signalingdelay (Fig. 14), we fixed in the performance tests the XML filesizes (see the scenario at the beginning of Section V-D). Theapplied uncompressed XML files are, respectively, 4.26 kBfor the prenegotiation, 3.23 kB for the short negotiation,and 1.28 k for a short renegotiation, which references twostream-level QoS contracts respectively for blocking/enforcing.A single PDU (prenegotiation, negotiation, or renegotiation)is exchanged in offerer-to-answerer direction and once inanswerer-to-offerer direction. As we were interested in theE2ENP and user agents performance, the application did notchange the contents of the PDU when receiving it from theuser agent to include QoS counteroffers. Thus, the followingexamples show the pure E2ENP application-to-applicationsignaling overhead.

We used a test bed including two end-devices as offererand answerer roles and one PC as a router (Table II), whichemulates the network performance with NISTNET accordingto Table III. We varied available data rates and introduce delay

TABLE VMEASURED E2ENP UA FSM PROCESSING TIMES PER E2ENP PHASE

to measure the E2ENP end-to-end signaling delay based on SIPand RMI carrying the E2ENP XML files. Our measurements(Fig. 14) show that due to the UDP-based transport usingexplicit synchronization within the SIP code, the SIP-basedE2ENP performs better in low-bandwidth networks. If moredata rate is available, the complexity of the explicit SIPsynchronization increases the impact of the peer performance,thus the end-to-end application signaling is slower comparedwith the RMI-based implementation. Within high-bandwidthnetworks the RMI-based SIP surrogate shows better results,since the transport synchronization is done by TCP. DifferentE2ENP phases result in different number of messages. Whenusing SIP with E2ENP in the push mode, a prenegotiationconsists of two messages; the negotiation/renegotiation consistof 11 messages for both QoS-information exchange andnetwork resource-reservation coordination [8]. The timenecessary for a prenegotiation, negotiation or renegotiationdepends on the applied SIP transaction. E2ENP uses OPTIONSmethod to perform prenegotiation (i.e., non-INVITE servertransaction [4]) compared with the INVITE transaction usedfor negotiation/renegotiation. The OPTIONS method requireslonger synchronization in order to reliably exchange the controldata within only two messages, compared with the INVITEthree-way-handshake [4]. Additionally, the prenegotiationshows the largest PDU size. An analysis of the scenario (be-ginning of Section V-D) shows that the renegotiation has up to

50% less signaling data compared with the negotiation pro-cedures for establishing a multimedia session (prenegotiation,negotiation).

The size of a PDU does not influence the FSM performance,since following the design rules (Section III-B1) the E2ENPUA can cope with either short or full negotiation/renegotiation.The FSM performance depends only on the corresponding callsynchronization (Section V-A).

The following measurements were taken on a single PC(Table II—first entry) for both SIP and the RMI-based SIPsurrogate. Table V shows the response times of the E2ENPFSMs for the offerer and the answerer. The E2ENP UA usestwo different adaptation layer to connect the E2ENP FSMs,one to SIP and another to RMI. In our implementation, theE2ENP FSM waits initially for binding to the parser/generatorevery time an initial call to the description language translatoris executed. This results in longer prenegotiation time for bothSIP and RMI due to the time needed to initialize the XML SAXparser (Table V—offerer side). In contrast, a binding procedureto the carrier protocol stack is made every time a remote invo-cation takes place when using the RMI base implementation.This results in longer response times for the RMI-based SIP

GUENKOVA-LUY et al.: END-TO-END QOS COORDINATION FOR MOBILE MULTIMEDIA APPLICATIONS 901

surrogate (Table V—offerer side). At the answerer side, bothRMI and SIP implementations are comparable, which verifiesthe stability of the E2ENP UA FSM, since the major amount ofnegotiation processing takes place at the answerer side. Futureversions of E2ENP UA will decouple XML parser and carrierprotocol stack binding from the negotiation process itself. Thiswould result in faster prenegotiation phase.

2) E2ENP and SDPng Comparative Measurements: Wewere interested to compare our solution (especially using theshort renegotiation) with state of the art [6], [15]. Since E2ENPand SDPng differ in their description models, there is onespecific scenario, where the two technologies can be compared.SDPng does not differentiate in its description model betweennegotiation and renegotiation, has no prenegotiation phase, anddoes not support short phases. Additionally, SDPng does nothave hierarchical QoS specification mechanisms for defininghierarchical session descriptions. It is also not possible to definePDU headers. Furthermore, the basic SDPng description model[6] does not contain elements for QoS codec parameterizationand for transport QoS. The proposal for SDPng transportQoS extensions [23] is up to now not adopted within theSDPng description scheme. Due to these differences, a stan-dard SDPng negotiation and a SDPng renegotiation has beenimplemented and tested in accordance with the “offer/answermodel” [15]. The performance of this implementation has beenthen compared (using the same SIP stack) with a full E2ENPnegotiation (prenegotiation data included in the negotiationphase but no prenegotiation phase carried out), and with a shortE2ENP renegotiation. Our SIP/SDPng prototype can use theRMI-based SIP surrogate and the SIP stack both describedin Section V-A. Thus, a consistent framework for comparingthe two technologies is provided. The session information isalways exchanged (E2ENP and SIP/SDPng) as uncompressedXML file. We observed the following size of exchanged sessionconfigurations in one direction (i.e., an offer or an answer):for the SDPng negotiation and renegotiation 4.96 kB, for thefull E2ENP negotiation 6.65 kB, and for the short E2ENPrenegotiation 1.28 kB. The information within the SDPngdescriptions corresponds to the following:

• audio and video alternative codec descriptions for media-streams according to [24];

• association of the defined streams with RTP transport pa-rameter, common to those defined in [1].

The E2ENP full negotiation contains information about thefollowing:

• PDU header (see Fig. 7);• audio and video alternative codec descriptions for media

streams according to [24] (see Fig. 8);• parameterization of video codec descriptions (see Figs. 8

and 9) in the form of QoS parameters at application level;• definition of transport QoS parameter (see Fig. 9);• association of streams with RTP transport parameter,

common to those defined in [1] (see Fig. 10);• definition of hierarchical QoS specification (see Fig. 11).

The short renegotiation PDU contains information similar toFigs. 7 and 12.

TABLE VIE2ENP AND SDPNG COMPARATIVE MEASUREMENTS

The results of the SDPng and E2ENP application-to-appli-cation roundtrip times are shown in Table VI and denote thetotal time needed to establish and (re)negotiate QoS for a multi-media conference. These measurements were taken on a singlePC (Table II—first entry) and, thus, approximately correspondto LAN behavior. Session setup time for E2ENP and SDPngshow similar behavior, where E2ENP is slightly slower (6%)due to the larger amount of information exchanged. This is dueto the overhead of the hierarchical QoS model. In contrast, theaverage response of E2ENP for the renegotiation phase is 240ms (or 57%) faster than SDPng for our simple test scenario.In more complex scenarios, the E2ENP short renegotiation iseven more efficient, as the size of the E2ENP renegotiation pay-load is highly limited due to the hierarchical QoS specification,whereas the size of SDPng renegotiations varies with the com-plexity of the described application/session scenario. It shouldalso be noted that E2ENP directly points at the changed config-uration data (i.e., enforce contract X, block contract Y) when-ever performing adaptations. In contrast, SDPng renegotiationsmust be compared with initial exchanged data to recognize thechange.

VI. RESULTS AND DISCUSSION

The results of the E2ENP performance measurements are inaccordance with the defined goals (Section V). The comparisonand the analysis of the SIP and RMI-based SIP-surrogate showthat further improvements of the initialization procedures andthe performance of the carrier protocol stack are necessary tooptimize the overall E2ENP performance. Additionally, a workaround for the SAX parser initialization procedure is requiredto avoid delays at initial negotiation processing of the E2ENPUA. Furthermore, E2ENP UA initialization delays at the veryfirst run of the JVM on a PC were observed. Performing re-peated measurements, it was detected that this is a problem ofthe JVM itself, which should be further investigated. From themeasurements, we conclude that E2ENP can be used to con-trol multimedia services in low bandwidth networks. The over-head of the hierarchical QoS descriptions is low compared withthe flexibility gain and the improved renegotiation performance.This is important, as the user tolerates a slightly longer confer-encing setup as long as the disruption during adaptation is min-imal and there is a need to adapt and reconfigure the streamingprocess. This will definitely be required in fourth-generation(4G) networks, where terminals will have multiple air interfacesand handover from 802.11–based WLANs to cellular networksmight occur. If the bandwidth availability of such heterogeneousnetworks differs significantly, applications must adapt or suffersevere packet loss.

E2ENP’s effectiveness, in comparison with other similarprotocols, should further be investigated. The authors will

902 IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 22, NO. 5, JUNE 2004

integrate the E2ENP UA (Section V-A) in a QoS brokerarchitecture similar to BRENTA [8] and test it in a completeadaptive multimedia-conferencing scenario together withreal-time streaming, provider policy management, and networkreservations. The E2ENP solution shows scalability especiallyin the case of QoS adaptation (renegotiation). The performancecan be further improved by applying compression techniquesto E2ENP payloads.

VII. CONCLUSION AND FUTURE WORK

This paper introduced an end-to-end protocol for negotiatingQoS-parameters at application, transport layer and sessionconfigurations. It uses a hierarchical QoS model and appliesreferencing mechanisms to reduce signaling load wheneverQoS adaptations are required. The special features of E2ENPare the negotiation of QoS configurations used for QoS adap-tation, including alternative QoS contracts, QoS contexts, andQoS-adaptation rules for enabling mobile multimedia-capabledevices to utilize heterogeneous access networks. The mea-surement results demonstrate the scalability of our approachand show that our mechanisms significantly reduce time forQoS adaptation processes. Our comparative measurementsof E2ENP and SDPng processing confirm the advantages ofE2ENP.

Considering the goals of the SDP/SDPng transition, theauthors believe that E2ENP can influence SDPng directionconcerning capabilities and QoS negotiation. Major points todiscuss are optimization, consistency, and unification of XMLschema(s) and XML namespaces for describing capabilities andQoS. Furthermore, strict namespaces for named and referencedvalues (i.e., capabilities, QoS contracts, QoS contexts) have tobe defined to provide proper naming and unique referencing ofmedia streams, sessions, and applications.

Our future work concentrates on integration of E2ENPinto a complete QoS framework. To this extent, we plan toprovide proxies capable of interpreting E2ENP payloads,which then work together with authentication, authorization,and accounting mechanisms (A4C platform). This enablesservice-based charging mechanisms and enhanced media-basedadmission control. In addition, the proxies will work togetherwith QoS brokers in the network during resource reservationphases to deploy flexible interactions between service domainand transport domain. Finally, we will test our E2ENP in adhoc network scenarios, where QoS adaptations will occurmore frequently due to topology change and wireless resourcefluctuations.

REFERENCES

[1] H. Schulzrinne et al., “RTP: A transport protocol for real-time applica-tions,” IETF, RFC 1889, Jan. 1996.

[2] A. Mankin et al., “Resource reservation protocol (RSVP),” IETF, RFC2208, Sept. 1997.

[3] D. Durham et al., “The COPS (common open policy service) protocol,”IETF, RFC 2748, Jan. 2000.

[4] J. Rosenberg et al., “SIP: Session initiation protocol,” IETF, RFC 3261,June 2002.

[5] W3C. Extensible Markup Language (XML) 1.0 (2nd ed.) [Online].Available: http://www.w3.org/TR/1998/REC-xml-19980210

[6] D. Kutscher et al., Session description and capability nego-tiation, IETF Internet-Drafts (draft-ietf-mmusic-sdpng-06 anddraft-ietf-mmusic-sdpng-07), Mar./Oct. 2003.

[7] T. Guenkova-Luy, A. Kassler, J. Eisl, and D. Mandato, Efficientend-to-end QoS signaling—Concepts and features, IETF Internet-Draft(draft-guenkova-mmusic-e2enp-sdpng-00), Mar. 2002.

[8] MIND Project, Top-level architecture for providing seamless QoS, se-curity, accounting and mobility to applications and services, DeliverableD1.2, Nov. 2002.

[9] X. Gu et al., “An XML-based quality of service enabling language forthe Web,” National Science Foundation, Project Rep., 2001.

[10] P. Ruiz, J. Sanchez, E. Garcia, A. Gomez-Skarmeta, J. Botia, A. Kassler,and T. Guenkova-Luy, Adaptive multimedia multi-party communicationin ad hoc environments, Software Technology Track, HICSS-37, Jan.2004.

[11] M. Alfano and N. Radouniklis, A cooperative multimedia environmentwith QoS control: Architectural and implementation issues, TR-96-040,Int. Comput. Sci. Inst., Berkeley, CA, Sept. 1996.

[12] A. J. Kassler, “Video adaptation within a quality of service architecture,”Ph.D. dissertation, Dept. Distributed Syst., Univ. Ulm, Ulm, Germany,2001.

[13] MIND Project, “MIND trials final report,” Deliverable D6.4, Nov. 2002.[14] K. Nahrstedt and J. M. Smith, “The QoS broker,” IEEE Multimedia

Mag., vol. 1, no. 2, pp. 53–67, Spring 1995.[15] J. Rosenberg and H. Schulzrinne, An offer/answer model with SDP,

IETF, RFC 3264, June 2002.[16] A. Vetro et al., “Study of ISO/IEC 21 000-7 CD—Part 7: Digital item

adaptation,”, ISO/IEC JTC 1/SC 29/WG 11/N5612, Mar. 2003.[17] International Telecommunication Union, Control protocol for multi-

media communication, ITU Recommendation H.245.[18] , Internet protocol data communication service—IP packet transfer

and availability performance parameters, ITU Recommendation Y.1540.[19] , Network performance objectives for IP-based services, ITU Rec-

ommendation Y.1541.[20] , Video codec for audiovisual services at p� 64 kbit/s, ITU H.261.[21] J. Ott. Multiparty multimedia session control WG. [Online]. Available:

http://www.dmn.tzi.org/ietf/mmusic/[22] M. Handley and V. Jacobson, “SDP: Session description protocol,”

IETF, RFC 2327, Apr. 1998.[23] L. Bos et al., SDPng extensions for quality of service negotiation, IETF

Internet-Draft (draft-bos-mmusic-sdpng-qos-00), Mar. 2002.[24] H. Schulzrinne and S. Casner, “RTP profile for audio and video confer-

ences with minimal control,” IETF, RFC 3551, July 2003.[25] J. Ott and C. Perkins, SDPng transition, IETF Internet-Draft (draft-ietf-

mmusic-sdpng-trans-04), May 2003.[26] G. Klyne, “A syntax for describing media feature sets,” IETF, RFC 2533,

Mar. 1999.[27] G. Camarillo et al., Integration of resource management and session ini-

tiation protocol (SIP), IETF, RFC 3312, Oct. 2002.[28] J. Rosenberg and H. Schulzrinne, Reliability of provisional responses in

the session initiation protocol, IETF, RFC 3262, June 2002.[29] J. Rosenberg, The session initiation protocol (SIP) UPDATE method,

IETF, RFC 3311, Oct. 2002.[30] H. Lu and I. Faynberg, “An architectural framework for support of

quality of service in packet networks,” IEEE Commun. Mag., vol. 41,pp. 98–105, June 2003.

[31] S. Blake et al., An architecture for differentiated services, IETF, RFC2475, Dec. 1998.

[32] Ch. Valentine, XML Schemas, SYBEX, 2002.[33] D. Megginson. SAX—Simple API for XML. [Online]. Available:

http://www.saxproject.org/[34] Oracle Technology Network. Oracle XML developers kit. [Online].

Available: http://otn.oracle.com/tech/xml/xdk/content.html[35] Sun Microsystems, Inc. Java TM 2 platform, standard edition (J2SE).

[Online]. Available: http://java.sun.com/j2se/[36] , Java TM 2 remote method invocation (RMI). [Online]. Available:

http://java.sun.com/j2se/1.4/docs/guide/rmi/index.html[37] A. Watson and M. A. Sasse, “Measuring perceived quality of speech and

video in multimedia conferencing applications,” in Proc. ACM Multi-media, 1998, pp. 55–60.

[38] National Institute of Standards and Technology, NIST NET Home Page.[Online]. Available: http://snad.ncsl.nist.gov/itg/nistnet/

GUENKOVA-LUY et al.: END-TO-END QOS COORDINATION FOR MOBILE MULTIMEDIA APPLICATIONS 903

Teodora Guenkova-Luy received the M.Sc. Engi-neer degree in computer technologies from TechnicalUniversity Sofia, Sofia, Bulgaria, in 1996 and theM.Sc. degree in informatics and computer sciencefrom the University of Ulm, Ulm, Germany, in 2001.Currently, she is working toward the Ph.D. degree inthe Department of Distributed Systems, Universityof Ulm, where she is an Assistant Researcherworking on her thesis concerning application levelquality-of-service coordination and management.

In the past, she worked in the area of microwaveand millimeter-wave hardware simulation and development for wireless com-munication at the Technical University Sofia and the University of Stuttgart,Stuttgart, Germany. She has participated in several international researchprojects (BRAIN, MIND, DAIDALOS). She has published a standard draftin IETF and actively takes part in the work of the IETF MMUSIC WG. Herresearch interests are in system modeling, middleware architectures, end-to-endperformance of distributed systems, and quality-of-service as a special issueof the end-to-end system performance.

Andreas J. Kassler (M’97) received the M.Sc.degree in mathematics and computer science fromAugsburg University, Augsburg, Germany, in 1995and the Ph.D. degree in computer science fromUniversity of Ulm, Ulm, Germany, in 2002.

Currently, is an Assistant Professor at NanyangTechnological University, Singapore. He teachesgraduate-level courses on data networks and mul-timedia and has been invited to deliver researchtalks and tutorials at many universities and indus-trial organizations. He is a Project Manager for

international research projects and is actively participating in several IETFworking groups. His research interests are in multimedia communicationand quality-of-service aspects, spanning network layer protocols, wirelesstransmission, and multimedia middleware architectures.

Dr. Kassler is a Member of ComSoc and the Technical Committee ofTelecommunications of the IASTED.

Davide Mandato received the Dr.Eng. degree inelectronic engineering from the University of Padua,Padua, Italy, in 1990.

From 1990 to 1995, he worked as a Firmwareand Software Engineer at Necsy, Italy, in projectsdealing with artificial telephonic traffic genera-tion, telephonic traffic monitoring, and IN VoiceActivated Dialing services. From 1996 to 1998,he worked as a Software Engineer at TrilliumDigital Systems, Inc., Los Angeles, CA, workingon SS7 ISUP and B-ISUP protocol layers, and on

ISUP/B-ISUP interworking functionality. In 1997, he moved to the researchfield focusing on distributed systems and JAIN SS7 TCAP API. From 1998to 1999, he worked as System Manager at the Eurofighter Simulation System,Germany, focusing on distributed data bases. In 1999, he joined the WirelessSystem Laboratories, Sony International (Europe) GmbH, Stuttgart, Germany,working as Principal Engineer in IST projects (BRAIN, MIND), and in thefields of context-awareness, mobile ad hoc networks, and mobile multimediamiddleware.