Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
Darwin: CustomizableResourceManagementfor Value-AddedNetwork Services
PrashantChandra�, Allan Fisher
�, Corey Kosak
�,
T. S.EugeneNg�, PeterSteenkiste
�, EduardoTakahashi
�, Hui Zhang
��
ElectricalandComputerEngineering,� Schoolof ComputerScienceCarnegieMellon University
Email: � prashant,alf,kosak,eugeneng,prs,takahasi,hzhang � @cs.cmu.eduAbstract
TheInternet is rapidly changingfrom a setof wires andswitchesthatcarry packetsinto a sophisticatedinfrastructurethat deliversa set of complex value-addedservicesto endusers.Servicescanrangefrombit transportall thewayup todistributedvalue-addedserviceslike videoteleconferencing,data mining,and distributedinteractivesimulations. Beforesuch servicescan be supportedin a general and dynamicmanner, we haveto develop appropriate resource manage-mentmechanisms.Theseresource managementmechanismsmustmakeit possibleto identify and allocateresourcesthatmeetserviceor applicationrequirements,supportboth iso-lation and controlled dynamicsharing of resources acrossorganizationssharing physicalresources, and be customiz-ablesoservicesandapplicationscantailor resourceusagetooptimizetheir performance.
The Darwin project is developinga set of customizableresource managementmechanismsthat supportvalue-addedservices.In thispaperwepresentthesemechanisms,describetheir implementationin a prototypesystem,anddescribetheresultsof a seriesof proof-of-conceptexperiments.
1. Intr oductionThereis a flurry of activity in the networkingcommunity
developingadvancedservicesnetworks. Although the focusof theseeffortsvarieswidely from per-flow servicedefinitionslike IntServto serviceframeworkslike Xbind, they sharetheoverallgoalof evolving theInternetservicemodelfrom whatis essentiallya bitway pipe to a sophisticatedinfrastructurecapableof supportingnovel advancedservices.
In this paper, we considera network environment thatcomprisesnot only communicationservices, but storageand computationresourcesas well. By packagingstor-age/computationresourcestogetherwith communicationser-vices,serviceproviderswill beableto supportsophisticatedvalue-addedservicessuchasintelligentcaching,video/audiotranscodingandmixing, virtual reality games,anddatamin-ing. In sucha service-orientednetwork,value-addedservicescanbecomposedin ahierarchicalfashion:applicationsinvoke
Thisworkwassponsoredby theDefenseAdvancedResearchProjectsAgencyundercontractN66001-96-C-8528.
high-levelserviceproviders,whichmayin turninvokeservicesfrom lower-level serviceproviders; the mostbasicserviceisprovidedbybitwayandcomputationproviders.Sinceservicescanbecomposedhierarchically, bothapplicationsandserviceproviderswill be able to combinetheir own resourceswithresourcesor servicesdeliveredby otherprovidersto createahigh-qualityservicefor theirclients.
Thedesignof sucha service-orientednetworkposeschal-lengesin several areas,suchasresourcediscovery, resourcemanagement,servicecomposition,billing, and security. Inthis paper, we focus on the resourcemanagementarchitec-ture and algorithmsfor such a network. In particular, weobserve that service-orientednetworkshave several impor-tant differencesfrom traditional networksthat make exist-ing network resourcemanagementinadequate.First, whiletraditionalcommunication-orientednetworkservicesarepro-vided by switchesand links, value-addedserviceswill haveto managea broadersetof resourcesthat includescomputa-tion, storage,andservicesfrom otherproviders. Moreover,interdependenciesbetweendifferenttypesof resourcescanbeexploitedby value-addedserviceprovidersto achieve higherefficiency. For example,by using compressiontechniques,onecanmaketradeoffs betweennetworkbandwidthandCPUcycles. Similarly, storagecansometimesbetradedfor band-width. Furthermore,value-addedservicesarelikely to haveservice-specificnotionsof Qualityof Service(QoS)thatwouldbedifficult to capturein any fixedframework providedby thenetwork.Therefore,thenetworkmustallow serviceprovidersto makeresourcetradeoffs basedon their own notionof qual-ity. Finally, it mustnot only be possibleto makethe abovetradeoffs atstartup,but alsoto adaptresourcemanagementtothe changingnetworkconditionson an ongoingbasis. Thechallengeis to accommodateservice-specificqualitiesof ser-vice andresourcemanagementpoliciesfor a largenumberofserviceproviders.
To supportthesenew requirements,wearguethatresourcemanagementmechanismsshouldbeflexible so that resourcemanagementpolicies are customizableby applicationsandserviceproviders. We identify threedimensionsalongwhichcustomizationis needed:space(what networkresourcesareneeded),time (how theresourcesareappliedover time), andorganization(how theresourcesaresharedwith otherorgani-
zations).Additionally, themechanismsalongall threedimen-sionsneedto to beintegrated,sothatthey canleverageoff oneanother.
TheDarwin project is developinga comprehensive setofcustomizableresourcemanagementtools that supportvalue-addedservices. In this paperwe first identify key resourcemanagementrequirements(Section2). We thenoutline andmotivatethe Darwin architecture(Section3). In Sections4through7 we describeeachof the four componentsof theDarwin architecturein more detail, and we illustrate theiroperationusing several proof of conceptexperiments. Wedemonstratethe integratedoperationof Darwin in Section8,discussrelatedworkin Section9,andsummarizein Section10.
2. CustomizableResourceManagement2.1.ResourceManagementRequirements
Theattributesof value-addedservicesdescribedin theIn-troductionsuggestseveralrequirementsfor resourcemanage-ment. We organizetheserequirementsinto the followingclasses:resourcemanagementin space,acrossorganizationalboundaries,andin time.� space: Servicesneedto be ableto allocatea rich set
of resources(links, switch capacity, storagecapacity,computeresources,etc.) to meet their needs. Re-sourcesshouldbe allocatedin a coordinatedfashion,sincethereareoftendependenciesbetweenresources,andthereshouldbesupportfor globaloptimizationofresourceuse.� organizations: Serviceproviderswill want to reservesomeresourcessothey canmeetcertainminimalQoSrequirements;at the sametime, they will want to dy-namicallyshareresourcesfor efficiency. Mechanismsareneededto isolateor dynamicallyshareresourcesina controlledfashionacrossorganizations.� time: Conditionsin thenetworkanduserrequirementschangeover time. Servicesmustbe able to quicklychangehow resourcesareusedandallocated,so theycanoperateundera broadrangeof conditions.Becauseof thebroaddiversityin servicesandbecauseser-
viceproviderswill wantto differentiatethemselvesfrom theircompetitors,thereis astrongneedfor customizationthatcutsacrossthesethreedimensions:providers will want to cus-tomizewhatresourcesthey allocate,how they shareresourceswith other organizations,andhow they adapttheir resourceuseover time.
2.2.Darwin ResourceManagementMechanismsWhile it mayseemattractiveto haveasingleresourceman-
agementmechanismthatmeetstheabove requirements,thereareseveral reasonswhy an integratedsetof mechanismsispreferable.First,resourcemanagementactivitiescoverawidedynamicrangein bothtimeandspace:sometasksareinvokedrarely(e.g.only atapplicationsetuptime),whereasothersareinvokedvery often (e.g. on every packetpassingthrougha
router); differentactivities also have differentcosts. Simi-larly, sometasksinvolve only local resources(e.g.thosein asingleswitch),whereasothersmight involveresourcesacrossmany networks(e.g.optimizingthepositionof avideoconfer-enceserver in a multicountryconferencecall). Finally, sometasksrequiredetailednetworkknowledge(e.g.selectingtheright QoS parametersfor a guaranteedsession),while oth-ersrequiredomainspecificknowledge(e.g.determiningthecomputationalcostof a MPEG to JPEGconversion). Giventhe diversity of thesetasks,softwareengineeringprinciplesargueagainstbuilding a singlemonolithic,complex resourcemanagementmechanism. Instead,the Darwin architecturecomprisesasmall family of relatedmechanisms:� High-levelresourceallocation: thismechanism,some-
times called a resourceor servicebroker, performsglobalallocationof theresourcesbasedonahigh-levelapplicationrequest,typically usingdomainknowledgefor optimization.Its tasksincludeperformingtradeoffsbetweenservices(e.g.tradingcomputationfor commu-nication) accordingto the application-selectedvaluemetric(e.g.minimizecost,maximizeservicequality).It mustalsoperformcoordinatedallocationsfor inter-dependentresources(e.g.sincetheamountof processorpowerrequiredbyasoftwaretranscoderisproportionalto thebandwidthof videodataflowing throughit, thesetwo allocationsmustbe correlated.)We alsowant toprovide the ability in somecasesto interconnectin-compatibleservicesor endpoints,for exampleauto-matically insertinga transcoderservicebetweentwootherwiseincompatiblevideoconferenceparticipants.Thebrokerin theDarwinsystemis calledXena[8].� Low-level resource allocation: this mechanismis asignalingprotocolthat providesan interfacebetweenXena’s abstractview of the network and low-levelnetwork resources. It has to allocate real networkresources–bandwidth,buffers, cycles,memory–whilehidingthedetailsof networkheterogeneityfromXena.TheDarwinsignalingprotocolis calledBeagle[9].� Runtimeresource management: this mechanismin-jectsapplicationor servicespecificbehavior into thenetwork. Ratherthanperformingruntimeadaptationat flow endpoints(wherethe informationprovidedbynetworkfeedbackispotentiallystaleandinaccurate),itallows rapid local runtimeadaptationat theswitchingpointsin thenetwork’s interior in responseto changesin networkbehavior. Darwin runtime customizationis basedoncontrol delegates, Java codesegmentsthatexecuteon routers.� HierarchicalScheduling: thismechanismprovidesiso-lation and controlledsharingof individual resourcesamongserviceproviders. For eachphysicalresource,sharingand thus contentionexist at multiple levels:at the physicalresourcelevel amongmultiple serviceproviders,at the serviceprovider level amonglower
4 431
2
Comm
Library
Appl
Appl
Library
Comm
Appl
Library
Comm
Beag
Appl
Library
Comm
Appl
Library
Comm
Appl
Library
Comm
Appl
Library
Comm
DelBeag
Hierarc.Local RM
ClassifierSched
Del
Hierarc.Local RM
ClassifierSched
Hierarc.Local RM
ClassifierSched
Beag Del
Xena
Service Providers
Figure 1. Darwin components
level serviceprovidersor organizations,andat theap-plicationlevel amongindividualflows. Thehierarchi-calschedulerallowsvariousentities(resourceowners,serviceproviders,applications)to independentlyspec-ify differentresourcesharingpoliciesandensurethatall thesepolicy requirementsaresatisfiedsimultane-ously. DarwinusestheHierarchicalFairServiceCurve(H-FSC)scheduler.Thesemechanismswere chosenbecausethey cover the
space,organization,and time requirementsoutlined in Sec-tion 2, andbecausethey supportresourcemanagementon avarietyof time scalesandscopes.
The different resourcemanagementmechanismscan betied togetherusing the abstractionof a virtual network. Avirtual network,sometimesalsocalleda virtual mesh[23] orsupranet[12], is thesetof resourcesthatareallocatedandman-agedin an integratedfashionto meetspecificneeds.Virtualnetworkscanbeusedat theapplicationandserviceproviderlevel. An applicationmeshwould be createdin responsetoanapplicationservicerequest,while a servicemeshcapturesthe resourcescontrolledby the provider to meetservicere-quests.A virtual networknot only capturesresources,but italsocanincludestateandcodethatrepresentsor implementsresourcemanagementpolicies that areappropriatefor thoseresources.For example,theDarwindelegatesthatimplementcustomizedruntimeresourcemanagementareassociatedwiththevirtual network. Thecreationof thevirtual networkpro-videsanopportunityto doglobalresourceoptimizationandtoestablishstatein thevirtualnetworkthatwill speedupruntimeadaptation.
3. Darwin3.1.Darwin SystemArchitecture
Figure1 showshow thecomponentsin theDarwinsystemwork togetherto managenetworkresources.Applications(1)runningon end-pointscansubmitrequestsfor service(2) toa resourcebroker(Xena). Theresourcebrokeridentifiestheresourcesneededto satisfytherequest,andpassesthis infor-
Classifier Scheduler
Local Resource Manager
Beagle Delegates
Route lookup
Routing
Control API Event Notification
XenaOther Beagle Entities
OtherRoutingEntities
ApplicationsOther Delegates
Figure 2. Darwin node software architecture
mation(3) to asignalingprotocol,Beagle(4),whichallocatestheresources.For eachresource,Beagleinteractswith a localresourcemanagerto acquireandsetup theresource.Thelo-cal resourcemanagermodifieslocal state,suchasthatof thepacketclassifierandschedulershown in thefigure,sothatthenew applicationwill receive theappropriatelevel of service.The signalingprotocolcan alsosetup delegates. Through-out this process,appropriateresourceauthorizationsmustbemade. Resourcebrokershave to know what resourcepoolsthey are allowed to use,and the signalingprotocol and lo-cal resourcemanagersmustbe able to validatethe resourceallocationrequestandsetupappropriatebilling or charging.
Figure2 shows the componentson eachswitch nodeandtheir most important interactions. The bottom part of thepicturecorrespondsto thedataplane.Thefocusin this com-ponentis on simplicity and high throughput. The top halfcorrespondsto the control plane. Activities in the controlplanehappenon a coarsertime scale;althoughthereis onlya limited setof resourcesavailableto supportcontrol activi-ties,thereis moreroomin thecontrolplanefor customizationandintelligent decisionmaking. We expect that in general,thelocalresourcemanagerwill executeonaCPUcloseto thedatapath.Routing,signalinganddelegates,ontheotherhand,arenot astightly coupledto thedatapath,andcouldrunonaseparateprocessor.
The Darwin architectureis similar in many ways to tra-ditional resourcemanagementstructures. For example,theresourcemanagementmechanismsfor the Internetthat havebeendefinedin theIETFin thelastfew yearsrely onQoSrout-ing [28, 16] (resourcebrokers),RSVP[32] (signalingsimilarto Beagle),and local resourcemanagersthat set up packetclassifiersandschedulers.Themorerecentproposalsfor dif-ferentiatedservice[33] requiresimilar entities. The specificresponsibilitiesof the entitiesdiffer, of course,in thesepro-posals.In Darwin,we emphasizetheneedfor customizationof resourcemanagement,hierarchicalresourcemanagement(link sharing),andsupportfor not only communication,butalsocomputationandstorageresources.
� ��
�
� �
� ��
� ��
� �
� �� � � � � � � � � �
� � � � � � � � � � ! " � �
Figure 3. Darwin IP testbed topology
3.2.Darwin TestbedTheDarwinsystemhasbeenimplemented;andthroughout
thispaperweshow theresultsof avarietyof experimentscon-ductedon the Darwin networktestbed.The topologyof thetestbedis shown in Figure3. The threeroutersarePentiumII 266 MHz PCsrunningFreeBSD2.2.5. The endsystemsm1 throughm8 are Digital Alpha 21064A 300 MHz work-stationsrunningDigital Unix 4.0. The endsystemsp1 andp2 arePentiumPro200Mhz PCsrunningNetBSD1.2DandFreeBSD2.2.5respectively, ands1 is aSunUltrasparcwork-stationrunningSolaris2.5. All links exceptthatbetweens1andwhiteface arefull-duplex point-to-pointEthernetlinksconfigurableaseither100Mbpsor 10 Mbps. Thelink to s1runsonly at10Mbps.
In the remainderof the paperwe describeXena,Beagle,delegates,andhierarchicalschedulingin moredetail. Figure4showstheexamplethatwewill usethroughoutthepaper. Thetop of the figure shows the input to Xena. The centerpartshows the information that Xenapassesto Beagle,and thebottomsectionshowstheinformationusedto setup thelocalresourcemanager.
4. Xena: ResourceAllocation in SpaceTheprocessof allocatingresources,eitherby anapplica-
tion or provider, hasthreecomponents.Thefirst componentis resourcediscovery: locatingavailable resourcesthat canpotentiallybeusedto meetapplicationrequirements.This re-quiresa resourcediscovery protocol. Thesecondcomponentis solvinganoptimizationproblem:identifying theresourcesthat areneededto meetthe applicationrequirements,whilemaximizingquality and/orminimizing cost. Finally, the re-sourceshave to beallocatedby contactingthe providersthatown them. In our architecture,the first two functionsareperformedby a servicebroker called Xena, while the thirdfunctionis performedby thesignalingprotocolBeagle(Sec-tion 7).
4.1.XenaDesignThe applicationprovides its resourcerequestto Xena in
the form of an application input graph, an annotatedgraphstructurethatspecifiesdesiredservices(asnodesin thegraph)and the communicationflows that connectthem (asedges).
# $ % & ' % ( ) * + , # $ % & ( + , -# . / 0 1 2 1 3 4 5 5 6 7 8 9 2 9 : ; < 7 = > > 9 ? 9 @ ;7 A 9 B ; < 7 = > 9 = C > ;# D / 0 1 2 1 3 4 5 5 6 7 8 9 2 9 : ; < 7 = > > 9 C 9 @ ;7 A 9 B ; < 7 = C 9 C > ;# E F G B F < C ># H F G B F < = >I J K L M L K N O PQ P R S T U V W T X Y Z [ \ ] Y ^ _ ` a b Y \ c
d e e ` Y f a b Y \ c
g h g i g j g kl m g ]
l
n L V o O o W p L W O P q o R N U L J rs t u vw x u y z { | | x } | ~ � }� u � � � z } { | | x } | ~ � }s t u v � z } �s � ~ � � s � � � � � � � � � � �� � � � � � � � � � � �� � � � � � � � � � �
� V � U R o O P p O o L J r
¡ L o ¢ N O P£ U U P L W O ¢ L R J ¤ V T p ¥ ¦¥ §
¥ ¨¥ ©
¥ ª
] « ¬ ® ¯ l °
] ± ¬ ² ® ¯ l °
³ « ¬ ® ¯ l °
³ ± ¬ ® ¯ l °´ µ a c ¶ f \ Z [ µg jg k
g hg i · ¸¥ ¹
¥ º¥ »
¼ ½ ¾ ¿ À Á Â¥ à ¿ Ä Å Â À ¿
Æ Ç Â È ¿ É Ê Ë ¿Ì Í Î ÏÐ Í Ï Ñ Ò Í
Ð Í Ï Ñ Ò ÍÓ Ô Õ Ï Ò Ö Í × Ô Ø Ù Õ ÍÚ Ï Î Ï Ñ Í Ù
Û Ü ¥ ©¥ ¨
X Y Z [ \ Ý Y ¶ e ` a Þß ² ® ¯ l \ µ ® ¯ l àX Y Z [ \ Ý Y ¶ e ` a Þß ² ® ¯ l \ µ ® ¯ l àX Y Z [ \ ] \ _ µ f [ß ² ® ¯ l \ µ ® ¯ l àX Y Z [ \ ] \ _ µ f [ß ² ® ¯ l \ c ` Þ à] Y ^ _ ` a b Y \ cß a e e á ¶ e [ f Y â Y f b Þ e [ à] Y ^ _ ` a b Y \ cß a e e á ¶ e [ f Y â Y f b Þ e [ à
] Y ^ _ ` a b Y \ cß a e e á ¶ e [ f Y â Y f b Þ e [ à] Y ^ _ ` a b Y \ cß a e e á ¶ e [ f Y â Y f b Þ e [ à
¥ ¦ã ã
ã㣠U U P L W O ¢ L R JI J U N ¢ q o O U pä å æ
ä ç æ
X Y Z [ \ ] Y ^ _ ` a b Y \ cd e e ` Y f a b Y \ c
g h g i g j g k« è « é
± é ê± ë
« è é èl
l m g ]ä ì æ X Y Z [ \ ] Y ^ _ ` a b Y \ cd e e ` Y f a b Y \ c
g h g ig j« è « é
± é ê± ë
ll
í î ï � L � ð V o P L J V ñ ò ó í ô ï £ T U V J ñ � L � ð V o P L J V
· õ
Figure 4. Handling an application service re-quest in Darwin
The annotationscan vary in their level of abstractionfromconcretespecifications(place this nodeat networkaddressX) to moreabstractdirectives(this noderequiresa serviceofclassS). Theseannotationsaredirectlyrelatedto thedegreeofcontrol the applicationwishesto exert over the allocation: ameshwith fewer (or moreabstract)constraintspresentsmoreopportunitiesfor optimizationthana highly specifiedone.
In themostconstrainedspecification,theapplicationspeci-fiesthenetworkaddresseswheretheservicesshouldbeplaced,theservicesthemselves,andtheQoSparametersfor theflowsthatconnectthem. In this styleof specification,Xena’s opti-mizationopportunitiesarelimited to coarserouting: selectingcommunicationserviceproviders for the flows in the mesh.In a lessconstrainedspecification,the applicationcan leavethe networkaddressof a serviceunspecified.This providesan additionaldegreeof freedom: Xenanow hasthe abilityto placenodesand routeflows. In addition,the applicationcanleave the exact QoSparametersunspecified,but insteadindicatethe flow’s semanticcontent. An exampleof a flowspecificationmightbeMotion JPEG,with specificframerateandqualityparameters.In additionto providing sufficient in-formationto maintainmeaningfulapplicationsemantics,thisapproachgivesXenatheopportunityto optimizecostor qual-ity by insertingsemantics-preservingtransformationsto the
mesh.For example,whenaMotion JPEGflow needsto crossa congestednetworksegment,Xenacaninsertmatchedpairof transcodersat two endsof thenetworksegment. Thefirsttranscoderconverts the flow to a more bandwidthefficientcoding format (suchas MPEG or H.261), and then convertit backto JPEGon the far side. Anotheroptimizationis thelowest-costtype unification: a groupof nodesreceiving thesamemulticastflow (say, a videostream)needto agreewiththe senderon the encodingthat is used. If thereis no sin-gle encodingacceptableto all parties,Xenacaninsert“typeconverter”nodesappropriately.
A feasiblesolution to the resourceselectionproblem isonethatsatisfiesall theconstraintsbasedon service-specificknowledgeor applicationspecification,e.g. entitiesat flowendpointsmustagreewith the type of datathat will be ex-changed.Givenasetof feasiblesolutions,Xenaevaluateseachaccordingtotheoptimizationcriteria.In Xena,theseoptimiza-tion criteriaareencodedby anapplication-specifiedobjectivefunctionthatmapscandidatesolutionstoanumericvalue:thisfunctionis composedof asumof terms,whereeachtermrep-resentsthe“quality” of aparticularlayoutchoice.Thisallowsapplicationsto defineservicequalityin anapplication-specificway. For tractability’ssakethe objective functionscurrentlymustbe selectedfrom a small set of built-in functionsthatrepresentuseful objectives; e.g. minimize price, maximizethroughput,minimize network delay. By using applicationspecificcriteriato guideresourceselection,Xenain effect al-lowsapplicationstocustomizethedefinitionof servicequality.
4.2.ImplementationOurcurrentimplementationof Xenaincludestheinterfaces
to the other systementities(applications,serviceproviders,and Beagle),plus the solving engineand optimizationsde-scribedabove. The applicationinterfaceallows the spec-ification of requestthat include nodes, flows, types, andtranscoders.ThecurrentXenaimplementationdoesnothaveageneralresourcediscoveryprotocol. Instead,it offersamech-anismthroughwhich servicescan register their availabilityand capabilities. This information allows Xena to build acoarsedatabaseof availablecommunicationandcomputationresources,andanestimateof their currentutilization. Addi-tionally,Xenamaintainsadatabasethatmapsserviceandflowtypes(e.g.transcodingor transmittingMPEGof acertainqual-ity) to their effective resourcerequirements(e.g.CPUcyclesor Mbps). Finally, thereis adatabasethatcontainsthevarioussemantics-preservingtransformations(e.g. JPEG-to-MPEG)andhow to instantiatethem.
Currently,Xenaexpressesitsoptimizationproblemin termsof an integer linear program,and turns it over to a solverpackage[5] that generatesa sequenceof successively bettersolutionsat successively greatercomputationcost. Sincetheoptimizationproblemis generallyNP-hard,this approachisonly appropriatefor small to mediumsizeproblems. Workis in progressondefiningheuristicsandothersimplifying as-
sumptionsthatwill maketheproblemtractable.Thisapproachwill necessarilytradequality for performance;i.e. ourgoalisto find high quality (but not, in general,optimal)solutionsata reasonablecost.
4.3.Example
Consideran application in which four scientistscom-municatevia a videoconferencingtool that usessoftwareMPEG/JPEGcodersandcollaborateon a distributedsimula-tion thatrunsoveraneight-nodedistributedcomputingtestbed(depictedin Figure4(b)).
Figure4(a) shows the abstractresourcemeshsuppliedtoXena. For the sakeof clarity, somedetail has beenomit-ted;for example,wehavedepictedcommunicationflowing inonedirectiononly. For thevideoconferencingtool, sincethescientistsarephysicallylocatedattheirmachines,theapplica-tion providesto Xenaspecificnetworkaddresses(m1,m2,m5,m6) for the nodesparticipatingin the videoconference.Thenodesarealsoannotatedby therequestedservicetypes(VideoSource,VideoDisplay). Note that the videosourceat m2 iscapableof emittingonly MPEG.Thenodesareconnectedbyamultipoint-to-multipoint flow (only partiallydepicted).Thisflow specificationdescribesonly theconnectivity betweenthenodes;the flow’s exact QoSparametersareleft unspecified.For thedistributedsimulation,theapplicationdoesnotspecifywhatnodesshouldparticipate.
Thescientistsrequesttheminimizecostoptimizationstrat-egy andat thetime of therequest,computationis costlyrela-tive to communicationbecauseof existing loads.This meansthat,eventhoughit is simplestto useMPEGfor thevideo,thelesscomputation-intensiveall-JPEGsolutionismoredesirable(Figure4). Xenacancompensatefor m2’s inability to emitJPEGby employingtheMPEG-JPEGconverterregisteredatm4; despitethe detourthroughm4 andits computationcost,this solutioncomesin at an overall lower cost than the all-MPEGapproach.Moreover, dueto severe loadson m7, m8,andthe routerwhiteface, which areexpressedasa highcost,the leastcostlysolutioncollocatesthedistributedsimu-lation nodeswith thevideoconferencenodes,at m1, m2, m5,m6. Oncenodeplacementandrouteselectionhasoccurred,XenainvokesBeagleto performtheresourceallocation.Theinputsto Beagleareshown in Figure4(b); they areexplainedin moredetailwhenwediscusssignalingin Section7.
5.CustomizableRuntime ResourceManagementWediscusshow delegatescanperformcustomizedruntime
resourcemanagement,describethedelegateruntimeenviron-mentimplementedin Darwin,andillustratedelegateoperationusinganexample.
5.1.Delegates
We usethe term “delegate” for codethat is sentby ap-plicationsor serviceproviders to networknodesin order to
Shared Reservation Sel. Drop A. Sel. Drop0
5
10
15
20
25F
ram
e R
ate
(fps
)
0.4 0.5
8.47.5
19
15
21
18
Clip 1
Clip 2
Figure 5. Video quality under four scenarios
implementcustomizedmanagementof their dataflows. Del-egatesexecuteondesignatedroutersandcanmonitorthenet-work statusandaffect resourcemanagementon thoseroutersthrougha control API, asshown in Figure2. Delegatescanbeviewedasanapplicationof active networks[25]: networkuserscanaddfunctionality to the network. It is, however, averyfocusedapplication.Delegateoperationsarerestrictedtotraffic management,sodelegateswantto learnaboutchangesin resourceavailability andmustbe ableto changeresourceuse.
A critical designdecisionfor delegatesis thedefinitionofthecontrolinterface,i.e. theAPI thatdelegatesuseto interactwith theenvironment.If theAPI is toorestrictive,it will limittheusefulnessof delegates,while toomuchfreedomcanmakethesystemlessefficient. Thedefinitionof the API is drivenby the needto supportresourcemanagementandit includesfunctionsin a numberof categories:� Delegatescanmonitorthenetworkstatus,e.g. conges-
tion conditions.� Delegatescan changehow resourcesare distributedacrossflows: splittingandmergingof flows,changingtheir resourceallocationandsharingrules,controllingselective packetdropping.� Delegatescansendandreceive messages,for exampleto coordinateactivities with peerson otherroutersorto interactwith theapplicationonendpoints.� Finally, delegatescan affect routing, for example toreroutea flow insidethevirtual networkfor loadbal-ancingreasons,or to directa flow to a computeserverthatwill performdatamanipulationsto, for example,addcompressionor encryptionto a flow.
Theuseof delegatesraisessignificantsafetyandsecurityconcerns.Delegatesarein generaluntrusted,sotherouterhasto ensurethat they cannotcorruptstatein therouteror causeother problems. This can be achieved througha variety ofrun time mechanisms(e.g. building “sandbox”that restrictswhat thedelegatecanaccess)andcompiletime mechanisms(e.g. proof carryingcode).A relatedissueis thatof security.At setuptime, therouterhasto makesurethatthedelegateis
Methods Description
add Add nodein schedulerhierarchydel Deletenodefrom schedulerhierarchyset Changeparam.onschedulerqueue
dsc on Activateselective discardin classifierdsc off Deactivateselective discardin classifierprobe Readschedulerqueuestate
reqMonitor Requestasync.cong.notificationretrieve Retrieveschedulerhierarchy
Table 1. Methods available at the Delegate/LocalResour ce Manager API
beingprovidedby a legitimateuser, andat runtime,thelocalresourcemanagerhasto makesurethatthedelegateonly actsonflows thatit is allowedto manage.Theauthenticationandauthorizationof delegateswill beperformedjointly by thesig-nalingprotocolandthelocalresourcemanager;authenticationandauthorizationhave notyetbeenimplemented.
5.2.Implementation
Our currentframework for delegatesis basedon Java andusesthe Kaffe Java virtual machine[29], capableof just-in-time(JIT) compilationandavailablefor many platforms.Thisenvironmentgivesusacceptableperformance,portability,andsafetyfeaturesinheritedfrom thelanguage.Delegatesareex-ecutedasJava threadsinsidethe virtual machine“sandbox.”Currently, delegatescanrun with differentstaticpriority lev-els, althougha more controlledenvironmentwith real-timeexecutionguaranteesis desirable.
The API that givesusersaccessto resourcemanagementfunctionsand event notification is implementedas a set ofnative methodsthat call the local resourcemanager, whichrunsin the kernel. Table1 presentsthe methodsof the Javaclassdelegate.Clschd, which implementsthe API tothepacketclassifierandscheduler. Communicationwasbuilton top of standardjava.net classes.Reroutingfunctionshave not beenimplementedyet. While this environmentissufficient for experimentation,it is not complete. It needssupportfor authenticationand mechanismsto monitor andlimit the amountof resourcesusedby delegates. We alsoexpecttheAPI to evolveover time.
5.3.Experiment
Wepresentanexampleof how delegatescanbeusedto docustomizedruntimeresourcemanagement.
Supposea network carriestwo typesof flows, dataandMPEG video, similar to the example in Figure 4. If pack-ets aredroppedrandomlyduring congestion,the quality ofthe video degradesvery quickly. The reasonis that MPEGstreamshave threetypesof framesof differentimportance:Iframes(intracoded)areself contained,P frames(predictive)usesapreviousI or Pframefor motioncompensationandthus
dependon this previous frame,andB frames(bidirectional-predictive)use(andthusdependon)previousandsubsequentI or P frames. Becauseof theseinter-framedependencies,losingI framesis extremelydamaging,while B framesaretheleastcritical.
WedirectthreeflowsovertheAspen-Timberlinelink of thetestbed:2 MPEG video streamsandan unconstrainedUDPstream.Bothvideosourcessendatarateof 30frames/second,andour performancemetric is the rateof correctlyreceivedframes.Figure5 comparestheperformanceof four scenarios.In thefirst scenario,thevideoanddatapacketsaretreatedthesame,andtherandompacketlossesresultin avery low framerate. In thesecondcase,thevideostreamssharea bandwidthreservationequalto thesumof theaveragevideobandwidths.This improvesperformance,but thevideostreamsareburstyandtherandompacketlossduringpeaktransfersmeansthatlessthana third of the framesarereceived correctly. In thethird scenario,wealsoplaceadelegateonAspen.Thedelegatemonitorsthe lengthof queueusedby video streams,and ifthe queuegrows beyond a threshold,it instructsthe packetclassifierto identify anddropB frames;B framesaremarkedwith anapplication-specificidentifier. Packetdroppingstopswhenthequeuesizedropsbelow asecondthreshold.Figure5shows that is quiteeffective: the framerateroughlydoubles.Thereasonis thatby restrictingthe presenceof B framesinthequeue,theI andP framesareprotectedfrom corruption.
Whiledelegatesprovideanelegantwayof selectivelydrop-pingB frames,thesameeffectcouldbeachievedby associat-ing differentprioritieswith differentframetypes,i.e. layeredvideocoding.In scenariofour weuseadelegateto implementamoresophisticatedcustomizeddroppolicy. In scenariothree,typically too many B framesaredropped,becauseall flowsaresimultaneouslyaffected.A betterapproachis to only droptheB framesof a subsetof thevideostreams,assumingthatis sufficient to relieve congestion.Theadvantageof having adelegatecontrolselectivepacketdroppingis thatit canimple-mentcustomizedpoliciesfor controllingwhatvideostreamsare degraded. Scenariofour in Figure 5 shows the resultsfor a simple“time sharing”policy, whereevery few secondsthe delegateswitchesthe streamthathasB framesdropped.This improvesperformanceby another10-20%.Policiesthatdifferentiatebetweenflowscouldsimilarly beimplemented.
6. Hierar chical SchedulingFor anindividualphysicalresource,sharingandthuscon-
tentionexist at multiple levels: at thephysicalresourcelevelamongmultipleserviceproviders,attheserviceproviderlevelamonglower level serviceprovidersor organizations,andatthe applicationlevel amongindividual flows. Theserela-tionshipscan be representedby a resourcetree: eachnoderepresentsoneentity (resourceowner, serviceprovider, ap-plication),thesliceof thevirtual resourceallocatedto it, thetraffic aggregatesupportedby it, andthepolicy of managingthe virtual resource;eacharc representsthe virtual resource
owner/userrelationships.Figure4(c)illustratearesourcesub-treefor anapplication.
Theability to customizeresourcemanagementpoliciesatall sharinglevelsfor aresourceis oneof thekey requirementsand distinctive featuresfor service-orientednetworks. Thechallengeis to designschedulingalgorithmsthatcansimulta-neouslysatisfydiversepoliciessetby differententitiesin theresourcemanagementtree.ThissectiondescribestheHierar-chicalFair ServiceCurve(H-FSC)schedulingalgorithm(firstdescribedin [24]) whichmeetstheabove constraint.
6.1.H-FSC Algorithm and CustomizationFeaturesIn a H-FSCscheduler, associatedwith eachlink is a class
hierarchythatspecifiestheresourcemanagementpolicy. Eachinterior classrepresentssomeaggregateof traffic flows thataremanagedby an entity suchas the link owner, a serviceprovider, and so on. The resourcemanagementpolicy foreachentity is thenmappedto onethatcanbeimplementedbytheFair ServiceCurve (FSC)algorithm. Thegoal of the H-FSCalgorithmis tosimultaneouslysatisfyall theFSCpoliciesof all entitiesin thehierarchy.
In FSC,a streamö is saidto beguaranteeda servicecurve÷ùøûúýü þ, if for any time ÿ 2, thereexists a time ÿ 1 � ÿ 2, which
is the beginning one of stream ö ’s backloggedperiods(notnecessarilyincluding ÿ 2), suchthatthefollowing holds÷ ø ú ÿ 2 � ÿ 1
þ���� ø ú ÿ 1 � ÿ 2þ � (1)
where� ø ú ÿ 1 � ÿ 2
þis theamountof servicereceivedby sessionö duringthetime interval
ú ÿ 1 � ÿ 2� .Consideran entity that manages� streams,whereeach
streamcanbe an applicationflow, or a flow aggregate. Theamountof resourcethe entity managesis the serviceguar-anteedby its parententity, denotedby
÷ ú ÿ þ . Without goinginto the detailsof the FSCalgorithm,we statethat the FSCalgorithmcan(1) guaranteetheservicecurvesfor all streamsif thestability condition �ø � 1
÷ ø ú ÿ þ�� ÷ ú ÿ þ holds;(2) fairlyallocatetheexcessserviceif someflowscannotusetheirguar-anteedservice.
Wedecidedto usetheH-FSCschedulerin Darwinbecauseof thefollowing two importantpropertiesof H-FSC.First,aslongasthestabilityconditionholds,H-FSCallowsFSCpoli-ciesto becombinedarbitrarily in a hierarchywhile satisfyingall of their requirementssimultaneously. Second,the FSCalgorithm,which is usedat eachlevel in a H-FSCscheduler,providesa generalframework for implementingmany poli-cies. For example,by ensuringa minimumasymptoticslopeof
÷ùø ú ÿ þ independentof the numberof competingstreams,aguaranteedbandwidthserviceis provided to stream ö . Byassigning
÷ùøûú ÿ þ to be÷ ú ÿ þ ü��ùø�� � � 1
� � for all streams,aweightedfair servicein which streamö hasa weightof
�ùøis
implemented.Unlike variousfair queueingalgorithms,FSCdecouplesdelayandbandwidthallocation.
Due to the flexibility of FSCschedulersin implementinggeneralpoliciesandtheflexiblity of H-FSCin arbitrarilyinte-gratingvariousFSCpolicies,variousentities(resourceown-
FFT Only Shared Res: 20+20 Res: 400
50
100
150
200
250
300
350
400C
omm
unic
atio
n tim
e (s
)
2851
311
365
236 232206
166
0.5 KB
1 KB
Figure 6. FFT comm unication times under vari-ous scenarios.
ers, serviceproviders, applications)sharingthe resourceatdifferent levels can independentlycustomizeor specify thepoliciesof managingtheir shareof thevirtual resources.
6.2.ImplementationThe implementationof the H-FSC schedulerin Darwin
requirestwo extracomponents:apacketclassifierandanAPIfor signalingprotocols.
Beforethey canbescheduled,incomingdatapacketsmustbeclassifiedandmappedto packetqueues.To supporttrafficflowsof variousgranularities,wehavedevisedahighlyflexibleclassificationscheme.Eachflow is describedbya9 parameterflowdescriptor: sourceIPaddress,CIDR-stylesourceaddressmask,destinationIP address,CIDR-styledestinationaddressmask,protocolnumber, sourceport number, destinationportnumber, applicationspecificID (carriedas an option in theIP header),andCIDR-styleapplicationspecificID mask. Azero denotesa “don’t care” parameter. Using this scheme,flowssuchasend-to-endTCPconnections,aggregatesof traf-fic betweennetworks,andWWW, FTP, TELNETservicescanbe specified. With the applicationspecificID, we canevensub-divideanend-to-endtraffic flow into applicationspecificsub-flows,suchasthedifferentframetypesin a MPEGflow.ThecontrolAPI exportedby thescheduleris discussedin Sec-tion 5. Theprocessingoverheadassociatedwith classificationandqueueingin our implementationis suitablylow for useinourDarwintestbed.Classificationoverheadis 3 � sperpacketwith cachingon a 200 MHz PentiumPro system. Averagequeueingoverheadis around9 � s whenthereare1000flowsin thesystem.This low overheadallows usto easilysupportlink speedof 100Mbpsin ourprototypenetworktestbed.
6.3.ExperimentsWe presenttheresultsof a setof experimentsthatdemon-
stratetheimportanceof beingableto reserve resourcesandtocontroldynamicbandwidthsharing.
In thefirst setof experiments,we runtwo distributedcom-putations,FastFourierTransforms(FFTs),simultaneouslytoshow theeffect of reservingresources.Hostsm1, m4, andm7participatein thefirst mesh(FFT-1) on0.5 KB of datafor 96
Link
UDP
60 Mbps
FFT-1 FFT-2 UDP
20 Mbps 60 Mbps 40 Mbps
FFT-2FFT-1
(a) (b)
20 Mbps
Link
Figure 7. Resour ce trees in FFT experiments.(a) Per-flow reserv ation. (b) Aggregate reserv a-tion.
iterations.Hostsm2,m5 andm8 participatein thesecondFFTmesh(FFT-2) on 1 KB of datafor 32 iterations(pleasereferto Figure 3 for the testbedtopology). All networkconnec-tions in this setof experimentsareconfiguredat 100 Mbps.Theall-to-all communicationin theFFT oftendominatestheexecutiontime. FFT usesTCP for communicationand thecommunicationpatternis highly bursty: the burst datarateis approximately80 Mbps. Figure6 summarizestheresults.Scenario1 showsthebasecase:in anidlenetworkandwithoutreservations,thecommunicationtimesare27.66s and51.05s for FFT-1 andFFT-2 respectively.
In Scenario2, we introducetwo 90 Mbps UDP trafficstreamsfrom m3 to p1 and from m6 to p1, causingcon-gestionon theinter-routerlinks. SincetheFFTsuseTCPforcommunication,their performancedegradessignificantlybe-causethecompetingtraffic doesnotusecongestioncontrol. Toimprove the FFTs’ performance,protectionboundariesmustbe drawn betweenthe backgroundtraffic and the FFTs. InScenario3, we usereservationsto guaranteethat eachFFTmeshgetsat least20 Mbs of bandwdithon eachinter-router;Figure7(a) shows the correspondingresourcetree. The re-mainingbandwidthis givento thebackgroundtraffic. Underthisper-flow reservationscheme,thecommunicationtimesoftheFFTsimprovedby 24%and36%. In Scenario4, weapplya sharedreservation of 40 Mbps for both FFTs;Figure7(b)shows theresourcetree. SincetheFFT traffic is very bursty,theadvantageof dynamicallysharingthereservedbandwidthis significant: thecommunicationtime is reducedby another13%and29%. We concludethat it is importantto beabletoreserve resources,not only for flows,but alsofor flow aggre-gates.
In oursecondsetof experiments,wedemonstratehow hier-archicalschedulingcanbeusedto controldynamicbandwidthsharing. Considera distributedinteractive simulationappli-cationthatcombinesanFFTwith aninteractivecomponenentsuchasa videostreamor sharedwhite-board.In this experi-ment,theFFT (1 KB of dataand32 iterations)usesm2, m5,andm8, while theinteractiveadaptive componentis modeledby aTCPconnectionfromp2 tos1. In addition,backgroundtraffic is modeledby a full-blast UDP traffic streamfrom m6to p1. With “vanilla” resourcedistribution, we reserve 40Mbpsfor theFFT mesh,1 Mbpsfor theTCPstream,andthe
Non-hierarchical Hierarchicalscheduling scheduling
FFTcomm.time (s) 70.76 73.18TCPBW (Mbps) 1.41 5.30
Table 2. Scheduling performance comparison.
remainingbandwidthis givento thebackgroundUDP stream(Figure8(a)). With this reservationscheme,theTCPstreamachievesa throughputof 1.41Mbps,while theFFT commu-nicationtime is 70.76s(Table2).
With hierarchicalresourcemanagementwegroupflowsasis shown in Figure8(b). While theoverall reservationfor theapplicationremainsthesame,thehierarchyspecifiesthatTCP(FFT)getsthefirst chanceatusingany bandwidthleft unusedby FFT (TCP).The result is that the throughputof the TCPstreamis now 5.30 Mbps, a 276% improvement,while thecommunicationtime of theFFT increasesby aninsignificantamountof 3%to73.18s(Table2). Thisexampledemonstratesthathierarchicalschedulingmakesit possiblefor applicationsto cooperatively sharethereservedresourcesmoreeffectivelywithin theapplicationboundaryandthatspecifyingspecificre-sourcetrees,applicationsandserviceproviderscancustomizehow resourcesareshared.
7. Beagle:SignalingTheBeaglesignalingprotocolprovidessupportfor thecus-
tomizableresourceallocationmodelof Darwin. While tradi-tional signalingprotocolssuchasRSVP[32] andPNNI [3]which operateon individual flows, Beagleoperateson vir-tual networksor meshes.We elaborateon somekey Beaglefeaturesin this section— moredetailscanbefoundin [9].
7.1.BeagleDesign
On the input side,Beagleinterfaceswith Xenato obtainthevirtual networkspecificationgeneratedby Xena.Thevir-tual networkis describedasa list of flows anddelegates,plusresourcesharingspecificationsthatdescribehow flowswithina meshshareresourcesamongstthem. The examplevirtualnetworkin Figure4(b) shows two video flows andtwo dis-tributedinteractive simulationflows. Eachflow is specifiedby a flow descriptoras describedin Section6.2 and infor-mationsuchasa tspecanda flowspecobject,asin the IETFIntServworkinggroupmodel.A delegateis characterizedbyits resourcerequirements(onCPU,memory, andstorage),itsruntimeenvironment,andalist of flowsthatthedelegateneedsto manipulate.The delegateruntimeenvironmentis charac-terizedby a codetype(e.g. Java,Active-X) andruntimetype(e.g. JDK 1.0.2, WinSock 2.1, etc.). The virtual networktypically alsoincludesa numberof designatedrouterswhichidentify themeshcore.In theexample,AspenandTimberlinearethedesignatedrouters.
FFT
UDPUDP
Link Link
FFT TCP
40 Mbps 59 Mbps
(b)(a)
Appl
TCP
1 Mbps
59 Mbps41 Mbps
40 Mbps 1 Mbps
Figure 8. Resour ce trees in distrib uted interac-tive sim ulation experiments. (a) Per-flow reser -vation. (b) Hierarchical reserv ation.After receiving arequest,Beagleissuesasequenceof flow
setupmessagesto thedifferentnodesin themesh,eachmes-sagespecifyingwhat total resourcesare neededon a link,plusa groupingtreethatspecifieshow theseresourcesshouldbe applied. An efficient setupof a meshincludesthe es-tablishmentof the core,and individual flow setupsinitiatedby the sendersor receivers that rendezvouswith the core indesignatedrouters. Our initial implementationusessimpleper-flow setupsanda hardstateapproachbasedon three-wayhandshakesbetweenadjacentrouters. Futurework includesoptimizingmeshsetupandevaluatingtheuseof soft stateforsomeor all of themeshstate.
Oneachnode,Beaglepassesa resourcetrees(Figure4(c))to theLocalResourceManagertoallocateresourcesfor flows.This interfaceis similar to theonedescribedin Section5 (Ta-ble 1). Beaglealsoestablishesdelegatesonto switch nodes(for resourcemanagementdelegates),or computeand stor-agenodes(for dataprocessingdelegates).For eachdelegaterequest,Beaglelocatestheappropriateruntimeenvironment,initializesthelocal resourcemanagerhandlesandflow reser-vationstate,andinstantiatesthedelegate. Thehandlesallowthedelegateto give resourcemanagementinstructionsto thelocal resourcemanagerfor the flows associatedwith it. Inthe future,Beaglewill alsoprovide supportfor communica-tion channelsbetweencontroldelegatesbelongingto thesameserviceandalsoprovide mechanismsto recover from controldelegatefailures.
7.2.ResourceDistributionIn Section6 we describedhow dynamicresourcesharing
canbecontrolledandcustomizedby specifyinganappropri-ate resourcetreefor eachresource.This could be achievedby having applicationsor Xenaspecifythe resourcetreestoBeagle,so that it can install themon eachnode. Therearetwo problemswith this approach.First, how onespecifiesaresourcetree is networkspecificand it is unrealisticto ex-pectapplicationsandbrokersto dealwith this heterogeneity.Second,applicationsandbrokersdonotspecifyeachphysicallink; insteadthey usevirtual links that may represententire
subnets.Beagleusesthe hierarchical groupingtreeabstrac-tion to dealwith bothproblems:it isanabstractrepresentationof thesharinghierarchythatcanbemappedontoeachlink byBeagle.Figure4(b) givesanexample.
The hierarchicalgrouping tree encodesthe hierarchicalsharingstructureof all flows sharinga virtual link. Onceit knows theactualflows thatsharea particularphysicallinkin thenetwork,Beagleprunesthehierarchicalgroupingtree,eliminatingflowsthatdonotexistatthatlink. Todealwith net-work heterogeneity, interior nodesin the hierarchicalgroup-ing treehave genericQoSservicetypesassociatedwith theminsteadof network-specificsharingspecifications.The leafnodesof thegroupingtreerepresentflowswhoseQoSrequire-mentsareexpressedby individualflowspecs.Service-specificrulesdescribehow child nodeflowspecsareaggregatedintoparentnodeflowspecsin deriving aphysicallink resourcetreefrom the groupingtree. This involvespruningthe groupingtreeto eliminateflowsthatdonotexist ataparticularlink andconvertingflowspecsat eachnodeinto appropriatelow-levelscheduler-specificparameters,suchasa weightfor hierarchi-cal weightedfair shareschedulers[4] or a servicecurve fortheHierarchicalFair ServiceCurvescheduler[24].
7.3.Temporal SharingThere are often resourcesharingopportunitieson time
scales larger than what can be expressedin tspecsandflowspecs.For example,a conferencingapplicationmayen-surethatat mosttwo videostreamsareactive at any time,oranapplicationmay like to associatean aggregatebandwidthrequirementfor a group of best-effort flows. Applicationsandresourcebrokerscanspecifythis application-specificin-formation by handingBeagletemporalsharingobjectsthatlist setsof flow combinationsandtheir associatedaggregateflowspecs. Beaglecan then usethis information to reduceresourcerequirementsfor agroupof flowssharinga link.
The temporalsharingobject is similar in spirit to the re-sourcesharingapproachusedin the Tenet-2scheme[17].However, controlof flow routingusingdesignatedroutersal-lowsusto takebetteradvantageof sharingopportunities.ThetemporalsharingobjectalsogeneralizesRSVP’snotionof re-sourcereservationstyles.However, RSVPlimits aggregationtoflowswithin amulticastsessionandtheaggregateflowspecsmustbetheresultof eitherasumor aleastupperbound(LUB)operationon theindividualflowspecs.In Beagle,thetempo-ral sharingobjectcanbeusedto grouparbitraryflows withinanapplicationmeshandany aggregateflowspeccanbeused.As anexample,in Figure4(b),thedistributedinteractivesim-ulation applicationassociatesan aggregateControlledLoadserviceflowspecwith thetwo simulationflows.
The hierarchicalgroupingtree and the temporalsharingobjectsbothdefinewaysin whichanapplicationcantailor re-sourceallocationwithin themesh.However, they areseparateconceptsandareorthogonalto eachother. If both typesofsharingarespecified,theresourcetreeis derivedby applying
Figure 9. Xena output showing service layout
thetemporalsharingspecificationatevery level of thetree.Ifthe temporalsharingobject lists flow groupsthat do not fallunderthesameparentnodein theresourcetree,thetemporalsharingbehavior is ignored.Theexamplein Figure4(b)showsthe useof both sharingobjects. The resultinglink resourcesub-treesat links � 1 and � 2, assumingtheuseof hierarchicalweightedfair shareschedulers[4], areshown in Figure4(c).
7.4.Experimental EvaluationToevaluatetheperformanceof theBeagleprototypeimple-
mentation,we measuredend-to-endsetuplatenciesfor flowsanddelegates,andper-routerflow setupprocessingtimesonthe Darwin testbed.The experimentinvolvedsettingup thevirtual networkshown in Figure4(b). All measurementsre-portedareaveragesfrom 100 runs. The averageend-to-endlatency through two routerswas 7.5msfor flow setupand3.8msfor delegate setup. The flow setupprocessingtimeon eachrouterwas2.4ms,about68%of which wasspentininteractingwith thelocal resourcemanager. This involvesad-missioncontrolandsettingupflow statein thepacketclassifierandscheduler. The currentBeagleprototypesupportsabout425flow setupspersecond.This is comparableto connectionsetuptimesreportedfor variousATM switches.However, weexpectimprovementin theseresultsby optimizingtheimple-mentation.
8. Darwin SystemDemonstrationThissectiondemonstratesthevariouspiecesof theDarwin
systemin action. The example usedin demonstratingthesystemis the sameas the one shown in Figure 4(b). Theservicelayoutproducedby Xenais shown in Figure9 whichisascreenshotof arunningXenasystem.As shown in Figure9,
Figure 10. Scheduler graphical user interfaceshowing the resour ce tree and bandwidth plots
Xenainstantiatesa transcoderdelegateon m4 to convert theMPEG video flow to JPEGformat. Xena also selectsm1and m2 as simulationendpoints. Beagletakesthis servicelayout andallocatesresourcesfor the video andsimulationflows on Timberline and Aspen. Figure 10 is a screenshotof a user-interfaceprogramfor theH-FSCschedulershowingthe classifierand schedulerstateinstalledon link � 1 in thedirectionfrom Aspento Timberline.Figure10alsoshowsthebandwidthplotsfor thevideo(light grey) andsimulation(darkgrey) flowsat thatlink.
9. RelatedWorkTherehasrecentlybeena lot of work aspartof theXbind
[19, 31] andTINA [13, 26] effortsto defineaservice-orientedarchitecturefor telecommunicationnetworks.Therearesev-eraldifferencesbetweenthemandDarwin. First,servicesen-visionedby Xbind andTINA aremostlytelecommunications-oriented. The value-addedservicesdescribedin this paperintegratecomputation,storage,andcommunicationresources.Second,the value-addedservicesin their context areusuallyrestrictedto the control plane,e.g. signaling. Darwin sup-portscustomizedservicesin thedataplane(controlledsharingof resources,processing,andstorage)andthe control plane(signalingandresourcebrokering). Finally, while the focusof both TINA and Xbind is on developing an openobject-orientedprogrammingmodel for rapid creationanddeploy-mentof services,thefocusof Darwinis ondevelopingspecificresourcemanagementmechanismsthatcanbecustomizedtomeetservice-specificneeds.While Xbind andTINA have sofar primarily beenusedasa developmentframework for tra-ditional ATM and telecommunicationnetwork managementmechanisms,they could potentially alsobe usedas a basisfor the developmentof customizableresourcemanagementmechanisms.
Over the pastdecademuchwork hasgoneinto definingQoSmodelsanddesigningassociatedresourcemanagement
mechanismsfor bothATM andIP networks[10,14,27]. Thishasresultedin specificQoSservicemodelsbothfor ATM [1]and IP [7, 30, 22]. This hasalso resultedin the develop-mentof QoSroutingprotocols[3, 28, 21, 20] andsignalingprotocols[2, 3, 32]. A closely relatedissuebeing investi-gatedin the IP communityis link sharing[15], the problemof how organizationscan sharenetwork resourcesin a pre-setway, while allowing the flexibility of distributing unusedbandwidthto otherusers. Darwin differs from theseeffortsin severalaspects.First, while mostof this work focusesoncommuncationservices,Darwin addressesboth bitway andvalue-addedserviceproviders. Second,most QoS modelsonly supportQoSon a per-flow basis [7, 1]. Exceptionsaretheconceptof VP andVC in ATM, andIP differentialservicemodel[6, 18, 11, 33], but theseefforts arevery restrictedei-ther in thetypeof hierarchythey supportor in thenumberoftraffic aggregatesfor whichQoScanbeprovided. In contrast,Darwin usesvirtual networksto defineservice-specificQoSandsupportscontrolledresourcesharingamongdynamicallydefinedtraffic aggregatesof differentgranularities. Finally,while theseeffortsprovideresourcemanagementmechanismson thespace,timeandorganizationaldimensions,themecha-nismsoperatelargelyin anisolatedanduncoordinatedfashion.On theotherhand,Darwin takesan integratedview towardsresourcemanagementalongthesethreedimensions.
The ideaof “active networks”hasrecentlyattracteda lotof attention.In anactivenetwork,packetscarrycodethatcanchangethebehavior of thenetwork[25]. TheDarwin projecttoucheson this conceptin two ways. First, servicedelegatesarean exampleof active packets,althougha very restrictedone: delegatesaretypically downloadedto a specificnodeatserviceinvocationtime,andremainin actionfor theduration.Second,Darwin’s facilities for managingboth computationandcommunicationresourcesvia virtual networkscanhelpto solvekey resourceallocationproblemsfacedby activenet-works.
10.SummaryWe have designeda resourcemanagementsystemcalled
Darwin for service-orientednetworksthat takesanintegratedview towardsresourcemanagementalong space,time, andorganizationdimensions. The Darwin systemconsistsoffour inter-relatedresourcemanagementmechanisms:resourcebrokerscalled Xena, signalingprotocol called Beagle,run-time resourcemanagementusingdelegates,andhierarchicalschedulingalgorithmsbasedonservicecurves.Thekey prop-erty of all thesemechanismsis that they canbe customizedaccordingto servicespecificneeds.While thesemechanismsaremosteffective whenthey work togetherin Darwin, eachmechanismcan also be usedin a plug-and-playfashion intraditional QoS architectures,e.g., Beaglefor RSVP, Xenafor resourcebrokers,and hierarchicalschedulingfor trafficcontrol. We have a proof-of-conceptimplementationof theDarwin systemandpreliminaryexperimentalresultsto vali-
datethearchitecture.TheDarwin prototypedescribedin this paperimplements
thevisionanddemonstratessomeof thepossibilities,butmuchworkremainstobedone.Futureversionswill befarmorescal-able,bothin termsof routingin largetopologiesandin termsof aggregateprocessingof largenumbersof flows. Securityfeaturesarecurrentlyrudimentary, andexplicit authentication,authorizationandencryptionmethodsremainto be incorpo-rated. Hierarchicalresourcemanagementhas beenimple-mentedonly for networkcomponents,andmustbeextendedto computationandstorageresources.Finally, a numberoftopics, while important to a completenetwork, are beyondthe scopeof the the currentproject: tools for servicecre-ation,mechanismsfor automateddiscovery of resources,anddetailedaccountingof resourcecommitmentanduse.
References[1] ATM Forum Traffic ManagementSpecificationVersion4.0,
October1995.ATM Forum/95-0013R8.[2] ATM User-NetworkInterfaceSpecification.Version4.0,1996.
ATM Forumdocument.[3] PrivateNetwork-NetworkInterfaceSpecificationVersion1.0,
March1996.ATM Forumdocument- af-pnni-0055.000.[4] J. Bennettand H. Zhang. Hierarchicalpacketfair queueing
algorithms.In Proceedingsof theSIGCOMM’96 Symposiumon CommunicationsArchitecturesand Protocols, pages143–156,Stanford,August1996.ACM.
[5] M. Berkelaar. lp solve: aMixedIntegerLinearProgramsolver.ftp://ftp.es.ele.tue.nl/pub/lpsolve/,September1997.
[6] S.Blake. Someissuesandapplicationsof packetmarkingfordifferentiatedservices,July 1997. InternetDraft.
[7] R. Braden,D. Clark,andS.Shenker. Integratedservicesin theinternetarchitecture:an overview, June1994. InternetRFC1633.
[8] P. Chandra,A. Fisher, C. Kosak,, andP. Steenkiste.Networksupportfor application-orientedqos. In Proceedingsof SixthInternationalWorkshoponQualityof Service, pages187–195,Napa,California,USA, May 1998.IEEE.
[9] P. Chandra,A. Fisher, andP. Steenkiste.Beagle: A resourceallocationprotocolfor anapplication-awareinternet.TechnicalReportCMU-CS-98-150,CarnegieMellon University, August1998.
[10] D. Clark,S.Shenker,andL. Zhang.Supportingreal-timeappli-cationsin anintegratedservicespacketnetwork: Architectureandmechanism.In Proceedingsof ACMSIGCOMM’92, pages14–26,Baltimore,Maryland,Aug. 1992.
[11] D. ClarkandJ.Wroclawski. An approachto serviceallocationin the internet,July 1997. Internetdraft, draft-clark-diff-svc-alloc-00.txt,work in progress.
[12] L. Delgrossiand D. Ferrari. A virtual network serviceforintegrated-servicesinternetworks.In Proceedingsof the7thIn-ternationalWorkshopon NetworkandOperatingSystemSup-port for Digital Audio and Video, pages307–311,St. Louis,May 1997.
[13] F. Dupuy, C. Nilsson, and Y. Inoue. The tina consortium:Towardnetworkingtelecommunicationsinformationservices.IEEE CommunicationsMagazine, 33(11):78–83,November1995.
[14] D. Ferrari and D. Verma. A schemefor real-timechannelestablishmentin wide-areanetworks. IEEEJournalonSelectedAreasin Communications, 8(3):368–379,Apr. 1990.
[15] S.FloydandV. Jacobson.Link-sharingandresourcemanage-mentmodelsfor packetnetworks.IEEE/ACM TransactionsonNetworking, 3(4),Aug. 1995.
[16] R.Guerin,D. Williams,T. Przygienda,S.Kamat,andA. Orda.QoSRoutingMechanismsandOSPFExtensions.IETF Inter-netDraft<draft-guerin-qos-routing-ospf-03.txt>,March1998.Work in progress.
[17] A. Gupta,W. Howe, M. Moran, and Q. Nguyen. Resourcesharingin multi-partyrealtimecommunication.In Proceedingsof INFOCOM95, Boston,MA, Apr. 1995.
[18] K. Kilkki. Simpleintegratedmediaaccess(sima),June1997.InternetDraft.
[19] A. Lazar,K.-S.Lim, andF. Marconcini.Realizingafoundationfor programmabilityof atm networkswith the binding archi-tecture. IEEE Journal on SelectedAreasin Communication,14(7):1214–1227,September1996.
[20] Q. Ma and P. Steenkiste.On path selectionfor traffic withbandwidthguarantees.In Fifth IEEEInternationalConferenceonNetworkProtocols, pages191–202,Atlanta,October1997.IEEE.
[21] Q. Ma andP. Steenkiste.Quality of serviceroutingfor trafficwith performanceguarantees.In IFIP InternationalWorkshopon Quality of Service, pages115–126,New York, May 1997.IFIP.
[22] S.Shenker, C. Partridge,andR. Guerin.Specificationof guar-anteedqualityof service,September1997.IETF RFC2212.
[23] P. Steenkiste,A. Fisher, and H. Zhang. Darwin: Resourcemanagementin application-aware networks. TechnicalRe-portCMU-CS-97-195,CarnegieMellon University, December1997.
[24] I. Stoica,H. Zhang,andT. S.E. Ng. A HierarchicalFair Ser-viceCurveAlgorithmfor Link-Sharing,Real-TimeandPriorityService.In Proceedingsof theSIGCOMM’97 SymposiumonCommunicationsArchitecturesandProtocols, pages249–262,Cannes,September1997.ACM.
[25] D. TennenhouseandD. Wetherall.Towardsandactivenetworkarchitecture.ComputerCommunicationReview, 26(2):5–18,April 1996.
[26] Tinaconsortium.http://www.tinac.com.[27] J.S.Turner. New directionsin communications(or whichway
to the information age?). IEEE CommunicationsMagazine,24(10):8–15,October1986.
[28] Z. Wang and J. Crowcroft. Quality-of-ServiceRouting forSupportingMultimediaApplications.IEEEJSAC,14(7):1288–1234,September1996.
[29] T. Wilkinson. KAFFE - A virtual machineto run Java code.http://www.kaffe.org/.
[30] J. Wroclawski. Specificationof the controlled-loadnetworkelementservice,September1997.IETF RFC2211.
[31] ProjectX-Bind. http://comet.ctr.columbia.edu/xbind.[32] L. Zhang,S. Deering,D. Estrin,S. Shenker, andD. Zappala.
RSVP:A New ResourceReservationProtocol.IEEECommu-nicationsMagazine, 31(9):8–18,Sept.1993.
[33] L. Zhang, V. Jacobson,and K. Nichols. A Two-bit Dif-ferentiatedServicesArchitecturefor the Internet,December1997. Internetdraft, draft-nichols-diff-svc-arch-00.txt,Workin progress.