alternative solutions to sim cloning

Embed Size (px)

Citation preview

  • 7/27/2019 alternative solutions to sim cloning

    1/18

    http://sim.sagepub.com

    SIMULATION

    DOI: 10.1177/0037549703037147

    2003; 79; 299SIMULATIONDan Chen, Stephen John Turner, Boon Ping Gan, Wentong Cai, Junhu Wei and Nirupam Julka

    Alternative Solutions for Distributed Simulation Cloning

    http://sim.sagepub.com/cgi/content/abstract/79/5-6/299The online version of this article can be found at:

    Published by:

    http://www.sagepublications.com

    On behalf of:

    Society for Modeling and Simulation International (SCS)

    can be found at:SIMULATIONAdditional services and information for

    http://sim.sagepub.com/cgi/alertsEmail Alerts:

    http://sim.sagepub.com/subscriptionsSubscriptions:

    http://www.sagepub.com/journalsReprints.navReprints:

    http://www.sagepub.com/journalsPermissions.navPermissions:

    2003 Simulation Councils Inc.. All rights reserved. Not for commercial use or unauthorized distribution.at PENNSYLVANIA STATE UNIV on April 15, 2008http://sim.sagepub.comDownloaded from

    http://www.scs.org/http://sim.sagepub.com/cgi/alertshttp://sim.sagepub.com/cgi/alertshttp://sim.sagepub.com/subscriptionshttp://sim.sagepub.com/subscriptionshttp://www.sagepub.com/journalsReprints.navhttp://www.sagepub.com/journalsReprints.navhttp://www.sagepub.com/journalsPermissions.navhttp://www.sagepub.com/journalsPermissions.navhttp://sim.sagepub.com/http://sim.sagepub.com/http://sim.sagepub.com/http://sim.sagepub.com/http://www.sagepub.com/journalsPermissions.navhttp://www.sagepub.com/journalsReprints.navhttp://sim.sagepub.com/subscriptionshttp://sim.sagepub.com/cgi/alertshttp://www.scs.org/
  • 7/27/2019 alternative solutions to sim cloning

    2/18

    Alternative Solutions for DistributedSimulation Cloning

    Dan ChenSingapore Institute of Manufacturing TechnologySingapore [email protected]

    Stephen John TurnerSchool of Computer EngineeringNanyang Technological UniversitySingapore 639798

    Boon Ping GanSingapore Institute of Manufacturing TechnologySingapore 638075

    Wentong CaiJunhu WeiSchool of Computer EngineeringNanyang Technological UniversitySingapore 639798

    Nirupam JulkaSingapore Institute of Manufacturing Technology

    Singapore 638075

    Simulation cloning is designed to satisfy the requirement of examining alternative scenarios concur-rently. This article discusses the issues involved in cloning distributed simulations based on the High

    Level Architecture (HLA) and proposes tentative solutions. Alternative solutions are compared fromboth the qualitative and quantitative point of view. In terms of federation organization, candidate solu-tions can be classified into the single-federation and the multiple-federation categories. To guaranteethe correctness and optimize the performance of the whole cloning-enabled distributed simulation,the single-federation solution requires an additional mechanism to isolate the interactions amongalternative executions. Data distribution management (DDM) is one of the candidate approaches. Tomeasure the trade-off between complexity and efficiency, the authors introduce a series of experi-ments to benchmark various solutions at the runtime infrastructure (RTI) level. The benchmark resultsindicate that the single-federation solution provides encouraging performance when using DDM.

    Keywords: Simulation cloning, HLA, RTI, single federation, multiple federation, data distributionmanagement

    1. Introduction

    Distributed simulation is an important technology thatenables a simulation program to be executed on a dis-tributed computer system. A large-scale simulation maybe constructed by linking together existing simulationmodels at possibly different locations to form a simulation

    ||||

    SIMULATION, Vol. 79, Issue 56, MayJune 2003 299-315 2003 The Society for Modeling and Simulation International

    DOI: 10.1177/003754903037147

    federation. The main applications of distributed simulationinclude militaryapplications, entertainment, social interac-tions and business collaborations, education and training,and so forth. Distributed simulation technology also meetsthe pressing need of supply chain simulation, as a supplychain often involves multiple companies across enterpriseboundaries [1, 2].

    The High Level Architecture (HLA) defines the rulesand specifications to support reusability and interoper-ability among the simulation components (federates). One

    2003 Simulation Councils Inc.. All rights reserved. Not for commercial use or unauthorized distribution.at PENNSYLVANIA STATE UNIV on April 15, 2008http://sim.sagepub.comDownloaded from

    http://sim.sagepub.com/http://sim.sagepub.com/http://sim.sagepub.com/http://sim.sagepub.com/
  • 7/27/2019 alternative solutions to sim cloning

    3/18

    Chen et al.

    federate is able to interact with different federates usingthe runtime infrastructure (RTI) [3]. However, in a nor-mal analytic distributed simulation, the output reports onesingle set of results per simulation. To optimize the sim-ulated system or perform what-if analysis, one shouldrepeat the simulation multiple times to examine alternative

    scenarios, decision policies, and strategies using differentdecision rules and parameters. Subsequently, the best so-lutions may be selected based on all the possible results.Basically it is a time-consuming and onerous task.

    When a federate reaches a decision point, it is facedwith multiple different choices. The cloning approach of-fers users the flexibility to examine the different choicesconcurrently, rather than executing them one by one in adeterministic or random way. At the decision point, thefederate can replicate itself into multiple copies (clones)to explore each possibility. Each clone explores one par-ticular path together with its partner clones spawned fromthe other federates in the original scenario. From an in-dividual clones point of view, it merely interacts with its

    partners to form an independent scenario. This providesthe user with the opportunity to evaluate multiple alterna-tive results concurrently using the same simulation run. Afurther consideration is that simulation components run-ning at different locations are liable to failure. It may bepossible to use the replication mechanisms for cloning toachieve fault tolerance. If one scenario encounters failure,the remaining concurrent scenarios are still able to con-tinue execution.

    One of the key benefits of HLA-based simulation isreusability [4]. It means the component simulation modelscan be reused in different simulation scenarios and appli-cations. A middleware approach is introduced to hide theimplementation of the cloning mechanismand still provideusers with the standard RTI interface. The middleware islocated between the user federate and the real RTI andencapsulates modules for cloning and simulation control.Thus, the complexity incurred by cloning is hidden fromthe users.

    When generating a new clone, alternative solutions maybe adopted. The clone may persist in the original federa-tion or form a new federation together with its partners.To keep the correctness of the execution in various paths,one should apply some mechanism to isolate the interac-tions among clones in different execution paths. In otherwords, logically, one clone will only interact with its part-ners, which are the corresponding clones of other feder-ates. In a single federation, this may be achieved either bytagging the interactions or by using data distribution man-agement (DDM) [4]. To investigate the overhead incurredby cloning, this article presents a set of benchmark exper-iments to examine the execution time of given distributedsimulations using the different solutions. The experimentsexplore the impact on the execution time of both messagetransmission and synchronization.

    A shorter version of this article waspresented at the36th

    Annual Simulation Symposium [5]. This extended articlealso discusses two DDM solutions introduced for scenariomanagement, as well as clone and scenario identification[6]. A recursive region division solution and a point regionsolution are discussed and compared. The former solutioninitializes each original federate with a region occupying

    the full dimension of the routing space. During the cloningprocedure, the new clones inherit split subregions fromtheir parent. The latter solution specifies a point regionfor each original federate, and a new clone is given anadditional point region on birth. Shared clones may havechanging region combinations when cloning occurs in thescenarios in which they operate. Finally, this article alsopresents how the mechanism may be implemented usinga middleware method to address reusability issues. In thismethod, scenario manager and region manager modulesare developed to support both cloning execution and ex-tensions to the RTI. Our proposed middleware approachoffers transparency in adding scenario management to theusers federates.

    The rest of this article is organized as follows: dis-tributed simulation cloning technology and related workare addressed in section 2. Section 3 covers the alterna-tive solutions in detail. Section 4 introduces the design ofthe experiments and analyzes the benchmark results. Sec-tion 5 studies the issues related to managing concurrentscenarios and gives the two region management solutions.In section 6, we conclude with a summary and proposalson future works.

    2. Distributed Simulation Cloning

    2.1 Related Work

    Hybinette and Fujimoto [7] proposed simulation cloningin the context of parallel simulation. The motivation forthis technique was to develop a parallel model that sup-ports an efficient, simple, and effective way to evaluateand compare alternate scenarios. The method was targetedfor parallel discrete event simulators that provide the sim-ulation application developer with a logical process (LP)execution model.

    Schulze, Straburger, and Klein [8] introduced a cloningapproach to extend the flexibility of system compositionto runtime. Their approach included the parallel manage-ment of different time axes to provide forecast functional-ity. Internal cloning and external cloning techniques weresuggested to clone the federates at runtime.

    As our design targets potential industry users who mayhave their own existing complex simulation models, wehave the additional aim to provide reusability and trans-parency while enabling simulation cloning. Providing easyutilization and deployment is another major concern. Dis-tributed simulation cloning technology should be a muchmore powerful and flexible decision support tool than tra-ditional linear simulation.

    300 SIMULATION Volume 79, Number 56

    2003 Simulation Councils Inc.. All rights reserved. Not for commercial use or unauthorized distribution.at PENNSYLVANIA STATE UNIV on April 15, 2008http://sim.sagepub.comDownloaded from

    http://sim.sagepub.com/http://sim.sagepub.com/http://sim.sagepub.com/http://sim.sagepub.com/
  • 7/27/2019 alternative solutions to sim cloning

    4/18

    ALTERNATIVE SOLUTIONS FOR DISTRIBUTED SIMULATION CLONING

    2.2 Issues in Simulation Cloning

    2.2.1 Decision Points

    During the execution of a simulation, a federate may facedifferent choices to perform alternative actions. The fed-erate is cloned at such decision points according to somerules. A decision point represents the location in theexecu-tion path where the states of the system start to diverge ina cloning-enabled simulation. Cloning differs from simplereplication in the sense that clones of the original federateexecute in different paths rather than simply repeat thesame executions, even though the computation of clones isidentical at the decision point. From the decision point on-wards, a simulation spawns multiple executions to exploitalternative scenarios concurrently.

    For example, in a simple manufacturing simulation il-lustrated in Figure 1, a batch machine is processing a lotwhile some lots are waiting in a queue. In this figure, part(A) gives the manufacturing process, and part (B) indi-cates the advancing of the simulation. When the currentjob finishes, the machine status changes from busy toidle. If it is found that the queue length has becomegreater than the alert level, one needs to use some spe-cial dispatching rule to process the waiting lots. The timeand cost to be consumed may vary for different rules.Instead of simulating all the three possibilities from thestart, it is possible to clone two more executions to ex-amine the rules of interest concurrently once the con-dition QUEUE_LENGTH_GREATER_ALERT is met.The point where the queue length reaches the alert leveland the machine becomes idle is one decision point in thissimulation.

    From the users point of view, decision points can bespecified either at the modeling stage or at runtime. At themodeling stage, the user can specify the conditions of thedecision points, as well as the policies and rules for thedecision points. During runtime, the user can insert deci-sion points dynamically and can even specify alternativeexecution paths at a decision point interactively.

    2.2.2 Active and Passive Cloning of Federates

    When a federate reaches a decision point, it makes cloneson its own initiative. This federate is said to perform ac-tive cloning. Each clone of the federate executes a separatescenario. In distributed simulations, there are multiple fed-erates interoperating with each other. When one federatesplits into different executions, the partners that interact

    with this federate may have to spawn clones to performproper interaction, even though the partners have not yetmet a decision point. Those partners are said to performpassive cloning. Generally speaking, the clones generatedin an active cloning have separate initial states while thosecreated in a passive cloning have identical initial states.Only the active cloning leads to the creation of new sce-narios, and this induces passive cloning in some otherfederates.

    BUSY

    IDLE

    Lot QueueBatch

    Machine

    Multiple

    dispatch rules

    Queue Length> alert level ?

    FILO

    Random

    FIFO

    DecisonPoint

    (A)

    (B)

    Figure 1. A simple manufacturing simulation. FIFO = first in,first out; FILO = first in, last out.

    2.2.3 Entire versus Incremental Cloning

    A simple approach to keep the correctness of the simula-tion is to clone the whole simulation whenever any fed-erate reaches a decision point (i.e., entire cloning). Eachcloned simulation consists of a separate set of federatesthat report results independently. One can see that the scal-

    ability of distributed simulation is a challenge in this case.Another approach replicates the simulation incrementally(i.e., only those federates whose states will alter at a deci-sion point need to be cloned; other federates will remainintact). Such an incremental cloning approach shares com-putation between federates in alternative scenarios and pro-vides a more efficient and scalable method to clone thedistributed simulation.

    2.2.4 Shared Clones

    In some cases, a federate does not need to perform passivecloning when its partner clones actively. Although newscenarios have been created due to the active cloning, oneclone is capable of executing in multiple scenarios. In this

    article, we name such clones as shared clones. A sharedclone may subsequently perform cloning passively duringthe execution of the simulation on demand of its partners.

    3. Alternative Solutions for Cloning in HLA-BasedDistributed Simulation

    When a federate is cloned, we can create multiple federa-tions to meet thedemand of executing alternative scenarios

    Volume 79, Number 56 SIMULATION 301

    2003 Simulation Councils Inc.. All rights reserved. Not for commercial use or unauthorized distribution.at PENNSYLVANIA STATE UNIV on April 15, 2008http://sim.sagepub.comDownloaded from

    http://sim.sagepub.com/http://sim.sagepub.com/http://sim.sagepub.com/http://sim.sagepub.com/
  • 7/27/2019 alternative solutions to sim cloning

    5/18

    Chen et al.

    or generate new federates to operate in the original federa-tion without intervening in the execution of any other sce-nario. This article refers to the multiple-federation solution(MF) to denote the former design and the single-federationsolution (SF) to denote the latter one.

    3.1 Single-Federation Solution versusMultiple-Federation Solution

    Figure 2 depicts the cloning of a simulation using bothsolutions. Federates inside the dashed rectangle representthe clones originating from a common ancestor. Followingthe cloning action that is triggered at the decision point,both original federates (A and B) duplicate themselves toform two different scenarios. By applying the SF solu-tion, the new federates fedA[1] and B[1] participate in theoriginal federation RT I[0], and there is no need for anadditional federation to support the new scenario (labeled1). By applying the MF solution, fedA[1] and B[1] formanother federation RT I[1] to facilitate another scenario

    (labeled 1).The two kinds of solutions mentioned above involve

    different research issues and problems, especially at theRTI level. Table 1 gives a comparison showing the advan-tages and disadvantages of both solutions from differentviewpoints. Making a trade-off among these issues is chal-lenging work. As RTI doesnot providedestination-specificdelivery in its object management services, it is mandatorythat the single- federation solution requires an additionalmechanism to isolate interactions among clones in differ-ent scenarios.

    3.2 DDM versus Non-DDM in a Single-Federation

    SolutionSeparate scenarios can be isolated by using a straightfor-ward approach that filters events at the receiver side. Eachevent is attached with the exclusive identity of the scenarioof thesender. Thereceivers discard those events from otherscenarios and merely reflect those belonging to the samescenario. A minimal effort is required to enable this filter-ing in addition to the standard RTI services.

    It is possible to use DDM services to partition scenar-ios in the overall cloning-enabled distributed simulation.In general, the purpose of DDM services is to reduce thetransmission and receipt of irrelevant data by the feder-ates. Routing spaces are a collection of dimensions thatrepresent coordinate axes of the federation problem space

    with a bounded range [9]. A region defines a multidimen-sional subspace in the routing space by defining the lowerbound and upper bound on each dimension of the rout-ing space. In DDM, data producers and consumers specifytheir data properties and data requirements by providingupdate regions and subscription regions. Data connectionwill be established between a pair of federates only whenan update region and a subscription region overlap. Withthis property, DDM seems to be a natural candidate for de-

    veloping the mechanism to restrict the interaction amongclones to within the same scenario.

    By assigning a scenario-specific region to one set ofclones, the interactions will automatically be confined tothis scenario. However, this incurs some extra overheadfor managing the regions and increases the complexity of

    implementation. It is not necessary to introduce the DDMmechanism to the multiple-federation solution, as the com-munication traffic has already been confined within eachfederation. Thus, there are three candidate solutions forcloning HLA-based distributed simulations: the MF, theDDM single-federation solution (DSF), and the non-DDMsingle-federation solution (NDSF). These solutions canbe evaluated by considering another important criteriontheir efficiency. The trade-off between efficiency and com-plexity is a mainconcern of the distributedsystem designer.Section 4 introduces the benchmark experiments to mea-sure the overall performance of the three solutions in termsof execution time.

    4. Benchmark Experiments and Results

    The primary objective of the experiments is to providesomecriteria on complexity and efficiency to helpus decidewhat kind of cloning method we should adoptnamely,MF, DSF, and NDSF, as discussed.

    We will study the computation complexity involved us-ing three factors: (1) the lower bound time stamp (LBTS)computation for time advance [4], (2) the interaction be-tween federates, and (3) the load of the federate processeswithin a PC. The experiments explore how these factorsaffect the overall performance. The experiments will re-port results in execution time to perform a given task usingeach solution.

    4.1 Experiment Design

    The experiments use three PCs in total (PCs 1, 2, and 3 inFigures 3 and 4), in which PC 2 executes the RTIEXECand FEDEX processes [4]. The federates that run at oneindependent PC are enclosed in a dashed rectangle. In ourcase, fedA[i] and fedB[i] (i 1) occupy PC 1 and PC 3,respectively. The PCs are interlinked via an EtherFast 100five-port workgroup switch, which constructs an isolatedsubnet to avoid fluctuation incurred by additional networktraffic. The PCs configuration is as follows:

    Intel 1700-MHz Pentium IV

    256 Mbytes of RAM

    Windows 2000 Professional

    DMSO RTI NG 1.3 V4

    The experiments emulate the simulation cloning pro-cess by increasing the number of identical federates. InFigures 3 and 4,fedA[1] and B[1] form a pair of initial fed-erate partners, which represent the federates to be cloned.FedA[i] and B[i] (i > 1) stand forthe ith clonesof thetwo

    302 SIMULATION Volume 79, Number 56

    2003 Simulation Councils Inc.. All rights reserved. Not for commercial use or unauthorized distribution.at PENNSYLVANIA STATE UNIV on April 15, 2008http://sim.sagepub.comDownloaded from

    http://sim.sagepub.com/http://sim.sagepub.com/http://sim.sagepub.com/http://sim.sagepub.com/
  • 7/27/2019 alternative solutions to sim cloning

    6/18

    ALTERNATIVE SOLUTIONS FOR DISTRIBUTED SIMULATION CLONING

    ) H G

    % > @

    ) H G

    % > @

    5 7 , > @

    Initial Scenario

    Reach Decision Point

    ) H G $ D F W L Y H F O R Q L Q J

    ) H G % S D V V L Y H F O R Q L Q J

    ) H G

    $

    ) H G

    %

    5 7 , > @

    ) H G

    $ > @

    ) H G

    % > @

    ) H G

    $ > @

    ) H G

    $ > @

    5 7 , > @ 5 7 , > @

    ) H G

    $ > @

    ) H G

    % > @

    Single-federation

    Solution

    Multiple-federation

    Solution

    Figure 2. An example of a single-federation solution and a multiple-federation solution

    original federates, respectively. The federates are tailoredbased on theDMSOstandardbenchmarkingprograms (seehttp://sdc.dmso.mil).

    As indicated in Figure 3, each pair of fedA[i] andB[i]joins in an exclusive federation, which is denoted asRT I[i]. We employ this set of scenarios to perform thebenchmarking experiments for the MF solution. In Fig-ure4, all federates formone single federation; accordingly,we use this set of scenarios to measure the performance ofNDSF and DSF solutions.

    Eachfederateis timeconstrained/timeregulatingor nei-ther. In one run, each federate updates an attribute instanceand receives an acknowledgment from its partner (fromfedA[i] to fedB[i], and vice versa) for 10,000 times with apayload of 150 bytes. A federate merely reflects the eventswith an identical ID to itself. In other words, fedA[i] willdiscardanyeventsnot generatedbyfedB[i], andviceversa.In time stamp order (TSO) mode, federates advance feder-ate time from 0 to10,000, with timestep= 1 and lookahead

    = 1. To investigate the efficiency for time synchronizationamong federates, we also examine the execution time forthestandardtime advancementbenchmarkingfederates us-ingboththeMFandSF solutions (seehttp://sdc.dmso.mil).

    Each federate can also be set as DDM enabled andnon-DDM. An exclusive ID is shared between fedA[i]and fedB[i]. In the DDM-enabled mode, each pair offederates has an associated region, which is pair spe-cific and nonoverlapping to any other region. The cur-rent DMSO RTI-NG [4] supports a number of DDM data-

    filtering strategies, one of which is the StaticGridParti-tionedstrategy. This strategy partitions individual spacesinto a grid in which each grid partition is assigned a sepa-rate reliable and best-effort channel [10]. To optimize theutilization of communication channels,we split the full di-mension, [MIN_EXTENT, MAX_EXTENT), evenly intothe same number of segments asNumPartitionsPerDimen-sion (NPPD). The middle point of each segment will bedefined as one region. We set the region associated to the

    Volume 79, Number 56 SIMULATION 303

    2003 Simulation Councils Inc.. All rights reserved. Not for commercial use or unauthorized distribution.at PENNSYLVANIA STATE UNIV on April 15, 2008http://sim.sagepub.comDownloaded from

    http://sim.sagepub.com/http://sim.sagepub.com/http://sim.sagepub.com/http://sim.sagepub.com/
  • 7/27/2019 alternative solutions to sim cloning

    7/18

    Chen et al.

    5 7 ,

    ) ( ' $ > @

    ) ( ' % > @

    ) ( ' $ > @

    ) ( ' % > @

    ) ( ' $ > @

    ) ( ' % > @

    ) ( ' $ > @

    ) ( ' % > @

    ) ( ' $ > @

    ) ( ' % > @

    5 7 ,

    5 7 ,

    5 7 ,

    5 7 ,

    3 & 3 & 3 &

    Figure 3. Test bed for the multiple-federation solution (MF)

    ) ( ' $ > @

    ) ( ' $ > @

    ) ( ' $ > @

    ) ( ' $ > @

    ) ( ' $ > @

    5 7 ,

    ) ( ' > % @

    ) ( ' > % @

    ) ( ' > % @

    ) ( ' > % @

    ) ( ' > % @

    3 & 3 & 3 &

    ) ( ' % > @

    ) ( ' % > @

    ) ( ' % > @

    ) ( ' % > @

    ) ( ' % > @

    Figure 4. Test bed for the datadistribution management single-federationsolution (DSF) and the nondatadistribution managementsingle-federation solution (NDSF)

    304 SIMULATION Volume 79, Number 56

    2003 Simulation Councils Inc.. All rights reserved. Not for commercial use or unauthorized distribution.at PENNSYLVANIA STATE UNIV on April 15, 2008http://sim.sagepub.comDownloaded from

    http://sim.sagepub.com/http://sim.sagepub.com/http://sim.sagepub.com/http://sim.sagepub.com/
  • 7/27/2019 alternative solutions to sim cloning

    8/18

    ALTERNATIVE SOLUTIONS FOR DISTRIBUTED SIMULATION CLONING

    Table 1. Comparison between single-federation and multiple-federation solutions

    Issues Single-Federation Multiple-Federation

    Interaction Additional mechanism is needed to dealwith unnecessary event crossing amongconcurrent scenarios.

    Interaction between clones in various scenarios is iso-lated by default.

    Synchronization Unnecessary synchronization among

    clones in differentscenarios is inevitable.

    Synchronization among clones in different scenarios is

    eliminated.Complexity of sharing Clone sharing is available in a single fed-

    eration.Clone sharing is difficult among federations.

    Robustness If a runtime infrastructure (RTI) instancecrashes, the simulation will fail.

    Multiple RTI instances for one user federation are main-tained, and thus one RTI instance crash will not result infailure of the whole simulation.

    Management of clones Management of clones is easier insidethe same federation.

    Management of clones crossing multiple federations isindirect.

    kth pair of federates as follows:

    (2k 1)(MAX_EXTENT-MIN_EXTENT )

    2 NPPD

    + MIN_EXTENT,

    (2k 1)(MAX_EXTENT-MIN_EXTENT )

    2 NPPD

    + MIN_EXTENT + 1.

    In these experiments, all federates subscribe and publishthe same object class and associate the designated regionto their subscription and updates.

    4.2 Benchmark Results and Analysis

    Some abbreviated notations are used to denote the prop-

    erties of the federates, as listed in Table 2. To investigatethe factors that affect the performance of alternative solu-tions,wedesignedseveralseriesofexperiments,as indexedin Table 3. The notations are the same as in the previousdiscussion. Combining the time features and solutions to-gether, we perform eight series of experiments in total. Ineach series of experiments, the number of federates in-creases by 1 pair each time, from 5 pairs to 14 pairs.

    4.2.1 Comparing Alternative Cloning SolutionsUsing Timestamp Order (TSO) Federates

    Figure 5 reports the execution time of TSO federates us-ing the three different solutions. The execution time of theTSO-NDSF scenarios has an obvious increase when the

    number of federate pairs reaches 7. When no DDM ser-vices are used, the execution time increases sharply withmore and more TSO federates joining the same federa-tion. The topmost value (2800 seconds, 14 pairs) is morethan five times the start value (500 seconds, 5 pairs). Onthecontrary, theexecution times of both theTSO-DSF andTSO-MFscenarios stay at a relative stable level, about 500seconds, despite the increase in thenumber of participatingfederates.

    Execution Time Comparison Between

    Different Cloning Solutions using TSO Federates

    0

    500

    1000

    1500

    2000

    2500

    3000

    5 6 7 8 9 10 11 12 13 14

    Federate Pairs

    ExecutionTime

    (Seconds)

    TSO-NDSF

    TSO-DSF

    TSO-MF

    Figure 5. Execution time comparison among TSO federatecloning mechanisms. For definitions of notations, see Table 2.

    In the single-federationmode, all federates interactwitheach other through thesame RTI,andthecomputationloadof LBTS increases with the number of federates. In theTSO-NDSFscenarios, of more importance is that each fed-erate not only receives usefulevents from its partner feder-ate but also has to filter out some useless messages from allother federatesas they belong to other scenarios. Theover-all communication traffic through the RTI is proportionalto C2

    n= n(n

    1)

    2, where n is the number of federates. Only

    1

    (n1)of the total incoming events make sense to onepartic-

    ular federate pair. Other unnecessary communication andreflection increase the overhead of communication in the

    RTI dramatically. The DDM services strictly confine theinteraction to the pair of federates with a common region.This optimizationresults in thesignificantly improved per-formance, as indicated in the curve TSO-DSF.

    In multiple-federation mode, a federate interacts withits partner through an exclusive RTI. Also the LBTScomputationswill take placebetween a pair of federatesin-dependently. These positive factors lead to themuch betterperformance compared with the TSO-NDSF scenarios.

    Volume 79, Number 56 SIMULATION 305

    2003 Simulation Councils Inc.. All rights reserved. Not for commercial use or unauthorized distribution.at PENNSYLVANIA STATE UNIV on April 15, 2008http://sim.sagepub.comDownloaded from

    http://sim.sagepub.com/http://sim.sagepub.com/http://sim.sagepub.com/http://sim.sagepub.com/
  • 7/27/2019 alternative solutions to sim cloning

    9/18

    Chen et al.

    Table 2. Notations of the federate attributes

    Notations Meaning

    TSO Federates are time regulating/time constrained and use time stamp order update andreflection

    RO Federates use receive order update and reflection onlySYN Standard DMSO time advancement benchmarking application

    MF Experiment in multiple-federation (MF) modeDSF Experiment in single-federation (SF) mode using data distribution management (DDM)

    NDSF Experiment in SF mode without using DDM

    Table 3. The index of the experiments

    TSO RO SYN

    NDSF Experiment 1: TSO-NDSF Experiment 4: RO-NDSF Experiment 7: SYN-SFDSF Experiment 2: TSO-DSF Experiment 5: RO-DSFMF Experiment 3: TSO-MF Experiment 6: RO-MF Experiment 8: SYN- MF

    Note. For definitions of notations, see Table 2.

    CPU Utilization Comparison Between

    Different Cloning Solutions using TSO Federates

    0

    20

    40

    60

    80

    100

    5 6 7 8 9 10 11 12 13 14

    Federate Pairs

    CPUUtilization(%)

    TSO-NDSF

    TSO-DSF

    TSO-MF

    Figure 6. CPU utilization comparison among TSO federatecloning mechanisms. For definitions of notations, see Table 2.

    CPU utilization percentage reports the processor activ-ityin thecomputer. This counter sums theaveragenon-idletime of all processors during the sample interval and di-vides it by the number of processors. The CPU utilization

    results in Figure 6 indicate that the TSO-NDSF scenariosconsume much more system resource than the other twosolutions. However, the TSO-DSF and TSO-MF scenariosalso have an uptrend in terms of CPU utilization. The CPUutilizationof theTSO-NDSF scenarios reaches about 90%after the number of federate pairs exceeds 7. These experi-ments are a combination of complex executions, includingLBTS andTSO events receiving andreflecting.We attemptto isolate these two factors in the following experiments.

    4.2.2 Comparing Alternative Cloning SolutionsUsing Receive Order (RO) Federates

    To further investigate the computational complexity inthese solutions, we disable the time feature of federatesand reapply the three solutions to them. Execution timeandCPU utilization arepresented in Figure7 andFigure8.

    Similar to the previous experiments, the execution timeof RO-NDSF scenarios increases rapidly with the numberof federates. The peak execution time (1200 seconds, 14pairs) is about six times the start value (200 seconds, 5pairs). The execution time of RO-NDSF scenarios alwayshas a greater value than that of RO-DSF and RO-MF sce-narios. TheRO-DSF andRO-MF scenarioshave execution

    times that fluctuate slightly from 100 seconds to 200 sec-onds. From the discussion of the TSO scenarios, the extracommunication and reflection lower the performance ofRO-NDSF solutions significantly.

    Figure 8 also shows that the RO-NDSF scenarios con-sume much more system resources than the other two so-lutions. The CPU utilization of the RO-NDSF scenariosjumps to 100% after the number of federate pairs exceeds6. The RO-DSF and RO-MF scenarios have a very lowCPU utilization (less than 10%) until there are more than12 pairs of federates.

    4.2.3 Comparing Alternative Cloning Solutions byUsing the Time Advancement Benchmark

    To have a better understanding of how another factor,the LBTS calculation, affects the performance, we re-examine the SF and MF solutions by introducing thestandard time advancement benchmark federates (seehttp://sdc.dmso.mil). Execution time and CPU utilizationof both solutions are shown in Figure 9 and Figure 10. AsDDM does not intervene in the synchronization amongfederates, this series of benchmarks ignores the DDMmechanism.

    306 SIMULATION Volume 79, Number 56

    2003 Simulation Councils Inc.. All rights reserved. Not for commercial use or unauthorized distribution.at PENNSYLVANIA STATE UNIV on April 15, 2008http://sim.sagepub.comDownloaded from

    http://sim.sagepub.com/http://sim.sagepub.com/http://sim.sagepub.com/http://sim.sagepub.com/
  • 7/27/2019 alternative solutions to sim cloning

    10/18

    ALTERNATIVE SOLUTIONS FOR DISTRIBUTED SIMULATION CLONING

    Execution Time Comparison Between

    Different Cloning Solutions using RO federates

    0

    200

    400

    600

    800

    1000

    1200

    1400

    5 6 7 8 9 10 11 12 13 14Federate Pairs

    Execution

    Time

    (Seconds)

    RO-NDSF

    RO-DSF

    RO-MF

    Figure 7. Execution time comparison between cloningsolutions using RO federates. For definitions of notations, seeTable 2.

    CPU Utilization Comp arison Between

    Different Cloning Solutions using RO federates

    0

    20

    40

    60

    80

    100

    5 6 7 8 9 10 11 12 13 14Federate Pairs

    CPUUtilization(%)

    RO-NDSF

    RO-DSF

    RO-MF

    Figure 8. CPU utilization comparison between cloningsolutions using RO federates. For definitions of notations, seeTable 2.

    We can conclude that the LBTS calculation does notmake a significant difference in either the MF or SF sce-narios with an increase in the number of federates. TheRO-NDSF and TSO-NDSF scenarios show a fast increas-ing execution time. It means that the reduction of interac-tions is the key to the optimization of the overall system

    performance in the simulation as long as the number offederates stays in a reasonable range.

    5. Managing Scenarios

    Experimental results have indicated that the performanceof the DDM-based approach is encouraging comparedwith a non-DDM approach. We only consider the single-federation solution in the following discussion as it is rel-

    Execution time Comparision between Alternative

    Solutions Using SYN federates

    499.5

    500

    500.5

    501501.5

    502

    502.5

    503

    5 6 7 8 9 10 11 12 13 14Federate Pairs

    (Secon

    ds)

    SYN-MF

    SYN-SF

    ExecutionTime

    Figure 9. Execution time comparison between cloningsolutions using time advancement federates. For definitionsof notations, see Table 2.

    CPU Utilization Comparision between Alternative

    Solutions Using SYN federates

    0

    20

    40

    60

    80

    100

    5 6 7 8 9 10 11 12 13 14Federate Pairs

    SYN-MF

    SYN-SF

    CPUUtilization(%)

    Figure 10. CPU utilization comparison between cloningsolutions using time advancement federates. For definitionsof notations, see Table 2.

    atively easier to manage clones and share computationamong different scenarios. In this section, we focus onthe issues related to the concurrent scenarios in the single-federation architecture. We propose two DDM-based al-ternative solutions to manage the scenariosnamely, therecursive regiondivision solutionandthepointregionsolu-tion. We give the details of the two solutions and also com-pare their advantages and drawbacks. The issues includecoding scenarios, region specification for each scenario,and ensuring user transparency through the middlewareapproach.

    5.1 Scenario Tree

    In thecontext of distributedsimulation cloning, each clone(federate) is an individual entity, whereas each scenario is

    Volume 79, Number 56 SIMULATION 307

    2003 Simulation Councils Inc.. All rights reserved. Not for commercial use or unauthorized distribution.at PENNSYLVANIA STATE UNIV on April 15, 2008http://sim.sagepub.comDownloaded from

    http://sim.sagepub.com/http://sim.sagepub.com/http://sim.sagepub.com/http://sim.sagepub.com/
  • 7/27/2019 alternative solutions to sim cloning

    11/18

    Chen et al.

    FedA[0]

    FedB[0]

    FedC[0]

    EventX

    FedA[0]

    FedB[0] Fed

    C[0]FedB[1]

    EventX'

    X == X'

    Event X

    FedA[0]

    FedB[0]

    FedC[0]

    FedB[1]

    Event X'

    X != X'FedC[1]

    Simulation

    Time

    0

    T1

    T2

    T3

    B[0] performs

    active cloning

    C[0] performs

    passive cloning

    C[1] performs

    active cloningFedA[0]

    Fed

    B[0]

    Fed

    C[0]

    Fed

    B[1]

    Fed

    C[1]

    FedC[2]

    S[1]

    S[0]

    S[1]

    S[2]

    S[0

    ]

    (A) (B)

    Figure 11. An example of incremental cloning and scenario tree

    a dynamic group that involves a changing combination ofmember clones. Each scenario reports simulation resultsindependently; it is the basic unit in our consideration anddiscussion. Only active cloning can drive the creation ofnew scenarios. We use a tree data structure to representthe relationship and development of the scenarios. Activecloning of any federate will incur the coordination of allother related clone federates.

    Figure 11 gives an example of a cloning-enabled sim-ulation in which fedA[0] operates as an event publisherand fedB[0] and fedC[0] exchange events. Part (A) illus-trates the details of the overall cloning procedure. Part (B)gives an abstraction of the scenario tree, in which eachparentnode (full dot) represents an occurrence of active

    cloning. The leafnodes (circle) stand for active scenariosat the current simulation time. An active cloning results inthe spawning of new scenarios, which is reflected in thefigure as parent nodes that have multiple children. Eachscenario is marked as S[i] (i = 0, 1, 2, . . .). The scenariotree grows along the simulation time axis as follows:

    At simulation time 0, there exists a single scenario S[0].When simulation time is advanced to T1, fed B meets adecision point and performs active cloning, splitting into

    clones B[0] and B[1]. A new branch S[1] is created in thescenario tree at this point. An event generated by B[0] isnamed as event X, whilean event from B[1] is called eventX. FedC[0] keeps intact, and this will continue as longas events X and X remain the same. FedC[0] operates asa shared clone for the duration between time T1 and timeT2.

    At simulation time T2, when an event X deviates fromevent X, this incurs a passive cloning offed Cand resultsin the birth of clone C[1]. This passive cloning does nottrigger any change in the scenario tree.

    At simulation time T3, FedC[1] performs an activecloning, spawning off clone C[2]. A new scenario is cre-ated and marked as S[2] in the scenario tree.

    A combinatorial explosion of scenarios in clone-based dis-tributedsimulationmayoccurin somesituations.Thenum-ber of possible scenarios is determined by (1) the numberof active cloning federates, (2) the times those federatesperform active cloning, and (3) the candidate choices thateach decision point represents. Human intervention mayreduce the combinatorial explosion, but it is difficult toreach a general solution.

    308 SIMULATION Volume 79, Number 56

    2003 Simulation Councils Inc.. All rights reserved. Not for commercial use or unauthorized distribution.at PENNSYLVANIA STATE UNIV on April 15, 2008http://sim.sagepub.comDownloaded from

    http://sim.sagepub.com/http://sim.sagepub.com/http://sim.sagepub.com/http://sim.sagepub.com/
  • 7/27/2019 alternative solutions to sim cloning

    12/18

    ALTERNATIVE SOLUTIONS FOR DISTRIBUTED SIMULATION CLONING

    In practice, it is unlikely that such a combinatorial ex-plosion will occur. For example, in a supply chain simula-tion, one company may wish to examine its own decisionstrategies concurrently. It is unlikely that one companycould manipulate its partners internal decision policies.So in this case, there will be only one or a few federates

    that perform active cloning while the remaining federatesperform passive cloning at therequest of theactive cloningfederates.

    From the above discussion, we observe that differentscenarios may have common member clones. The rela-tionship between scenarios and clones canbe complex andhighly dynamic. There is a need for an identification andpartitioning mechanism to manage the concurrent scenar-ios in the distributed simulation cloning procedure. Thisrequires us to:

    represent the relationship among scenarios,

    identify and control each scenario and clone,

    partition the event messages belonging to differentscenarios,

    support the sharing of clones between scenarios,

    manage regions and map each scenario to its associatedregion,

    provide reusability to user federates while enabling theformer functionalities.

    The concurrent scenarios can be represented as a treedeveloping along the simulation time axis. To illustrate thedetails of the DDM solutions, the example in Figure 12 isreferred to in studying the coding scheme in the followingsections. Seven scenarios (labeled a to g) are presentin the overall simulation following two active clonings at

    times T1 and T2, respectively.To minimizecomputation involved in theDDM, we useonly one single routing space having a single dimensionfor cloning. To ease discussion, we assume the federatesbeing studied do not use DDM services. However, feder-ates that already use DDM services can also easily applythe solutions without changing the federate code. This canbe achieved by associating another cloning dimension inthemiddlewareto theexistingDDM-enabledupdates whennecessary. The RTI offers a region modification method toenable the dynamic change of subspace without creating adifferent region. The change takes effect immediately afternotifying the RTI. This feature enables the adjustment ofa clones characteristic region at runtime; thus, dynamic

    routing and filtering of events from one clone to differentscenario combinations can be realized.The underlying DDM module subscribes and publishes

    thesame region foreach clone. Each scenario is associatedwith an exclusive region, and there is no overlap with anyother scenarios region. Clones within the same scenariouse a common scenario-specific region extent1 unless they

    1. In this article, extentis used to mean the interval [Lower_Bound,

    Upper_Bound) that defines the region.

    D

    6 L P X O D W L R Q

    7 L P H

    7

    7

    E F G H I J

    Figure 12. An example of a scenario tree

    are shared clones. A shared clone has a merged regionthat exactly covers the scenario-specific extents of all thescenarios in which it operates. DDM also aids interactivecontrol of thescenariosat runtime;externalcommands canbe easily routed to the clones within a given scenario byassociating the proper region to them. The approach willlessen the work of recognition and processing at the cloneside.

    Our potential solutions for managing scenarios aim toprovidea standardinterface of object management (OM)or

    DDM services while employing additional DDM methodsin theunderlying communication layer. Thus, it is possibleto hide the DDM solution implementation behind the nor-mal OM or DDM services interface in a transparent way.The solutions to be discussed focus on addressing the ma-nipulation of regions andthe mapping between each regionand the scenarios.

    5.2 Recursive Region Division Solution

    The basic idea of the recursive region division solutionis to divide the full dimension (i.e., [MIN_EXTENT,MAX_EXTENT)) from top to bottom. A federation isinitialized with all federates having an associated region

    with extent [MIN_EXTENT, MAX_EXTENT); thereby,the initial scenario has a full-dimension region. Once newscenarios are created, each scenario inherits a subregionfrom its parent scenario under a region division algo-rithm. Thus, the active cloning federate should divide itsoriginal region for the child clones, while its partner maystill keep the region unchanged. In the example shown inFigure 11, immediately after time T1, fedB[0]s originalregion is split into two parts; fedB[0] and fedB[1] modify

    Volume 79, Number 56 SIMULATION 309

    2003 Simulation Councils Inc.. All rights reserved. Not for commercial use or unauthorized distribution.at PENNSYLVANIA STATE UNIV on April 15, 2008http://sim.sagepub.comDownloaded from

    http://sim.sagepub.com/http://sim.sagepub.com/http://sim.sagepub.com/http://sim.sagepub.com/
  • 7/27/2019 alternative solutions to sim cloning

    13/18

    Chen et al.

    predecessor scenarioExample:n = 5 = 2^2 + 1

    k =2+1=3 , m = 1new levels = 3

    leftmost nodeat this level

    Figure 13. Development of binary scenario tree

    their region with the first and the second part, respectively.The new regions indicate that fedB[0] and fedB[1] belong

    to scenario S[0] and S[1], respectively, butfedC[0] will re-main associated with the region of both scenarios S[0] andS[1]. Thus, the events from both fedB[0] and fedB[1] willbe automatically routed to fedC[0] without any modifica-tion to its region. This solution requires minimal work forshared clones. Region division keeps taking place alongthe development of the scenario tree.

    The scenario tree can be reconstructed as a binary treeto ease coding. When n new scenarios are developed fromone scenario branch, the following rule will be applied:given that 2k1 < n 2k, n is rewritten as n = 2k1 + m,m 2k1. Then the branch extends k levels downwards,and the leftmost m nodes at level k 1 will always beexpanded as parent nodes.

    Following this rule, the scenario tree in Figure 12 isconverted to a binary tree. Suppose that five new scenariosare created from the predecessor scenario node; the devel-opment of this branch is as indicated in Figure 13. Theleaf nodes represent the newborn scenarios, including theoriginal one.

    As shown in Figure 14A, we code a nodes left branchas 0 and the right branch as 1 accordingly. Based onthe binary tree, we link the branch codes together alongthe path from the root to the given leaf node, and thus theexclusive code of each scenario is achieved. For example,scenario e is identified as 0011. The position of the sce-nario can be easily traced according to its identity.

    A shared clone needs to distinguish events from differ-

    ent clones that are spawned by the same federate. We mayalso need to control the cloning procedure or update thesystem state of one particular clone. Therefore, it is nec-essary to identify each clone accurately. A clone alwaysjoins one or multiplescenarios;naturally, the clone identityneeds to cover this information. A direct scheme is to com-bine scenario codes and the federate name of the ancestorto this clone, with format &&. For example, when a clone joins scenario e and

    its ancestor federate is named as fab1, this clone may becoded as 0011&&fab1. In the case of shared clones, adelimiter symbol is placed between scenarios; if necessary,a wildcard is also introduced to reduce the identity length.The format is :: ... :&&, in which : is

    a delimiter and * is thewildcard. For example, if anotherclone of ancestor fab2 is shared among scenario a, b, andg, this clone is coded as 0000*:1&&fab2.

    Figure 14B gives the region-coding scheme. The rela-tionship between the position of a scenario node and itsspecific region is illustrated explicitly. The full dimensionis segmented more and more densely with the increase inlevel depth.

    At the kth level (mapping the nodes in the binary treewith depth equal to k), the full dimension is evenly par-titioned into n = 2k segments. Each segment is given abinary code according to its index, with the code lengthequal to the level depth. The segment code is length sen-sitive (e.g., the code 011 at level 3 differs from the code

    0011 at level 4). For the purpose of illustration, assumethat MIN_EXTENT = 0x00000000 and MAX_EXTENT= 0xFFFFFFFF; the following simple formula gives thecalculation of the extent of the ith segment at level k:

    Lower Bound = i 232k,

    Upper Bound = (i + 1) 232k 1.(1)

    The code of each scenario coincides with the segmentcode perfectly. The extent of the scenario-specific regioncan be directly obtained from the scenario code. First, weassignthe lengthof scenariocode to k; second,wecalculatei = atol(scenario code), and then the extent of the region

    is available immediately from equation (1). For example,the code of scenario e is 0011, and then we have k =strlen(0011)= 4 and i = atol(0011)= 3:

    Lower Bound = 3 2324,

    Upper Bound = (3 + 1) 2324 1.

    The region extent of scenario e is [0x30000000,0x3FFFFFFF). The one-to-one map between a scenariocode and the region extent is constant for any simulation.This feature implies that we do not have to record the mapbetween a scenario ID and its specific region extent. Usingthe recursive region division solution avoids the need for asearching procedure.

    The full dimension [MIN_EXTENT, MAX_EXTENT)contains 232 unique extents at most. Thereby, 232 concur-rent scenarios are allowed when making full use of thedimension. Such a large number is able to meet any prac-tical requirement for classifying scenarios. However, thebinary division algorithm may incur problems in some ex-tremesituations. Letus look at a special example,in which,on active cloning, only the leftmost branch in the binaryscenario tree splits into multiple new scenarios. Thus, the

    310 SIMULATION Volume 79, Number 56

    2003 Simulation Councils Inc.. All rights reserved. Not for commercial use or unauthorized distribution.at PENNSYLVANIA STATE UNIV on April 15, 2008http://sim.sagepub.comDownloaded from

    http://sim.sagepub.com/http://sim.sagepub.com/http://sim.sagepub.com/http://sim.sagepub.com/
  • 7/27/2019 alternative solutions to sim cloning

    14/18

    ALTERNATIVE SOLUTIONS FOR DISTRIBUTED SIMULATION CLONING

    1 8 / /

    D E

    F

    G

    H

    I

    J

    G H S W K

    R I W U H H

    0 , 1 B ( ; 7 ( 1 7 [ 0 $ ; B ( ; 7 ( 1 7 [ ) ) ) ) ) ) ) )

    (A) Coding the BinaryScenario tree

    (B) Coding the Region extent and specifyingregion for each scenario

    00000 00001 0001 0010 0011 01 1

    D E

    F G H

    I

    J

    Figure 14. Binary scenario tree and scenario region code

    region extent of the leftmost scenario will shrink rapidlyin an exponential way, much faster than other scenarios. Itcan be seen that region allocation is densely concentratedat the left end of the dimension, whereas onlya few scenar-ios occupy the remainder of dimension. As this continues,when thedepthof thebinary tree reaches 32,onescenariosregion extent becomes a point, and child scenarios are notable to inherit extents any more. This potential limitationexists even if this is unlikely to occur. Once the extent is

    exhausted in the single-clone dimension, one possible so-lution is to redistribute the extents of the full dimension.Undoubtedly, the redistribution incurs extra complexity,and it damages the natural harmony of the one-to-one mapbetween scenario identity and its region extent. Alterna-tively, we canspecifya multiple-dimensionalroutingspaceforcloningbeforehand.A cloneregion iscreated with thesedimensions,andtherecursive division algorithm is initiallyapplied to thefirstdimension. Whentheextent is exhaustedin thisdimension, thesubsequentdimensionsarestill avail-able for extent allocation using the same recursive divisionalgorithm. This approach is able to keep thecodingschemeintact. As long as the number of clone dimensions is setlarge enough, it should meet any request in practice.

    5.3 Point Region Solution

    An alternative solution, namely, the point region solu-tion, is proposed to distribute the extents of the dimen-sion as evenly as possible in a bottom-to-top manner.A point region has an extent defined as [Lower_Bound,Lower_Bound+1). As the name implies, the point regionis the element in associating a region with a scenario. The

    initial federates are assigned a start point region. Insteadof inheriting any region from their predecessor, the newscenarios get different point regions from the dimension.A shared clone has to combine its original region with thenew ones associated with additional scenarios.

    Thus, scenarioscanbe coded based on thescenario tree.Figure 15 illustrates a coded scenario tree. In this solution,on an active cloning, no matter how many new scenariosare created from a scenario, the scenario extends only one

    level down with all new scenarios and itself. From left toright, sibling branches of any scenario are labeled from 0to n. For thescenarios that remainunsplit,in thescenariotree, the corresponding nodes still extend one level downwith one single child to keep consistency in representingcurrent scenarios and the cloning history along the sim-ulation time axis. By simply linking codes starting fromthe root, the identity of a scenario is obtained. To easerepresenting and resolving the scenario identity, we canrecord the label of each branch in a hexadecimal format,whereby each byte holds a siblings code, and thus sce-nario es identity is written as 0004. The clone identityfollows the format defined in the previous recursive divi-sion solution. Given the same examples in section 5.2, we

    cancodethe descendantcloneof originalfederatefab1 as0004&&fab1. As foranother clone with ancestor fab2,which is shared by scenarios a, b, and g, wemay code it as0000:0001:0200&&fab2. This shared clone has a regionthat is a union of region extents associated with scenariosa, b, and g.

    To optimize the usage of communication channels pro-vided by DDM in RTI-NG, we divide the single cloningdimension into multiple segments evenly according to

    Volume 79, Number 56 SIMULATION 311

    2003 Simulation Councils Inc.. All rights reserved. Not for commercial use or unauthorized distribution.at PENNSYLVANIA STATE UNIV on April 15, 2008http://sim.sagepub.comDownloaded from

    http://sim.sagepub.com/http://sim.sagepub.com/http://sim.sagepub.com/http://sim.sagepub.com/
  • 7/27/2019 alternative solutions to sim cloning

    15/18

    Chen et al.

    D

    6 L P X O D W L R Q

    7 L P H

    1

    8

    /

    /

    7

    7

    E F G H I J

    Figure 15. Coding the scenario tree

    theNumPartitionsPerDimension(NPPD) in theRTI.rid file[10]. Each segment will contain an identical number ofpoint regions. To ease discussion, here we still assume thatthat MIN_EXTENT = 0x00000000 and MAX_EXTENT= 0xFFFFFFFF. The following matrix describes the dis-tribution of all available point regions with the number ofrows equal to NPPDand the number of columns equal to([USE [ ?]MAX_EXTENT- MIN_EXTENT+1)/NPPD. Itis obvious that the points in one row belong to same seg-ment in the dimension.

    Rgn =

    R0,0 R0,1 R0,(232/NPPD1)R1,0 :

    : :

    R(NPPD1),0 R(NPPD1),(232/NPPD1)

    ,

    where Ri,j = (232/NPPD) i + j

    (2)

    In total, there can be up to 232 available point regionsin which the jth point region in the ith row is written as[Ri,j, Ri,j + 1). The initial scenario starts with a region[R0,0, R0,0 + 1). A new scenario will always be assignedthe first unused point region in the next row. Once the

    regions in same column are fully used, the allocation willstart from the beginning of the next column. Figure 16illustrates this region, specifying flow as indicated by thearrowed line. The scenario ID maps to its specified regionin a one-to-one way. Thescenario tree records the scenarioID and the region in the node for that scenario and canbe used for resolving the region of one particular scenariofrom its identity by searching the tree. The scenario ID isresolved to locate the scenario node in the tree, and this

    )1/2),(1(0),1(

    0,1

    )1/2,(01,00,0

    32

    32

    ...

    ::

    :...

    ...

    NPPDNPPDNPPD

    NPPD

    RR

    R

    RRR

    Figure 16. Point region allocation flow

    search is performed along a given path instead of making afull searchof thescenariotree. Themapping relationshipinthe point region solution may vary in different simulation

    runs.The point region solution has an added advantage in

    that it maximizes the use of available communicationchannels. For each routing space, RTI-NG will createNPPDNumber of Dimensions reliable and best-effort channels. Thechannels are mapped to the dimension in a grid-like fash-ion. As we only define one dimension, RTI provides ex-actly NPPDchannels for the cloning space in our solu-tion. The point regions in the same row (see the matrix inequation (2)) occupy a common data channel. The pointregion ensures that the update data it is associated with usea unique data channel and avoids the overhead of sendingdata through multiple channels unnecessarily. However,this solution does not support a direct conversion between

    the scenario ID and region; the scenario manager moduleneeds to record this mapping. Furthermore, shared clonesneed to modify their regions on the latest active cloning.This incurs extraeffort compared with the recursive regiondivision solution.

    5.4 DDM in RTI++

    No matter what kind of solution is chosen to perform sim-ulation cloning, a critical principle is the reusability of theusers program. Oursolutionsshouldminimize themodifi-cation to the users existing code, even though it is difficultto eliminate all extra complexity whenenabling simulationcloning. We propose the middleware approach to hide the

    complexity, as indicated in Figure 17.We extend the standard RTI to RTI++ to encapsulate

    cloning operations directly related to the RTI while pre-senting the standard RTI interface to the user. The usercode still uses the standard RTI interface, while the en-hanced functionality remains transparent to the user. Themiddleware sits between the users code and the real RTIand contains a library for cloning management and theRTI++.

    312 SIMULATION Volume 79, Number 56

    2003 Simulation Councils Inc.. All rights reserved. Not for commercial use or unauthorized distribution.at PENNSYLVANIA STATE UNIV on April 15, 2008http://sim.sagepub.comDownloaded from

    http://sim.sagepub.com/http://sim.sagepub.com/http://sim.sagepub.com/http://sim.sagepub.com/
  • 7/27/2019 alternative solutions to sim cloning

    16/18

    ALTERNATIVE SOLUTIONS FOR DISTRIBUTED SIMULATION CLONING

    RTI++

    Middleware

    User's Program

    Standard RTI Interface

    Module

    LibraryDDM

    InteractiveControl

    Real RTI

    Figure 17. Middleware for cloning. RTI = runtimeinfrastructure.

    As mentioned previously, either of the solutions dis-cussed should work as an underlying control mechanismwith transparency to federate codes.Hiding the implemen-tation details while enabling the management of scenarioscan maximize the reusability of the users existing simu-lation model. Middleware exists between the RTI and fed-erates and includes an implementation of the solutions ad-dressed above. RTI++ software is built up to encapsulatethe cloning-related modules while maintaining the sameRTI interface to the calling federate code. Scenario man-ager and region manager modules are developed to imple-ment the DDM solutions. Figure 18 gives the relationshipbetween the cloning modules.

    The scenario manager module creates and stores the

    scenario tree, from which the identity and correspondingregion of each scenario can be fetched. The scenario man-ager keepsthe history of theoverallcloningprocedure. Thecloning management module initiates the creation and up-date of the scenarios. The region manager module createsregions and manages the regions based on the aforemen-tioned algorithm. The region manager will deal with anyrequest for an updating region. The RTI++ object manage-mentservices invokedbya federateare eventuallyexecutedviathe corresponding DDMmethods by associating there-gion obtained from the scenario manager. Figure 19 givesan example of pseudo-codes for implementing the aboveinside the middleware.

    6. Conclusions and Future Work

    In this article, we have investigated some research issues incloningHLA-baseddistributedsimulations.A federatecanperform cloning actively or passively according to its ownor its partners requirements at thedecision point. To makethe most use of computation sharing, we have employedan incremental cloning mechanism in the design of oursolutions to clone simulations.A middlewareapproach has

    been used to hide the complexity incurred by simulationcloning and maximize the reusability of users simulationapplications.

    Alternative candidate solutions have been compared intermsofefficiency, complexity, and robustness.Thesingle-federation solution indicates an advantage in cloning con-

    trol and cloning sharing. However, the multiple-federationsolution surpasses in robustness and federate synchroniza-tion. The benchmark results indicate that, in general, MFexhibits much better performance than NDSF. The inter-action among federates is the main factor that affects theexecution speed. In the case when there is a high data ex-change, the performance can be improved dramatically byapplying DDM to NDSF. The calculation of LBTS makesno significant difference in both MF and SF solutions. Theresults show that DSF performs as well as MF in termsofefficiency. Considering the reduction of implementationcomplexity and convenience for scenariomanagement, wechose the single-federation solution applying DDM forcloning distributed simulations in the current study.

    In this article, we have also studied issues involved inmanaging concurrent scenarios generated by distributedsimulation cloning. Interactionsneed to be confined withineachscenario to guarantee thecorrectness of simulation re-sults. We also need to share computation for shared clonescrossing multiple scenarios. DDM services are used in anunderlying module to route events and partition scenarios.We analyzed the internal mechanism of RTI-NG DDMservices to improve the DDM performance. A middlewareapproachcovers thescenariomanagement details andmax-imizes the reusability of user federate codes.

    Furthermore, to address the complexity of the overallcloning-enabled distributed simulation due to increasingscenario spawning, we investigated an efficient and pre-

    cisescheme to identifyandrepresentscenarios.Alternativecandidate solutions have been proposed to code scenariosas well as to manipulate region extents in the cloningdimension. These are a recursive region division solutionand a point region solution. The recursive region divisionsolution has advantages because (1) it minimizes the extraoverhead for shared clones, and (2) it provides a naturalone-to- one mapping between scenario identity and regionextent. Thecoding mechanism harmonizes with the regionspecification perfectly. However, the recursive region divi-sion solution hasa limitation in dealing with some extremesituations. The point region solution has advantages overthe former solution in that (1) it can meet the region allo-cation requirements even in extreme situations, and (2) it

    optimizes the use of data channels.For our future work, we need to explore the mechanism

    of the cloning operation and guarantee the correctness ofnew clones. One major concern is the design and imple-mentation of system state saving and recovery at both thesimulation level and the RTI level. Another major concernis the interactive manipulation of cloning-enabled simula-tion in a distributed environment, where users are offeredthe flexibility tocontrolandupdate thecloningonline.How

    Volume 79, Number 56 SIMULATION 313

    2003 Simulation Councils Inc.. All rights reserved. Not for commercial use or unauthorized distribution.at PENNSYLVANIA STATE UNIV on April 15, 2008http://sim.sagepub.comDownloaded from

    http://sim.sagepub.com/http://sim.sagepub.com/http://sim.sagepub.com/http://sim.sagepub.com/
  • 7/27/2019 alternative solutions to sim cloning

    17/18

    Chen et al.

    Figure 18. RTI++ and internal modules. RTI = runtime infrastructure; DDM = data distribution management.

    Get region associated with current clone federate

    currentRegion= scenManger->getCloneRegion();Override standard RTI functions with DDM enabled services

    For example:

    RTIambassadorPlus::subscribeObjectClassAttributes (theClass, attributeList,...){...(RTI::RTIambassador*this)->subscribeObjectClassAttributesWithRegion(theClass,currentRegion, attributeList,...);...

    }

    ...

    Figure 19. Example for RTI++ implementation. RTI = runtime infrastructure; DDM = data distribution management.

    to achieve fault tolerance using the cloning mechanism isalso a very important issue.

    7. References

    [1] Gan, Boon Ping, Li Liu, Sanjay Jain, Stephen John Turner, WentongCai,and Wen-JingHsu.2000.Distributedsupplychainsimulationacross enterprise boundaries. In Proceedings of the 2000 Winter

    Simulation Conference, December, Orlando, FL, pp. 1245-51.[2] Turner, StephenJohn, WentongCai, andBoonPingGan.2001.Adapt-

    ing a supply-chain simulation for HLA. In Proceedings of theFourth IEEE International Workshop on Distributed Simulationand Real-Time Applications, August, San Francisco, pp. 67-74.

    [3] Dahmann, Judith S., Frederick Kuhl, and Richard Weatherly. 1998.Standards for simulation: As simple as possible but not simpler:The High Level Architecture for simulation. SIMULATION 71(6): 378-87.

    314 SIMULATION Volume 79, Number 56

    2003 Simulation Councils Inc.. All rights reserved. Not for commercial use or unauthorized distribution.at PENNSYLVANIA STATE UNIV on April 15, 2008http://sim.sagepub.comDownloaded from

    http://sim.sagepub.com/http://sim.sagepub.com/http://sim.sagepub.com/http://sim.sagepub.com/
  • 7/27/2019 alternative solutions to sim cloning

    18/18

    ALTERNATIVE SOLUTIONS FOR DISTRIBUTED SIMULATION CLONING

    [4] DMSO/DOD, DMSO, Alexandria, VA, USA, Feb. 2002. RTI 1.3Next generation programmers guide version 5.

    [5] Chen, Dan,Boon PingGan, StephenJohnTurner,Wentong Cai,Niru-pam Julka, and Junhu Wei. 2003. Evaluating alternative solutionsfor cloning in distributed simulation. In Proceedings of the 36th

    Annual Simulation Symposium, March, Orlando, FL, pp. 201-8.[6] Chen, Dan, Boon Ping Gan, Stephen John Turner, Wentong Cai, and

    Junhu Wei. 2003. Data distribution management in distributedsimulation cloning. In Proceedings of 2003 European Simulation

    Interoperability Workshop, paper no. 03E-SIW-024, June, Stock-holm, Sweden.

    [7] Hybinette, Maria, and Richard M. Fujimoto. 2001. Cloning paral-lel simulations. ACM Transactions on Modeling and ComputerSimulation (TOMACS) 11:378-407.

    [8] Schulze, T., SteffenStraburger, andUlrich Klein. 2000. HLA-feder-ate reproduction procedures in public transportation federations.In Proceedings of the 2000 Summer Computer Simulation Con-

    ference, July, Vancouver, Canada.[9] Morse, K. L., and M. D. Petty. 2001. Data distribution management

    migration from DoD 1.3 to IEEE 1516. In Proceedings of theFifth IEEEInternationalWorkshopon Distributed Simulationand

    Real-Time Applications, August, Cincinnati, OH, pp. 58-65.[10] Hyett, Mark, and Roger Wuerfel. 2002. Implementation of the data

    distribution management services in the RTI-NG. In Proceedings

    of 2002 Spring Simulation Interoperability Workshop, paper no.02S-SIW-044, March, Orlando FL.

    Dan Chen is a research engineer with the Productions and Lo-

    gistics Planning Group at Singapore Institute of Manufacturing

    Technology in Singapore. He received a bachelor of science de-

    gree in applied physics from Wuhan University, China in 1994; a

    master of engineering degreein computerscience from Huazhong

    University of Science andTechnology, China in 1999; anda mas-

    ter of engineering degree in computer engineering from Nanyang

    Technological University, Singapore in 2002. His research inter-

    ests are distributed simulation, networking, and other relative

    technologies.

    Stephen John Turner joined Nanyang Technological University

    (Singapore) in 1999 andis currently an associateprofessor in the

    School of Computer Engineering and director of the Parallel and

    Distributed Computing Center. Previously, he was a senior lec-

    turer in computer science at Exeter University (UK). He received

    his M.A. in mathematics and computer science from Cambridge

    University (United Kingdom) and his M.Sc. and Ph.D. in com-

    puter science fromManchesterUniversity(UnitedKingdom). His

    current research interests include parallel and distributed sim-

    ulation, distributed virtual environments, grid computing, and

    multiagent systems.

    Boon Ping Gan is a research engineer with the Production and

    Logistics Planning Group at the Singapore Institute of Manu-

    facturing Technology in Singapore. He received a bachelor ofapplied science degree in computer engineering and a master of

    applied science degree from the Nanyang Technological Univer-

    sity of Singapore in 1995 and 1998, respectively. His research

    interests are parallel and distributed simulation, parallel pro-

    grams scheduling, and application of genetic algorithms.

    Wentong Cai is currently an associate professor at the School

    of Computer Engineering (SCE), Nanyang Technological Uni-

    versity (Singapore). He received his B.Sc. in computer science

    from Nankai University (Peoples Republic of China) and Ph.D.,

    also in computer science, from the University of Exeter (United

    Kingdom). He was a postdoctoral research fellow at Queens

    University (Canada) from February 1991 to January 1993 and

    joined SCE as a lecturer in February 1993. He has served asprogram committee members in many international conferences

    (e.g., PADS, DSRT, and PDCS). He is a member of IEEE, and

    his current research interests are mainly in the areas of parallel

    and distributed computing (particularly, parallel & distributed

    simulation and grid computing).

    Junhu Wei is working with Nanyang Technological University

    (Singapore) as a research fellow sponsored by SIMTech under

    the research manpower program. He received his B.E. in auto-

    matic control, M.E. in system engineering, and Ph.D. in control

    engineering from Xian Jiaotong University (China). His cur-

    rent research interests include parallel and distributed simula-

    tion, simulation, planning, and scheduling of manufacturing.

    Nirupam Julka is an associate research engineer with the Pro-

    duction and Logistics Planning Group at the Singapore Institute

    of Manufacturing Technology in Singapore. He received a bach-

    elor of technology (honors) degree from the Indian Institute of

    Technology (IIT), Kharagpur in 1999 and a master of engineer-

    ingdegree from theNational University of Singapore in2002. His

    research interests are parallel and distributed simulation, sup-

    ply chain optimization, and management and decision support

    systems.

    Volume 79, Number 56 SIMULATION 315