31
Consistency Control for Synchronous and Asynchronous Collaboration based on Shared Objects and Activities urgen Vogel University of Mannheim Lehrstuhl Praktische Informatik IV L 15, 16 68131 Mannheim, Germany +49 621 181 2615 [email protected] Werner Geyer, Li-Te Cheng and Michael Muller IBM Thomas J. Watson Research Center One Rogers Street Cambridge, MA 02142, USA +1 617 693 4791 {werner.geyer,li-te cheng,michael muller}@us.ibm.com Abstract. We describe a new collaborative technology that bridges the gap between ad hoc collaboration in email and more formal collaboration in structured shared workspaces. Our approach is based on the notion of object-centric sharing, where users collaborate in a lightweight manner but aggregate and organize different types of shared artifacts into semi-structured activities with dynamic membership, hierar- chical object relationships, as well as real-time and asynchronous collaboration. We present a working prototype that implements object-centric sharing on the basis of a replicated peer-to-peer architecture. In order to keep replicated data consistent in such a dynamic environment with blended synchronous and asynchronous collabo- ration, we designed appropriate consistency control algorithms, which we describe in detail. The performance of our approach is demonstrated by means of simulation results. Keywords: Object-centric sharing, replication, consistency control, peer-to-peer, activity-centric collaboration, synchronous and asynchronous collaboration. 1. Introduction Ad hoc collaboration systems, such as email and chat are lightweight and flexible, and provide good dynamic support for short-term commu- nication needs. However, for those collaborative activities which extend over longer periods of time, or over larger numbers of participants, these media rapidly become unmanageable. At the other extreme, structured shared workspaces provide good support for making sense of large cor- pora of messages and files. However, these environments are relatively labor-intensive to initiate, and discourage people from using them for small-scale or short-term collaborations.

ConsistencyControlforSynchronousandAsynchronous ... · ConsistencyControlforSynchronousandAsynchronous Collaboration based on Shared Objects and Activities Jurgen Vogel University

Embed Size (px)

Citation preview

Page 1: ConsistencyControlforSynchronousandAsynchronous ... · ConsistencyControlforSynchronousandAsynchronous Collaboration based on Shared Objects and Activities Jurgen Vogel University

Consistency Control for Synchronous and Asynchronous

Collaboration based on Shared Objects and Activities

Jurgen VogelUniversity of MannheimLehrstuhl Praktische Informatik IVL 15, 1668131 Mannheim, Germany+49 621 181 [email protected]

Werner Geyer, Li-Te Cheng and Michael MullerIBM Thomas J. Watson Research CenterOne Rogers StreetCambridge, MA 02142, USA+1 617 693 4791{werner.geyer,li-te cheng,michael muller}@us.ibm.com

Abstract. We describe a new collaborative technology that bridges the gap betweenad hoc collaboration in email and more formal collaboration in structured sharedworkspaces. Our approach is based on the notion of object-centric sharing, whereusers collaborate in a lightweight manner but aggregate and organize different typesof shared artifacts into semi-structured activities with dynamic membership, hierar-chical object relationships, as well as real-time and asynchronous collaboration. Wepresent a working prototype that implements object-centric sharing on the basis ofa replicated peer-to-peer architecture. In order to keep replicated data consistent insuch a dynamic environment with blended synchronous and asynchronous collabo-ration, we designed appropriate consistency control algorithms, which we describein detail. The performance of our approach is demonstrated by means of simulationresults.

Keywords: Object-centric sharing, replication, consistency control, peer-to-peer,activity-centric collaboration, synchronous and asynchronous collaboration.

1. Introduction

Ad hoc collaboration systems, such as email and chat are lightweightand flexible, and provide good dynamic support for short-term commu-nication needs. However, for those collaborative activities which extendover longer periods of time, or over larger numbers of participants, thesemedia rapidly become unmanageable. At the other extreme, structuredshared workspaces provide good support for making sense of large cor-pora of messages and files. However, these environments are relativelylabor-intensive to initiate, and discourage people from using them forsmall-scale or short-term collaborations.

Page 2: ConsistencyControlforSynchronousandAsynchronous ... · ConsistencyControlforSynchronousandAsynchronous Collaboration based on Shared Objects and Activities Jurgen Vogel University

2 Vogel et al.

Figure 1. From ad hoc communication to formal collaboration

Little work has been done to offer richer collaboration in email andto help support the progression of collaboration from ad hoc com-munication to more structured types of collaboration. Our researchin the Instant Collaboration project (Geyer and Cheng, 2002; Geyeret al., 2003) investigates technologies that can help bridge these gapsbetween small vs. large numbers of participants, brief vs. extendedcollaborations, and informal vs. formal structures (see Figure 1). Wehave designed and built a peer-to-peer prototype system that supportslightweight and ad hoc forms of sharing information, which we believeare key in bridging these gaps because they do not overload the userwith the overhead of manually creating shared workspaces or setting-up conferences. Reports from a field trial of 33 users over a five-monthperiod (using a similar design but a different architecture) support theseclaims (Muller et al., 2004; Millen et al., 2004).

This article introduces the design concept behind our prototype andthen focuses on the implementation of this system. In Section 2, weintroduce the notion of “object-centric” sharing, which is fundamentalto our design. Object-centric sharing allows individuals to aggregateand organize shared artifacts into larger collaborative activities, pro-viding an emerging context that evolves and engages a dynamic groupof participants. Section 3 presents the prototype system from a userinterface perspective. We illustrate how the prototype can be usedwithin email to engage in lightweight activities. In Sections 4, 5, and 6,we focus on the architecture and implementation of this system. Wenot only decided to make this system feel peer-to-peer from a userperspective, but also implemented it based on a replicated peer-to-peerarchitecture. This decision poses various technical challenges. Keepingreplicated data consistent in an architecture that supports real-timeand asynchronous collaboration at the same time is not trivial, andrelatively little research has been done in addressing this problem.

Page 3: ConsistencyControlforSynchronousandAsynchronous ... · ConsistencyControlforSynchronousandAsynchronous Collaboration based on Shared Objects and Activities Jurgen Vogel University

Consistency Control for Synchronous and Asynchronous Collaboration 3

Our approach enhances a popular consistency control algorithm, whichhad been originally designed for real-time collaboration. These algo-rithms are evaluated by means of simulations in Section 7. Section 8addresses the question how the system can support the user in under-standing the changes made in phases of asynchronous collaboration.In Section 9, we discuss our experiences when building the prototypeas well as trade-offs between centralized and replicated architecturesfor blended synchronous and asynchronous collaborative systems. Sec-tions 10 and 11 conclude with related work and a summary of thiscontribution.

2. Design Philosophy

The design of our system was mainly driven by the desire to combinethe lightweight and ad hoc characteristics of email and the rich supportfor sharing and structure in shared workspace systems. A more detaileddiscussion of the design can be found in (Geyer et al., 2003).

2.1. Object-centric Sharing

A key design decision in our system was to allow sharing of resourcesin a fine-grained way. The basic building block for collaboration in ourapproach is a shared object. Shared objects hold one piece of persistentinformation and they define a list of people who have access to thatcontent. Examples for shared objects are a message, a chat transcript, ashared file etc. Sharing on an object level was motivated by the difficultyof predicting the scale of a new activity upfront when people start col-laborating. Collaboration might be very short-term and instantaneousand involve only little amounts of data to be shared, few people, andfew steps of interaction, e.g., exchanging one or more files, setting up ameeting agenda with people, or jointly annotating a document. Theseactivities might or might not become part of a larger collaborative workprocess. However, people usually do not create heavyweight sharedworkspaces if they do not know that a simple chat or email will growinto a larger, more complex, more formal collaborative work process.

2.2. Object-Level Awareness

Our design deliberately does not make a distinction between asyn-chronous and synchronous types of collaboration. Each shared objectsupports real-time notifications that are used either to update otherusers’ views of this shared object (if they are currently viewing orediting this object) or to notify other users about current activity on

Page 4: ConsistencyControlforSynchronousandAsynchronous ... · ConsistencyControlforSynchronousandAsynchronous Collaboration based on Shared Objects and Activities Jurgen Vogel University

4 Vogel et al.

this object. The latter provides fine-grained object-level awareness, i.e.,people get to know who is working on what. Awareness about who iscurrently present in an object can serve as a trigger for opportunisticcollaboration and can help keep collaborative activities moving forward.

2.3. Activity and Conversational Structure

In order to be able to add structure to their collaboration, we al-low users to combine and aggregate heterogeneous shared objects intostructured collections as their collaboration proceeds. We call a set ofrelated but shared objects an activity thread, representing the contextof a collaborative activity. We see this structure being defined by users’ongoing conversation, i.e., each object added to an existing object canbe considered as a “reply” to the previous one. We intentionally did notrequire any particular object as a structural container for a collabora-tive activity. Each individual shared object can function as the parentobject for an activity thread, or for a sub-thread within a thread.

2.4. Dynamic Membership

Membership in collaborative activities can be dynamic and heteroge-neous. Activities often spawn side activities that involve a different setof people, or they might require bringing in new people or excludingpeople from certain shared resources (Ducheneaut and Belotti, 2001).Dynamic membership within an activity thread comes as a by-productof object-centric sharing because each object has its own access controllist. As collaboration proceeds, we allow users to include new membersin, or to exclude old members from selected shared resources in thethread.

In many regards our approach is similar to threads in email ordiscussion databases, or thrasks (Belotti et al., 2003). However, it isricher because

− objects are shared (unlike in email or thrasks);

− activity threads may contain different types of objects, not onlymessages like in email or discussion databases;

− all objects are equal, unlike in email or discussion databases whereattachments are subordinates contained in the message;

− membership is dynamic and may differ within an activity threadfrom object to object unlike in discussion databases - and mem-bership can be redefined after an object has been created, unlikein email;

Page 5: ConsistencyControlforSynchronousandAsynchronous ... · ConsistencyControlforSynchronousandAsynchronous Collaboration based on Shared Objects and Activities Jurgen Vogel University

Consistency Control for Synchronous and Asynchronous Collaboration 5

Figure 2. Prototype user interface, (A) Activity Thread Pane, (B) Email InboxPane, (C) Email Viewer Pane

− objects support synchronous collaboration through built-in real-time notifications and provide rich awareness information on whois currently working on what.

3. User Experience

The user interface to our prototype system is integrated into an emailclient. The client supports sharing of five types of objects: message, chattranscript, file, annotated screen shot, and to-do item. These objectsare managed through a simple tree-like user interface that is containedin the right pane (A) in Figure 2. Each “branch” in that tree representsan activity thread.

Users interact with shared objects by right-clicking on the nodes ofthe tree which pops up a context menu. Users can create new objects,delete objects, add and remove members etc. Our client supports POP3email messaging: The upper left pane is the inbox (B) and the lowerleft pane a message viewer (C). In the following, we use a scenarioto illustrate how shared objects as building blocks can be used to

Page 6: ConsistencyControlforSynchronousandAsynchronous ... · ConsistencyControlforSynchronousandAsynchronous Collaboration based on Shared Objects and Activities Jurgen Vogel University

6 Vogel et al.

collaborate in an activity that starts from an email message. Pleasenote that the activity thread displayed in Figure 2 is just a snapshot atthe end of an activity from the perspective of one of the actors (Bob);the thread is built dynamically as the actors collaborate.

Bob is a project lead and he works with Dan on a project on “CasualDisplays”. Catherine is a web designer in their company who is respon-sible for the external web site. Bob receives an email from Catherinecontaining a draft for a project description that she would like to puton their external web site (1). She wants some feedback from Bob.Before getting back to her, Bob wants to discuss the design of that webpage with Dan. Instead of forwarding the message to Dan via email,Bob decides to start a new activity by creating a shared object basedon this message. He right-clicks on the original message in his inbox,selects “share”, enters Dan’s email address, and hits “Share”. A newshared message object (with Bob and Dan as members) shows up inBob’s activity tree in the right window pane (2). Bob right-clicks on theshared object and adds a new shared message to the initial one, becausehe wants to let Dan know that he would like to discuss this with him.Bob’s message shows up as a reply to the initial message similarly to anewsgroup thread (3).

A few hours later, Dan returns to his desktop, which is running theclient, and notices Bob’s newly created shared messages. He opens onemessage and while he is reading it, Bob sees that Dan is looking at themessages because the shared object is lit green along with Dan’s nameunderneath the object (4). Bob takes this as an opportunity to begin adiscussion with Dan within the context of the shared object. He right-clicks on the initial message and adds a chat object to this activity (5).A chat window pops up on Dan’s desktop and they chat. In their chatconversation, Bob and Dan continue talking about the web page over thephone. At some point during the discussion, Bob wants to show directlyhow to change the web page. He right-clicks on the chat object in hisactivity tree and adds a shared screen object (6). A transparent windowallows Bob to select and “screen scrape” any region on his desktop. Hefreezes the transparent window over Catherine’s draft web page. Thescreen shot pops up on Dan’s desktop. Bob and Dan begin annotatingthe web page in real-time like a shared whiteboard (7). As they discussa few changes, Bob is asking Dan to integrate a project logo into theweb page. Dan agrees but is pressured now to run to another meeting.He says good-bye to Bob and tells him that he will check with him nextday. Dan closes all his windows and as he leaves, his name turns graythroughout all of his shared objects displayed on Bob’s client.

Now alone, Bob continues annotating the web page. He also typesin a few lines for Dan in the chat window before closing it. He then

Page 7: ConsistencyControlforSynchronousandAsynchronous ... · ConsistencyControlforSynchronousandAsynchronous Collaboration based on Shared Objects and Activities Jurgen Vogel University

Consistency Control for Synchronous and Asynchronous Collaboration 7

right clicks on the chat object and creates a new shared file object. Hepicks the logo file from his local file system and the file object becomespart of Bob’s and Dan’s activity thread (8). Bob closes all windows andleaves. Next morning when Dan returns to his office, he finds Bob’sadditional annotations, his chat message, and the project logo file. Hestarts working on the web page and few hours later, he puts the reworkedpage into the activity thread as a shared file object (9) and adds amessage with some comments (10). He also shares these two objectswith Catherine (11) so that she can download and deploy the newlyrevised web page and logo.

This scenario demonstrates how our prototype moves seamlessly andeffortlessly back and forth from private to public information, and fromasynchronous to synchronous real-time collaboration, without manuallycreating a shared workspace or setting up a meeting. Collaborationstarts off with a single shared object and evolves into a multi-objectactivity, which is structured by a dynamic group of participants as theycreate and add new shared objects. An activity thread provides theconversational context and awareness for an emerging collaboration; itallows aggregating a mix of different object types.

4. System Architecture

Our design approach and the envisioned use of the system contributedto the decision to implement our prototype as a peer-to-peer system. Inparticular, the high administrative cost of centralized shared workspacesystems was a major factor in this decision. We wanted users to be ableto create shared objects on the fly in order to facilitate instantaneousand effortless collaboration. Giving full control to the user implies thatthe system should function without any additional infrastructure.

Another major requirement is that users are able to collaboratesynchronously as well as asynchronously. Asynchronous work may takeplace when people are online and their collaborators are offline or whenthey are disconnected from the network themselves. To provide offlineaccess at any time, shared objects need not only be persistent but copiesof these objects need to reside on the user’s local machine (desktop,laptop, or PDA). To keep local replicas synchronized, the system mustprovide appropriate communication and consistency control mecha-nisms. Since the membership of shared objects can be highly dynamic,the system also has to support late-joining users by providing themwith sufficient initialization information. Replication not only providesoffline access to data but also helps to achieve good responsiveness.

Page 8: ConsistencyControlforSynchronousandAsynchronous ... · ConsistencyControlforSynchronousandAsynchronous Collaboration based on Shared Objects and Activities Jurgen Vogel University

8 Vogel et al.

Figure 3. System architecture

Finally, object-centric sharing as described in Section 2 implies thatmembership management occurs individually for each shared object.This fine-grained access to and control of shared objects might entail ascalability penalty in a server-based system. A replicated system scalesbetter with respect to the number of shared objects and the number ofusers.

With these considerations in mind, we opted against a client-serversolution and decided to build a peer-to-peer system where each userruns an equal instance of the application locally. Figure 3 shows threeapplications instances (A, B, and C). Each local instance consists ofa client component that provides the user interface to the system (seedescription in Section 3) and a service component, called ActivitySer-vice, that maintains all shared objects that are relevant to the localuser.

A shared object is defined by the current values of its attributes,e.g., the state of a message object includes a subject and a body (inthe form of text). All shared objects together form the state of theapplication. We use a local database in each application instance tostore and access this application state (see Figure 3).

The application state can be changed by user actions, e.g., whencreating a new shared object or when modifying the attributes of anexisting object. State changes need to be propagated to all participantsof an activity so that their ActivityServices can keep the local copies ofthe shared objects synchronized. State changes can be encoded eitheras complete states that contain the updated values of all attributes, oras events that contain the modifications only. Our prototype encodesnew objects as states and all other changes as events. In the following,we denote states and events as operations.

Page 9: ConsistencyControlforSynchronousandAsynchronous ... · ConsistencyControlforSynchronousandAsynchronous Collaboration based on Shared Objects and Activities Jurgen Vogel University

Consistency Control for Synchronous and Asynchronous Collaboration 9

All algorithms discussed in the following sections are based on thisabstract data model of states and events. Thus, they are indepen-dent from the shared object types so that our prototype can be easilyextended with new types (e.g., calender items).

From the user’s perspective, this peer-to-peer architecture meansthat, apart from acquiring and running the application, no extra stepsare necessary before new shared objects can be created or existingshared objects can be accessed. New peers can be invited and inte-grated easily, for example, by an email referencing a web link thatautomatically installs the application.

4.1. Communication Protocols

For the distribution of operations from their source to all peers thatparticipate in an activity, appropriate communication protocols needto be designed. We decided to base the application-level protocol onXML (Extensible Markup Language (XML) 1.0 (Second Edition), 2000),mainly to allow rapid development and easy debugging. Preliminarytests with our prototype have shown that the resulting performanceis sufficient, but should the need arise we plan to switch to a binaryprotocol.

Since objects can be shared by an arbitrary number of users, appli-cation data usually needs to be delivered to more than one destination.Thus, the system has to employ a group communication protocol.We decided against using IP multicast due to its insufficient deploy-ment (Diot et al., 2000), and because it would require the use of UDPas a transport protocol, which is blocked by firewalls in many organi-zations. Instead, a sender delivers application data directly via TCPto each receiver, forming a star-shaped distribution topology. Since weexpect groups for most activities to be small (i.e., up to 10 participants),the network overhead imposed by this approach seems to be acceptable.An alternative would be to use an application-level multicast protocol(e.g., (Chu et al., 2001; Vogel et al., 2003b)) which remains an issue forfuture work.

4.2. Peer Discovery

Building the application’s communication on top of point-to-point uni-cast connections means that a sender has to contact each receiverindividually. Therefore, a peer needs to know the IP addresses of all thepeers it is collaborating with. Since our prototype is integrated into anemail client, the natural contact information of a peer is its user’s emailaddress. This has to be mapped to the IP address of the correspondinguser’s computer. The address resolution is a dynamic process because

Page 10: ConsistencyControlforSynchronousandAsynchronous ... · ConsistencyControlforSynchronousandAsynchronous Collaboration based on Shared Objects and Activities Jurgen Vogel University

10 Vogel et al.

an IP address might change when users connect via dial-up, work ondifferent computers, or use dynamic IP.

To allow a peer to connect to the system for the first time, we usethe existing instant messaging (IM) infrastructure in a company toresolve email addresses (see Figure 3): Each peer connects to its user’scorresponding instant messaging server. Each peer contacting the serveralso leaves its own address information, i.e., the IM infrastructure servesas a means for peer discovery and address resolution. All addressesresolved by a peer are saved in a local address table together with thetime the information was obtained. This address table is persistent andused on the first attempt to establish a connection to another peer.Should this fail, the instant messaging server is contacted to inquirewhether the peer’s IP address has changed. Once a peer makes contactto other peers, they can exchange addresses to complement their localaddress tables with new entries and to replace outdated addresses. Thisway communication with the address server can be limited and thepeer-to-peer system will be able to function for some time in case theaddress server is not reachable.

This article does not focus on peer discovery protocols. The afore-mentioned mechanisms could also be replaced with existing peer dis-covery protocols such as Gnutella (Gnutella and Limewire, 2004) andJXTA (JXTA, 2004). Alternatively, we could also exchange IP ad-dresses through special emails that are filtered by our email client. Theadvantage of this approach would be that no additional infrastructurefor address resolution is required. Its realization is an issue for futurework.

5. Synchronous Collaboration

The replication of the application’s state as described in Section 4requires explicit mechanisms to keep all state copies consistent. Muchresearch has been done in keeping the state of synchronous multi-userapplications such as whiteboards, shared editors etc. consistent. Like-wise our prototype requires consistency mechanisms when people areworking on shared objects at the same time. However, by design oursystem also supports offline use when people are disconnected from thenetwork, and people who are online are able to share objects with otherswho are currently offline. Little work has been done on algorithmsthat support consistency in blended synchronous and asynchronouscollaborative applications. We have chosen and extended an existingconsistency mechanism for synchronous collaboration so that it alsosupports asynchronous collaboration. In the following, we first describe

Page 11: ConsistencyControlforSynchronousandAsynchronous ... · ConsistencyControlforSynchronousandAsynchronous Collaboration based on Shared Objects and Activities Jurgen Vogel University

Consistency Control for Synchronous and Asynchronous Collaboration 11

consistency control in the “ideal” case when everyone is online beforewe cover the asynchronous case in Section 6.

5.1. Consistency Criteria

Consider the scenario described in Section 3: Let us assume that Bobdecides to change the name of the shared screen object that he and Danwere annotating to “project homepage” (see (7) in Figure 2). In order toexecute this local state change in Dan’s application, Bob’s applicationneeds to propagate the pertinent information with an operation Oa.But since this transmission is subject to network delay, Dan could alsochange the object’s name to “first draft” with an operation Ob in thebrief time span before Bob’s update is received. In this case, Bob’sand Dan’s changes are concurrent, and they conflict semantically sincethey target the same aspect of the state. Without further actions, thename of Bob’s activity would be “first draft” and that of Dan “projecthomepage”, meaning that the object’s state is inconsistent. To preventthis from happening, the application needs to employ an appropriateconsistency control mechanism.

To be more specific, there are two consistency criteria that the ap-plication should observe: causality and convergence (Ellis and Gibbs,1989). Causality means that an operation Ob that is issued at peer i

after another operation Oa was executed at i needs to be executed afterOa at all peers, so that the cause is always visible before the effect. Forexample, the shared screen object has to be created before its namecan be changed. Convergence demands that the state of all peers isidentical after these executed the same set of operations {Oi}. Thismeans in our example that Bob and Dan should see the same name ofthe screen activity.

5.2. Consistency Control Algorithms

Consistency control mechanisms that fulfill these two criteria can beclassified into pessimistic and optimistic. Pessimistic approaches seekto prevent conflicts, such as the one described above, from happeningby allowing only one user to change a certain aspect of the application’sstate at a certain point in time, e.g., by locking (Munson and Dewan,1996). Their drawback is that they constrain collaboration and reducethe responsiveness of the system. In contrast, optimistic algorithmsallow conflicts to happen and repair them afterwards in an efficientway. They work best under the assumption that conflicts occur onlyinfrequently. Examples are object duplication (Sun and Chen, 2002),operational transformation (Ellis and Gibbs, 1989; Sun et al., 1998),and serialization (Sun et al., 1996). Object duplication creates a new

Page 12: ConsistencyControlforSynchronousandAsynchronous ... · ConsistencyControlforSynchronousandAsynchronous Collaboration based on Shared Objects and Activities Jurgen Vogel University

12 Vogel et al.

version of an object when conflicting operations occur, thus presentingmultiple versions of the same object to all users. We believe that thisapproach is confusing for the user and that it is opposed to the originalgoal of a collaborative application, i.e., to create a joint object. Op-erational transformation converts concurrent operations so that theycan be executed in the order in which they are received and still leadto the same result. The main drawback of this approach is its compleximplementation since it requires a transformation function for each pairof operations.

Thus, we decided to employ serialization for establishing causalityand convergence. The basic idea is to execute a set of operations thattargets a certain object in a specific order at all peers. Such an ordercan be defined on the basis of state vectors (Lamport, 1978)1.

5.3. State Vector Ordering

A state vector SV is a set of tuples (i, SNi), i = 1, .., n, where i denotesa certain peer, n is the number of peers, and SNi is the sequence numberof peer i. When i issues an operation O, SNi is incremented by 1 andthe new SV is assigned to O as well as to the state that results afterexecuting O. We define SV [i] := SNi. With the help of state vectors,causality can be achieved as follows (Sun et al., 1998): Let SVO be thestate vector of an operation O issued at peer a and SVb the state vectorat peer b at the time O is received. Then O can be executed at peerb when (1) SVO[a] = SVb[a] + 1, and (2) SVO[i] ≤ SVb[i] for all peersi 6= a. This means that prior to the execution of O all other operationsthat causally precede O have been received and executed. If this is thecase, we call O causally ready. But if O violates this rule, it needs tobe buffered until it is causally ready, i.e., until all necessary precedingoperations have arrived and have been executed.

Convergence can be achieved by applying the following orderingrelation to all operations that are causally ready (Sun et al., 1998): LetOa and Ob be two operations generated at peer a and b, SVa the statevector of Oa and SVb the state vector of Ob, and sum(SV ) :=

∑i SV [i].

Then Oa < Ob, if (1) sum(SVa) < sum(SVb), or (2) sum(SVa) =sum(SVb) and a < b.

1 Alternatively, operations can be ordered on the basis of their scheduled exe-cution timestamps (Mauve, 2000), which is mainly done for shared objects withreal-time characteristics, e.g., time-controlled animations. Such objects are currentlynot supported by our prototype. Moreover, a timestamp ordering would requirethat the clocks of all peers are synchronized, which increases the administrativeoverhead and creates dependencies on a time synchronization infrastructure. Butthe algorithms described below are independent from the chosen ordering relation,and switching to a timestamp-based ordering would be straightforward.

Page 13: ConsistencyControlforSynchronousandAsynchronous ... · ConsistencyControlforSynchronousandAsynchronous Collaboration based on Shared Objects and Activities Jurgen Vogel University

Consistency Control for Synchronous and Asynchronous Collaboration 13

In the following, we denote the set of operations that was receivedby a peer and executed in the order defined above as operation history.Due to the propagation delay of the network, it might happen that anoperation Oa that should have been executed before an operation Ob

according to the ordering relation is received only after Ob has beenexecuted, i.e., Oa would not be the last operation when sorted into thehistory. This means that applying Oa to the current state would causean inconsistency. Thus, a repair mechanism is required that restoresthe correct order of operations.

5.4. The Timewarp Algorithm

Timewarp is such an algorithm and works as follows (Jefferson, 1985;Mauve et al., 2004): Each peer saves for each shared object the historyof all local and remote operations. Moreover, snapshots of the currentstate are added periodically to the history. Let us assume that anoperation Oa is received out of order. First, Oa is inserted into thehistory of the target object in the correct order. Then, the object’sstate is set back to the last state saved before Oa should have beenexecuted, and all states that are newer are removed from the history(since they are now outdated). After that, all operations following Oa

are executed in a fast-forward mode until the end of the history isreached. Thus, the timewarp algorithm ensures that all peers executeall operations in the same order and therefore fulfills the convergencecriterion defined above. To avoid confusion, only the repaired final stateshould be visible for the user.

Instead of deleting outdated states when executing a timewarp, thesecould also be updated while calculating the new state. This concept isalso known as trailing states (Cronin et al., 2002; Mauve et al., 2004)and has the advantage that the operation history will contain morestates as possible starting points for future timewarps.

The advantages of the timewarp algorithm are that it functions ex-clusively on local information and does not require additional commu-nication among peers, that it is robust and scalable, that it is applicableto all peer-to-peer applications, and that the operation history can bereused for other purposes such as versioning and local recording (seeSection 8).

One major drawback is the memory usage of the operation historywhich is determined to a large part by the frequency of state snapshots.While a low frequency saves memory, it increases the average processingtime for the execution of a timewarp. For shared objects with manylightweight operations and a large state, we recommend a low satesnapshot frequency (e.g., every 15-20 operations). For instance, if we

Page 14: ConsistencyControlforSynchronousandAsynchronous ... · ConsistencyControlforSynchronousandAsynchronous Collaboration based on Shared Objects and Activities Jurgen Vogel University

14 Vogel et al.

design the shared screen object such that each point drawn on the userinterface is transmitted as an individual event, the history will containmany operations, and a low frequency is advisable in order to savememory space. But if we design the shared screen object such thatonly entire strokes are propagated (as it is the case in our prototype),a higher frequency of about 10 is appropriate. Instead of predefiningthe snapshot frequency, the system could also adapt it dynamicallydepending on the observed processing time for a timewarp and theavailable memory space.

Another drawback of the timewarp algorithm is the processing timeto recalculate an object’s state. But since each shared object has itsown operation history and since states are inserted frequently into thehistory, the time required for a timewarp is usually below 40 ms on anAMD 2.0 GHz PC, which is not noticeable for the user.

Finally, the actual effect of a timewarp might be disturbing for auser: It is possible that the recalculated state differs to a high degreefrom the object’s state that was visible before, e.g., when objects aredeleted. In Section 8, we discuss possibilities to explain such visualartifacts to the user.

5.5. Improvements for the Timewarp Algorithm

In order to lower the memory space that is required for storing theoperation history, the size of the history can be limited by analyzingthe information included in the state vector SVO of an operation O

issued by peer j and received by i: SVO[k] gives the sequence numberof the last operation that j received from k, i.e., j implicitly acknowl-edges all older operations of k. Under the condition that O is causallyready, there can be no timewarp triggered by operations from j thatare concurrent to operations from k with SNk ≤ SVO[k]. This means,that from j’s perspective, i can remove all operations of k with SNk ≤SVO[k] from its history. Once i has received similar acknowledgmentsfrom all peers, old operations can be cleared. This process can be spedup by periodically exchanging status messages with the current statevectors of a peer.

In the unlikely event that a new peer late-joins an ongoing collabo-ration on a shared object, obtains the current state, and issues a statechange concurrently to the operation of another peer, the aforemen-tioned improvement might fail since peers clearing their history haveno knowledge about the late-joining peer. This can be prevented bykeeping outdated operations for some extra time span or by using anadditional repair mechanism such as a state request (Vogel and Mauve,2001).

Page 15: ConsistencyControlforSynchronousandAsynchronous ... · ConsistencyControlforSynchronousandAsynchronous Collaboration based on Shared Objects and Activities Jurgen Vogel University

Consistency Control for Synchronous and Asynchronous Collaboration 15

The performance of the timewarp algorithm can be improved con-siderably by a round-based execution and by applying application-levelknowledge: With the first improvement, the application collects alloperations during a certain period of time T , instead of treating eachoperation individually (Mauve et al., 2004). The smallest operation (ac-cording to the state vector ordering) received then determines whetherthe sequence of received operations can be applied to the current stateor whether a timewarp is required. Thus, at most one timewarp per T isexecuted, independent from the number of late-arriving operations. Aswe show in (Mauve et al., 2004), this proceeding reduces the complexityof the timewarp algorithm from O(n3) to O(n2), where n is the numberof peers.

The second improvement uses application-level knowledge to decideif the local state of a peer does not need to be changed during an updateinterval T : An operation Oa received out of order can be ignored undercertain conditions, i.e., if there exists an operation Ob with Oa < Ob

that overwrites the effects of Oa (Vogel and Mauve, 2001). This means,that the state that would be reached after executing Oa and Ob isidentical to the state that has been reached by executing Ob only. Ifthere is at least one such Ob, then Oa can be ignored, and no timewarpis necessary. For instance, let Oa and Ob be the two “change name”events from the example above. Then the object’s name will be “firstdraft” after both Oa and Ob have been executed. Even if a timewarpwas executed, Dan would miss the fact that the name was “projecthomepage” for a certain period of time, since only the state after ex-ecuting the timewarp would be displayed. As we found in (Vogel andMauve, 2001), this filtered timewarp approach can reduce the numberof timewarps required by as much as 99%, depending on the applicationscenario.

6. Asynchronous Collaboration

A key aspect of our prototype is that shared objects can be accessed byusers anytime and independently of others. They are persistent evenwhen all peers are offline, and they facilitate synchronous as well asasynchronous collaboration. The latter means for the scenario describedin Section 3 that Bob can access and manipulate data of the sharedscreen object (see (7) in Figure 2) even when Dan is offline and hisActivityService is not running2. This raises the question how Dan’s

2 In case Dan is not currently working on a shared object but his application isrunning, no additional actions are required since Dan’s ActivityService still receivesall state updates.

Page 16: ConsistencyControlforSynchronousandAsynchronous ... · ConsistencyControlforSynchronousandAsynchronous Collaboration based on Shared Objects and Activities Jurgen Vogel University

16 Vogel et al.

application can acquire the information that was missed once it is activeagain so that it possesses the current state and Dan is able to continuethe collaboration.

For easier discussion, we denote the process of reconnecting after op-erations have been missed as late-join, the peer that missed operationsas late-join client, and any peer that provides information to the late-join client as late-join server (Vogel et al., 2003a). Note that the rolesof late-join client and server are assigned dynamically in a session, andthat a peer can be both client and server for different shared objects.

6.1. Design Considerations

A late-joining peer needs to retrieve a sufficient amount of data so thatit can update the local copies of its shared objects. More specifically, wehave to address the following questions: First, in which form is updateinformation provided to a late-join client? Update information can bedelivered in the form of a state that holds the current values of a sharedobject’s attributes or by a replay of the operations that led to thatstate. Second, who provides information to a late-join client? Becauseof the application’s peer-to-peer architecture, data could be providedby a number of peers. Thus, one or more late-join servers need to bespecified. Since we do not want to introduce a static infrastructure byusing predefined late-join servers, the server selection process should bedynamic. Third, when does the transmission of information take place?Late-join data could be delivered immediately or at a later point intime, e.g., when the user accesses the concerned activity. Finally, howis update data distributed over the network? Since the communicationmodel of our prototype relies on point-to-point connections, we decidedto transmit update information also via unicast.

Before we discuss these design options in more detail and presentour approach, we also need to consider a number of design criteria:First, the late-joining peer must reach a consistent state as defined inSection 5.1. Second, the algorithm should be robust against possiblefailures in the update process. Third, for the late-joining peer it isdesirable that the update process takes only a limited amount of time.Fourth, the network load that is caused by the transmission of late-joindata should be low. Finally, the application load in terms of memoryspace and processing power should be low.

Since our prototype allows a seamless transition between synchronousand asynchronous collaboration, it is likely that late-join situations oc-cur frequently, and that update information is available only for certainperiods of time at selected peers. This also means that the local statecopies held at the individual peers might diverge to a certain degree

Page 17: ConsistencyControlforSynchronousandAsynchronous ... · ConsistencyControlforSynchronousandAsynchronous Collaboration based on Shared Objects and Activities Jurgen Vogel University

Consistency Control for Synchronous and Asynchronous Collaboration 17

before updates can be exchanged. Thus, our main design goals are aquick update of late-joining peers as well as consistency and robustnessof the system.

6.2. State Transmission versus Operation Replay

The first design option is whether the late-join client should be providedwith all operations missed or, alternatively, with a copy of the currentstate. The most efficient way to provide late-join data are object states.However, in the dynamic environment encountered here, it is likely thatthe local state of a peer that is selected as a late-join server is not upto date. Thus, several state transmissions might be necessary until alate-join client reaches a consistent state. Better suited is therefore areplay of the operation history which enables a client to repair possibleinconsistencies locally. Under the condition that all missing operationswill be delivered eventually to the late-join client, this approach is veryrobust. Since in many late-join situations the client already holds a largepart of the operation history, only the missing operations need to beexchanged, which reduces the initialization delay as well as the networkand application load. The missing parts of the operation history can beidentified by comparing the state vectors of late-join client and server.The client then integrates these operations into its operation historyby applying the timewarp algorithm.

Peers that join an activity for the first time have an empty initialstate. Replaying the complete operation history to these late-join clientswould cause a high initialization delay and a high network load. Butinitializing a late-join client with a state copy again raises consistencyissues as described above. Instead, the initialization can be started withan old state and a following replay of later operations. The startingstate S should be consistent so that it allows the late-join client toreach a consistent state. Whether a state S from the operation historyof the server j qualifies as a starting point can be determined by com-paring its state vector SVS with the vectors SVi of all other peers i: IfSVS[k] ≤ SVi[k] ∀k, i 6= j, S contains the effects of all operationsscheduled before S was taken and is a valid starting point. In practice,this implies that S might be quite old.

6.3. Controlling Late-Join Data Delivery

Another design option determines if the data transmission is initiatedby the late-join client or the server, and when the transmission takesplace (Vogel et al., 2003a). Initiating and controlling the transmissionby the late-join client is the obvious choice since the client has to ensurethe consistency of its own state.

Page 18: ConsistencyControlforSynchronousandAsynchronous ... · ConsistencyControlforSynchronousandAsynchronous Collaboration based on Shared Objects and Activities Jurgen Vogel University

18 Vogel et al.

Since update information might be available only for a short periodof time, the update process is started as soon as the ActivityService isrunning: The late-join client contacts all peers with whom it collabo-rates one by one and hands over the current state vectors of its sharedobjects. Comparing those state vectors to its own, a peer can decidewhich operations from its history need to be sent to the late-joiningpeer (if any)3. Thus, the late-join client can obtain missed state changesfrom any peer that participates in a shared object, not only from theoriginal source of an operation. This is especially useful when thatsource is offline at the time the late-join client connects. Additionally,it increases the robustness of the system and spreads the burden ofinitializing clients to all peers.

Instead of contacting only a single peer, the late-join client contactsall peers in order to quickly discover operations missing in the client’shistory. This lowers the time span until a consistent state is estab-lished and increases the robustness of the system. To further increasethe probability that all missed operations are received, informationabout the current state of objects is also exchanged whenever a userjoins a shared object. Both proceedings also increase the applicationand network loads, but do not pose a severe scalability problem sincethe number of peers that participate in an activity is typically belowten (Muller et al., 2004).

6.4. Late-Joining New Shared Objects

The mechanism described so far succeeds only if a peer that was al-ready participating in a shared object missed some state changes. Butanother possibility is that a new shared object is created or that anew member is invited to join an existing activity (as in step (1) of thescenario in Section 3). Notice that either the peer creating a new sharedobject or the peer receiving a new shared object could be offline. Sowhen reconnecting, the late-join client does not know about the newshared object and can therefore not ask for the transmission of missedoperations. Even worse, it might be that the late-join client does notknow the other peers that are members of this shared object if theydid not collaborate before. Thus, in case a peer misses the creation ofan object or the invitation to an existing object, the responsibility forthe retransmission must be with the originator of that action. Shouldboth peers connect for the transmission of any other data, the missedoperations can be delivered to the late-join client. The late-join server

3 It might also be that the late-joining peer possesses operations that the con-tacted peer does not have. This can also be detected by a state vector comparisonso that the missing operations can be transferred.

Page 19: ConsistencyControlforSynchronousandAsynchronous ... · ConsistencyControlforSynchronousandAsynchronous Collaboration based on Shared Objects and Activities Jurgen Vogel University

Consistency Control for Synchronous and Asynchronous Collaboration 19

also tries to contact the client periodically in order to deliver that data(depending on the current application and network load).

As in the previous case, the probability for quickly updating the late-join client can be increased by placing the responsibility for updatingthe late-join client not solely on the original source but on all membersof a shared object. The more members a shared object has, the higherthe likelihood for a quick update. For this purpose, the peer that createdan object or invited new members to an object informs all availablepeer members that the relevant operations could not be delivered toone or more peers. All peers are assigned the task to try to transmit theoperations to the late-join client either by making regular contact or byprobing periodically. They stop as soon as they receive a state vectorfrom the client, indicating that it possesses the required information. Incase the late-join client receives the same operations more than once,it simply ignores the duplicates.

In summary, our approach to achieve a consistent application state(also in case of asynchronous/offline use) is based on the retransmissionof missed operations, accomplished by the aforementioned mechanismsin our prototype. Retransmissions can be initiated by the late-joiningpeer, by the peer who was the source for an operation, or by otherpeer members of a shared object. All operations received by the late-joining peer are then executed in the order defined in Section 5. Theymight trigger a timewarp if necessary so that consistency is achieved.All algorithms are based on our abstract data model of states andoperations and function independent of a shared object’s type.

7. Simulation Results

To evaluate the correctness and the performance of the algorithms de-scribed above, we simulated different scenarios for the synchronous andasynchronous collaboration of two and three peers respectively. In eachscenario we randomly reproduce the activities of a typical work weekwith five days. During one simulated workday, the following actionstake place on average: A total of three shared objects are created, eachpeer views the data of an object fifteen times (i.e., opens and closesthe view window of an object fifteen times), and each peer modifiesthe state of an object ten times. The simulated scenarios differ inrespect to the time span that each peer is working either online oroffline. For easier handling, we simulated each workday in 60 seconds.Please note that this increases the probability for a timewarp whenpeers work synchronously. In our simulations, the improvements to thetimewarp algorithm described in Section 5.5 were not applied. Because

Page 20: ConsistencyControlforSynchronousandAsynchronous ... · ConsistencyControlforSynchronousandAsynchronous Collaboration based on Shared Objects and Activities Jurgen Vogel University

20 Vogel et al.

of the limited number of operations in our scenarios, we insert a statesnapshot every five operations into the history.

7.1. Timewarp Algorithm

When all peers are collaborating synchronously over the entire simula-tion time, a timewarp can happen only under the rare condition thattwo or more operations targeting the same shared object are issuedconcurrently, i.e., within the time span that is necessary to propagatethese operations. For a scenario with two peers, a total number of fivetimewarps was triggered with an average duration of 26 ms and anaverage number of 6.4 operations that needed to be executed in orderto regain a consistent state4. When both peers are working online foronly 80% of the simulated time, a timewarp becomes much more likelysince concurrent actions can now happen over the whole time spanwhere at least one peer is offline. Consequently, a total number of 27timewarps occurred that took an average execution time of 28 ms andan average sequence of 6.7 operations to rebuild the application’s state.When the time that peers spend online is reduced further to 50%, 118timewarps happened with an average time of 32 ms and 13.2 operationsto be executed. In the last scenario, the two peers worked synchronouslyonly for 20% of the simulated time. Since in this case many sharedobjects and subsequent actions are actually created when being offline,the total number of timewarps decreases to 44 with an average durationof 30 ms and 9.5 operations in a timewarp sequence. In all cases, theaverage time to execute a single timewarp is below 32 ms which is notnoticeable for the user.

The effect of the algorithm to reduce the size of the operation historycan be seen in Figure 4, which depicts the total number of operationsexchanged in comparison with the actual size of the history of the twopeers, assuming an online time of 80%. In this scenario, the algorithm isable to limit the size of the histories to approximately 25% of the totalnumber of exchanged operations. For the other scenarios, this numberlies between 20% and 40%.

7.2. Delivery of Late-Join Data

While a peer is working offline, operations originating from or tar-geting that peer cannot be delivered immediately. Instead, the algo-rithms described in Section 6 transmit the missed operations when

4 Please note that the sequence of operations to be executed in the case ofa timewarp also includes operations that were not causally ready before (seeSection 5.3).

Page 21: ConsistencyControlforSynchronousandAsynchronous ... · ConsistencyControlforSynchronousandAsynchronous Collaboration based on Shared Objects and Activities Jurgen Vogel University

Consistency Control for Synchronous and Asynchronous Collaboration 21

0

100

200

300

400

500

600

700

0 50 100 150 200 250 300nu

mbe

r of

ope

ratio

nssimulation time [s]

operations exchangedoperations in history of peer 1operations in history of peer 2

Figure 4. Size of the operation history

the peer reconnects. Figure 5 shows the times peers are collaboratingsynchronously in the scenario where the three peers spend 80% of thesimulation time online. The number of operations that actually need tobe delivered belatedly because the receivers are unreachable is depictedin Figure 6. The late-join algorithm manages to update peers as soonas they are online again. The curves include those operations that arecached by all receivers that were online at the time they were issued (seeSection 6). On average, peer 1 stores 5.3 operations (peer 2: 7.2, peer3: 5.3) for the other peers, and it takes about 3.9 s (peer 2: 4.4 s, peer3: 4.0 s) between generating and delivering a certain operation5. Thesenumbers increase considerably for the scenario where peers are onlinefor 50% of the simulation time: Peer 1 caches on average 25.4 operations(peer 2: 15.3, peer 3: 22.5) and missed operations are transmitted 22.7 s(peer 2: 13.5 s, peer 3: 20.5 s) after they were issued. In the last scenariowhere peers are online for only 20% of the simulation time, the firstpeer meets the other peers only rarely. Consequently, peer 1 stores onaverage 107.4 operations for 60.6 s (peer 2: 80.4 operations for 22.8 s,peer 3: 66.4 for 27.6 s). These numbers show that the state of a sharedobject might diverge to a considerable degree when peers are workingoffline for a longer period of time. The algorithm for the distributedcaching of missed operations most likely will alleviate this problem,if the number of members of a shared object increases. However, itsperformance also depends on the work patterns of the members of ashared object.

5 Peers are offline at the end of all simulations (see Figure 5) and operations thathave not been delivered are not included in the analysis of the delivery time span.

Page 22: ConsistencyControlforSynchronousandAsynchronous ... · ConsistencyControlforSynchronousandAsynchronous Collaboration based on Shared Objects and Activities Jurgen Vogel University

22 Vogel et al.

peer 2

peer 1

peer 3

0 50 100 150 200 250 300

simulation time [s]

Figure 5. Online time of peers

0

5

10

15

20

25

30

0 50 100 150 200 250 300

num

ber

of o

pera

tions

simulation time [s]

operations cached by peer 1operations cached by peer 2operations cached by peer 3

Figure 6. Number of undeliverable operations

8. Conflict Management

The most challenging aspect of our peer-to-peer system is to makeit work for asynchronous/offline collaboration as well as for real-timecollaboration. Our approach builds upon an existing consistency mech-anism for synchronous collaboration, the timewarp algorithm. In Sec-tion 6, we discussed several strategies that we implemented to makesure that this algorithm would also work in case of asynchronous/offlinework. They are based on recovering the operation history after periodsof offline/asynchronous work. While our simulation results in Section 7indicate that our system design works and performs adequately, thisapproach has also some shortcomings.

The time for reaching a consistent state after periods of offline usedepends very much on the user behavior. In the worst case, local statesof a shared object might diverge for a very long period of time, e.g.,if two users are alternating between being offline and online. Each

Page 23: ConsistencyControlforSynchronousandAsynchronous ... · ConsistencyControlforSynchronousandAsynchronous Collaboration based on Shared Objects and Activities Jurgen Vogel University

Consistency Control for Synchronous and Asynchronous Collaboration 23

user then works on a state that differs from the one that would bereached if all operations were exchanged immediately, and after even-tually merging the operation histories, they might see an unexpectedresult. This reflects a major problem of this approach: While the state isnow consistent for both users, the machine cannot guess what their ac-tual intention was when they were independently modifying the sharedobjects. The algorithm only makes sure that both can see the sameend result. However, for the user it is difficult to understand how theresulting state of the object came to be. This is in particular the case,when some of the operations exchanged conflict semantically, i.e., whenthey modify correlated aspects of a shared object. Even though thesystem can detect semantic conflicts in operations (Vogel and Mauve,2001), it is not able to resolve them automatically. We therefore believethat it is crucial to supplement our automatic update algorithms withsome feedback mechanism that allows users to understand the pastcourse of action and to become aware of conflicts.

Our goal therefore is a visualization mechanism that provides de-tailed awareness information about missed and conflicting operations:Which objects were modified and how? Which peers are responsible forthe changes? What is their current task or motivation? Which opera-tions conflicted? And what would an alternative state look like? Some ofthis information can be extracted automatically from the operations ex-changed (e.g., the peers and objects involved), while other informationrequires explicit user feedback (e.g., about a user’s motivation). Onepossibility would be to display a summarizing report for each sharedobject when reconnecting after a period of asynchronous work, namingpeers, their modifications, and possible conflicts. Analyzing such re-ports might be inefficient for larger activities with many shared objectsand operations.

Another possibility would be to let the user compare different ver-sions of a shared object, e.g., the versions before and after exchang-ing missed operations. The application could also point out differ-ences in two states, similar to diff that highlights differences in textfiles (Neuwirth et al., 1992), and provide mechanisms to resolve conflictsin coordination with the all peers involved (e.g., by voting). Displayingdifferent versions of an object can be implemented easily on the basisof the operation history that is held by each peer (see Section 5.4).

Finally, the operation history can not only be used to access paststates but also to demonstrate the current state’s evolution: By replay-ing the history, the user can review the past course of actions so thatthe effects of (missed) operations become visible (Edwards and Mynatt,1997; McCaffrey, 1998; Geyer et al., 2001). As depicted in Figure 7, weimplemented a replay prototype for shared screen objects (Vogel, 2004).

Page 24: ConsistencyControlforSynchronousandAsynchronous ... · ConsistencyControlforSynchronousandAsynchronous Collaboration based on Shared Objects and Activities Jurgen Vogel University

24 Vogel et al.

Figure 7. Visualization of the operation history

It is controlled with a VCR-like interface that is displayed underneaththe object’s workspace: After pressing play (1), operations are replayedin the order given by the history, and the appropriate state is displayedin the workspace window. The replay can run either in the originaltime lapse or in a fast-forward mode. The skip buttons (2) change thecurrent position in the history. The user can also browse through thehistory by means of a slider (3). This is possible for going forwards andbackwards in time. When the slider is moved, the shared object’s stateis updated accordingly. In preliminary experiments, this fast browsingwas found to be very effective when analyzing large sequences of missedoperations.

Exploring the operation history has only effects on the local user’sview and does not disturb other peers. While a past state is displayed,shared objects cannot modified. All remote operations received in themeantime are appended to the history, but their effect is not visibleuntil the replay reaches their position.

In addition to this replay functionality, we visualize the awareness in-formation identified above in a timeline representation of the operationhistory. This timeline contains an icon for each operation. The most re-cent operation is shown on the right, and new operations are appendedas soon as they are obtained. Each icon on the timeline encodes differentinformation depending on its shape, color, and background: Operationsissued by the local user have an outgoing arrow (4), while all remoteoperations show an incoming arrow (5). The operation that was exe-cuted last is marked by a green background (6), i.e., the state currentlydisplayed shows the effects of all operations up to (6). All operations notyet executed are displayed with gray icons (7), instead of black ones (4).If operations are detected that conflict semantically, they are marked by

Page 25: ConsistencyControlforSynchronousandAsynchronous ... · ConsistencyControlforSynchronousandAsynchronous Collaboration based on Shared Objects and Activities Jurgen Vogel University

Consistency Control for Synchronous and Asynchronous Collaboration 25

a dot that is either red (8) or orange (9) in order to distinguish amongdifferent conflict sequences. In (8), one remote and one local operationconflict semantically. Thus, the user becomes aware of conflicts andcan analyze them via the replay. Although it would be possible toencode even more information into an operation icon such as its type(e.g., move, draw, etc.) or the responsible peer, this would increase therepresentation’s complexity. Instead, detailed textual descriptions foreach operation are provided via tool-tip windows.

The replay functionality can be implemented on the basis of thetimewarp algorithm: A timewarp is executed when a state older thanthe current one is to be displayed (e.g., when moving the slider to theleft or when jumping to the history’s beginning). When moving forwardin time, the respective next state can be calculated by applying the nextoperation to the current state.

A more detailed discussion of the replay functionality and the time-line representation of the operation history can be found in (Vogel,2004).

9. Lessons Learned

Our design philosophy and usage scenarios were a major factor in thedecision of building a peer-to-peer system that would function with-out much administrative overhead. From a technical point of view,building such a system is rather challenging and requires sophisticatedconsistency control algorithms. Since the states of shared objects mightdiverge to a certain degree before updates can be exchanged, the systemmight also behave unexpectedly from the user’s point of view. Asidefrom the visualization mechanisms discussed above, there are also twomore technical solutions that would alleviate this problem: Adding ad-ditional servers, which cache operations and update late-joining peers,would help to decrease the time between resynchronizations of states.Whenever peers are connected to the system, their application can sendoperations to the caching server, if the collaborating peers are currentlyoffline. When the other peers connect, they first contact the cachingserver to collect missed operations. The drawback of this approach isthat it requires additional infrastructure that needs to be maintained.

Another technical approach could be to put more semantics intothe consistency control algorithm itself. The ordering of operations instate vectors is based on sequence numbers, i.e., when people workoffline for long periods of time, the system only looks at the sequencenumbers to restore a consistent state. But the sequence numbers do notreflect when operations actually took place. If two users work offline

Page 26: ConsistencyControlforSynchronousandAsynchronous ... · ConsistencyControlforSynchronousandAsynchronous Collaboration based on Shared Objects and Activities Jurgen Vogel University

26 Vogel et al.

at different times, ordering of operations could be prioritized by therecency of the state change, which would probably help users in betterunderstanding the resulting state after merging the operation histories.This could be achieved either by using globally synchronized clocks asa means for ordering (which again requires infrastructure) or, moreelegantly, by using a modified state vector scheme that incorporateslocal time into the numbering.

It is also noteworthy that when we started this project, we werenot anticipating the complexity of such a peer-to-peer system despitethe fact that some of the authors of this article had several years ofexperience in developing distributed systems. The implementation ofonly the consistency mechanisms took three full person months. And,before starting to work on the peer-to-peer aspects, we already had aserver-based prototype available6.

When building a peer-to-peer system, the advantages of this ap-proach have to be carefully traded off against the benefits that a server-based solution offers. We believe that there is no definite answer tothe question peer-to-peer or server-based. The design philosophy andthe user experience of our system require offline use. Hence, we needreplication of data to the local machine. However, the aspect of offlineuse would also greatly benefit from having a server that helps syn-chronize divergent copies of the distributed state as discussed above.We are currently investigating hybrid architectures that use servers forresynchronization, caching, address resolution, and other useful morelightweight services. While they hold a master copy of the state, lo-cal applications would still communicate directly peer-to-peer. In thistopology, the server would be just another peer with a different role.The presence of a server facilitates synchronization. However, pleasenote that even with a server, there is still a need for consistency algo-rithms like the one described in Section 5. When working offline, localreplicas and operations need to be merged with the master copy onthe server. With some modifications, the algorithms described earlierin this article could be also used in a hybrid architecture. We are awarethat a hybrid architecture again comes with the problem of an addi-tional infrastructure that needs to be introduced and accepted withinan organization, which may take a long time. In the meantime purepeer-to-peer systems help paving the road to get there.

We believe that ultimately the user experience has to be peer-to-peer. Email is actually a very good example for such a hybrid systemthat feels peer-to-peer from a user’s perspective but is implemented

6 Our prototype is implemented in Java 1.4 using SWT/JFACE user interfacewidgets from the Eclipse framework (Eclipse Project, 2004).

Page 27: ConsistencyControlforSynchronousandAsynchronous ... · ConsistencyControlforSynchronousandAsynchronous Collaboration based on Shared Objects and Activities Jurgen Vogel University

Consistency Control for Synchronous and Asynchronous Collaboration 27

through a network of mail servers (that talk peer-to-peer to one an-other). Most email clients today also support offline use by keeping localreplicas of their inboxes. However, email does not support shared statesand real-time collaboration and thus does not face the consistencychallenges like in our system.

10. Related Work

Our notion of activity-centric collaboration based on shared objects hasbeen inspired by previous work in this area. This includes a variety ofstudies on email and activities, collaborative improvements to email,and “peer-to-peer like” systems featuring replicated or synchronousshared objects.

The use of email has been widely studied and research indicates thatemail is the place where collaboration emerges (e.g., (Ducheneaut andBellotti, 2001; Ducheneaut and Belotti, 2001; Mackay, 1988; Whittakerand Sidner, 1996; Minassian et al., 2004)). Ducheneaut et al. report onhow people manage work activities within their email (Ducheneaut andBelotti, 2001), confirming Bernstein’s description of emerging groupprocesses (Bernstein, 2000). Their findings include: The outcome ofan activity is often unpredictable, membership in activities is fluid,activities can evolve from the informal to the formal, and late-joinersof activities are poorly supported because they have no access to thehistory.

A number of recent collaboration systems illustrate concepts similarto activity threads. Rohall et al. and Kerr show how automatic analysisof subject headers and reply-relationships in different emails can groupthem into a coherent thread of messages (Rohall et al., 2001; Kerr,2003). They also present how carefully designed thread visualizationscan reduce inbox clutter. However, their threads are not shared, nor dothey reflect any collaborative activity beyond the exchange of messages.

Bellotti et al. (Belotti et al., 2003) introduce the notion of a “thrask”as a means for better organizing email activities. Thrasks are threadedtask-oriented collections that contain different types of artifacts suchas messages, hyperlinks, and attachments and treat such content at anequal level. Thrasks can be manually created, with contents assigned byusers to meaningful activities. While all of these attributes are similarto our use of activity threads, thrasks are not shared: They are privatecollections of related artifacts, whose organization are specific only tothe owner of the thrasks, and lack any awareness of how others aremanipulating the artifacts.

Page 28: ConsistencyControlforSynchronousandAsynchronous ... · ConsistencyControlforSynchronousandAsynchronous Collaboration based on Shared Objects and Activities Jurgen Vogel University

28 Vogel et al.

Along the same lines, Kaptelinin presents a system that helps usersorganize resources into higher-level activities (“project-related pools”)(Kaptelinin, 2003). The system attempts to include not just email,but also any desktop application. It allows manual addition of re-sources of any type but also monitors user activities and automati-cally adds resources to the current activity. However, the system hasbeen designed for personal information management only, not com-plete synchronous/asynchronous sharing and awareness. Rohall’s andKaptelinin’s work demonstrate semi-automatic management of activi-ties and their artifacts. This is something we should investigate in ourfuture work to help make managing activity threads more lightweight.

A variety of collaborative systems implement replicated shared ob-jects and “collaborative building blocks” similar to our prototype. Oneexample is Groove (Groove Networks, 2004), a peer-to-peer systemwhich features a large suite of collaborative tools (chat, calendar, white-board, etc.) and email integration. However, Groove is workspace-centric(although creation of shared workspaces is quick), and collaborationis centered on the tools placed in the workspace (e.g., all shared filesappear in the shared files tool), except for persistent chat which appearsacross all tools. In contrast, our system focuses collaboration aroundartifacts, which can be organized in a single connected hierarchy ofdifferent types. While Groove’s design philosophy is different from ours,the architecture is very similar. Their system has to deal with problemssimilar to the ones described in Section 5. However, we have no technicalinformation to compare the two approaches.

Another example is Eudora’s Sharing Protocol (Eudora SharingProtocol (ESP), 2004), which offers a replicated shared file architecturecompletely based on email and leverages special MIME headers. KubiSpaces also offers replicated shared objects, with a richer suite of objecttypes, including to-dos, contacts, and timelines (Kubi Software, 2004).Microsoft Outlook has a server-based “shared email folder” capabil-ity (Microsoft Outlook, 2004). Lotus Notes (Lotus Notes, 2004) offersa replicated object architecture, although typically email objects arecopies, not shared amongst different users. A powerful benefit of these“shared email solutions” is that no additional infrastructure beyondemail is needed. However, synchronization only occurs on a triggered re-fresh interval and depends on a non-real-time message delivery betweenintermediary email servers. Thus, collaboration is entirely asynchronous(e.g., users cannot work simultaneously on a whiteboard in real-time)without real-time presence awareness.

Page 29: ConsistencyControlforSynchronousandAsynchronous ... · ConsistencyControlforSynchronousandAsynchronous Collaboration based on Shared Objects and Activities Jurgen Vogel University

Consistency Control for Synchronous and Asynchronous Collaboration 29

11. Conclusions

We described a new collaboration technology that targets lightweightcollaborative activities sitting mid-way between ad hoc communicationin email and more formal collaboration in shared workspace systems. Asa major new design principle, we introduced the notion of object-centricsharing, which allows people to aggregate and organize shared ob-jects into activity threads, providing an emerging context that evolvesand engages a dynamic group of participants. Being “placeless,” ourapproach imposes little overhead and allows for lightweight, ad hoccollaboration. A study to better understand the usefulness and usabilityof this new approach was recently completed and detailed results arereported in (Muller et al., 2004; Millen et al., 2004).

This work focused on describing technical aspects and implemen-tation challenges of a peer-to-peer prototype system supporting thenotion of object-centric sharing. As an implication of our design phi-losophy, we need to maintain consistency in a blended synchronous andasynchronous collaborative system. Our approach enhances a popularconsistency algorithm, which had been originally designed for real-timecollaboration. It is based on an abstract data model of states andoperations and can be adapted easily to any type of shared object. Oursimulation results indicate that this approach performs well. However,a major difficulty is that during long phases of asynchronous work theapplication state might diverge significantly. To alleviate this problem,we are currently looking into improved versions of our consistencyalgorithm, visual feedback mechanisms for providing awareness infor-mation, and we are also investigating new hybrid architectures that uselightweight caching servers to make the system more robust.

References

Belotti, V., N. Ducheneaut, M. Howard, and I. Smith: 2003, ‘Taking Email to Task:The Design and Evaluation of a Task Management Centered Email Tool’. In:Proc. ACM SIGCHI, Ft. Lauderdale, FL, USA. pp. 345–352.

Bernstein, A.: 2000, ‘How Can Cooperative Work Tools Support Dynamic GroupProcesses? Bridging the Specifity Frontier’. In: Proc. ACM CSCW, Philadelphia,PA, USA. pp. 279–288.

Chu, Y., S. G. Rao, S. Seshan, and H. Zhang: 2001, ‘Enabling Conferencing Appli-cations on the Internet using an Overlay Multicast Architecture’. In: Proc. ACMSIGCOMM, San Diego, CA, USA. pp. 55–67.

Cronin, E., B. Filstrup, S. Jamin, and A. R. Kurc: 2002, ‘An Efficient Synchro-nization Mechanism for Mirrored Game Architectures’. In: Proc. NetGames,Braunschweig, Germany. pp. 67–73.

Page 30: ConsistencyControlforSynchronousandAsynchronous ... · ConsistencyControlforSynchronousandAsynchronous Collaboration based on Shared Objects and Activities Jurgen Vogel University

30 Vogel et al.

Diot, C., B. N. Levine, B. Lyles, H. Kassem, and D. Balensiefen: 2000, ‘DeploymentIssues for the IP Multicast Service and Architecture’. IEEE Network 14(1),78–88.

Ducheneaut, N. and V. Bellotti: 2001, ‘E-Mail as Habitat: An Exploration ofEmbedded Personal Information Management’. ACM Interactions 8(5), 30–38.

Ducheneaut, N. and V. Belotti: 2001, ‘A Study of Email Work Activities in ThreeOrganizations’. Technical report, Working Paper, PARC, CA, USA.

Eclipse Project: 2004. URL http://www.eclipse.org.Edwards, W. K. and E. D. Mynatt: 1997, ‘Timewarp: Techniques for Autonomous

Collaboration’. In: Proc. ACM SIGCHI, Atlanta, GA, USA. pp. 218–225.Ellis, C. A. and S. J. Gibbs: 1989, ‘Concurrency Control in Groupware Systems’.

In: Proc. ACM SIGMOD, Portland, OR, USA. pp. 399–407.Eudora Sharing Protocol (ESP): 2004. URL http://www.eudora.com/email/-

features/esp.html.Extensible Markup Language (XML) 1.0 (Second Edition): 2000. W3C Recommen-

dation, available at http://www.w3c.org/TR/REC-xml.Geyer, W. and L.-T. Cheng: 2002, ‘Facilitating Emerging Collaboration through

Light-Weight Information Sharing’. In: Proc. ACM CSCW, New Orleans, LA,USA. pp. 221–230.

Geyer, W., H. Richter, L. Fuchs, T. Frauenhofer, S. Davijavad, and S. Poltrock:2001, ‘A Team Collaboration Space Supporting Capture and Access of VirtualMeetings’. In: Proc. ACM SIGGROUP, Boulder, CO, USA. pp. 188–196.

Geyer, W., J. Vogel, L.-T. Cheng, and M. Muller: 2003, ‘Supporting Activity-centric Collaboration through Peer-to-Peer Shared Objects’. In: Proc. ACMSIGGROUP, Sanibel Island, FL, USA. pp. 115–124.

Gnutella and Limewire: 2004. URL http://www.limewire.com.Groove Networks: 2004. URL http://www.groove.net.Jefferson, D. R.: 1985, ‘Virtual Time’. ACM Transactions on Programming

Languages and Systems 7(3), 404–425.JXTA: 2004. URL http://www.jxta.org.Kaptelinin, V.: 2003, ‘UMEA: Translating Interaction Histories into Project Con-

texts’. In: Proc. ACM SIGCHI, Ft. Lauderdale, FL, USA. pp. 353–360.Kerr, B.: 2003, ‘Thread Arcs: An Email Thread Visualization’. In: Proc. IEEE

InfoVis, Seattle, WA, USA. pp. 27–35.Kubi Software: 2004. URL http://www.kubisoft.com.Lamport, L.: 1978, ‘Time, Clocks, and the Ordering of Events in a Distributed

System’. Communications of the ACM 21(7), 558–565.Lotus Notes: 2004. URL http://www.lotus.com/notes.Mackay, W. E.: 1988, ‘More Than Just a Communication System: Diversity in the

Use of Electronic Mail’. In: Proc. ACM CSCW, Portland, OR, USA. pp. 344–353.Mauve, M.: 2000, ‘Consistency in Continuous Distributed Interactive Media’. In:

Proc. ACM CSCW, Philadelphia, PA, USA. pp. 181–190.Mauve, M., J. Vogel, V. Hilt, and W. Effelsberg: 2004, ‘Local-lag and Timewarp: Pro-

viding Consistency for Replicated Continuous Applications’. IEEE Transactionson Multimedia 6(1), 45–57.

McCaffrey, L.: 1998, ‘Representing Change in Persistent Groupware Environments’.Technical report, Grouplab Report, Department of Computer Science, Universityof Calgary, Canada.

Microsoft Outlook: 2004. URL http://www.microsoft.com/outlook/.

Page 31: ConsistencyControlforSynchronousandAsynchronous ... · ConsistencyControlforSynchronousandAsynchronous Collaboration based on Shared Objects and Activities Jurgen Vogel University

Consistency Control for Synchronous and Asynchronous Collaboration 31

Millen, D. R., M. Muller, W. Geyer, B. Brownholtz, and E. Wilcox: 2004, ‘MediaChoices in an Activity-centric Collaborative Environment’. Submitted to: ACMCSCW, Chicago, IL, USA.

Minassian, S., S. Rohall, and D. Gruen: 2004, ‘Lessons from the ReMail Prototypes’.Submitted to: ACM CSCW, Chicago, IL, USA.

Muller, M., W. Geyer, B. Brownholtz, E. Wilcox, and D. R. Millen: 2004, ‘OneHundred Days in an Activity-centric Collaboration Environment’. In: Proc. ACMSIGCHI, Vienna, Austria.

Munson, J. and P. Dewan: 1996, ‘A Concurrency Control Framework for Col-laborative Systems’. In: Proc. ACM CSCW, Cambridge, MA, USA. pp.278–287.

Neuwirth, C. M., R. Chandhok, D. S. Kaufer, P. Erion, J. Morris, and D. Miller: 1992,‘Flexible Diff-ing in a Collaborative Writing System’. In: Proc. ACM CSCW,Toronto, Ontario, Canada. pp. 183–195.

Rohall, S. L., D. Gruen, P. Moody, and S. Kellerman: 2001, ‘Email Visualizations toAid Communications’. In: Proc. IEEE InfoVis, San Diego, CA, USA. pp. 12–15.

Sun, C. and D. Chen: 2002, ‘Consistency Maintenance in Real-Time CollaborativeEditing Systems’. ACM Transactions on Computer-Human Interaction 9(1),1–41.

Sun, C., X. Jia, Y. Zhang, Y. Yang, and D. Chen: 1998, ‘Achieving Convergence,Causality Preservation and Intention Preservation in Real-Time CooperativeEditing Systems’. ACM Transactions on Computer-Human Interaction 5(1),63–108.

Sun, C., Y. Yang, Y. Zhang, and D. Chen: 1996, ‘Distributed Concurrency Controlin Real-Time Cooperative Editing Systems’. In: Proc. of the Asian ComputingScience Conference, Singapore. pp. 85–95.

Vogel, J.: 2004, ‘Conflict Visualization for Collaborative Multi-user Applications’.Technical Report TR-04-003, Department for Mathematics and ComputerScience, University of Mannheim, Germany.

Vogel, J. and M. Mauve: 2001, ‘Consistency Control for Distributed InteractiveMedia’. In: Proc. ACM Multimedia, Ottawa, Canada. pp. 221–230.

Vogel, J., M. Mauve, V. Hilt, and W. Effelsberg: 2003a, ‘Late Join Algorithms forDistributed Interactive Applications’. ACM/Springer Multimedia Systems 9(4),327–336.

Vogel, J., J. Widmer, D. Farin, M. Mauve, and W. Effelsberg: 2003b, ‘Priority-Based Distribution Trees for Application-Level Multicast’. In: Proc. NetGames,Redwood City, CA, USA. pp. 140–149.

Whittaker, S. and C. Sidner: 1996, ‘Email Overload: Exploring Personal InformationManagement of Email’. In: Proc. ACM SIGCHI, Vancouver, BC, Canada. pp.276–283.