Download pdf - Distributed, on-demand, data-intensive and collaborative simulation analysis

Future Generation Computer Systems 19 (2003) 849–859

Distributed, on-demand, data-intensive andcollaborative simulation analysis

Arthurine Breckenridgea,∗, Lyndon Piersona, Sergiu Sanielevicib, Joel Wellingb,Rainer Kellerc, Uwe Woessnerc, Juergen Schulzec

a Sandia National Laboratories, Albuquerque, NM, USAb Pittsburgh Supercomputing Center, Pittsburgh, PA, USA

c High Performance Computing Center, Stuttgart, Germany

Abstract

Distributed, on-demand, data-intensive, and collaborative simulation analysis tools are being developed by an internationalteam to solve real problems such as bioinformatics applications. The project consists of three distinct focuses: compute,visualize, and collaborate. Each component utilizes software and hardware that performs across the International Grid.Computers in North America, Asia, and Europe are working on a common simulation programs. The results are visualizedin a multi-way 3D visualization collaboration session where additional compute requests can be submitted in real-time.Navigation controls and data replication issues are addressed and solved with a scalable solution.Published by Elsevier B.V.

Keywords: Compute; Visualize; Collaborate; Data-intensive; Simulation analysis; Bioinformatics applications

1. Introduction

The international collaboration being presented inthis paper represents a snap shot in time of on-goingwork. Since our team has proudly participated in allthree International Grid conferences, the tremendousadvantage of participation can be shown in the projectchronology. Initially, the G7 Economic Summit de-fined a task force to enhance international networkingand electronic collaborations. In 1996, the demonstra-tion of our G7 team was presented only locally; noexternal-to-conference-exhibit network functionalitywas available. The various international high-speed

∗ Corresponding author.E-mail addresses: [email protected] (A. Breckenridge),[email protected] (L. Pierson), [email protected] (S. Sanielevici),[email protected] (J. Welling), [email protected] (R. Keller),[email protected] (U. Woessner), [email protected] (J. Schulze).

links were lacking due to physical gap in the wiring.In 1997, partially due to work on StarTap by TomDeFanti1 et al., the first complete transatlantic Asyn-chronous Transfer Mode (ATM) network high-speed(i.e., 1997 metrics, 90 Mbps) connection was estab-lished for the Supercomputing (SC’97) conference.In our demonstration, HLRS, SNL, and PSC2 werenetworked. Next, a general call of participation wasissued to have international projects demonstratedin a common booth, the first International Grid, at

1 Electronic Visualization Laboratory, University of Illinois atChicago,http://www.startap.net.

2 Sandia National Laboratories (SNL), US; Pittsburgh Super-computing Center (PSC), US; Manchester Computing Centre(MCC), England; National Center for High-Performance Com-puting (NCHC), Taiwan; High Performance Computing Center(HLRS), Germany; Tsukuba Advanced Computing Center (TACC),Japan; Japan Atomic Energy Research Institute (JAERI), Japan.

0167-739X/03/$ – see front matter. Published by Elsevier B.V.doi:10.1016/S0167-739X(03)00065-7

http://www.startap.net

850 A. Breckenridge et al. / Future Generation Computer Systems 19 (2003) 849–859

SC’98. Our team provided a two-way Trans-Atlanticimmersive collaborative session including force feed-back based on a dataset distributed and computed attwo supercomputer centers. Next, at the Second In-ternational Grid 2000, our team provided three-wayTrans-Pacific immersive collaborative session includ-ing force feedback based on a dataset distributedand computed at seven supercomputer centers (seefootnote 2) using the high-speed network (i.e., 2000metrics, 90 Mbps still). Now, our accomplishmentswill be documented using the Third International Gridhigh-speed network (i.e., 2002 metrics, 10 Gbps).

2. Related work

Our project consists of three distinct focuses: com-pute, visualize and collaborate. The challenge is howto provide ubiquitous functionality by distributing,on-demand, data-intensive applications and then, pro-vide an environment for collaboration and analysis ofthe information. Numerous teams are working to pro-vide innovations. Related work in each area will bementioned at the beginning of each subsection. Also,our tools can potentially be applied to any contentarea. Our team has demonstrated some of our previ-ous Igrid work in volume, finite element, surface, andVRML rendering applied to architectural design.3 AtIgrid 2002, our demonstration application was bioin-formatics, specifically intron/exon splice sites[7]graphically shown after computing secondary struc-ture with a parallel version of RNAfold modeling andsimulation code.

3. Focus area: compute

3.1. Related work

With contemporary computers, the ab initio com-putations of the folding of long primary sequencesinto their corresponding tertiary (3D) structures areprohibitive with the current status of which problemsare tractable and/or NP complete given inFig. 1.The degrees of freedom per base are just too high.Finding this geometry is essential for understanding

3 http://www.cs.sandia.gov/ilaband http://www.hlrs.de.

the functionality. Since research suggests that fold-ing from single-stranded chain to tertiary structureis a hierarchic process[2], the tertiary structure maybe derived through the computation of the planarsecondary structure. Still, the calculation of the sec-ondary structure of RNA is computationally expen-sive. To speedup the calculation, high-performanceparallel computers were used with the MPI-parallelversion of the program RNAfold. This program wasderived from the ViennaRNA package, developed atthe University of Vienna[4,9] and improved in theframe of this project to be both efficiently paralleland grid distributable. For the minimum free energyof the sequence, all entries of two triangle matricesFB and FM of size (n × n), with n being the lengthof the sequence, have to be computed. The overallcomplexity of the algorithm is in O(n3). The mem-ory requirement for matrices in minimum free energycalculation module is in O(n2). The work is equallydivided among all processors along the diagonal ofthe matrices. In each of then time stepsi, each ofthe m processors computes(n/m − i) values. Thememory requirement regarding the matrices withinthe minimum free energy calculation module is

M =(

4d2 + 2d(umax + 2)

2N

)size of(int) bytes,

additional memory is consumed by several lineararrays.

3.2. Parallelism, communication speedup

Our research was applied to the RNAfold code withthe minor enhancements in input and output formatto allow alignment to a single coordinate space forcomparisons in a spatial visualization program.

The serial version of the RNAfold program had asevere limitation on the number of bases that couldbe computed using the resources of one machine suchas memory. Also, the initial parallel version had notbeen kept up to date with the serial versions since highcommunicational requirements as shown inFig. 2alsoseverely limited the number of bases. After completinga diagonal, each processor previously did one sendand one receive in the order of O(n).

First, the serial version of the code with its im-proved energy parameters was ported to the parallelversion. Second, our primary work included major

http://www.cs.sandia.gov/ilab

http://www.hlrs.de

A. Breckenridge et al. / Future Generation Computer Systems 19 (2003) 849–859 851

Fig. 1. Status of simulations.

enhancements to remove the limitation on the numberof bases by re-programming the code using strong gridor distributed programming techniques. The pre-gridcode either sends a row or receives a column from hisright neighbor, or sends a row or receives a column

Fig. 2. Message statistics: data sent between processes (eight processes, 1000 bases).

from his left neighbor. Also, rows are completed onprocess 0, which then distributes the row to the corre-sponding storing process. To examine the communi-cation pattern, the first computations were done withVAMPIR [8] on the Cray T3E installed at the High


Fig. 3. Message statistics: number of messages sent, sorted by length (eight processors, 1000 bases).

Performance Computing Center in Stuttgart.Fig. 2shows the amount of communication between eachof the eight processors calculating a 1000 bases testcase. The main communication pattern is between pro-cessors along the diagonal, sending the rows/columnsneeded by the adjacent processor in a non-blockingway, i.e. communication may be hidden by the com-putation of the diagonal. The numbers of messageswere exorbitantly high, while the amount of data be-ing sent is small; by far the most messages sent were40 bytes long as seen inFig. 3.

At the end of the computation, process 0 reassem-bles the folding of minimum energy with values ofmatrix FB and FM. Sending a request of three inte-gers to the process storing this particular value andsending back one integer was done several times.Furthermore, these inefficient communications wereoften requested twice, some were asked for even moreoften. The maximum we observed in the 1000 basepair test case were 700 times.

Our grid-enabled code improves the communicationpattern to increase the efficiency of the communica-tion:

• Communication overhead was reduced by coalesc-ing six messages into one message.

• Sending completed columns from process 0 to thecorresponding process storing the column weremerged.

• Two “received” operations unnecessarily used theMPI “any-source” semantic.

• Caching of data that is sent over the network severaltimes.

Benefits of caching of the values were gained evenon HPC machines with fast internal network. Exami-nation of the request pattern for matricesFB andFM,showed that the accesses follow a strict pattern. Ma-trix FM access pattern may be seen inFig. 4. It showsa linear access with temporal property and every pos-sible direction (top to bottom and vice versa and leftto right and vice versa).

An algorithm was implemented for predicting a par-ticular access pattern, prefetching a fixed number ofvalues and caching them. The access pattern ofFB

was different. The requested values did not follow astrict regional property; still they were mostly withina square-sized boundary, as shown inFig. 5. For thismatrix, a fixed-sized square of values were requestedand cached.

3.3. Distributed computing

The enhanced MPI-version of the program was runon multiple grid-enabled platforms distributed world-wide. In addition to the Stuttgart equipment (a CrayT3E and an Origin 2000), the Terascale ComputingSystem (TCS) at PSC was used for our calculations.For example, the RNAfold calculation of 78,067 basesof the human MSH6 gene ran in approximately 44 minon 512 processors of TCS. With its 3000 Alpha EV68processors, the TCS offers an overall performance ofsix TF peak performance. Not only does it offer thecomputational power, but it also fits the memory re-quirement of our code. For further tests, the TCS willbe also integrated into International ComputationalGrid at SC2002 using PACX-MPI[3].


Fig. 4. Access pattern of matrixFM.

Fig. 5. Access pattern of matrixFB.


4. Focus area: visualize

4.1. Related work

The current RNAfold basically had two types ofoutputs: (a) postscript file and/or (b) a circular visualrepresentation, limited to 3000 bases[9]. Both repre-sentations could be viewed with traditional 2D imagetools such as the AdobeTM suite or Acrobat readers.Our team created an extended output file format thatcontains the following information:

• Comments for locus, description and accession.• Original primary sequences.• Calculated secondary sequence in paragraph/dot

syntax (i.e., ((..).).).• Coordinates inX,Y plane.• Pairs of nucleotides.

4.2. Graphical user interfaces (GUIs)

To visualize the folded secondary structure, theproject used a common plug-in architecture into twoseparate visualization 3D OpenGL frameworks. Thecontribution from Sandia was a desktop applicationwith a haptics interface where the sense of touch wasmainly used for six degrees of freedom navigation.The plug-in was added to the e-TouchTM Graph-ical/Hapitcal User Interface[1]. The contributionfrom HLRS was an immersive virtual reality appli-cation with head tracking used in the CAVE and/orhigh-resolution power walls. The VR application wasdeveloped as a plug-in for COVER[6] that is theVR component of the collaborative visualization andsimulation framework COVISE[5]. Both graphicalenvironments allow easy extension through a flex-ible plug-in system and utilized a common set ofvisualization functionality (Fig. 6).

4.3. Common plug-in functionality

The customer can interact with the visualizationthrough a 3D menu system and through direct inter-action (i.e., 3D widgets, touch activated) with the ab-stract nucleotide representation of the objects. Newcomponents can be dynamically loaded to the systemduring runtime. The user can select multiple solutionfiles and place them side by side or one over the other

Fig. 6. Sample screen from e-TouchTM GHUI.

to analysis them visually. One can select from a vari-ety of different display styles to differentiate betweenbond types or exon and intron regions. Each datasetmaintains all the original primary sequence informa-tion with menus to show one base, groups of bases, orall bases if desired. Advanced visualization featuresfor level of detail and area of enhanced functionalityare applied. Optimization of the data representationranges from two triangles for each nucleotide to 400triangles for complete XML/X3D representation of abase.

4.4. Unique plug-in functionality/computationalsteering

An additional client/server plug-in was developedto allow the analysis of folded pre-mRNA or mRNA.The RNAfold simulation program acting as the client,connects to the graphics application, in this demonstra-tion, COVER, acting as the server. The 3D GUI allowsthe customer to select between pre-computed resultsor couple to a running simulation of parallel RNAfoldand receive real-time computed results. Whenever theRNAfold simulation code is started without a specificdataset to compute and an environmental variable, itconnects to a visualization machine and waits for asimulation task to be scheduled for it. The customer inthe VR environment can select from a list of availablesimulation hosts and assign a specific RNA sequenceto these computers for simulation. As soon as the


computation finishes, the result is transferred to thevisualization machine and is available for display. Theresult is, also, stored on the visualization machine’sdisk so that it can later be viewed offline. Multiplesimulations can connect to a visualization machinedynamically and simultaneously, which allows thecustomer to submit a whole series of jobs. Also, theGUI is available for the analysts to choose all param-eters of RNAfold, such as temperature, lonely pairs,or disallowing G–U bonding, etc. During runtime,the simulation reports its progress to the visualiza-tion machine with a percentage complete widget, sothe analysis knows when one can expect the resultof a certain simulation. The GUI thus has two gridfunctions: easy use of submitting batch jobs to thecompute nodes and computational steering of smallersequences in real-time.

4.5. Unique plug-in functionality/automaticfeature detection

An additional hardware plug-in was developed toallow automatic feature detection of folded pre-mRNAor mRNA. The haptics plug-in was developed to showa visual “Start Here” arrow. The arrow has a magneticfield to draw the analyst to the beginning of a sequencein a 3D environment. Thus, the human’s unique skillto detect sequences and motifs is used. The arrow wid-get allows for easy manipulation of one sequence foralignment with another.

5. Focus area: collaborate

5.1. Related work

In addition to computing and visualizing the RNAstructure, the primary goal was to gain insight intothe information. The project concentrated on linkingbiologists from around the world to perform analysis,and then addressed the extended functionality neededin a collaborative environment.

5.2. Tele-presence issues

Each International Grid partner shared a base set offunctionality for collaboration. For audio and video

Fig. 7. Avatar.

conferencing, the AccessGrid infrastructure4 wasused.

In the COVER and the e-TouchTM environments,audio conferencing was used extensively to communi-cate with other analysts. Traditional video conferenc-ing was not activated in either environment, insteadin the COVER environment, a graphical avatar repre-sentation was used which consisted of stereo glasses,a hand with a tracking pointer, and a grid on the floor(Fig. 7).

5.3. Navigation control issues

Support for collaborative work means that multiplepeople at different sites can join a visualization sessioneither from a desktop, or from a virtual environmentlike a CAVE or a power wall. The results of an onlinesimulation are transferred automatically to all partici-pants in a collaborative session. The same applies forstored datasets. If they are not available at one of theparticipating sites, they are transferred automaticallyand stored for later use. In COVER, different collabo-ration modes that offer different strengths of synchro-nization are implemented.

4 http://www.accessgrid.org.

http://www.accessgrid.org


Fig. 8. Collaboration environment at International Grid 2002.

Master/slave: In this mode, all virtual worlds aretightly coupled and only one of the participants, themaster, is allowed to interact with both the user inter-face and the objects in the scene. This mode is espe-cially well suited for presentations.

Tight coupling: This mode also synchronizes all fea-tures of the application and the user interface, but itallows all collaborators to interact. This mode workswell for joint work on a small dataset, like a smallergene sequence or if all participants want to have anoverview of a large dataset.

Loose coupling: The state of the virtual world stayssynchronized but different users can have differentviews of the dataset. All partners are allowed to navi-gate independently of each other and not all features ofthe application are synchronized. The different usersare displayed as avatars.

5.4. Collaborative analysis scenario

At International Grid 2002, our demonstration cou-pled three CAVEs in Amsterdam (SARA), Chicago(EVL), and Stuttgart (HLRS). In this collaborativesession, analysts connected online to Supercomput-

ers at PSC, SARA and HLRS. The amount of datatransferred for synchronization of the three CAVEswas about 40 Kbps, additional bandwidth was usedfor audio conferencing (∼250 Kbps) and optionallyvideo conferencing (∼400 Kbps). Network bandwidthis not a central issue for navigation controls of anapplication, while latency definitely has to be takeninto account. To deal with this, we implemented ournavigation and object manipulation tasks in such away that one partner never has to wait for acknowl-edgements from all partners to start an interaction(Fig. 8).

5.5. Data replication issues

Another major issue addressed by this project inbuilding a collaborative environment is access tothe data. Our team’s robust multi-connection vir-tual environment has for several years fully repli-cated the data at each site. This severely limitsadministration of such an environment due to thelevel of hardware and hardware support needed ateach site. This issue has been identified and an ini-tial solution has been prototyped to only send a


Fig. 9. Interactive remote visualization hardware operational con-cept.

completely rendered graphical scene via multicastto all participants. Sandia-developed hardware wasdemonstrated to implement a low latency digital videocompression system for remote interactive visualiza-tion (Fig. 9).

This system receives high-resolution DVI (or RGB)video data, compresses and formats the data for trans-mission over an IP/GBE network, and decompressesand displays the data on a DVI (or RGB) monitorat the distant end. The firmware currently supportsthe 1280× 1024× 60 Hz format, yet the hardwarecan support formats up to 1920× 1024× 30 Hz. Theequipment captures video data directly from any videographics accelerator and transports this data in a com-pressed form over a Gigabit Ethernet network usingstandard Internet protocols. The equipment supportslow latency, interactive visualization by transportingkeyboard and mouse inputs from the remote end backto the originating computer. Eventually, the partici-pants will have a simple display, interaction device,and the DVI/GBE decoder at each site.

The interactive remote visualization hardware sys-tem is designed to allow the remote visualization userto see a computer image as if sitting at the computerconsole. For a great many applications, one must haveconfidence that the image is a true representation of thedata and not an artifact of the compression. The com-pression algorithm implemented delivers a faithfulpixel-by-pixel rendering of the original image with-out compression artifacts. Building and maintaininghardware resources such as ultra-computers at somelocations can frequently be impractical. For example,

a viable solution may be to perform data computationand rendering at an existing supercomputing site, andto then transfer the rendered images to the remote loca-tion. However, problems arise when this is attemptedbecause it is difficult to provide transfers of a qualityresolution and with an acceptable display frame rate.The extent of user interactivity, and consequently ofusability, of these transfers is dependent not only onthe frame rate, but also on the delay with which theimages are presented. The problem lies in developingan apparatus to handle the transfer of this informa-tion at a speed acceptable for human interaction andretain images of a high quality. Equipment currentlyavailable delivers either low latency, low-resolutionimages such as video conferencing systems; or highlatency, high-resolution images such as used by en-tertainment systems. The Sandia-developed hardwareallows for the transfer of such information while main-taining an acceptable frame rate and resolution, aswell as interaction with the visual data from a remotelocation.

Sandia’s early experiments in this area used com-mercial hardware with frame updates at 30 fps, butat the relatively low-resolution of 640× 480 pixelsin interlaced NTSC format. At this stage, the equip-ment transferred compressed video over a network at8 Mbps. An upgraded implementation utilized four in-dependent video feeds, which raised the resolution to1280× 960 pixels. However, this arrangement provedto be expensive and difficult to setup. The spatial re-construction and color matching of the four quadrantsof the image required careful and frequent adjust-ment. Switching from NTSC format to the EuropeanPAL standard provided better color matching. Thisearly hardware was developed to improve resolutionand interactivity, and to eliminate the use of four sep-arate feeds of the lower 640× 480 resolution in theprevious commercial hardware implementation. Theconversion to analog RGB from the previous PALformat was made at this stage in the development toreduce scan conversion losses incurred primarily frominterlaced sampling. The early prototype evolved intothe hardware demonstrated at Igrid 2002, with severalchanges to the hardware. The current hardware inter-faces to both Gigabit Ethernet and ATM networks,and incorporates both RGB and DVI video interfaces.The current hardware accepts keyboard and mouseinput for transport to the remote location.


5.6. Remote visualization scenario

Using the interactive remote visualization hard-ware, RNA folding visualization generated in Chicagowas manipulated from Amsterdam and visa versa.The equipment achieved compression ratios of com-plex changes of approximately 20:1. The 2.5 Gbps(1280× 1024× 60 Hz) of screen data directly fromthe GeForce4 graphic card’s DVI interface was com-pressed into a 2/10 Gbps portion of an IP data streamcarried over the Gigabit Ethernet Interface. Of the98 ms network round trip delay between Chicago andAmsterdam, only 30 ms were added by the compres-sion.

6. Conclusion

The overall demonstration was very successful.The compute, visualize, and collaborate componentsall functioned as a unit. In establishing the connec-tions, audio services with remote partners were veryimportant. The compute component needed to modifythe base application to be efficient for grid comput-ing. The visualize and collaborate components haveestablished a valid environment to allow the biologistto fully examine RNA secondary structures in theCOVER and e-TouchTM environments. Key issuesin scaling and making this research readily availablehave been identified.

Acknowledgements

Special acknowledgements are in order for the net-working and system administration teams at SARAthat provided local support at the conference and theremote networking and system administration teams atHLRS, PSC, and Sandia. Special mention is in orderfor Alan Verlo, EVL, which provide local and remotesupport.

This work done by HLRS has been funded by theDAMIEN project (IST-2000-25406) and by the col-laborative research centers (SFB) 374 and 382 of theGerman Research Council (DFG).

The computations were performed, in part, on theNational Science Foundation Terascale ComputingSystem at the Pittsburgh Supercomputing Center. ThePittsburgh Supercomputing Center is a joint effort

of Carnegie Mellon University and the University ofPittsburgh together with the Westinghouse ElectricCompany. It was established in 1986 and is supportedby several Federal agencies, the Commonwealth ofPennsylvania and private industry.

Special acknowledgements are in order for the San-dia hardware design team consists of Perry J. Robert-son, Karl Gass, Lyndon Pierson, Ron Olsberg, JohnEldridge, Tom Tarman, Tom Pratt, Ed Witzke, JohnBurns, and Larry Puckett.

The work at Sandia National Laboratories has beenfunded by the Department of Energy’s AdvancedSimulation Computing Program (ASCI) and Mathe-matics, Information, and Computer Science (MICS).Sandia is a multi-program laboratory operated bySandia Corporation, a Lockheed Martin Company,for the United States Department of Energy undercontract DE-AC04-94AL85000.

References

[1] T. Anderson, FLIGHT: a 3D human–computer interface andapplication development environment, in: Proceedings of theSecond PHANTOM Users Group Workshop, Cambridge, MA,1997.

[2] R.L. Baldwin, G.D. Rose, Is protein folding hierarchic? Localstructure and peptide folding, January 1999.

[3] E. Gabriel, M. Resch, T. Beisel, R. Keller, Distributedcomputing in a heterogeneous computing, in: Lecture Notesin Computer Science, Springer, Berlin, 1998, pp. 180–188.

[4] I.L. Hofacker, W. Fontana, L.S. Bonhoeffer, M. Tacker,P. Schuster, Vienna RNA Package.http://www.tbi.univie.ac.at/∼ivo/RNA/.

[5] D. Rantzau, U. Lang, R. Rühle, Collaborative and interactivevisualization in a distributed high performance softwareenvironment, in: Proceedings of the International Workshop onHigh Performance Computing for Graphics and Visualization,Swansea, Wales, 1996.

[6] D. Rantzau, K. Frank, U. Lang, D. Rainer, U. Wössner,COVISE in the CUBE: an environment for analyzing largeand complex simulation data, in: Proceedings of the SecondWorkshop on Immersive Projection Technology (IPTW’98),Ames, IA, 1998.

[7] M. Spingola, L. Grate, D. Haussler, M. Ares, Genome-widebioinformatic and molecular analysis of introns inSaccharomyces cerevisiae, RNA 5 (1999) 221–234.

[8] Pallas GmbH, VAMPIR—visualization and analysis of MPIresources.http://www.pallas.com/e/products/vampir.

[9] M. Zuker, The use of dynamic programming algorithms inRNA secondary structure prediction, in: Waterman, M.S. (Ed.),Mathematical Methods for DNA Sequences, CRC Press, BocaRaton, FL, 1989, Chapter 7, pp. 159–184.

http://www.tbi.univie.ac.at/~ivo/RNA/

http://www.tbi.univie.ac.at/~ivo/RNA/

http://www.pallas.com/e/products/vampir


Arthurine Breckenridge is a PrincipalMember of Technical Staff in the Visu-alization Department at Sandia NationalLaboratories in Albuquerque, New Mex-ico. Arthurine has been involved in manyaspects of networking and visualizationduring her 16 years at Sandia. Her workhas won recognition with the InteropAward and leading researcher in hapticstechnology. She holds BS and MS incomputer science from Oklahoma StateUniversity.

Lyndon Pierson is a Senior Scientistin the Advanced Networking IntegrationDepartment at Sandia National Laborato-ries in Albuquerque, New Mexico. Lyn-don has been involved in many aspectsof computer and communication securityduring his 27 years in inter-site securecommunications at Sandia. His work inhigh-speed encryption has won recogni-tion as a R&D 100 Award in 1991 and

again in 1996. His current interests include the scaling of widearea communication technologies to achieve Supra-Gigabit/secondsecure network performance. He holds a BSEE from New MexicoState University and an MSEE from Stanford.

Sergiu Sanielevici is the Assistant Di-rector of Scientific Applications and UserSupport at the Pittsburgh Supercom-puting Center. He is responsible forhigh-performance, scalable, parallel codedevelopment.

Joel Welling is a senior scientific soft-ware developer in biomedical and scien-tific visualization at the Pittsburgh Super-computing Center. He is specifically in-terested in distributed volume-renderingapplications.

Rainer Keller is now a Research Sci-entist at High Performance ComputingCenter, Stuttgart (HLRS). He receivedhis diploma degree in computer scienceat the University of Stuttgart in 2001.Since then, he is working on his PhDfor the working group “Parallel and Dis-tributed Systems” on the metacomputinglibrary PACX-MPI. His research interestsinclude MPI, Parallel-IO, networking andbioinformatics in general.

Uwe Woessner received his Diploma de-gree in mechanical engineering from theUniversity of Stuttgart in 1999. From1993 to 1999, he was working in sev-eral European projects as well as the G7GWAAT Project together with Sandia Na-tional Labs. Since 1996 he is working inthe Collaborative Research Center RapidPrototyping established at the Universityof Stuttgart in the field of VR based vir-

tual prototyping. He is now Head of the Hybrid Prototyping Work-ing group which focuses on integration of physical and virtualprototypes as well as collaborative virtual environments

Juergen Schulze received his Master’sdegrees in computer science from theUniversity of Massachusetts at Dart-mouth (1998) and from the Universityof Stuttgart (1999). He has been a PhDstudent in the visualization group at theHigh Performance Computing Center inStuttgart since 1999. His work is focusedon volume rendering in virtual environ-ments. It involves the development of ren-

dering routines on graphics hardware and remote rendering usingparallel computers.