Upload
j-sahm
View
216
Download
0
Embed Size (px)
Citation preview
Computers & Graphics 28 (2004) 15–24
ARTICLE IN PRESS
*Correspond
6151-155-139.
E-mail addr
0097-8493/$ - se
doi:10.1016/j.ca
Efficient representation and streaming of 3D scenes
J. Sahm*, I. Soetebier, H. Birthelmer
Fraunhofer Institut f .ur Graphische Datenverarbeitung, Abteilung f .ur Animation und Bildkommunikation,
Fraunhoferstr. 5, Darmstadt 64283, Germany
Abstract
In the last few years several approaches (Proceedings of the Second International Workshop on Distributed
Interactive Simulation and Real-Time Application, 1998, pp. 88–91; EUROGRAPHICS 2001, vol. 20(3) 2001) have
been presented, which address the transmission and visualization of 3D scenes on distributed devices with different
capabilities. These techniques are in the interest of a wide field of applications such as virtual chatrooms, product
presentations, CAVE visualizations, 3D simulations, or 3D online games. In order to adapt the immense data effort to
the capabilities of the devices and the networks, two basic problems have to be solved: the selection of the information
according to the user’s interest and the reduction of the selected data. Usually the first problem is reduced to visual
aspects, which can be determined by visibility culling algorithms. This paper concentrates on the second problem and
introduces a system in order to stream the data of even large 3D scenes to remote devices.
r 2003 Elsevier Ltd. All rights reserved.
Keywords: Curve, surface, solid, and object representations; Distributed data structures; Distributed/network graphics; Distributed
virtual environment
1. Introduction
Due to advancements in 3D visualization in conse-
quence of more powerful hardware, distributed 3D
graphics has become more and more interesting for a
wide field of industrial activities. Consequently, applica-
tions such as virtual chatrooms, product presentations,
CAVE visualizations, 3D simulations, or 3D online-
games are an important factor of industrial economy.
Unfortunately most of these techniques have their
drawbacks: virtual chatrooms [1] typically lack of
graphics quality and detail, product presentations
concentrate on very simple 3D scenes with only a few
elements [2,3], CAVE visualizations require very ex-
pensive hardware, and 3D simulations and 3D online
games [4] do not make use of the transmission of
graphical information. Although most of these applica-
tions provide level of detail (LOD) concepts, they do not
support an efficient and exact adaptation of the data
effort and the data itself to the capabilities of the
ing author. Tel.: +49-6151-155-645; fax: +49-
ess: [email protected] (J. Sahm).
e front matter r 2003 Elsevier Ltd. All rights reserve
g.2003.10.014
devices. For that reason mobile devices such as laptops
or palmtops are still not considered by many software
systems. In almost the same manner it is unsatisfying, if
the capabilities of the user’s PC or workstation are not
exploited.
In order to adapt the data to capabilities such as
memory size, computing power, graphics support, and
bandwidth, two basic approaches can be identified. The
first approach is to select only small parts of the scene
for transmission or visualization, which are currently in
the user’s interest [5]. Usually this problem is constricted
to visual aspects, so visibility culling algorithms [6] can
be used for the computation of these areas. The second
approach is to reduce the data effort by modifying the
elements of a scene, for example with the help of multi-
resolution techniques [7]. This paper concentrates on the
second approach and introduces a client–server system
for the representation, transmission, and visualization of
distributed 3D scenes even on mobile devices.
1.1. Content
The rest of the paper is structured as follows: First, an
overview of related work is given. This is followed by the
d.
ARTICLE IN PRESSJ. Sahm et al. / Computers & Graphics 28 (2004) 15–2416
concept, which starts with the requirements of the
system. Furthermore the concept explains, how the
system fulfills these requirements. After that, the
implementation will be discussed, followed by the results
in comparison to other approaches. Finally, the paper
ends with a conclusion.
1.2. Related work
There are already some approaches to distribute 3D
graphics over networks. Those 3D environments depend
heavily on network connection, so it is an important
design criteria to represent scene information in an
efficient way to keep network traffic as low as possible.
Furthermore, the representation has to adapt to the
capabilities of the client devices.
Some virtual environments use a peer-to-peer-based
network topology. Because there is no central server, the
management of the scene is distributed on all devices
which are involved in the virtual environment. Examples
for those environments are NPSNET [8]. It was
developed for a large number of users in military
training and is based on the DIS protocol that uses
multicast IP. It uses a paradigm called players and ghost,
where each participant controls its own ‘player’, which
is replicated on all other participant as ‘ghost’. An
advancement to NPSNET is Bamboo [9]. Another peer-
to-peer-based environment is presented by Broll [10].
The ASCII-based virtual reality modeling language
(VRML) is used as scene description. The scene
description of the distributed interactive virtual environ-
ment ðDIVEÞ [11] is stored in a database, which isreplicated on each client. The scalable platform for large
interactive network environments ðSplineÞ [12] is a toolkitfor creating large scale multi-user environments. Its
world model is stored in an object-oriented database. A
remarkable application built with Spline is Diamond-
Park [12]. The virtual environment operating shell
ðVEOSÞ [13] is a multi-user environment, which uses apeer-to-peer architecture without the multicast option. It
is a system for rapid prototyping of virtual environ-
ments.
Distributed 3D graphics environments, which are
based on client–server communication, are able to use a
central scene description. Examples are the AVIARY VR
system presented by Snowdon et al. [14]. The virtual
society system from Lea et al. [15] is another client–
server-based virtual environment. Its 3D world are
designed using VRML and they are distributed using
an own communication protocol called virtual society
client protocol (VSCP). A programming toolkit for
creating multi-user environments is theMR ToolKit [16].
For distributing the 3D world informations it uses a
client–server-based shared memory abstraction. MacIn-
tyre et al. [17] proposed COTERIE. It distributes shared
objects by replication using a client–server topology.
Another virtual environment for a large number of users
using client–server communication is Ring presented by
Funkhouser [18]. The network graphics framework
ðNGF Þ [19] is an adaptive framework for transmitting3D graphics over networks. It considers several proper-
ties, like capabilities of network, server and client or user
preferences to choose the appropriate method for
transmission. Also a client–server-based approach with
a central scene description is presented by Teler et al.
[20]. Different to other virtual environments is the
approach to transfer an impostor-based representation
of the objects of the scene to have a fast response to the
user’s navigation. HOUCOM [21] is a framework for
creating cooperative applications and groupware appli-
cations. For this, it provides basic services, that can be
extended by an application developer with application-
specific functionality. A possible application could
be a cooperative virtual environment, but in opposite
to the presented system, HOUCOM is not specialized in
distributing 3D information to several user devices with
different capabilities.
Commercial frameworks for virtual environment are
for example the WorldToolKit [22] by Sense8 and
OpenGL Vizserver by SGI [23]. Together with the
World2World and the WorldUp extensions, the World-
ToolKit provides a development environment for client–
server-based virtual environments. A different approach
is used by OpenGL Vizserver. Though it is a client–server
solution, all rendering is done on the server, which is a
SGI supercomputer. The images then, are transferred to
the client. This offers high-quality graphics, but this
approach needs a very powerful and therefore expensive
server, especially if the number of concurrent clients is
very high.
Some general research on virtual environments is
presented by Benford et al. [24], a survey of existing
virtual environments was presented by Meehan [25].
2. Concept
This section introduces the concept of the client–
server system.
2.1. Requirements
The basic requirement is the transmission of a 3D
scene from a server system to multiple clients. The
complexity of the 3D scenes should reach from only a
few elements up to several thousands. Elements can
represent simple primitives such as lines, triangles, or
cubes but even more complex structures such as houses,
cars, or robots. In opposite to the approaches of
transferring server-rendered frames or detail levels (see
[20,23]) the system should transmit the elements’ data
containing vertices, normals, textures, colors, etc. There
ARTICLE IN PRESSJ. Sahm et al. / Computers & Graphics 28 (2004) 15–24 17
are several reasons for this requirement: The rendering
of complete frames or even the elements’ lowest level of
detail demands an excessive running power if applied for
multiple clients. With each registered client the load of
the server increases dramatically. So a very expensive
graphics-able server is necessary. Furthermore, the
transmitted frames or bitmaps represent only 2D
information and lack of interaction and transformation
possibilities.
Another important requirement is the ability to adapt
the scenes’ data effort to the capabilities of the clients
precisely. For that reason the server should not only
provide some precalculated levels of detail for each
element, but a scene representation, which allows a fast
and accurate on-the-fly-determination of the best fitting
data set for a specific client. In consequence to the
adaption requirement the clients usually hold only a
subset of the server’s scene information. This subset
changes dynamically with the user’s view position and
view direction.
Since even the complete transmission of the optimal
data set may require several seconds, it should be possible
to visualize the received data on the client side, although
the transmission has not been finished. So the scene and
the element representation has to be refineable and
renderable simultaneously on the client side in real time.
2.2. Element representation
An element of a 3D scene can contain multiple
information representing different media types such as
vertices, normals, texture coordinates, textures, or
colors. While vertices, normals, and texture coordinates
belong to the geometrical information, textures and
color arrays describe image information. Another
important information is the connectivity of the vertices,
which is denoted as the topological information of an
element in the following.
The visual appearance of a single element is described
with the help of an Element Graph. The idea behind the
VertexArrayNode
ColorFacetNode
NormalArrayNode
ITr
GroupNode
Fig. 1. An example of an Element Graph. Since the Vertex Array Node
they have the same color. While the Indexed Triangle Set Node refere
a transformation matrix. The Group Node works similar to the group
Element Graph is similar to other scene graph APIs such
as Open Inventor [26] or IRIS Performer [27]. Each node
of the Element Graph represents a specific data type, e.g.
the Vertex Array Node in Fig. 1 points to the vertices of
the element. Properties such as normals or texture
coordinates can be defined per vertex or per facet, what
is indicated by different node types. While Array Nodes
reference per vertex information, Facet Nodes point to
per facet data (e.g. the Color Facet Node in Fig. 1). In
order to render an Element Graph to file, network, or
frame buffer, the Element Graph is traversed in a DFS
order from left to right. Each time a node is visited its
state is set and remains valid until it is redefined by
another visited node of the same type. This principle
implies state engines such as OpenGL.
2.3. Scene representation
Since a 3D scene may contain thousands of elements,
the scene representation has to manage an according
number of Element Graphs. In order to avoid data
redundancy between the elements of a 3D scene, the
nodes of an Element Graph do not contain the data
directly (with the exception of Group Nodes and
Transform Nodes). Instead the information is stored in
several pools and the nodes only reference the entries of
these pools. For example, multiple Vertex Array Nodes
of different Element Graphs can share the identical
vertex data by pointing to the same pool entry as
illustrated in Fig. 2. According to the node types of the
Element Graph the system provides appropriate pool
types, e.g. a Geometry Pool for vertices, normals, or
texture coordinates, a Texture Pool for textures, a Color
Pool for colors, and a Topology Pool for the topological
information. In concern to the basic requirement of
Section 2.1 the pools provide a very helpful feature,
because they encapsulate the complete visual informa-
tion of a 3D scene in a compact and memory efficient
manner. So if the system is able to transmit the content
of the pools efficiently, then the requirements are
GroupNode
ndexediangleSetNode
TransformNode
and the Normal Array Node represent geometrical information,
nces the topological information, the Transform Node contains
concept of Open Inventor.
ARTICLE IN PRESS
GroupNode
VertexArrayNode
ColorArrayNode
NormalArrayNode
IndexedTriangleSet
Node
GroupNode
VertexArrayNode
TextureNode
TextureCoordFacet
Node
IndexedTriangleSet
Node
GeometryPool TopologyPoolColorPool TexturePool
ElementE2
ElementEn
ElementE1
3D Scene
Fig. 2. The two Element Graphs are identical with the exception, that the left Element Graph is colored per facet and the right one is
textured. So these Element Graphs can share the same geometry and topology information. Multiple elements are allowed to reference
the same Element Graph.
J. Sahm et al. / Computers & Graphics 28 (2004) 15–2418
almost fulfilled. Important information outside of the
pools are the element specific data such as the element’s
identification number (ID) and the Element Graph.
Since the clients typically hold only a subset of the 3D
scene (see Section 2.1), is not necessary to transmit the
elements’ spatial arrangement inside of the server’s scene
representation (e.g. a k-D-tree or an octree [28]). In fact,
the clients manage their own spatial arrangement.
2.4. Server scene preparation
Because of the adaption requirement in Section 2.1
the server’s scene representation has to be prepared in
order to determine the optimal data set for each
registered client fast and accurately. This preparation
is done by a progressive simplification algorithm, which
traverses all Element Graphs of the given 3D scene. The
algorithm memorizes all pool entries referenced by the
visited nodes of the current Element Graph. Each time a
node with topological information is reached, the
simplification is started. It is possible to apply the
preparation not only as a precalculation process but also
during runtime. In consequence the simplification
algorithm has to be very fast (see (13)). In return the
system is able to add new Element Graphs or to modify
existing Element Graphs and their information on-the-
fly. A restriction of the preparation is, that the elements
have to be represented by triangle meshes, i.e. the nodes
with the topological information inside of the Element
Graphs have to be Indexed Triangle Set Nodes.
The simplification bases on an edge collapsing
operation, which removes within each simplification
step one vertex and two triangles out of the mesh
represented by the current Vertex Array Node and the
current Indexed Triangle Set Node of the Element
Graph. If the Element Graph references additional data
such as normals or colors, then the per vertex informa-
tion is treated in the same way as the vertices and the per
facet data in the same way as the triangles. In order to
determine the removal sequence (priority queue) of the
vertices, for each vertex an error value is calculated with
the help of an error metric. The vertices with the lowest
error values are removed at first, followed by a
recalculation of the error values of the remaining
vertices affected by this operation. Because of speed
the calculation of the error metric has to be very fast.
Fig. 3 outlines how the error metric calculates an error
value for each vertex. First, the normals for the triangles
are calculated. They will be not normalized, the length
of the normal vector is proportional to the area size
of the triangle. Then, the normals are moved to one
point and an axis-aligned bounding box containing all
ARTICLE IN PRESS
V1
V2
T1
T2
V1
Fig. 3. The left side illustrates the metric used for calculating an error value per vertex. The right side shows the basic operation of the
simplification process, the edge collapsing.
Pointer to Vertex 1
Pointer to Vertex 2
NULL
Pointer to Vertex n
Pointer to Triangle 1
Pointer to Triangle 2
NULL
Pointer to Triangle n
Vertex Index
Per Vertex Data
Triangle Index
Per Facet Data
Occurence List
Triangle 1
Triangle 2
X Y Z
I0 I1 I2
Normal Color
Texture Coords
Normal Pool Color Pool
Vertex
Triangle
Texture Coords Pool
Vertex Pointer Array
Triangle Pointer Array
Pool Vertex Array
Pool Triangle Array
Fig. 4. All not colored data structures are temporary created by the simplification. The colored structures represent the pool entries.
The Pool Triangle Array is a pool entry inside of the TopologyPool and contains indices to the vertices inside of the Pool Vertex Array.
J. Sahm et al. / Computers & Graphics 28 (2004) 15–24 19
normals will be defined. For the bounding box, its
diagonal and the volume will be calculated. The error
value results from the diagonal, the volume and from the
number of the triangles which encircle the vertex:
evertex ¼V2Bounding Box � L2Diagonal
Number of triangles:
In addition to the information inside of the pool entries
referenced by the Vertex Array Node and the Indexed
Triangle Set Node the simplification is in need of some
further temporary data structures, which are illustrated
in Fig. 4. For each vertex v inside of the Pool Vertex
Array the Vertex Pointer Array contains a reference to a
structure with additional information about v: Thisstructure stores the index of v inside of the Pool Vertex
Array, an occurrence list indicating the triangles with
v; and references to the per vertex data (normal, color,etc.). If a vertex v2 (see Fig. 3) is removed from the mesh,
then its representation inside of the Pool Vertex Array is
exchanged with the last vertex of the Pool Vertex Array.
Although the new last vertex v2 is not deleted from the
Pool Vertex Array the size of the Pool Vertex Array is
decreased by one. The size minus one indicates the
currently last entry of the array. The v2 entry inside of
the Vertex Pointer Array is deleted. In opposite to the
Pool Vertex Array the position of a vertex inside of the
Vertex Pointer Array never changes. For that reason the
priority queue references the entries of the Vertex
Pointer Array in order to determine the next to be
removed vertex. With the help of the occurrence list of
the removed vertex v2; the simplification identifies theaffected triangles. The processing of these triangles is
analog to the vertices: The triangle representations of
the two removed triangles T1 and T2 inside of the Pool
Triangle Array are exchanged with the last representa-
tions of the Pool Triangle Array. The size of the Pool
Triangle Array is decreased by two. The according
entries of T1 and T2 inside of the Triangle Pointer Array
ARTICLE IN PRESSJ. Sahm et al. / Computers & Graphics 28 (2004) 15–2420
are deleted. Due to the edge collapsing the v2; indices ofall other effected triangles inside of the Pool Triangle
Array are replaced by the index of v1: The position ofthe per vertex data inside of the appropriate pools
(e.g. the Normal Array Pool) is modified analog to the
Pool Vertex Array, the position of the per facet data is
changed analog to the Pool Triangle Array.
After the simplification the sizes of the Pool Vertex
Array, the Pool Triangle Array, and all other affected
pool entries indicate the base mesh M0 (notation of
Hoppe [29]), i.e. all data below the size limits represent
M0: The sequence of the data above the size limits,which is called the progressive data in the following,
corresponds to the order, in which the vertices or
triangles, respectively, were removed. In order to
differentiate between the data of the base mesh M0
and the progressive data, the server creates a new pool
entry inside of each affected pool, which references the
progressive data. Furthermore each node of the
simplified Element Graph gets a progressive partner
node (see Fig. 5). The original nodes reference the pool
entries containing the data of the base mesh and the
progressive nodes point to the pool entries with the
progressive data.
2.5. Transmission and client refinement
Because the progressive data reflects the removal
sequence, the original mesh can be restored on the client
side by applying the reverse operations of the simplifica-
VertexArrayNode
NormalArrayNode
Group
Node
ProgessiveVertexArrayNode
ProgessiveNormalArrayNode
GeometryPool
v1 ... vi
vi + 1 ... vk
n1 ... ni
ni + 1 ... nk
ColorPool
c1 ... cp
cp + 1 ... cn
Fig. 5. The simplification algorithm modifies the scene representation
entries, which store the data in a sequential order. The appropriate no
model M0:
tion process. The vertices and triangles are inserted into
the base mesh in the reverse order of the removal
sequence, i.e. the last removed vertex is inserted at first.
To do so, the vertex is added to the end of the Pool
Vertex Array and exchanged with the same vertex, with
which the vertex was exchanged during the simplifica-
tion. In that way the simplification’s position change
operations inside of the Pool Arrays are reversed, too.
The same procedure is applied to the Pool Triangle
Array and all other affected pool entries. The insert and
the exchange operation are fast and simple procedures,
so they are applicable even on weak devices.
Since the data are not inserted arbitrarily but in the
reverse order of the simplification, it is possible to refine
a mesh (i.e. the visual appearance of an element) vertex
by vertex or triangle by triangle, respectively. For that
reason the server is able to adapt the data effort to the
clients’ capabilities fast and accurately: If the server has
to transfer an element to a client, the server identifies the
according Element Graph and pool entries. Taking the
client’s capabilities into account, the server determines
the number of triangles nt; which should be transmittedto the client. The number of the triangles nt implies the
number of the vertices nv: Since the server alwaystransmits the base mesh M0; the numbers nt and nv have
to be greater than the according sizes n0t and n0v of the
base mesh. Usually M0 only contains a few vertices and
triangles, so it is renderable even on weak devices. After
the transmission of theM0 data, the server transmits the
first nv–n0v entries of the Pool Vertex Array and of all
ColorFacetNode
IndexedTriangleSet
Node
ProgessiveColorFacetNode
ProgessiveIndexed
TriangleSetNode
TopologyPool
t1 ... tp
tp + 1 ... tn
on the server side. So the Element Graphs contain progressive
des in front of their progressive partners contain the data of the
ARTICLE IN PRESSJ. Sahm et al. / Computers & Graphics 28 (2004) 15–24 21
pool entries with per vertex data to the client.
Furthermore, the server transmits the first nt–n0t entries
of the Pool Triangle Array and of all pool entries with
per facet data. Because of the progressive data, the
server is not in need to send the information in one data
package. Instead the information can be transferred step
by step. After receiving theM0 the client is able to refine
the mesh with each incoming package (Fig. 6).
A remarkable property of the Pool Vertex Array, the
Pool Triangle Array, and all other affected pool entries
is, that these data structures in comparison to the
pointer arrays do not contain any holes or deleted items,
respectively. So the pool entries always provide a
consistent representation of the mesh, even during
the simplification and the refinement process.
Thinking of output render pipelines such as OpenGL
and its interleaved array implementation, the pool
entries are always renderable, because the system only
has to pass the references to the pool entries’ data to the
pipeline.
2.6. Client scene reconstruction
As mentioned in Section 2.1, not only single elements
should be transmitted but complete 3D scenes. Fig. 2
illustrates the scene representation on the server side,
which avoids data redundancies by using the pool
concept. Furthermore multiple elements can share the
Fig. 6. The simplified Stanford bunny with texture, texture coordinat
68 000 triangles, the second model has ca. 34 000 triangles, the third m
Element Eu
Client Cu1,..., Cun
Element Ev
Client Cv1,..., Cvn
Element Ew
Client Cw1,..., Cwn
ElementGra
Client Cx1,.
ElementGra
Client Cy1,.
Fig. 7. The server creates a dependency graph in order
same Element Graph. This representation has to be
restored on the client side.
According to the view positions and view directions of
the clients the server selects the elements, which have to
be transmitted to a specific client. If an element already
exists on the client, then it is ignored. Each selected
element gets a priority, which depends on the element’s
distance to the client’s view point. As mentioned before
the pools contain the complete visual information of the
3D scene. With the help of the selected elements’
Element Graphs the server is able to identify the pool
entries, which are shared by multiple elements. These
pool entries have to be transferred only once. For that
reason the server creates a dependency graph, which is
illustrated in Fig. 7. This dependency graph is processed
by the server from left to right: At first the elements’
specific data such as the ID is transmitted to the
appropriate clients. After that the modified Element
Graphs including the progressive nodes are send. Since
some elements may share the same Element Graph, the
Element Graph is marked as SENT for a specific client
after its first transmission. In the next step the pool
entries referenced by the Element Graphs are trans-
mitted as explained in Section 2.5. Analog to the
Element Graphs shared pool entries are marked as
SENT. If a client does not support the content of a pool
entry (e.g. a mobile device may not support textures),
then this pool entry is ignored by the server. In addition
es, and normals. The first model represents the original with ca.
odel ca. 7000 triangles, and the fourth model ca. 1300 triangles.
ph Gx
.., Cxn
ph Gy
.., Cyn
PoolEntry Pa
Client Ca1,..., Can
PoolEntry Pb
Client Cb1,..., Cbn
PoolEntry Pc
Client Cc1,..., Ccn
PoolEntry Pd
Client Cd1,..., Cdn
PoolEntry P
Client Ce1,..., Cen
to avoid unnecessary transmission of information.
ARTICLE IN PRESSJ. Sahm et al. / Computers & Graphics 28 (2004) 15–2422
the according nodes inside of the Element Graph are
skipped during the transmission of the Element Graph.
Receiving the incoming element information (e.g. the
ID) the client creates a similar dependency graph. If an
Element Graph is received, then the client traverses the
Element Graph and creates for each base mesh node a
pool entry inside of the appropriate pool. With the help
of the progressive nodes the client identifies the pool
entries, which are expecting progressive data. After
traversing the Element Graph all progressive nodes are
removed from the Element Graph, in order to improve
the rendering efficiency of the Element Graph.
3. Implementation
The complete software is implemented in Cþþ usingOpenGL for the graphical output and the adaptive
communication environment (ACE) library for all
multi-threading and network aspects. Because the pools
are implemented as key-value-maps, each pool entry
needs a key or an ID, respectively. This ID is generated
by a 64bit CRC checksum algorithm, which processes
the data of a pool entry before the simplification. If an
entry’s ID is identical to the ID of another entry, then
the data of these entries is supposed to be equal. For
that reason the server can check for data redundancy
very fast. Besides the ID, each pool entry contains a list
of codec identifications, which are set by the designer of
the element. Codecs are responsible for encoding and
decoding the data of a specific media type and are used
for the transmission of the data. The codecs are
implemented as dynamic link libraries (DLL), which
can be loaded by the system dynamically. The topology
entries are processed by a codec, which is implemented
with the help of the zlib [30]. Because the zlib does not
work well with floating point values, all entries, which
contain floating point values, are handled by a codec
with a slightly modified zlib algorithm. Typical pool
entries of this category are all geometry and color
entries. Textures are encoded and decoded with a
wavelet transformation, provided by a modified DjVu
[31] library (Fig. 8).
Fig. 8. The left image is the original texture with 650� 300� 24 BPPtexture’s lowest LOD with ca. 1 KB (modified DjVu format).
4. Results
Software systems concerning distributed 3D graphics
can be divided into two basic approaches: applications
of the first approach render the scene on the server side
and transmit the resulting frames via video streaming to
the clients. Systems of the second approach transfer the
3D scene’s data including geometry, colors, and textures
to the clients, where the data is visualized. Another
possibility is to combine these two basic approaches into
a hybrid solution. Since the presented technology
belongs to the second approach, it is compared to
systems of the video streaming and the hybrid solution.
The video streaming approaches are represented by
SGI’s OpenGL Vizserver and the hybrid systems by the
solution of Teler et al. [20].
4.1. SGI Vizserver
Using SGI’s Vizserver the application’s processing is
transferred from the client side to the server. For that
reason the user is able to start applications, which
exceed the capabilities of his device. Since the client is
only required to decode and visualize the video stream,
the resulting feedback is of high graphics quality even on
weak computer systems such as mobile devices. As
another advantage the application’s data has not to be
adapted to the client’s capabilities but only the quality
and resolution of the video stream. Furthermore, this
principle is not restricted to the visualization of 3D
scenes. It is applicable to almost all kinds of software
systems. Because the server has to compute, render, and
visualize the application’s results for several clients
simultaneously, SGI offers a super computer (e.g. the
Origin-3900-Server, 4.5 million Euro) in combination
with the software. One of the Vizserver’s biggest
drawbacks is, that the server’s load increases with
each registered client dramatically. Why should the
clients do not process tasks of the application
within their constraints? It is unsatisfying, if a
high-end-client is used in the same matter as a Palmtop,
namely for the rendering of 2D bitmaps. Another
point are the restricted interaction possibilities of
and ca. 572 KB (TGA format). The right image represents the
ARTICLE IN PRESSJ. Sahm et al. / Computers & Graphics 28 (2004) 15–24 23
these 2D bitmaps in comparison to transferred
3D models.
4.2. Teler et al.
In opposite to the Vizserver software Teler et al. only
render the lowest LOD of an element (a so called
impostor) on the server and transmit the resulting image
to the client, while the other LODs are transferred as 3D
models. The idea is, that the rendering time of the lowest
LOD plus the streaming time of the rasterized image
results in faster response times as if transmitting a 3D
model. Similar to the video streaming approach this
solution is in need of a graphics-able server. Since the
visualization of even the lowest LOD can become very
expensive if applied for several elements and multiple
clients simultaneously, the server’s work can hardly be
done by a standard personal computer. Another
problem is, that the images of the lowest LOD look
rather inhomogeneous in comparison to the elements’
3D model visualizations. Because Teler et al. concen-
trate on the precalculation and the determination of the
user’s visual area of interest, they do not mention any
scene representation.
4.3. Presented approach
As an example of the second basic approach, the
presented client–server-system has to adapt the data
effort of the 3D scenes to the client’s capabilities.
Consequently, the visualization’s quality on devices with
less performance and graphics support is significantly
lower as it is in case of the Vizserver scenario (e.g. the
visualization on palmtops is almost restricted to wire-
frame models). On the other hand, the server of the
presented approach is not in need of rendering images
for multiple clients. So in the opposite to the solutions
above, the server has not to be graphics-able. Because
the transmitted elements are represented by 3D models
and not by 2D images, the clients can handle user
interactions such as transformation requests themselves.
Table 1
An overview of the approaches in table form
SGI Vizserver Teler et al.
Technique Server rendered frames Geometry and
Costs SGI super computer Graphics-able
Accuracy No adaption Progressive me
Compression Image compression Not supported
Image quality High quality on all devices Inhomogeneou
Load balancing Server based Server and clie
Load increasing proportional to client number Not proportio
Response time Depends on frame Depends on im
Interaction 2D images Impostors and
Additionally, the presented approach provides the
following novelties or benefits:
* A memory efficient progressive representation not
only of single elements but of complete 3D scenes,
which supports the adaption of the scene’s data in
concern to the clients’ capabilities accurate to a
triangle.* The server does not provide a few precalculated levels
of detail, but a continuously resolution of the scene’s
elements depending on their number of vertices and
triangles. Since the simplification process works very
fast because of the new metric (the Stanford bunny
requires ca. 1 s on an AMD Athlon 1333 MHz), this
resolution can be generated as a precalculation step
or on-the-fly. For that reason it is possible to add
new Element Graphs to the scene or to modify even
the existing meshes.* The simplification does not generate one progressive
stream for a single element including vertices, colors,
normals, etc., but preserves the separation of the pool
entries and their according media types. So it is
possible to process each media type with a specific
codec, which results in a better encoding and
decoding efficiency. If a client does not support a
media type, the server just ignores the appropriate
pool entries. In consequence there is not any
unnecessary transmission of unsupported data. Fi-
nally, the separation allows the restoration of the
memory efficient representation on the client side.* The system always provides a consistent representa-
tion of the scene on server and client, even during the
simplification and the refinement process. So the
elements can be modified and rendered simulta-
neously [32]. It is not necessary to convert the
elements’ representations between progressive sim-
plification data structures and render data structures
as in other algorithms. Furthermore, the refinement
process is very fast (ca. 100 000 triangles/s on a AMD
Athlon 1333 MHz), so it is applicable even on weak
devices (Table 1).
Presented Approach
impostors Geometry
server Personal computer
shes Accurate to a triangle
but integrable Media specific codecs
s impostors Low quality on weak devices
nts Server and clients
nal (except for the impostors) Not proportional
postors Depends on elements’ M0
3D models 3D models
ARTICLE IN PRESSJ. Sahm et al. / Computers & Graphics 28 (2004) 15–2424
5. Conclusion
In this paper, a new 3D scene presentation was
introduced in order to transmit the visual information
from a server to a client. The scene representation
provides a memory efficient data management and
includes a progressive data format, which is generated
by a progressive simplification algorithm. This algo-
rithm separates the information of a scene element into
several streams according to the element’s data types. As
explained in Section 4 these streams can be handled in a
very flexible way. So it is possible to adapt the data
effort to the clients’ capabilities precisely. In the
opposite to the Vizserver software and the approach of
Teler et al the presented system does not make use of
server based image rendering and video streaming. For
that reason the server has not to be graphics-able and
can be represented by a standard personal computer.
Acknowledgements
This work was funded by the Heinz-Nixdorf-Founda-
tion.
References
[1] Active worlds, http://www.activeworlds.com.
[2] Kaon, http://www.kaon.com.
[3] O2c, http://www.o2c.de.
[4] Ryzom, http://www.ryzom.com/.
[5] Hesnia G, Schmalstieg D. A network architecture for
remote rendering. Proceedings of Second International
Workshop on Distributed Interactive Simulation and
Real-Time Applications; Montreal, Canada; 1998.
p. 88–91.
[6] Cohen-Or D, Chrysanthou Y, Silva C. A survey of
visibility for walkthrough applications. EURO-
GRAPHICS 2000, Course Notes; 2000.
[7] Klein R. Multiresolution representations for surfaces
meshes. Technical report, Wilhelm-Schickard-Institut,
GRIS, Universitt T .ubingen, Germany, 1997.
[8] Macedonia M, Zyda M, Pratt D, Barham P, Zesswitz S.
NPSNET: a network software architecture for large scale
virtual environments. Presence 1994;3(4):265–87.
[9] Watsen K, Zyda M. Bamboo—supporting dynamic pro-
tokols for virtual environments. Image Conference; Scott-
sdale, Arizona, USA; 1998.
[10] Broll W. Distributed virtual reality for everyone—a
framework for networked VR on the internet. IEEE
Virtual Reality Annual International Symposium 1997
(VRAIS’97), Albuquerque, NM, USA; 1997.
[11] Carlsson C, Hagsand O. DIVE—a Multi-user virtual
reality system. IEEE Virtual Reality Annual Symposium;
Seattle, USA, 1993.
[12] Waters R, Anderson D, Barrus J, Brogan D, Casey M,
McKeown S, Nitta T, Sterns I, Yerazunis W. Diamond-
Park and spline: a social virtual reality system with 3D
animation, spoken interaction and runtime modificability.
Technical Report TR-96-02a. Mitsubishi Electronic Re-
search Laboratory; 1996.
[13] Bricken W, Coco G. The VEOS Project. Presence 1994;
1(2):111–29.
[14] Snowdon D, West A. The AVIARY VR-system. A
prototype implementation. Sixth ERCIM Workshop,
Stockholm, Sweden; 1994.
[15] Lea R, Honda Y, Matsuda K, Hagsand O, Stenius M.
Issues in the design of a scalable shared virtual environ-
ment for the internet. Proceedings of the HICSS’97;
Hawaii; 1997.
[16] Shaw C, Green M, Liang J, Sun Y. Decoupled simulation
in virtual reality with the MR toolkit. ACM Transactions
on Information Systems 1993;11(3):287–317.
[17] MacIntyre B, Feiner S. Language level support for
exploratory programming of distributed virtual environ-
ments. Symposium on User Interface Software and
Technology, ACM UIST’96; Seattle, WA, USA; 1996.
[18] Funkhouser T. RING: a client-server system for multi-user
virtual environments. ACM Symposium on 3D Graphics;
Monterey, CA, USA; 1995. 85–92.
[19] Schneider B, Martin I. An adaptive framework for 3D
graphics over networks. Computers and Graphics
1999;23(6):867–74.
[20] Teler E, Lischinski D. Streaming of complex 3D scenes for
remote walkthroughs. EUROGRAPHICS 2001; Manche-
ster, UK; 2001;20(3).
[21] Schiffner N, Ruehl C. HOUCOM framework for colla-
borative environments. SPIE International Symposium on
Voice, Video and Data Communications, Hynes Conven-
tion Center Boston, USA; 1999.
[22] Sense8, ‘‘WorldToolKit’’. http://www.sense8.com, 1997.
[23] Silicon Graphics Inc., ‘‘OpenGL Vizserver 3.1’’, http://
www.sgi.com/software/vizserver, 2003.
[24] Benford S, Greenhalgh C, Rodden T, Pycock J. Colla-
borative virtual environments. Communications of the
ACM 2001;44(7):79–85.
[25] Meehan M. Survey of multi-user distributed virtual
environments. Course Notes: Developing Shared
Virtual Environments, SIGGRAPH’99; Los Angeles,
CA, USA; 1999.
[26] Strauss PS, Carey R. An object-oriented 3D graphics
toolkit. SIGGRAPH’92; Chicago, IL, USA; 1992.
341–9.
[27] IRIS Performer, http://futuretech.mirror.vuurwerk.net/
performer.html.
[28] Chang AY. A survey of geometric data structures for ray
tracing. Technical Report TR-CIS-2001-06, CIS Depart-
ment, Polytechnic University; 2001.
[29] Hoppe H. Progressive meshes. SIGGRAPH 1996. New
York: ACM; 99–108.
[30] zlib, http://www.gzip.org/zlib.
[31] DjVu, http://www.djvuzone.org/.
[32] Birthelmer H, Soetebier I, Sahm J. Efficient representation
of triangle meshes for simultaneous modification and
rendering. Proceedings of International Conference of
Computational Science 2003 (ICCS 2003), Springer,
Berlin, Heidelberg; 2003. 925–34.