STATE OF THE ART AND SCENARIOS - polimi.it

PUODARSI http://www.kaemart.it/puodarsi

TECHNICAL REPORT D1 - 1 -

PUODARSI Product User-Oriented Development based on Augmented Reality

and Interactive Simulation

STATE OF THE ART AND SCENARIOS

Summary: The document describes the technologies and the libraries in the fields of interest of the project.

Deliverable n°: D1 Version n°: 0.1

Keywords:

Virtual Prototyping, Virtual Reality, Augmented Reality, Mixed Reality, scientific visualization, haptic, Reverse Engineering, multimodal interaction, test cases.

PRIN2006 – PUODARSI STATE OF THE ART AND TEST CASE


File name D1.doc Deliverable n°: D1

Release date 31/10/2007

Autori: Monica Bordegoni; Francesco Ferrise Politecnico di Milano

Giuseppe Monno; Antonello Uva; Michele Fiorentino Politecnico di Bari

Fabio Bruno; Francesco Caruso Universita’ della Calabria Piero Mussio; Stefano Valtolina; Loredana Paralisiti

Universita’ degli studi di Milano

Francesco Caputo; Giuseppe Di Gironimo; Salvatore Gerbino; Massimo Martorelli; Adelaide Marzano; Stefano Papa; Fabrizio Renno; Domenico Speranza; Andrea Tarallo

Università di Napoli Federico II



Index Index________________________________________________________________ 3

1 Introduction ________________________________________________________ 6

2 Augmented Reality and Mixed Reality systems _____________________________ 6 2.1 AR and MR systems ____________________________________________________ 6

2.1.1 Tracking systems ___________________________________________________________7 2.1.2 Visualization systems _______________________________________________________13

2.2 Visualization libraries __________________________________________________ 21 2.2.1 Java 3D __________________________________________________________________22 2.2.2 Open Inventor _____________________________________________________________22 2.2.3 VTK ____________________________________________________________________23 2.2.4 OpenSG__________________________________________________________________23 2.2.5 OpenSceneGraph __________________________________________________________25

2.3 References____________________________________________________________ 27 3 Haptic systems______________________________________________________ 29

3.1 Haptic technology _____________________________________________________ 29 3.1.1 A taxonomy of current haptic technologies ______________________________________29 3.1.2 Possible dimensions in the taxonomy ___________________________________________29 3.1.3 Size scales________________________________________________________________30 3.1.4 Degrees of freedom (DOFs) __________________________________________________30 3.1.5 Grounding and kinematics ___________________________________________________30 3.1.6 Drive type ________________________________________________________________31 3.1.7 Control type ______________________________________________________________31 3.1.8 Contact type ______________________________________________________________31

3.2 Haptic devices_________________________________________________________ 32 3.2.1 Existing force feedback displays_______________________________________________32 3.2.2 1-DOF and 2-DOF displays __________________________________________________32 3.2.3 3-DOF and 6-DOF displays __________________________________________________32 3.2.4 Exoskeleton or humanoid type ________________________________________________34 3.2.5 Existing grasping displays ___________________________________________________36 3.2.6 Existing vibro-tactile and friction displays _______________________________________38 3.2.7 Conclusion on the state of the art in large-scale haptic displays_______________________38

3.3 Haptic libraries _______________________________________________________ 39 3.3.1 CHAI3D _________________________________________________________________39 3.3.2 OpenHaptic _______________________________________________________________40 3.3.3 H3D/VHTK ______________________________________________________________40 3.3.4 Haptik ___________________________________________________________________41 3.3.5 OpenScenceGraph Haptic____________________________________________________42 3.3.6 Haptic libraries overview ____________________________________________________43

3.4 Reference ____________________________________________________________ 43 4 Interactive simulation systems _________________________________________ 44

4.1 CFD analysis technologies_______________________________________________ 44 4.1.1 Deal II ___________________________________________________________________44 4.1.2 OpenFlower ______________________________________________________________44 4.1.3 Comsol Multiphysics _______________________________________________________48 4.1.4 Benchmark of the selected CFD solvers _________________________________________50

4.2 FEM analysis technologies ______________________________________________ 51 4.2.1 Introduction_______________________________________________________________51 4.2.2 Software _________________________________________________________________51



5 Reverse Engineering systems __________________________________________ 53 5.1 Introduction __________________________________________________________ 53 5.2 3D Scanning techniques_________________________________________________ 54

5.2.1 Contact digitizers __________________________________________________________55 5.2.2 Mixed CMM-Optical digitizers _______________________________________________56 5.2.3 Line and Spot Scanners (based on triangulation) __________________________________56 5.2.4 Probes based on the Conoscopic Holography_____________________________________58 5.2.5 Dual-Capability Systems_____________________________________________________60 5.2.6 Other Types of Laser Systems ________________________________________________60 5.2.7 Other Types of Tracking Systems______________________________________________61 5.2.8 Photogrammetry ___________________________________________________________64 5.2.9 Specifications and application criteria __________________________________________66

5.3 Critical issuses related to "Puodarsi" RE systems ___________________________ 68 5.4 Conclusions___________________________________________________________ 69 5.5 References____________________________________________________________ 70 5.6 Reverse Engineering Software: technical specs _____________________________ 71 5.7 Reverse Engineering Hardware: technical specs ____________________________ 72

5.7.1 Mechanical Touch Probe Systems _____________________________________________72 5.7.2 Line Scanners/Triangulation__________________________________________________73 5.7.3 Laser Trackers_____________________________________________________________74 5.7.4 Optical Radar _____________________________________________________________74 5.7.5 Color Capable Systems ______________________________________________________74 5.7.6 3D Metrology Systems for Manufacturing _______________________________________74 5.7.7 Scanners for Very Large Objects and Surveying Applications________________________75

6 Multimodal Annotations______________________________________________ 77 6.1 Introduction: Paper Annotation, electronic annotation and web annotation _____ 77 6.2 Annotation in 2d environments __________________________________________ 77 Tools for Collaborative Annotation: a Comparison among Three Annotation Styles developed by Unimi (University of Milano)____________________________________ 77

A. SyMPA annotation activities ____________________________________________________78 B. T.Arc.H.N.A annotation activities ________________________________________________78 C. BANCO annotation activities ___________________________________________________79 Del.ico.us, Digg, BlinkList _______________________________________________________80 Pliny and traditional scholarly practice ______________________________________________81 DesignDesk ViewLink___________________________________________________________81 Eroiica Edit ___________________________________________________________________82 eReview ______________________________________________________________________82

6.3 Annotation in 3d environments __________________________________________ 83 The Virtual Annotation System ____________________________________________________83 CATIA 3D Functional Tolerancing & Annotation 2 (FTA) ______________________________83 Annotation Authoring in Collaborative 3D Virtual Environments _________________________83 Composing PDF Documents with 3D Content from MicroStation _________________________83 A Direct-Manipulation Tool for JavaScripting Animated Exploded Views Using Acrobat 7.0 Professional ___________________________________________________________________84 NX I-deas Master Notation: For documenting solid model designs ________________________84 Immersive redlining and annotation of 3D design models on the Web ______________________84 Post Processing Tips & Hints: Annotation in ANSYS __________________________________84 Boom Chameleon: Simultaneous capture of 3D viewpoint, voice and gesture annotations on a spatially-aware display___________________________________________________________85 ANNOT3D DESCRIPTION ______________________________________________________85 Drawing for Illustration and Annotation in 3D ________________________________________85 Markup and Drawing Annotation Tools _____________________________________________86



Space Pen Annotation and sketching on 3D models on the Internet________________________86 6.4 XML DBMS __________________________________________________________ 87

6.4.1 Not native XML Databases___________________________________________________87 CACHE ______________________________________________________________________87 eXtremeDB ___________________________________________________________________87 Informix ______________________________________________________________________87 Matisse_______________________________________________________________________88 OpenInsight ___________________________________________________________________88 Oracle________________________________________________________________________88 SQL Server ___________________________________________________________________90 Sybase ASE 12.5 _______________________________________________________________91 UniVerse _____________________________________________________________________91 6.4.2 Native XML Databases______________________________________________________92 Berkeley DB XML______________________________________________________________92 DB2 _________________________________________________________________________92 dbXML ______________________________________________________________________93 eXist_________________________________________________________________________94 SQL/XML-IMDB ______________________________________________________________94 Tamino_______________________________________________________________________95 TEXTML Server _______________________________________________________________95 TigerLogic XML Data Management Server (XDMS)___________________________________96 Timber _______________________________________________________________________96 XQuantum XML Database Server__________________________________________________97 6.4.3 Tables ___________________________________________________________________97 Not native Databases ____________________________________________________________97 Native Databases _______________________________________________________________98

6.5 Bibliography__________________________________________________________ 99 6.6 Sitography___________________________________________________________ 100

7 Data transmission __________________________________________________ 102 7.1 SimLib: A library for the inter-process communication _____________________ 104 7.2 References___________________________________________________________ 107



1 Introduction This report includes the state of the art in the research domains addressed by the PUODARSI project: Augmented Reality and Virtual Reality systems, Haptic Systems, Interactive simulation systems, Reverse Engineering systems and Multimodal annotations. The report analyses hardware and software technologies for each domain. Eventually, the report includes a description of the Scenarios of the PUODARSI system and the test case for the evaluation and testing of the system.

2 Augmented Reality and Mixed Reality systems

2.1 AR and MR systems A standard virtual reality system aims to completely immerse the user in a computer generated environment. This immersion can be effective only if motions or changes made by the user result in the appropriate changes in the perceived virtual world. From this point of view, an augmented reality system could be considered the ultimate immersive system. The task is now to register the virtual frame of reference with what the user is seeing in the real world. This registration is more critical in an augmented reality system because we are more sensitive to visual misalignments than to the type of vision-kinesthetic errors that might result in a standard virtual reality system. Figure 1 shows the multiple reference frames that must be related in an augmented reality system.

Figure 1: Components of an Augmented Reality System. The scene is viewed by an imaging device, which in this case is depicted as a video camera. The camera performs a perspective projection of the 3D world onto a 2D image plane. The intrinsic (focal length and lens distortion) and extrinsic (position and pose) parameters of the device determine exactly what is projected onto its image plane. The generation of the virtual



image is done with a standard computer graphics system. The virtual objects are modeled in an object reference frame. The graphics system requires information about the imaging of the real scene so that it can correctly render these objects. These data will control the synthetic camera that is used to generate the image of the virtual objects. This image is then merged with the image of the real scene to form the augmented reality image. The research activities in augmented reality center around two aspects of the problem. One is to develop methods to register the two distinct sets of images and keep them registered in real time. Some new work in this area has started to make use of computer vision techniques. The second direction of research is in display technology for merging the two images. Performance Issues in an Augmented Reality System Augmented reality systems are expected to run in real-time so that a user will be able to move about freely within the scene and see a properly rendered augmented image. This places two performance criteria on the system. They are: • Update rate for generating the augmented image, • Accuracy of the registration of the real and virtual image. Visually the real-time constraint is manifested in the user viewing an augmented image in which the virtual parts are rendered without any visible jumps. In order to appear without any jumps, the graphics system must be able to render the virtual scene at least 10 times per second. For the virtual objects to realistically appear part of the scene more photorealistic graphics rendering is required. The current graphics technology does not support fully lit, shaded and ray-traced images of complex scenes. Fortunately, there are many applications for augmented reality in which the virtual part is either not very complex or will not require a high level of photorealism. Failures in the second performance criterion have two possible causes. One is a misregistration of the real and virtual scene because of noise in the system. The position and pose of the camera with respect to the real scene must be sensed. Any noise in this measurement has the potential to be exhibited as errors in the registration of the virtual image with the image of the real scene. Fluctuations of values while the system is running will cause jittering in the viewed image. The second cause of misregistration is time delays in the system. If there are delays in calculating the camera position or the correct alignment of the graphics camera then the augmented objects will tend to lag behind motions in the real scene. The system design should minimize the delays to keep overall system delay within the requirements for real-time performance.

2.1.1 Tracking systems

One of the biggest challenge to face in developing AR systems, is the retrival of the user or camera position related to an external environment. When virtual objects have to be merged with images taken from the real world, the graphic rendering must be aligned with real objects. To obtain this kind of alignment it is necessary to use devices which can track object in the 3D real environment. A tracker is a system capable of measuring position and orientation of a sensor relative to a known fixed point. Data, acquired by the tracker, feed the 3D rendering system in order to produce, in real time, graphical results which can be combined realistically with images of the real world. Many papers have been published about this subject, like Welch and Foxlin’s [1], Holloway and Lastra’s [2], and Azuma’s [3]

This section will cover the most popular tracking technologies.

Classification

In order to classify tracking technologies we need to name first several intercorrelated physical properties. Holloway and Lastra enumerate the fundamental ones:

- Accuracy: it is the ability of the tracker to measure the physical status related to the actual value. Statical Errors can be detected when objects are not moving, while Dynamic Errors vary according to the objects motion.



- Resolution: it represents the minimum displacement that the tracker can measure. - Latency: the time between the acquisition of the input by the tracker and the

acquisition of the measure by the computer after the data processing. When latency is too large, the time delay between the known position of the object an its actual one can be inacceptable.

- Refresh rate: it is the number of positional and/or orientation values the tracker can produce per second. The Higher the rates the smoother animations can be in the virtual environment.

- Infrastructure: trackers measures are relative to a reference. The reference position and orientation need to be measured with respect to the real environment in order to refer coordinates, received by the tracker, to the real world.

- Operative range: often trackers can work in a limited space. Signals sent by the sensors fall down with distance limiting the operative space.

- Price: some technologies allow higher degrees of precision and accuracy, but can be very expensive.

- DOF: Degrees Of Freedom provided by the tracker - Coordinate Type: some trackers measure speed or acceleration which need

integrations to produce positional values. The integration process amplifies errors and can let the simulation explode after some time. Trackers which measure positional values directly let the simulation be stable.

- Mobility: many applications require the user to wear the tracking device or to carry it in an external environment. Therefore weight and dimensions are key properties.

- Power Consumption: strictly connected to mobility.

MOST COMMON TRACKING TECHNOLOGIES

ACCELEROMETERS

Accelerometers measure forces produced by mass inertia and produce relative positions by means of a double integration. They can also measure two absolute angular coordinates using the force of gravity. Accelerometers are small and easy to build. Lately they have been realized in MEMS technology (Micro Electro Mechanical) which allows integration in small components. They do not require complex hardware, they are cheap and they have a high update rate.

GPS

At the present time, GPS (Global Positioning System) is the best available technology for external wide areas navigation. GPS accuracy can vary between 5 and 25 meters. While this accuracy is acceptable for some applications (like navigation) it is excessively low for AR applications where accuracy should be near to millimeters. An AR system would be useless if the computer generated graphics was projected meters far away from the place it should be. Accuracy can be scaled using more GPS signals or differential GPS systems, but it is usually not less than one meter. The Real-time Kinematic GPS reaches an accuracy of one centimeter.

HIBall

Tracking is usually simpler in small spaces than in wide areas. Researchers of the University of Carolina at Chapel Hill developed a high precision system able to track objects in an area of about 50 m². The system, named Hiball Tracking System, uses an optoelectronic technology. It can be divided in two parts:

- Six optical sensors carried by the user. - Infrared LED mounted on special panels on the ceiling

This system uses the known LED position to retrieve user position and orientation. It allows an accuracy of about 0.2 mm and 0.03 degrees with an update rate higher than 1500Hz and a latency under one millisecond.



Figure 2.

Why choosing optical tracking

Optical tracking has evolved lately thanks to the research in computer graphics. The computational speed necessary for this kind of real time image processing appeared in the beginning of the ‘90s. But mainly the availability of low cost CCD sensors gave a big push to research. 10 years ago nobody could even think of buying a PAL resolution digital videocamera for 200 euros in a consumer electronic shop like today.

Unfortunately, like in every aspect of industrial design, price is the parameter which mainly influences all the others. Very precise tracking systems are usually very expensive and sometimes they’re not easy and they have a limited operative range. Optical tracking is nowadays one of the technologies which more then the others satisfies all requirements.

The following table enumerates all requirements and makes a comparison with the optical tracking characteristics.

Obtained Required Notes Accuracy 3% of distance 10% of distance - Orientational error is higher

than positional error - Positional error behaves

differently on the z axis Resolution Up to 1 mm about some mm Resolution on x and y depends on z

and on the camera FOV Latency Very low < 1 second Computational time is very low Refresh rate > 15 Hz 15 Hz Same as video frame rate Operative range

About some meters

1 meter Related to the marker dimensions

Stability Stable Quite stable Using a Robust Pose Estimator Mobility Easy to carry Quite easy to carry It uses a video camera and a

processing system (like a laptop) Easiness of installation

Easy Medium easiness After calibrating the system, the installation is easy

Price Low Medium It uses consumer hardware

Table 1. Requirements for optical tracking systems

Optical tracking can be implemented using active elements (markers) or passive features (real world images). It can provide relative or absolute data according to the used technology. Human



being trust their sight when walking in an environment. They capture images of the world and compare them with those they have in memory to evaluate their position.

They also use known information on the world like occlusions, relative dimensions, height, atmospherically and prospectical effects, stereo vision. [4.]. Human beings can estimate dimensions and relative position of visible objects. Some works are studying these behaviours in order to replicate them using Artificial Intelligence and Computer Vision Systems; but these systems are not mature yet to be used in any condition. [5.].

Markerless systems

Special techniques have been developed to try to track position and orientation of arbitrary objects in an arbitrary environment. Some examples of these technologies which do not use markers can be found in Simon’s works [6.], Simon and Berger [7], Genc [8] e Chia [9].

These works have developed algorithms to extract features able to characterize a scene and to work with unexpected movements of non statical objects, images defocused by fast speeds and variable lighting conditions.

Marker based systems

Even if research on Markerless Systems is highly advanced, when looking for a high level of robustness it is necessary to introduce some kind of marker in the scene. After some works in which hybrid technologies based on magnetic and optical tracking have been used to improve positional and orientational precision, the first “robust” system for AR applications has been developed by Kato and named ARToolKit and consists of physical patterns analysis application [10].

Other marker recognition technologies have been developed (ARTAG, ARDEV, etc.) but they have not been released as open source. They will be discussed and compared at the end of this chapter.

2.1.2 TRACKING LIBRARIES

ARToolKit

ARToolkit is a software library used to compute in real time the camera position and orientation relative to a physical marker placed in the scene.

A simple marker, called “fiducial marker”, printed on paper, is used to extract position and orientation from each single video frame. This kind of marker is usually made of a black non-symmetrical pattern on a white bordered background. The analysis of the borders allow to estimate the prospectical deformations, rotations of the pattern and distance from the camera. These information can be extracted in real time. The algorithm is simple enough and it require a single marker to provide 6 DOF. Despite that, many claim that this system cannot be used when accuracy is primary goal.

ArtoolKit has been initially developed by Dr. Hirokazu Kato and it is now developed by Human Interface Technology Laboratory [http://www.hitl.washington.edu/] at the University of Washington, University of Canterbury, New Zealand [http://www.hitlabnz.org/] and by ARToolworks, Inc, Seattle [http://www.artoolworks.com]

ARToolkit still remains one of the most used libraries in the field of research and it is available under GNU Public License and under a Commercial License.



ARToolkitPLUS

ARToolkitPLUS is an extended version of the ARToolKit which adds some important features but breaks the compatibility because of a different API structure.

ARToolkitPLUS has been entirely developed at the University of Gratz as a part of the “HandHeld AR Project” [http://studierstube.icg.tu-graz.ac.at/handheld_ar] and it is available at the internet adress: [http://studierstube.icg.tu-graz.ac.at/handheld_ar/artoolkitplus.php].

Even though ARToolkitPLUS is based on the ARToolKit code, it requires some C++ programming experience to be used, therefore it is not an easy to use solution but a tool for programmers. No binary code or executable code ready to use are provided. No GUI is available for the compilation and no IDE can help the development. In the following we will always refer to the ARToolkitPLUS (ARTK+) even if its features are inherited from the ARToolKit.

ARTK+ added features are the following:

- API C++ - ID-Encoded Markers - More image formats support - Variable internal black border thickness - Optimizations for Mobile devices (SmartPhones-PDA etc.) - New Pose Estimation algorithm RPP (Robust Pose) - Support for the MATLAB Calibration toolbox files - Automatic exposure correction for dark scenes - Tools: pattern file generator, camera calibrator, etc.

Some of the above will be discussed later in the chapter

Structure of ARTK+

Here follows the basics of the ARTK+ tracking concept:

Figure 3. Structure of ARToolKit

1) Camera acquisition of the scene; 2) Marker search accomplished using the contrast between the black internal border

and the white external border; 3) Estimation of the marker position and orientation; 4) Marker detection through marker database; 5) Virtual object rendering from the camera point of view;



6) Virtual object overlapping with the video frame acquired at 1).

Actually the process is inverted with respect to the video boards texture mapping process.

We could say that the optical tracking is the inverse form of the texture mapping. Many publications exploit this concept for the markerless tracking technology since they use object “textures” as image features.

At a deeper level we can focus our attention on the tracking process only. Here follows the sequence of operations from the video frame acquisition to the marker position and orientation estimated matrix.



Figure 4. Operational steps of ARTK+

1) Format conversion from different kind of devices to a uniform one 2) Thresholding to reveal image borders 3) Marker detection (template matching). It can be performed in many ways as it will

be explained. 4) Lens distortion correction 5) Position and orientation estimation.

2.1.2 Visualization systems The combination of real and virtual images into a single image presents new technical challenges for designers of augmented reality systems, since augmentation requires special display technologies. In fact, the simple Monitor Based Display, which is shown in figure 5, gives the user only a poor feeling of being immersed in the augmented environment. In order to increase the sense of presence, other display technologies are needed.



Figure 5. Monitor based display.

Display technologies for Augmented Reality

Augmented reality displays are image-forming systems that use a set of optical, electronic, and mechanical components to generate images somewhere on the optical path in between the observer’s eyes and the physical object to be augmented. Figure 6 illustrates the different possibilities of where the image can be formed to support augmented reality applications, where the displays are located with respect to the observer and the real object [11].

Figure 6. Image generation technologies for AR.

Head-attached displays, such as retinal displays, head-mounted displays, and head-mounted projectors, have to be worn by the observer. While some displays are hand-held, others are spatially aligned and completely detached from the users.

Head-Attached Displays

Head-Attached Displays require the user to wear the display system on his or her head. Depending on the image generation technology, three main types exist [11]: a) simple head-mounted displays that use miniature displays in front of the eyes; b) retinal displays that use low-power lasers to project images directly onto the retina of the eye; c) head-mounted



projectors that make use of miniature projectors or miniature LCD panels with backlighting and project images on the surfaces of the real environment.

a) Head-Mounted Displays

Head-mounted displays are currently the dominant display technology within the AR field. They support mobile applications and multi-user applications if a large number of users need to be supported. Head-mounted displays (HMD) have been widely used in virtual reality systems. Augmented reality researchers have been working with two types of HMD. These are called video see-through and optical see-through. The "see-through" designation comes from the need for the user to be able to see the real world view even when wearing the HMD. Video See-Through uses an opaque HMD to display merged video of the Virtual Environment with a view from cameras mounted on the HMD. A diagram of a video see-through system is shown in Figure 7. This can be seen to actually be the same architecture as the monitor based display described above except that now the user has a heightened sense of immersion in the display.

Figure 7. Video see-through system.

Optical See-Through HMD [12] eliminates the video channel that is looking at the real scene. In fact, as shown in Figure 8, the merging of real world and virtual augmentation is done optically. This technology is similar to Heads Up displays (HUD) that commonly appear in military airplane cockpits and recently some experimental automobiles. In this case, the optical merging of the two images is done on the head mounted display, rather than the cockpit window or auto windshield.

Figure 8. Optical see-through system.

There are advantages and disadvantages for both display systems. The first thing to be noticed is that video see-through displays can be used both for VR and AR applications, while the application field of optical see-through displays is limited to AR. However there are also some performance issues that will be highlighted here. The video see-through approach is a bit more complex than optical see-through, requiring proper location of the cameras. However, video



composition of the real and virtual worlds is much easier. In fact, both display systems introduce a forced delay between the real and the augmented environment visualization, due to the video merging operation. However, since the video see-through provides the user a complete isolation from the real scene, compensation for this delay could be made by correctly timing the camera image with the virtual image generation. On the other hand, with an optical see-through display the view of the real world is instantaneous so it is not possible to compensate such system delays. Furthermore, with video see-through displays a video camera is viewing the real scene. In this way, the image generated by the camera is available to the system to provide also tracking information. Instead, the optical see-through display does not provide this additional information, except for the HMD position itself. In general several disadvantages can be noted in the use of head-mounted displays as an augmented reality device. These shortcomings are summarized in the following: - Lack in resolution that is due to limitations of the applied miniature displays. In the optical

see-through case, only the graphical overlays suffer from a relatively low resolution, while the real environment can be perceived in the resolution of the human visual system. For video see-through devices, both the real environment and the graphical overlays are perceived in the resolution of the video source or display.

- Limited field of view that is due to limitations of the applied optics. - Imbalanced ratio between heavy optics (that results in cumbersome and uncomfortable

devices) and ergonomic devices with a low image quality. - Visual perception issues that are due to the constant image depth. For optical see-through,

since objects within the real environment and the image plane that is attached to the viewer’s head are sensed at different depths, the eyes are forced to either continuously shift focus between the different depth levels, or perceive one depth level as unsharp. This is known as the fixed focal length problem, and it is more critical for see-through than for closed-view head-mounted displays. For video see-through, only one focal plane exists—the image plane.

- Conventional optical see-through devices are incapable of providing consistent occlusion effects between real and virtual objects. This is due to the mirror beam combiners that reflect the light of the miniature displays interfering with the transmitted light of the illuminated real environment.

b) Retinal Displays

Retinal displays utilize low-power semiconductor lasers (or, in the future, special light-emitting diodes) to scan modulated light directly onto the retina of the human eye, instead of providing screens in front of the eyes (see figure 9). This produces a much brighter and higher-resolution image with a potentially wider field of view than a screen-based display.

Figure 9. Retinal display system. Current retinal displays share many shortcomings with head-mounted displays. However, some additional disadvantages can be identified for existing versions:

- Only monochrome (red) images are presented since cheap low-power blue and green lasers do not yet exist.

- The sense of ocular accommodation is not supported due to the complete bypass of the ocular motor system by scanning directly onto the retina. Consequently, the focal length is fixed.

- Commercial stereoscopic versions do not yet exist. The main advantages of retinal displays are the brightness and contrast, and low-power consumption which make them well suited for mobile out-door applications. Future generations



also hold the potential to provide dynamic re-focus, full-colour stereoscopic images, and an extremely high resolution and large field-of-view.

c) Head-Mounted Projectors

Head-mounted projective displays (HMPDs) redirect the frustum of miniature projectors with mirror beam combiners so that the images are beamed onto retro-reflective surfaces that are located in front of the viewer (see Figure 10). A retro-reflective surface is covered with many thousands of micro corner cubes. Since each micro corner cube has the unique optical property to reflect light back along its incident direction, such surfaces reflect brighter images than normal surfaces that diffuse light. Note that this is similar in spirit to the holographic films used for transparent projection screens. However, these films are back-projected while retroreflective surfaces are front-projected. Other displays that utilize head-attached projectors are projective headmounted displays (PHMDs) [13]. They beam the generated images onto regular ceilings, rather than onto special surfaces that face the viewer. Two half-silvered mirrors are then used to integrate the projected stereo image into the viewer’s visual field. Their functioning is generally different than head-mounted projectors. It can be compared to optical see-through head mounted displays. However, in this case the images are not displayed on miniature screens, but projected to the ceiling before being reflected by the beam splitter.

Figure 10. Head-mounted projective displays.

Head-mounted projective displays decrease the effect of inconsistency between accommodation and convergence that is related to HMDs. Both head-mounted projective displays and projective head-mounted displays also address other problems that are related to HMDs. They provide a larger field of view without the application of additional lenses that introduce distorting arbitrations. They also prevent incorrect parallax distortions caused by inter-pupil distance (IPD) mismatch that occurs if HMDs are worn incorrectly (e.g., if they slip slightly from their designed position).

However, current prototypes also have the following shortcomings:

- The integrated miniature projectors/LCDs offer limited resolution and brightness. - Head-mounted projective displays might require special display surfaces (i.e., retro-

reflective surfaces) to provide bright images, stereo separation, and multi-user viewing. - For projective head-mounted displays, the brightness of the images depends strongly

on the environmental light conditions. - Projective head-mounted displays can only be used indoors, since they require the

presence of a ceiling. Such displays technically tend to combine the advantages of projection displays with the advantages of traditional head-attached displays. However, the need to cover the real environment with a special retro-reflective film material make HMPDs more applicable for virtual reality applications, rather than for augmenting a real environment.



Hand held displays

Conventional examples of hand-held displays, such as Tablet PCs, or cell phones generate images within arm’s reach. All of these examples combine processor, memory, display, and interaction technology into one single device, and aim at supporting a wireless and unconstrained mobile handling. Video see-through is the preferred concept for such approaches. Integrated video cameras capture live video streams of the environment that are overlaid by graphical augmentations before displaying them (see Figure 11).

Figure 11. Video see-through hand-held displays.

However, optical see-through hand-held devices also exist. Stetton et al [14] for instance, have introduced a device for overlaying real-time tomographic data. It consists of an ultrasound transducer that scans ultrasound slices of objects in front of it. The slices are displayed time-sequentially on a small flat-panel monitor and are then reflected by a planar half-silvered mirror in such a way that the virtual image is exactly aligned with the scanned slice area. Stereoscopic rendering is not required in this case, since the visualized data is two-dimensional and appears at its correct three-dimensional location. Hand-held mirror beam combiners can be used in combination with large, semi-immersive or immersive screens to support augmented reality tasks with rear-projection systems [15]. Tracked mirror beam combiners act as optical combiners that merge the reflected graphics, which are displayed on the projection plane with the transmitted image of the real environment.

Yet another interesting display concept is described in Raskar [16.]. It proposes the application of a hand-held and spatially aware video projector to augment the real environment with context sensitive content. A combination of a hand-held video projector and a camera was also used by Foxlin and Naimark of InterSense to demonstrate the capabilities of their optical tracking system. This concept represents an interesting application of AR to the fields of architecture and maintenance.

There are disadvantages to each individual approach:

• The image analysis and rendering components is processor and memory intensive. This is critical for low-end devices such as PDAs and cell phones and might result in a too high end-to-end system delay and low frame rates. Such devices often lack a floating point unit which makes precise image processing and fast rendering even more difficult.

• The limited screen size of most hand-held devices restricts the covered field-of-view. However, moving the image to navigate through an information space that is essentially larger than the display device supports a visual perception phenomenon that is known as Parks effect [17]. That is, moving a scene on a fixed display is not the same as moving a display over a stationary scene because of the persistence of the image on the viewer’s retina. Thus, if the display can be moved, the effective size of the virtual display can be larger than its physical size, and a larger image of the scene can be left on the retina.



• The optics and image sensor chips of integrated cameras in consumer hand-held devices are targeted to other applications and, consequently, provide a limited quality for image processing tasks (e.g., usually high barrel distortion). For example, they do not allow for modifying focus. Fixed focus cameras can only be effective in a certain depth range. This also applies to the image output of handheld projectors if they are used to augment surfaces with a certain depth variance, since projected images can only be focused on a single plane.

• Compared to head-attached devices, hand-held devices do not provide a completely hands-free working environment. Hand-held devices, however, represent a real alternative to head-attached devices for mobile applications. Consumer devices, such as PDAs and cell phones, have a large potential to bring AR to a mass market. More than 500 million mobile phones have been sold worldwide in 2004. It has been estimated that by the end of the year 2005 over fifty percent of all cell phones will be equipped with digital cameras. Today, a large variation of communication protocols allows the transfer of data between individual units, or access to larger networks, such as the Internet.

Leading graphics board vendors are about to release new chips that will enable hardware-accelerated 3D graphics on mobile phones including geometry processing and per-pixel rendering pipelines. Variable-focus liquid lenses will enable dynamic and automatic focus adjustment for mobile phone cameras, supporting better image processing. Some exotic devices even support auto-stereoscopic viewing (e.g., Sharp), GPS navigation, or scanning of RFID tags. Due to the rapid technological advances of cell phones, the distinction between PDAs and mobile phones might be history soon. Obviously, compact hand-held devices, such as PDAs but especially mobile phones are becoming platforms that have the potential to bring augmented reality to a mass market. This will influence application areas, such as entertainment, edutainment, service, and many others.

Spatial Displays In contrast to body-attached displays, spatial displays detach most of the technology from the user and integrate it into the environment. Three different approaches exist which mainly differ in the way they augment the environment—using either video see-through, optical see-through, or direct augmentation.

Figure 12. Example for a screen-based video see-through display. The locomotion of a dinosaur is simulated over a physical footprint. [18].



Screen-Based Video See-Through Displays

Such systems make use of video mixing and display the merged images on a regular monitor (see figure 12). Within an augmented reality context, the degree of immersion into an augmented real environment is frequently expressed by the amount of the observer’s visual field (i.e., the field of view) that can be superimposed with graphics. In the case of screen-based augmented reality, the field of view is limited and restricted to the monitor size, its spatial alignment relative to the observer, and its distance to the observer. For screen-based augmented reality, the following disadvantages exist:

• Small field of view that is due to relatively small monitor sizes. However, the screen-size is scalable if projection displays are used.

• Limited resolution of the merged images (especially dissatisfying is the limited resolution of the real environment’s video image).

• Mostly remote viewing, rather than supporting a see-through metaphor.

Screen-based augmentation is a common technique if mobile applications or optical see-through does not have to be supported. It represents probably the most cost efficient AR approach, since only off-the-shelf hardware components and standard PC equipment is required.

Spatial Optical See-Through Displays

In contrast to head-attached or hand-held optical see-through displays, spatial optical see-through displays generate images that are aligned within the physical environment. Spatial optical combiners, such as planar or curved mirror beam combiners, transparent screens or optical holograms [19] are essential components of such displays. Spatial optical see-through configurations have the following shortcomings:

• They do not support mobile applications because of the spatially aligned optics and display technology.

• In most cases, the applied optics prevents a direct manipulative interaction with virtual and real objects that are located behind the optics. Exceptions are reach-in configurations either realized with see-through LCD panels [20] or mirror beam combiners [21].

• The number of observers that can be supported simultaneously is restricted by the applied optics.

• A mutual occlusion between real and virtual environment is not supported for the same reasons as for optical see-through head-mounted displays

• Due to the limited size of screens and optical combiners, virtual objects outside the display area are unnaturally cropped. This effect is known as window violation and is also present for semi-immersive virtual reality displays.

The general advantages of spatial optical see-through displays are easier eye accommodation and vergence, higher and scalable resolution, larger and scalable field of view, improved ergonomic factors, easier and more stable calibration, and better controllable environment (e.g., tracking, illumination, etc.).This can lead to more realistic augmented environments.

Projection-Based Spatial Displays

Projector-based spatial displays use front-projection to seamlessly project images directly on a physical objects’ surfaces instead of displaying them on an image plane (or surface) somewhere within the viewer’s visual field (see figure 13). Single static or steerable, and multiple projectors are used to increase the potential display area and enhance the image quality.



Figure 13. ShaderLamps with Taj Mahal: The wooden white model is illuminated; the scanned geometry of the Taj Mahal is augmented to add texture and material properties; The geometry is then registered to the real Taj Mahal and displayed from the projector’s viewpoint [22].

A stereoscopic projection and, consequently, the technology to separate stereo images is not necessarily required if only the surface properties (e.g., color, illumination, or texture) of the real objects are changed by overlaying images. In this case, a correct depth perception is still provided by the physical depth of the objects’ surfaces. However, if three-dimensional graphics are to be displayed in front of or behind the object’s surfaces, a view-dependent, stereoscopic projection is required as for other oblique screen displays [23]

Projector-based spatial displays introduce several new problems:

• Shadow-casting of the physical objects and of interacting users that is due to the utilized front-projection. Multi-projector configurations can solve this problem.

• Restrictions of the display area that is constrained to the size, shape, and colour of the physical objects’ surfaces (for example, no graphics can be displayed beside the objects’ surfaces if no projection surface is present). Multi-projector configurations can solve this problem.

• Restriction to a single user in case virtual objects are displayed with non-zero parallax. Multi-user projector configurations can solve this problem.

• Conventional projectors only focusing on a single focal plane located at a constant distance. Projecting images onto non-planar surfaces causes blur. Exceptions are laser projectors which do not suffer from this effect.

• Increased complexity of consistent geometric alignment and colour calibration as the number of applied projectors increases.

On the other hand, projector-based spatial displays overcome some of the shortcomings that are related to head-attached displays: an improved ergonomics, a theoretically unlimited field of view, a scalable resolution, and an easier eye accommodation (because the virtual objects are typically rendered near their real world location).

2.2 Visualization libraries

• Java 3D • Open Inventor • VTK • OpenSG • OpenSceneGraph



2.2.1 Java 3D Brief description: Compared to other solutions, Java 3D is not only a wrapper around these graphics APIs, but an interface that encapsulates the graphics programming using a real, object-oriented concept. Here a scene is constructed using a scene graph that is a representation of the objects that have to be shown. This scene graph is structured as a tree containing several elements that are necessary to display the objects. Additionally, Java 3D offers extensive spatialized sound support. Features

• Multithreaded scene graph structure • Platform independent • Generic Real-time API, usable for both visualization and gaming • Support for retained, compiled-retained, and immediate mode rendering • Includes hardware-accelerated JOGL, OpenGL and Direct3D renderers (depending on

platform) • Sophisticated virtual-reality-based view model with support for stereoscopic rendering

and complex multi-display configurations • Native support for head-mounted display • CAVE (multiple screen projectors) • 3D spatial sound • Programmable shaders, supporting both GLSL and CG • Stencil buffer • Importers for most mainstream formats, like 3DS, OBJ, VRML, X3D, NWN, and FLT

Technical specs: Website: https://java3d.dev.java.net Standards: OpenGL, Direct3D Compatible platforms: platform independent License: http://java.sun.com/products/java-media/3D/java3d-1_3-license.html

2.2.2 Open Inventor Brief description: Open InventorTM is an object-oriented 3D toolkit offering a comprehensive solution to interactive graphics programming problems. It presents a programming model based on a 3D scene database that simplifies graphics programming. It includes a rich set of objects such as cubes, polygons, text, materials, cameras, lights, trackballs, handle boxes, 3D viewers, and editors that speed up the programming time and extend the 3D programming capabilities. Open Inventor:

• is built on top of OpenGL® • defines a standard file format for 3D data interchange • introduces a simple event model for 3D interaction • provides animation objects called Engines • provides high performance object picking • is window system and platform independent • is a cross-platform 3D graphics development system • supports PostScript printing • encourages programmers to create new customized objects



Technical specs: Website: http://oss.sgi.com/projects/inventor Standards: OpenGL Compatible platforms: platform independent License: http://oss.sgi.com/projects/inventor/license.html (GNU)

2.2.3 VTK Brief description: The Visualization ToolKit (VTK) is an open source, freely available software system for 3D computer graphics, image processing, and visualization. VTK includes a textbook published by Kitware, a C++ class library, and several interpreted interface layers including Tcl/Tk, Java, and Python. VTK has been implemented on nearly every Unix-based platform, PC's (Windows 95/98/NT/2000/XP) and Mac OSX Jaguar and later. The design and implementation of the library has been strongly influenced by object-oriented principles. The graphics model in VTK is at a higher level of abstraction than rendering libraries like OpenGL or PEX. This means it is much easier to create useful graphics and visualization applications. In VTK applications can be written directly in C++, Tcl, Java, or Python. In fact, using the interpreted languages Tcl or Python with Tk, and even Java with its GUI class libraries, it is possible to build useful applications really, really fast. VTK supports a wide variety of visualization algorithms including scalar, vector, tensor, texture, and volumetric methods; and advanced modeling techniques like implicit modeling, polygon reduction, mesh smoothing, cutting, contouring, and Delaunay triangulation. Moreover, it has directly integrated dozens of imaging algorithms into the system so you can mix 2D imaging / 3D graphics algorithms and data. Although VTK is freely available, commercial support is available from Kitware. Dozens of other companies, ranging from large US Government research labs to small firms selling custom postprocessors, use VTK. Also, VTK is widely used in academia for research and in courses on visualization and graphics. Technical specs: Website: http://www.vtk.org Standards: OpenGL Compatible platforms: platform independent License: http://www.vtk.org/copyright.php

2.2.4 OpenSG Brief description: OpenSG is a portable scenegraph system to create realtime graphics programs, e.g. for virtual reality applications. It is developed following Open Source principles and can be used freely. It runs on Windows, Linux, Solaris and MacOS X and is based on OpenGL. OpenSG is based on the widely used and successful scene graph metaphor, which can be seen as an object-oriented representation of everything that needs to be displayed. The goal is to



abstract the complexities of the lower-level APIs like OpenGL. Because the scene graph has a full representation of the whole scene, it can also do many things which a graphics driver cannot do, like high-level optimizations, or abstracting the complexities behind modern graphics algorithms, cf. advanced lighting and shading. It is designed to be very general, for many different kinds of applications from architectural and manufacturing scene visualization to highly dynamic and interactive virtual worlds. The main design and implementation features: Performance OpenSG uses a wide variety of optimization techniques to efficiently use the available horsepower. Multi-Threading OpenSG supports a very general and flexible multi-threading model that gives totally independent threads write access to the scenegraph without interfering with each other. Clustering Just a few years ago a large-screen stereo projection was only affordable for large companies or research centers. With the advent of cheap boardroom projectors and powerful graphics cards for standard PCs, the components for setting up large, high-quality display systems at reasonable prices are there. But one of the limitations of current PC systems is the number of outputs. Most modern graphics cards have two outputs, even with two cards in a system 4 outputs are the limit. For large display systems or to be able to dedicate the full power of a graphics card to each projector (they're cheap, after all) that is far from sufficient. To alleviate this shortcoming, a cluster of PCs needs to be employed. In addition to large displays clusters can also be used to display large scenes by merging the outputs from all cluster nodes to a single screen. The power of a cluster far exceeds the power of a single machine, even a high-end one. But cluster-aware software is significantly more involved to write. OpenSG takes the pain out of clustering, by supporting serialization of arbitrary scenegraph changes. Making a standalone application cluster-capable and having it drive a display like the HEyeWall typically takes less than an hour using OpenSG. Extensibility Due to the ubiquity of high-powered 3D graphics cards the number of application areas for interactive 3D graphics is growing steadily, and with it is the number of requirements for a general scenegraph system. It's not possible to foresee which requirements future applications might have. Therefore it is important to design a system that is open for application-specific and -nonspecific extensions. Doing it Open Source is a good first step towards that, but it is not enough. Some extensions might be too application-specific for the maintainers of the system to be comfortable with integrating them. If the system is designed to require source code changes for extensions, these changes will have to be made over and over again by the application developer, for each new release of the scenegraph system. OpenSG is designed to avoid this situation by using highly dynamic and flexible structures that can be easily extended or adapted by an application. Portability One of the motivating factors of the OpenSG development was the desire to have a system that can be used on a wide variety of platforms. To do that OpenSG is based on portable components like OpenGL resp. OpenGL ES and Boost. It has support for some system-specific windowing options, but it does not depend on them. If there is a way to open an OpenGL-capable window, there is a way to make OpenSG work. Technical specs: Website: http://opensg.vrsource.org/trac Standards: OpenGL Compatible platforms: Windows, Linux, Solaris and MacOS X License: GNU Library or Lesser General Public License (LGPL)



2.2.5 OpenSceneGraph Brief description: The OpenSceneGraph is an OpenSource, cross platform graphics toolkit for the development of high performance graphics applications such as flight simulators, games, virtual reality and scientific visualization. Based around the concept of a SceneGraph, it provides an object oriented framework on top of OpenGL freeing the developer from implementing and optimizing low level graphics calls, and provides many additional utilities for rapid development of graphics applications. Features Written entirely in Standard C++ and OpenGL, it makes full use of the STL and DesignPatterns, and leverages the open source development model to provide a development library that is legacy free and focused on the needs of end users. The key strengths of OpenSceneGraph are its performance, scalability, portability and the productivity gains associated with using a fully featured scene graph, in more detail: Performance Supports view frustum culling, occlusion culling, small feature culling, Level Of Detail (LOD) nodes, state sorting, vertex arrays and display lists as part of the core scene graph. The OpenSceneGraph also supports easy customization of the drawing process, such as implementation of Continuous Level of Detail (CLOD) meshes on top of the scene graph (see Virtual Terrain Project and Delta3D). Productivity The core scene graph encapsulates the majority of OpenGL functionality including the latest extensions, provides rendering optimizations such as culling and sorting, and a whole set of add on libraries which make it possible to develop high peformance graphics applications very rapidly. The application developer is freed to concentrate on content and how that content is controlled rather than low level coding. Combining lessons learned from established scene graphs like Performer and Open Inventor, with modern software engineering methods like Design Patterns, along with a great deal of feedback early on in the development cycle, it has been possible to design a library that is clean and extensible. This has made it easy for users to adopt to the OpenSceneGraph and to integrate it with their own applications. For reading and writing databases the database library (osgDB) adds support for a wide variety of database formats via a extensible dynamic plugin mechansim - the distribution now includes 45 separate plugins for loading various 3D database and image formats. 3D database loaders include OpenFlight (.flt), TerraPage (.txp) including multi-threaded paging support, LightWave (.lwo), Alias Wavefront (.obj), Carbon Graphics GEO (.geo), 3D Studio MAX (.3ds), Peformer (.pfb), Quake Character Models (.md2). Direct X (.x), and Inventor Ascii 2.0 (.iv)/ VRML 1.0 (.wrl), Designer Workshop (.dw) and AC3D (.ac) and the native .osg ASCII format. Image loaders include .rgb, .gif, .jpg, .png, .tiff, .pic, .bmp, .dds (include compressed mip mapped imagery), .tga and quicktime (under OSX). A full range of high quality, anti-aliased fonts can also be loaded via the freetype plugin. The scene graph also has a set of Node Kits which are separate libraries that can be compiled in with your applications or loaded in at runtime, which add support for particle systems (osgParticle), high quality anti-aliased text (osgText), special effects framework (osgFX), OpenGL shader language support (osgGL2), large scale geospatial terrain database generation (osgTerrain), and navigational light points (osgSim).



The community has also developed a number of additional Node Kits such as osgNV (which includes support for NVidia's vertex, fragment, combiner etc extension and NVidia's Cg shader language.), Demeter (CLOD terrain + integration with OSG). osgCal (which integrates Cal3D and the OSG), osgVortex (which integrates the CM-Labs Vortex physics enginer with OSG) and a whole set libraries that integrating the leading Windowing API's can be found in the WindowingToolkits section. Portability The core scene graph has been designed to have minimal dependency on any specific platform, requiring little more than Standard C++ and OpenGL. This has allowed the scene graph to be rapidly ported to a wide range of platforms - originally developed on IRIX, then ported to Linux, then to Windows, then FreeBSD, Mac OSX, Solaris and HP-UX. The core scene graph library being completely windowing system independent makes it easy for users to add their own window-specific libraries and applications on top. In the distribution there is already the osgProducer library which integrates with OpenProducer, and in the Community/Applications section of this website one can find examples of applications and libraries written on top of GLUT, Qt, MFC, WxWindows and SDL. Users have also integrated it with Motif, and X. Scalability The scene graph will not only run on portables all the way up to Onyx Infinite Reality Monsters, but also supports the multiple graphics subsystems found on machines like a mulitpipe Onyx. Display Lists and texture objects, and the cull and draw traversals have been designed to cache rendering data locally and use the scene graph almost entirely as a read-only operation. This allows multiple cull-draw pairs to run on multiple CPU's which are bound to multiple graphics subsystems. Multi-language support Java and Python bindings for the OpenSceneGraph are also available as Commmunity projects. Technical specs: Website: http://www.openscenegraph.com Standards: OpenGL Compatible platforms: IRIX, Linux, Windows, FreeBSD, Mac OSX, Solaris, HP-UX License: GNU



2.3 References

[1] Welch, G. and Foxlin, E. Motion Tracking: No Silver Bullet, but a Respectable Arsenal. IEEE Computer Graphics and Applications, Vol. 22, No. 6, pp 24-38, 2002.

[2] Holloway, R. and Lastra, A. Virtual Environments: A Survey of the Technology. Technical Report, University of North Carolina, Chapel Hill, NC, Report No. TR93-033, Apr 1993.

[3] Azuma, R. A Survey of Augmented Reality. Presence: Teleoperators and Virtual Environments, Vol. 6, No. 4, pp 355-385, 1997.

[4] Cutting, J. E. and Vishton, P. M. Perceiving layout and knowing distances: The integration, relative potency, and contextual use of different information about depth. In W. Epstein and S. Rogers, Handbook of perception and cognition, San Diego, Ca, Academic Press, 1995

[5] Azuma, R., Baillot, Y., Behringer, R., Feiner, S., Julier, S., and MacIntyre, B. Recent Advances in Augmented Reality. IEEE Computer Graphics and Applications, Vol. 21, No. 6, pp 34-47, Nov 2001

[6] Simon, G., Fitzgibbon, A. W., and Zisserman, A. Markerless Tracking using Planar Structures in the Scene. In 3rd Int'l Symposium on Augmented Reality, pp 120-128, Munich, Germany, Oct 2000

[7] Simon, G. and Berger, M.-O. Reconstructing while registering: a novel approach for markerless augmented reality. In Int'l Symposium on Mixed and Augmented Reality, pp 285-293, Darmstadt, Germany, Oct 2002.

[8] Genc, Y., Riedel, S., Souvannavong, F., Akmlar, C., and Navab, N. Marker-less Tracking for AR: A Learning-Based Approach. In Int'l Symposium on Mixed and Augmented Reality, pp 295-304, Darmstadt, Germany, Oct 2002.

[9] Chia, K. W., Cheok, A. D., and Prince, S. J. D. Online 6 DOF Augmented Reality Registration from Natural Features. In Int'l Symposium on Mixed and Augmented Reality, pp 305-313, Darmstadt, Germany, Oct 2002.

[10] Kato, H. and Billinghurst, M. Marker Tracking and HMD Calibration for a Video-based Augmented Reality Conferencing System. In 2nd Int'l Workshop on Augmented Reality, pp 85-94, San Francisco, Ca, Oct 1999.

[11] Bimber O., Raskar R., “Spatial Augmented Reality - Merging Real and Virtual Worlds”, A K Peters, Ltd., 2005

[12] Manhart, P. K., R. J. Malcolm, et al. (1993). 'Augeye': A Compact, Solid Schmidt Relay for Helmet Mounted Displays. Proceedings 1993 IEEE Virtual Reality Annual International Symposium . Seatle, WA: 234-245

[13] R. Kijima and T. Ojika. “Transition between Virtual Environment and Workstation Environment with Projective Head-Mounted Display.” In Proceedings of IEEE Virtual Reality Annual International Symposium, pp. 130-137. Los Alamitos, CA: IEEE Press, 1997

[14] G. Stetten, V. Chib, D. Hildebrand, and J. Bursee. “Real Time Tomographic Reflection: Phantoms for Calibration and Biopsy.” In Proceedings of IEEE/ACM International Symposium on Augmented Reality (ISMAR’01),pp. 11–19. Los Alamitos, CA: IEEE Press, 2001.

[15] O. Bimber, L. M. Encarnacao, and D. Schmalstieg. Augmented Reality with Back-Projection Systems using Transflective Surfaces. Computer Graphics Forum (Proceedings of EUROGRAPHICS 2000 - EG’2000 19:3 (2000), 161–168.



[16] R. Raskar, J. van Baar, P. Beardsly, T. Willwacher, S. Rao, and C. Forlines.“iLamps: Geometrically Aware and Self-Configuring Projectors.” Transactions on Graphics (Proc. SIGGRAPH ’03) 22:3 (2003), 809–818

[17] T. E. Parks. “Post Retinal Visual Storage.” American Journal of Psychology 78 (1965), 145–147

[18] O. Bimber, S. M. Gatesy, L. M. Witmer, R. Raskar and L. M. Encarnacao. “Merging Fossil Specimens with Computer-Generated Information.” IEEE Computer 35:9 (2002), 45–50]

[19] O. Bimber and R. Raskar. Modern Approaches to Augmented Reality. Conference Tutorial Eurographics, 2004

[20] B. Schwald, H. Seibert, and T. A. Weller. “A Flexible Tracking Concept Applied to Medical Scenarios Using an AR Window.” In Proceedings ofInternational Symposium on Mixed and Augmented Reality (ISMAR’02),pp. 261–262. Los Alamitos, CA: IEEE Press, 2002

[21] G. Goebbels, K. Troche, M. Braun, A. Ivanovic, A. Grab, K. v. Lubtow,R. Sader, F. Zeilhofer, K. Albrecht, and K. Praxmarer. “ARSyS-Tricorder Entwicklung eines Augmented Reality Systems fur die intraoperative Navigation in der MKG Chirurgie.” In Proceedings of 2. Jahrestagung der Deutschen Gesellschaft fur Computer- und Roboterassistierte Chirurgie e.V., Nurnberg, 2004

[22] R. Raskar, G. Welch, K. L. Low, and D. Bandyopadhyay. “Shader Lamps: Animating Real Objects with Image-Based Illumination.” In Proceedings of Eurographics Rendering Workshop, pp. 89–102. Berlin: Springer-Verlag, 2001

[23] R. Raskar, M. Cutts, G Welch, and W. Sturzlinger. “Efficient Image Generation for Multiprojector and Multisurface Displays.” In Proceedings of the Ninth Eurographics Workshop on Rendering, pp. 139–144. Berlin: Springer-Verlag, 1998.



3 Haptic systems

3.1 Haptic technology Several reports have been published in the context of some research projects and are publicly available. Therefore, the aim of this section is proposing an update of what already published, which is duly referenced hereafter. The section starts proposing taxonomy of haptic technology for the classification of available technology and devices according to various dimensions. The following section provides examples of technologies belonging to the taxonomy.

3.1.1 A taxonomy of current haptic technologies Current haptic technology is a varied field. Devices range from the very small to the very large and from the very simple to the very complex. This section will provide an original, systematic taxonomy of all currently known types of haptics devices. Every large overview of a new field faces the problem of organizing the material in a logical and coherent way. In a very new and inherently unorganized field, a chronological, historical overview may be the logical choice. Such an overview will show the winding paths that previous workers have taken, reviewing the devices produced so far, without passing a final judgment on the most suitable type of technology as yet. The excellent overview produced by the ENACTIVE Network of Excellence [1] is a prime example of this strategy. A more systematic approach may be taken, if the purpose of the overview is limited to a clear subset of the devices. The T'nD project produced a ranking of the devices in its applicable subset of devices, along dimensions of performance metrics [2]. Hayward [3] orders his own recent work on scales of size and level of rendering detail. We will follow this idea, but applied much more rigorously, to the whole field of haptics known today.

3.1.2 Possible dimensions in the taxonomy The common denominator of all haptic devices is that they all seek to make an impression on the tactile or kinesthetic senses of the user. These senses act on different size scales, the smallest scale being perceived within the skin (tactile sense), and the larger scales mostly in tendons, joints and muscles (kinesthetic senses, usually called haptic in the more limited sense of the word). A closely linked issue is the relationship between the total physical workspace of the device, and the part of that workspace that is being displayed simultaneously at any given point in time. This is related to the necessary number of degrees of freedom (DOFs) of the device, which we will discuss in some detail later. The level of detail in the haptic world could be termed as the "spatial frequency content" of the haptic experience. Another important frequency is the temporal one. Devices will vary in their frequency response, i.e. the speed with which they can react to external forces, or display virtual forces to the user. This quality is closely linked to the range of haptic experiences that a single device can stably provide, ranging from the sensation of free air, to that of a high stiffness surface. The range of impedances that a device can render is sometimes called the "Z-width" of the device. Other important issues are the absolute size of the workspace, maximum forces, as well as the co-location in time and space (co-registration) of the haptic modality with other modalities, typically graphical and auditory. Lastly, there are a number of implementation issues including the type of control and the type of drive used, the kinematics and whether they are grounded to the world or to the user, and the question whether the device is of the full contact type, or of the encountered type. We will discuss the various dimensions of this taxonomy in some more detail in this section, and then give an overview of the haptic field ordered along these dimensions, in the next section.



3.1.3 Size scales Human users can haptically perceive the world around them on scales ranging from the reach of the human arm (order of 1 meter), through that of the spreading and grasping fingers of the hand, (order of 0.1 meter), via details felt by the skin of the finger tips and palm of the hand (order of 0.01 to 0.001 meter, i.e. down to 1 millimeter), all the way down to surface roughness or texture which may live at the micrometer level, and which can only be explored by sliding the fingers over a surface. The first scale is perceived by movements of the arm, with receptors in joints, tendons and muscles. It is usually referred to as the kinesthetic or haptic scale. The second scale is felt by exploring and grasping movement of the fingers, perceived through a combination of kinesthetic clues and the skin deformation indicating the orientation of the surface touched by the fingers. It is usually referred to as tactile/haptic. The third scale is explored by various receptors in the skin, and referred to as tactile. The fourth scale is perceived by high-frequency receptors in the skin, and is often referred to as vibrotactile. A complete haptic interface would be able to display all of these scales.

3.1.4 Degrees of freedom (DOFs) The number of degrees of freedom indicates the level of detail that the device can render simultaneously at any given time.

a) Point-based force feedback interfaces

Classical haptic devices are point-based, i.e. they give force feedback in a single point. A general force has components in the three directions XYZ, so these devices are 3-DOF. They typically have three motors (or brakes, in the case of passive devices). The natural extension is to render torques in the same point of interest. This leads to 6-DOF devices, usually joystick-like interfaces which exhibit torques as well as forces. The six degrees of freedom can also be used to render forces in two closely spaced points, which the user can grasp with two fingers. The torque around the connecting line between the two devices is traded for control over the distance between the two fingers, so pinching and grasping forces can be rendered. Such devices do not exist as a product, but they can be built by combining two 3-DOF devices, although this usually leads to workspace conflicts.

b) Tilting (and curving) surfaces

Some experimental devices add local degrees of freedom to the end-effector of point-based force feedback devices. These may take the shape of small plates tilting under the finger, or thimbles tilting on a movable roller. Sometimes these effects take the place of "normal" degrees of freedom, substituting a haptic illusion for true 3-D motion (Morpheotron).

c) Shape and contour interfaces Shape interfaces exist at various scales, from the larger contour devices, to tactile "pincushion" type devices. They are usually based on some form of grid. The number of degrees of freedom often equals the number of grid points, but in some cases each grid point can do more than just rise vertically, and the number of DOF's may be even larger. In any case, the number of DOF's of shape interfaces is usually much larger than in point-based force feedback interfaces, but the workspace of each individual actuator is usually much smaller.

3.1.5 Grounding and kinematics Haptics devices can be either fixed to the inertial world (grounded), or worn by the user, i.e. attached to the user's shoulder, arm or finger ("exoskeleton" or "wearable"). Examples will be shown and discussed in the next section.

A mechanical division which is sometimes relevant is that between "serial" and "parallel" mechanisms, depending on whether the actuators all follow each other link by link, as in classical robotic arm, or whether they all link directly to the end-effector, as in a classical six-leg flight simulator motion platform. Hybrid forms do exist, and a well known 3-DOF parallel robot is the Force Dimension "Delta" robot (see Appendix A for details).



Mechanical redundancy is sometimes used to increase the rotational workspace, or to avoid difficult poses known as "singularities" or "gimbal lock". Redundancy means that more actuators are used than the final number of DOF's at the end-effector would strictly require. A special form of redundant, parallel mechanisms is formed by the "wire" robots like the Spidar. Here, the legs are replaced by cables. One or two extra cables are always needed to keep the other cables pretensioned.

3.1.6 Drive type Haptic devices can be active, passive or hybrid. Active devices have some form of drive capable of adding energy to the device. This drive can be electrical, hydraulic, pneumatic, etc. We will leave out some exotic devices based on magnetic levitation or other technologies which from first principles will never be able to reach the levels of force of workspace relevant to the applied technology community. Passive devices employ some form of braking, e.g. electro-mechanical, or electro-rheological. Hybrid devices use braking for large forces and stiffness, and power for the more subtle effects. A special case is formed by nonholonomic devices, which use the sideways blocking action of actively orienting wheels, to define surfaces passively. The name "cobots" has been coined for such robots, to emphasize their safety in the presence of human users. Since the non-side slipping wheels act in a way similar to brakes, cobots form a special family of the hybrid devices.

3.1.7 Control type Force feedback devices come in two types: impedance controlled and admittance controlled. The majority of devices are impedance controlled. Impedance controlled devices are mechanically designed to "render" free air, i.e. low mass and low friction when passive, and to render virtual walls by energizing the motor to give to the user a resisting force. Their causality paradigm is: the user inputs displacement into the device and the device responds with force. Admittance controllers are the dual. They carry a force sensor at the interface to the user, and their causality paradigm is: the user inputs force to the device, and the device responds with a displacement. Admittance controlled devices are usually built much stiffer and more robust, since their internal displacement controller loop can cancel their own friction, and to a large extent also their own inertia.

3.1.8 Contact type A major dichotomy in haptics devices is whether the user is in continuous contact with the device even if the user is not touching anything in the virtual world, or whether the device is only contacted when something is "touched" in the virtual world. The latter type is called the "encountered" type of device. Most force feedback devices are of the first type and most shape and tactile devices are of the latter type; but there are exceptions to this rule.

The classical VR application of a control stick or a steering wheel in a flight simulator is technically an encountered device, although the user stays in full contact with it during most of the simulation. The force feedback part of the device is more like a classical full contact device, but the shape of the stick and the first contact with the surface is definitely encountered. Similar considerations apply to medical simulators for minimally invasive surgery, which typically use scissors handles as an interface to the user.

More general encountered devices come in two distinct varieties. There are those devices which try to copy the whole virtual object, which remains stationary in the workspace. These devices need a great level of detail, but do not need to adapt very fast.

Then there are devices which copy the shape of a small part of the virtual object. The device needs to adapt its shape quickly, but does not need the full number of DOF's to render the whole object. Pin-cushion type tactile devices typically fall into this category, but larger shape displays also qualify. The shape of the part of the virtual object which is momentarily rendered depends on the position of the user's hand in the workspace. A limiting case is where the shape display is very simple, perhaps just a flat surface oriented in the same direction as the surface of the virtual object. Taxonomy dimensions are summarized in the following table.



Taxonomy dimensions Description

SIZE SCALES Human perception of the world according to scaling factors.

DOF Level of detail that the device can render simultaneously at any given time.

GROUNDING AND KINEMATICS Haptic device is fixed to the inertial world or it is worn by the user.

DRIVE TYPE Active device, passive device, hybrid device.

CONTROL TYPE Impedance controlled devices. Admittance controlled devices.

CONTACT TYPE Continuous contact of the user with the device. "Encountered" type of device.

Table 2: taxonomy of haptic devices

3.2 Haptic devices

3.2.1 Existing force feedback displays Currently on the market and in the research area there are a number of different haptic devices, most of them oriented to specific tasks or designed for specific interaction. The current section intends to provide a short overview of the most relevant technologies available divided by the function they perform.

3.2.2 1-DOF and 2-DOF displays There is a number of mostly experimental designs of haptic machines for a very limited number of degrees of freedom. These devices are only relevant as design exercises for devices with three or more degrees of freedom. Currently the most renowned one at the moment is the commercially available knob produced by Immersion Corp. mounted on BMW vehicles with the name of iDrive. Because of the intrinsic limited functionality of these devices, actually applicable for graphical menu navigation, they are not listed here extensively.

3.2.3 3-DOF and 6-DOF displays The largest workspaces of all haptic devices are typically spanned by classical point based force feedback devices. These devices typically have a low number of degrees of freedom, ranging from 3 (point-based force feedback interfaces), to 6 (including torques). The level of spatial detail which these devices can render may be quite high, but this detail is not rendered simultaneously. This is why we consider these devices to be on the one extreme end of our taxonomical continuum. They have the largest workspace, and the lowest level of simultaneous detail of all haptic devices considered.

a) Moog-FCS HapticMaster The Moog-FCS HapticMaster is a 3DOF haptic device is the only admittance-controlled haptic device available on the market. Typically this device is used for rehabilitation tasks (for measuring the motion performance of the patient) or for training simulation task (for welding simulation, for example). Recently this device has been implemented in shape modelling operations with the European Touch and Design project. The great benefit of this device is the large amount of displaying forces (250N) that make this one of the strongest devices available besides the intrinsic spatial accuracy. The working environment, compared to other grounded displays is extremely capable and encompasses a volume of approximately 400mm X 300mm by 4000mm.



Figure 14: Moog-FCS HapticMASTER

b) Force Dimension Delta The Force Dimension Delta and Omega devices are very nice commercial implementations of a 3-DOF parallel robot architecture called the "Delta robot". It offers 3 active degrees of freedom in translation and was designed to display high quality kinesthetic information, it can convey a large range of forces (20N) over a relatively large work space (360mm X 300mm) maintaing an high accuracy (0.03mm) thanks to its electromagnetic brakes.

Figure 15: Force Dimension Delta

c) SensAble PhanTom The PhanTom is the very first haptic device developed and marketed. On the market currently there are two different versions: the PhanTom Omni and the PhanTom Desktop. Both of them are 6 DOF haptic devices (3 degrees in translation and 3 degrees in rotation) with an working space of 160mm X 120mm X 120mm (data referred to PhanTom Desktop model) and with a maximum exertable force of 7.9N and 0.03mm of nominal resolution. Contrary to the Moog-FCS Haptic Master, it is based on impedance controlled, consequently the user has some inertia forces displayed while performing the task. What makes this tool particularly interesting is the fact that it is commercially available for a relatively low price (around 2000 €) and is applicable for a wide range of simulation and modelling tasks.



Figure 16: SensAble Phantom Desktop

d) SPIDAR G Sato [Sato; 2004] created one of the best known examples of the parallel cable-type device, the SPIDAR (SPace Interface Device for Artificial Reality) that is an efficient still cost effective haptic interface with 7 DOF (version G). This device is based on 4 servo motors positioned at the edges of the working cage plus (in the latest version) another motor positioned in the end effector (typically a ball) for providing a further rotational degree of freedom. The great benefit of this device resides in the fact that it is based on a scalable architecture proving therefore different workspaces dimensions (from desktop based applications to immersive ones) with different force releases. Currently this haptic interface is not marketed but it is simply a research work done at Tokyo Institute of Technology.

Figure 17: Haptic interface SPIDAR G

3.2.4 Exoskeleton or humanoid type To this category belong two main typologies of haptic interfaces, depending on their architecture: they can be grounded or wearable.

a) Grounded



The serial, grounded "exoskeleton" or humanoid robots at first sight do not distinguish themselves very clearly from the serial, grounded "opposing" type discussed before. However, we will distinguish them here on the basis of their typical "cooperating" type of setup. This type of exoskeleton or humanoid arm is typically grounded near the user's shoulder, and the kinematics and number of degrees of freedom closely follows that of the human arm and, in some cases, the human hand. The number of actuator degrees of freedom is typically larger than in a serial device of the "opposing" type, and sometimes redundancy is present in the number of actuators, relative to the number of degrees of freedom of the final end effector.

PERCRO Arm Exoskeleton The PERCRO grounded exoskeleton and hand is one of the highest quality haptic research devices found to date. Virtual objects can be touched more or less convincingly, but exploring the surface, and especially with more than one finger, is still unsatisfactory.

Figure 18: PERCRO Arm Exoskeleton

b) Wearable Wearable exoskeletons usually are of the hand or finger type. Reaction forces are set off on the lower arm or shoulder of the user, so high forces cannot be rendered realistically. Some of the less spectacular wearable haptic arm types will be mentioned further down under combination devices. Active wearable devices are almost always of the hand-interface type, and we will discuss them under the grasping interfaces.



3.2.5 Existing grasping displays Grasping displays are typically the next step down from point-based interfaces in terms of workspace, and the next step up in (extra) number of degrees of freedom. The workspace is typically that of the human hand, and the number of degrees of freedom is that of the number of fingers. There are two common types of grasping display: one strategy is to use two or more point-based interfaces to render forces to index finger and thumb; the other strategy is to add brakes or motors to the bending of each individual finger. These devices are typically wearable, and the reaction forces are typically set off on the wrist or lower arm of the wearer.

PERCRO Pure Form hand exoskeleton The Pure Form project uses a combination of wearable and grounded haptic devices to present force feedback to two separate fingers in a large workspace. The "wearable" two finger exo-skeleton can be mounted on to the grounded "Exos" arm exoskeleton, to create a grounded 5-DOF haptic display for two fingers.

CyberGrasp The Cybergrasp device is a powered glove where the fingers are driven by Bowden cables (push-pull cable in a sheath). It can be worn on the lower arm, although a lot of flexible cabling to ground restricts the movements of the user. It can also be combined with the Cyberforce haptic arm. The overall impression of both devices is very bad. Neither strong nor precise forces can be rendered, and the impression of touching a virtual object is never reached. Only some random jerky motions are perceived when approaching a virtual object.

Figure 19: Immersion Corporation CyberGrasp

Existing tactile displays Tactile displays are invariably of the rectangular grid type. The most common is the "pin cushion" type of vertical pins with small (millimeter) vertical displacement. Electrical, pneumatic suction and vibrating stimulation also occur. Hayward's sideways moving "comb" type pins are a recent innovation.



Harvard tactile display using RC servomotors

The Harvard tactile display design [4] uses an array of stimulators that contact the skin to achieve a force distribution on the fingertip. This system, using RC servomotors, can achieve a high bandwidth, high actuator density, large vertical displacement, and firm static response for a relatively low cost and simple construction. Concerning the resolution it actuates vertically a 6x6 array of 36 mechanical pins at 2 mm spacing to a height range of 2 mm with a resolution of 4 bits with a relatively fast response: a vertical displacement of 2 mm, the 10% to 90% rise time is 41ms.

Figure 20: Hayward Tactile Display

SmartTouch

This is an electro-coutaneous skin stimulator from Tokyo University [5], substituting electrical impulses to the tactile nerves for physical pins. The prototype of SmartTouch is composed of two layers. The top layer has a 4 by 4 array of stimulating electrodes on the front side of a thin plate, while the bottom layer has optical sensors on the reverse side of the plate. Visual images captured by the sensor are translated into tactile information, and displayed through electrical stimulation.

Figure 21: SmartTouch tactile display

STReSS The STReSS device is of the pin-cushion type, but with sideways displacement of skin. The device is an interesting variation on the normal pin cushion type of display. The underlying idea is that skin sensations can be aroused by stretching the skin sideways, instead of pushing it in and out. A comb-like set of piezo strips is pulled sideways to generate tactile sensations. The principle is promising, but prototypes still leave something to be desired.



The idea that tactile sensation is generated purely by shear deformation of the skin, not by any actual depth cues, is supported by psychophysical experiments where the skin is locally sucked in by a small air orifice in a flat surface. The resulting sensation is that of a pin prick. Apparently the skin has no way of knowing whether it is being stretched by an inward or an outward local "bulge", and the physical; default expectation is that it is being pushed in by an object from outside.

3.2.6 Existing vibro-tactile and friction displays The sensation of "incipient slip" is important to grasping and lifting (virtual) objects. We tend to pick up things by pinching just hard enough to prevent them from slipping through our fingers. The phenomenon of slipping is sometimes referred to as "tactile flow" in the psychophysical literature, and some research has been performed on barber-pole type illusions that can occur in the same way as in optical flow. No really feasible device has yet been developed to display the tactile flow sensation. Two ways seem open. Either the sensation is represented by vibratory inputs to the skin of the fingers with a "conventional" tactile display, or the actual slip condition is physically applied. The latter could be done conceivably by a miniaturised treadmill device. Full size treadmills are sometimes fitted onto moving base platforms to form a combination device for "displaying" an undulating road for subjects to walk on. Similarly, a "miniature belt sander" could be mounted upside down on a haptic device to display the slipping surface, or to keep the display surface stationary in inertial space while the haptic base moves under it. Changing the direction of slip movement quickly would remain an unsolved problem with a belt type device. The Johns Hopkins University did some experiments with an inverted trackball.

Exeter vibratory tactile display Summers created a vibratory tactile display capable of rendering vibrations in the range from 25 [Hz] up to 400 [Hz]. It uses a 10 X 10 array of points, individually driven by piezo-electric actuators. The device has been used to do psychophysical research into skin simulations using patterns of vibration. No explicit effort into simulating the (incipient) slip condition was reported.

Johns Hopkins slip display The Johns Hopkins tactile haptic slip display is a motor driven trackball. The user's finger rests on the ball, and experiences slip when the ball rotates. The device has been mounted onto a Phantom haptic device, and has been used for some basic psychophysical research on JND's (Just Noticeable Differences) in tactile slip perception. The reference also points to a publication by Chen and Marcus on an earlier "belt sander" type of 1-DOF slip display.

3.2.7 Conclusion on the state of the art in large-scale haptic displays

When we survey the field of existing haptic devices, we come to the following conclusions: no satisfactory demonstration of the sense of touching and exploring the shape of an object using point-based devices with powered gloves and/or tactile interfaces has been reported to date. Exploration of simple, generic predefined objects in virtual space has been demonstrated by Tachi, but the usefulness is very limited. In fact, exploration of an arbitrary virtual surface by hand has never been demonstrated by any haptic device. Preliminary results using haptic illusions have been obtained for very low curvature surfaces, in Hayward [6]. The European Touch and Design project has shown an early prototype of a shape display for surfaces with local order of curvature up to G2, i.e. with single degree of curvature in two arbitrary principal directions. The area is very new indeed.



3.3 Haptic libraries • CHAI3D • OpenHaptic Toolkit • H3D/VHTK • Haptik • OpenSceneGraph Haptic

3.3.1 CHAI3D Brief description: CHAI 3D is an open source, freely available set of C++ libraries for computer haptics, visualization and interactive real-time simulation. CHAI 3D supports several commercially-available three and six-degree-of-freedom haptic devices, and makes it simple to support new custom force feedback devices. CHAI 3D is especially suitable for education and research purposes, offering a light platform on which extensions can be developed. CHAI 3D's support for multiple haptic devices also makes it easy to send your applications to remote sites that may use different hardware. In short, CHAI 3D takes an important step toward developer-friendly creation of multimodal virtual worlds, by tightly integrating the haptic and visual representations of objects and by abstracting away the complexities of individual haptic devices. Features

• OpenGL-based mono or stereo graphics • Graphic and haptic rendering and collision-detection for triangle meshes • Importing of model files in .obj and .3ds format • Haptic proxy for smooth interaction with surfaces • A virtual haptic device, for running and testing haptic applications when a physical

device is not available • Communication with several digital/analog I/O boards • Integration with ActiveX for web-embedded haptic applications • Integration with ODE (Open Dynamics Engine) for haptic interaction with rigid body

simulations (beta) Supported Hardware

• Force Dimension DELTA and OMEGA haptic devices • SensAble PHANToM haptic devices • MPB Freedom 6S haptic devices • Stereo glasses • Servotogo boards • Sensoray626 boards

Supported Environments

• Microsoft Visual Studio 6 • Microsoft Visual Studio .net • Borland C++ Builder • Cygwin / gcc • Linux / gcc (beta)



Technical specs: Website: http://www.chai3d.org License: http://www.vtk.org/copyright.php

3.3.2 OpenHaptic Brief description: The OpenHaptics™ toolkit enables software developers to add haptics and true 3D navigation to a broad range of applications, including 3D design and modeling, medical, games, entertainment, visualization, and simulation. This haptics toolkit is patterned after the OpenGL® API, making it familiar to graphics programmers and facilitating integration with OpenGL applications. Using the OpenHaptics toolkit, developers can leverage existing OpenGL code for specifying geometry and supplement it with OpenHaptics commands to simulate haptic material properties such as friction and stiffness. The extensible architecture enables developers to add functionality to support new types of shapes. This toolkit is also designed to integrate third-party libraries such as physics/dynamics and collision detection engines. The OpenHaptics toolkit supports the range of SensAble™ PHANTOM® devices, from the low-cost PHANTOM® Omni™ device to the larger PHANTOM Premium devices. The current version of the toolkit supports Microsoft® Windows® XP and 2000. The OpenHaptics toolkit includes the Haptic Device API (HDAPI), the Haptic Library API (HLAPI), utilities, PHANTOM Device Drivers (PDD), and source code examples. The HDAPI provides low-level access to the haptic device, enables haptics programmers to render forces directly, offers control over configuring the runtime behavior of the drivers, and provides convenient utility features and debugging aids. The HLAPI provides high-level haptic rendering and is designed to be familiar to OpenGL® API programmers. It allows significant reuse of existing OpenGL code and greatly simplifies synchronization of the haptics and graphics threads. The PHANTOM Device Drivers support all shipping PHANTOM devices. The HDAPI provides low-level access to the haptic device, enables haptics programmers to render forces directly, offers control over configuring the runtime behavior of the drivers, and provides convenient utility features and debugging aids. The HLAPI provides high-level haptic rendering and is designed to be familiar to OpenGL® API programmers. It allows significant reuse of existing OpenGL code and greatly simplifies synchronization of the haptics and graphics threads. Technical specs: Website: http://www.freeform-modeling.com/products/phantom_ghost/OpenHapticsToolkit-intro.asp Compatible platforms: Microsoft® Windows® XP and 2000 License: free for academic users

3.3.3 H3D/VHTK Brief description: H3D API is an open-source, cross-platform, scene-graph API. H3D is written entirely in C++ and uses OpenGL for graphics rendering and OpenHaptics for haptic rendering. H3D is built using many industry standards including -

• X3D - http://www.web3d.org - the Extensible 3D file format that is the successor to the now outdated VRML standard. X3D, however, is more than just a file format - it is an ISO open standard scene-graph design that is easily extended to offer new functionality in a modular way.

• XML - http://www.w3.org/XML - Extensible Markup Language, XML is the standard markup language used in a wide variety of applications. The X3D file format is based on XML, and H3D comes with a full XML parser for loading scene-graph definitions.



• OpenGL - http://www.opengl.org - Open Graphics Library, the cross-language, cross-platform standard for 3D graphics.

• STL - The Standard Template Library is a large collection of C++ templates that support rapid development of highly efficient applications.

H3D has been carefully designed to be a cross-platform API. The currently supported operating systems are Windows XP, Linux and Mac OS X, though the open-source nature of H3D means that it can be easily ported to other operating systems. Unlike most other scene-graph APIs, H3D is designed chiefly to support a special rapid development process. By combining X3D, C++ and the scripting language Python, H3D offers you three ways of programming applications that gives you the best of both worlds - execution speed where performance is critical, and development speed where performance is less critical. Using our unique blend of X3D, C++ and Python, you can cut development time by more than half when compared to using only C++. Technical specs: Website: http://www.h3d.org Compatible platforms: Windows XP, Linux and Mac OS X License: GNU GPL but also commercial available

3.3.4 Haptik Brief description: The Haptik library is an open source library with a component based architecture that acts as an Hardware Abstraction Layer to provide uniform access to haptic devices. It does not contain graphic primitives, physics related algorithms or complex class hierarchies, but instead exposes a set of interfaces that hide differences between devices to the applications. In this way any dependency for a particular hardware device, driver version, sdk related dlls is removed from both code and executables, thus ensuring that your application will always run on any systems no matter what drivers you have installed or not (and even if no real device is present). Support for each class of devices is provided through dynamically loaded plugins so its easy to add support for new devices and features. At runtime, and only when needed, Haptik loads the plugins and query each one of them for the set of supported devices, and present these informations to the application. Dynamic loading allows removing compile-time dependency from a particular set of dlls. In this way, through Haptik, applications always use the highest available driver version for a particular device but can run even on older ones. For example Phantom devices support is provided by three plugins: Phantom42, Phantom40, Phantom31, each one tailored on a particular driver version. Haptik has a very simple API. Three simple lines of code are enough to start using any device, and some useful features, like device enumeration, automatic default selection, device info querying, auto recalibration, are already built in. The Haptik Library is easy to adopt even in already existing projects because it has an absolutely non-invasive API, and supports many different execution models (callback based or polled access) and callback schemes (procedures, object-method) to best fit requirements of different applications. It easily integrates with both OpenGL and DirectX based applications by supporting both coordinate systems and providing data in a format that can be directly used by both APIs. Currently bundled plugins contains support for devices from SensAble, Force Dimension and MPB Technologies, software only virtual devices to be used when no hardware is attached and some extended features such as recording/playback of device movements as well as network transparency through any TCP/IP based network.



Haptik is written in C++ but it can be used with many different languages/environments such as Matlab and Simulink, and from Java applications and applets. Current Feature List

• Device Enumeration • Device Info • Default device automatic selection • Automatic Driver Fallback • Multithreaded-aware Internal Synchronization • Support for both callbacks and polling • Support for direct Object/Method invocation • Configurable Callback Rates • Configurable Force Feedback Cutoff • Support for left-handed (DirectX) and right-handed (OpenGL) coordinate system • Per-application and per-system configuration • Centralized Logging Facilities • Debug Output to development IDEs • Dynamic Loading/Unloading of unused Plugins • Torque Support • Bundled Plugins for all SensAble PHANToM devices (PHANToM Desktop, Premium,

Premium 6DoF and Omni) on drivers version 4.2(required for Omni), 4.0, 3.1 • Auto recalibration on PHANToM Desktops • Bundled Plugin for ForceDimensions Delta 3DoF/6DoF and Omega devices on drivers

version 3.1, 3.0 • Bundled Plugin for MPB Technologies Freedom and Cubic devices • Bundled Plugin for Haption Virtuose devices • Multithreaded mouse-based virtual device • Remote devices support through TCP/IP networks • Recording playback capabilities • Java JNI Interface • Matlab Mex Interface • Simulink Interface • Python binding

Technical specs: Website: http://www.haptiklibrary.org Compatible platforms: platform independent License: source distribution GNU GPL version 2, The Binary Distribution is released under a modified MIT license

3.3.5 OpenScenceGraph Haptic Brief description: OpenSceneGraph Haptics Library (osgHaptics) incorporates haptic rendering into one of the most versatile scene graphs available. osgHaptics is written using portable C++ ontop of OpenHaptics, a comercially available low-level toolkit from Sensable Inc. Together with OpenSceneGraph Audio Library (osgAL) a complete multimodal development framework for realtime applications is available. osgHaptics and osgAL is developed at VRlab, Umeå University, Sweden. http://www.vrlab.umu.se) osgHaptics depends on OpenSceneGraph and OpenHaptics. It has been compiled with OSG 1.2. Download and build OpenSceneGraph install header files and library files where the compiler can find them.



Technical specs: Website: http://sourceforge.net/projects/osghaptics Compatible platforms: OS Portable (Source code to work with many OS platforms) License: GNU Library or Lesser General Public License (LGPL) but depends on OpenHaptic

3.3.6 Haptic libraries overview

CHAI3D OpenHaptic Toolkit H3D/VHTK Haptik OSG Haptic

License GNU GPL commercial GNU GPL / commercial GNU GPL

depends on OpenHaptic

development environment C++ C++ Python/XML/C++ Java/Matlab/C++ C++ integration with other libraries yes yes yes yes yes

Working Devices all but FCS only PHANToM

PHANToM and FCS

all but FCS (scheduled) only PHANToM

3.4 Reference [1] ENACTIVE, European Network of Excellence - haptic interfaces reports & pubblications; ENACTIVE; web page:https://www.enactivenetwork.org/index.php?68/enactive-haptic-interfaces; Page accessed on:Apr 20, 2007 [2] T'nD, project; (2005). "Deliverable 1, Hardware and software technology and cognitive ergonomics update". Politecnico di Milano, Think3, FCS, Universitè de Provence, published on: 8/4/2004 pp.1-54; report number: TnD-1-PoliMI-R-04001-1 [3] Hayward, V.; (2001). "Survey Of Haptic Interface Research At McGill University"; Proceedings of: Workshop on "Advances in Interactive Multimodal Telepresence Systems"; Hieronymus Buchreproduktions GmbH; Munich, Germany. [4] Wagner, C.R.; Lederman, S.J.; Howe R.D.; (2002). "A tactile shape display using RC servomotors "; Proceedings of: Haptic Interfaces for Virtual Environment and Teleoperator; IEEE Publishing; Orlando, USA. [5] Kajimoto, H.; Inami, M.; Kawakami, N.; Tachi, S.; (2003). "Smart Touch - augmentation of skin sensation with electrocutaneous display"; Proceedings of: Haptic Symposium; Mar, Los Angeles, USA. [6] Hayward, V.; Astley, O.R; Cruz Hernandez, M.; Grant, D.; Robles-de-la-Torre, G.; (2004). "Tutorial: Haptic interfaces and devices". Sensor Review; Vol.: 24; # 1 pp. 16-29.



4 Interactive simulation systems

4.1 CFD analysis technologies Computational fluid dynamics (CFD) is one of the branches of fluid mechanics that uses numerical methods and algorithms to solve and analyze problems that involve fluid flows. The fundamental basis of any CFD problem are the Navier-Stokes equations, which define any single-phase fluid flow. These equations can be simplified by removing terms describing viscosity to yield the Euler equations. To approach an interactive CFD analysis three different problems have to be studied separately:

1. Definition of geometry and mesh generation: starting from geometries modelled in a CAD software, a control volume around the geometry is created. The volume of the fluid to be analyzed can be generated as the Boolean difference between the control volume and the geometry of the object. Then the 3D mesh of the fluid volume is generated. Moreover there should be the possibility to modify the geometry and thereby to change the position of the nodes of the mesh.

2. Solution of the CFD problem: Once the mesh and the boundary conditions are

defined, the software has to solve the Navier-Stokes equations and generate the velocity field around the geometry. Since the simulation has to be interactive, it should be helpful to use a fast solver, so that it can update quickly the streamlines when changing the geometry.

3. Post-processing of solution data: It should be useful to save post-processing data –

velocity field, with x, y and z-component for each node of the mesh – in ASCII code, so that it is possible to create a custom visualization.

In order to find an appropriate software for interactive simulations, different CFD solvers have been analyzed, looking at the compatibility of input files, performances (speed of the analysis), and structure of output data. The three CFD solvers analyzed are:

1. Deal II, (www.dealii.org) 2. OpenFlower, (http://openflower.sourceforge.net) 3. Comsol Multiphysics. (www.comsol.com)

4.1.1 Deal II Deal II (Differential Equations Analysis Library) is a C++ programming library for the computational solution of partial differential equations using adaptive finite elements. It uses the Discontinuous Galerkin Method (DGM) discretization algorithm, therefore it is one of the fastest solver available. However, it works only with meshes made up by quadrilateral (2D) or hexahedral (3D) elements. The lack of triangular and tetrahedral elements is crucial for our target, since quite often tessellate objects are made up by triangles. Therefore, even if this solver is very fast, it can not be used at all.

4.1.2 OpenFlower OpenFlower is an open source CFD software library written in C++. It has to be used together with another software – Gmsh – that is used to create the geometry, create the mesh and, after finding the solution with OpenFlower, see the post-processing data.



Figure 22: A sphere inside the control volume in Gmsh

Figure 23: Mesh generation in Gmsh

Gmsh does not support CAD files import, and it read only input files with .geo extension. GEO files are written in ASCII code and the syntax is very simple:

Point(1)={x-coord, y-coord, z-coord}; //definition of a point

Line(3)={1,2}; // definition of a line between two points

Surface(6)={3,4,5}; // definition of a surface between three lines.

Unfortunately CAD software does not support this file format, so it is not possible to open directly in Gmsh a geometry file created with another software. For this reason a c++ code to convert STL files in GEO files has been created. When saving a geometry in STL format, the software divides the surfaces in triangle (as shown in fig.11):



Figure 24: A simple STL geometry used to test the STL2GEO converter

In STL files each triangle is defined by a loop:

outer loop vertex -4.556993e-001 8.901338e-001 -6.123032e-017 vertex 2.487240e-002 9.996907e-001 -6.123032e-017 vertex 2.312606e-002 9.295006e-001 -3.680947e-001 endloop

To convert this kind of files in GEO files it has been necessary to read all the loops and to define (writing in an output file) for each loop vertices, lines and surfaces, checking each time if vertices or lines were defined before:

for ( i=0; i<geodata.vVertex.size(); i++){ outss << "Point(" << i+1 << ") = {" << geodata.vVertex[i].x << " , " << geodata.vVertex[i].y << " , " << geodata.vVertex[i].z << " , 0.1};"<<endl;

} for (j=i; j< geodata.vLine.size()+i; j++){

outss << "Line(" << j+1 << ") = {" << geodata.vLine[j-i].a+1 << " , " << geodata.vLine[j-i].b+1 << "};"<<endl;

}

Moreover the code has been created so that, once the conversion is done, a control volume is defined around the input geometry; in fig. an example of conversion from STL to GEO is shown.



Figure 25: The STL sphere converted in GEO file format and imported in Gmsh

After the conversion, and once created the mesh, it is necessary to write a text file with .flw extension, in witch all the boundary conditions are defined, together with the name of the mesh file (*.msh) created with Gmsh, and the post processing preferences:

... Mesh gmsh { file filename.msh frontiers { surface 600001 inlet surface 600002 outlet surface 600000 walls } } ... boundary_condition wall walls 0. 0. 0. boundary_condition velocity inlet 100. 0. 0. boundary_condition pressure_outlet outlet 0. ...

To start the analysis, a simple command line is needed:

OpenFlower filename.flw

Once the analysis is ended, the post-processing files created (*.pos extension) can be visualized with Gmsh:



Figure 26: The result of the CFD analysis generated by OpenFlower and visualized by glyphs

At this point it is necessary to export numerical data in order to visualize them into an external post-processing environment.

4.1.3 Comsol Multiphysics Comsol Multi-physics is a modeling package for the simulation of many physical processes that can be described with partial differential equations. Comsol is surely more user-friendly than OpenFlower, and it can be used together with the software Matlab. In Comsol it is possible to create directly the geometry desired, but it is also possible to import different CAD files (IGES, STEP, STL …). It is possible to use Comsol with the Graphical User Interface, but also by running a script. In the following images an example of simulation is shown: Creation of geometry and control volume This is one of the main problem of Comsol, because, even if it can theoretically import a wide range of CAD files, it often generates many errors while importing tessellated file formats, (e.g. STL, WRL, etc.).

Figure 27: A sphere created in Comsol.



Generation of the mesh Comsol has a powerful meshing functionalities, but we experienced many problems when the 3D model is imported as a tessellate geometry. In other words, actually we can not mesh the control volume obtained by Boolean difference between a larger control volume and the imported tessellated geometry.

Figure 28: A simple mesh correctly generated by Comsol.

Figure 29: A tessellated geometry wrongly imported by Comsol.

Simulation As regards the simulation, Comsol is quite flexible. It is possible to modify by hand the PDE system, and it is possible to select the resolution algorithm among a wide variety of solver-types. We employed the GMRES algorithm because it is the best compromise between accuracy and speed.

Visualization of post processing data In Comsol it is also possible to save the velocity coordinates of all the nodes in a text format, so it is possible to create a different visualizations. In order to automatically export the results of the simulation it is possible to use the Matlab script interface. Using this interface in fact it is possible to generate three text files containing the information about the points, the information about the cells and the result of the simulation (e.g. velocity or pressure) for each point.



Figure 30: A post-processed solution generated by Comsol.

4.1.4 Benchmark of the selected CFD solvers In order to establish which of the previously mentioned solvers should be employed, a simple benchmarking session has been carried out. The test was performed on a sphere invested by a flow. The boundary conditions, applied to a control volume generated all around the sphere, are the inlet velocity and the outlet pressure. The mesh is different because the mesh generation has been done using different meshing software: In Comsol the mesh was generated inside Comsol, while Gmsh was employed to generate the mesh for OpenFlower. In both cases, the time is referred only to the numerical simulation, i.e. the meshing time and the post-processing time are not included. We employed the GMRES algorithm because it seems the best compromise between accuracy and speed. From the results it can be inferred that OpenFlower is faster than Comsol, by a factor of 40.

Comsol Benchmarking Test Result: • 1572 elements • U0 = 1 m/s • P1 = 1 atm • Solver : GMRES • Computation time: 14,23 s OpenFlower Benchmarking Test Result: • 4728 elements • U0 = 1 m/s • P1 = 1 atm • Solver : GMRES • Computation time: 1,58 s

Machine features: • Intel Dual Core 2 E6400 2,13 GHz • 2 GB RAM • Linux 64bit



4.2 FEM analysis technologies

4.2.1 Introduction Finite Element Method (FEM) or Finite Element Analysis (FEA) is a computer-based numerical technique for calculating the strength and behavior of engineering structures. It can be used to calculate deflection, stress, vibration, buckling behavior and many other phenomena. It can be used to analyze either small or large-scale deflection under loading or applied displacement. It can analyze elastic deformation, or "permanently bent out of shape" plastic deformation. The power and low cost of modern computers has made Finite Element Analysis available. In the finite element method, a structure is broken down into many small simple blocks or elements. The behavior of an individual element can be described with a relatively simple set of equations. Just as the set of elements would be joined together to build the whole structure, the equations describing the behaviors of the individual elements are joined into an extremely large set of equations that describe the behavior of the whole structure. The computer can solve this large set of simultaneous equations. From the solution, the computer extracts the behavior of the individual elements. From this, it can get the stress and deflection of all the parts of the structure. The stresses will be compared to allowed values of stress for the materials to be used, to see if the structure is strong enough. The term "finite element" distinguishes the technique from the use of infinitesimal "differential elements" used in calculus, differential equations, and partial differential equations. The method is also distinguished from finite difference equations, for which although the steps into which space is divided are finite in size, there is little freedom in the shapes that the discreet steps can take. Finite element analysis is a way to deal with structures that are more complex than can be dealt with analytically using partial differential equations. FEA deals with complex boundaries better than finite difference equations will, and gives answers to "real world" structural problems. It has been substantially extended in scope during the roughly 40 years of its use. Finite Element Analysis is done principally with commercially purchased software. These commercial software programs can cost roughly €1,000 to €50,000 or more. Software at the high end of the price scale features extensive capabilities like plastic deformation, and specialized work such as metal forming or crash and impact analysis. Finite element packages may include pre-processors that can be used to create the geometry of the structure, or to import it from CAD files generated by other software. The FEA software includes modules to create the element mesh, to analyze the defined problem, and to review the results of the analysis. Output can be in printed form, and plotted results such as contour maps of stress, deflection plots, and graphs of output parameters. The choice of a computer to run the FEA application is based principally on the kind of structure to be analyzed, the detail required of the model, the type of analysis (e.g. linear versus nonlinear), the economics of the value of timely analysis, and the analyst's salary and overhead. An analysis can take minutes, hours, or days. Extremely complex models will be run on supercomputers. Many things can be analyzed in good detail on computers costing from roughly €2,000. Higher prices will let you consider computers with more memory, larger hard drives, and one or more high-speed processors. The background of a finite element analyst includes an understanding of engineering mechanics (strength of materials & solid mechanics) as well as the fundamentals of the theory underlying the finite element method. The analyst must appreciate the basics of numerical methods. An engineering degree is typical, though not an absolute requirement. Use of a particular finite element program requires familiarity with the interface of the program in order to create and load the models, and to review the results.

4.2.2 Software There are an impressive number of FEM/FEA software packages available. Some are free some are commercial. The main requirement for our application is the communication among the solver, the interaction module and the visualization one. Among all the available software, we took in consideration, for performance testing, two of the most used (and supported) packages: ANSYS and COMSOL ANSYS software offers a comprehensive product solution for structural linear/nonlinear and dynamics analysis. The product offers a complete set of elements behavior, material models



and equation solvers for a wide range of engineering problems. Information, such as nodal coordinates and mechanical and thermal loading, is entered into a text editor and is uploaded into ANSYS, or is entered directly into the ANSYS program using the main menu tools. An analysis application is then run, and ANSYS generates a solution for the uploaded problem. COMSOL Multiphysics (formerly FEMLAB) is a finite element analysis and solver software package for various physics and engineering applications, especially coupled phenomena, or multiphysics. COMSOL Multiphysics also offers an extensive and well-managed interface to MATLAB and its toolboxes for a large variety of programming, preprocessing and postprocessing possibilities. A similar interface is offered to COMSOL Script. The packages are cross-platform (Windows, Mac, Linux, Unix.) In addition to conventional physics-based user-interfaces, COMSOL Multiphysics also allows for entering coupled systems of partial differential equations (PDEs). The PDEs can be entered directly or using the so called weak form (see finite element method for a description of weak formulation). Out main issue is to have the solution ready to achieve a 5 Hz refresh in our visualization/simulation package. This means that the solution must be done in 0.2 sec, including the mesh “preparation” and the output access. ANSYS (and almost all the other software) streams its output and input using the hard drive, making virtually impossible to achieve the 5Hz refresh, since the I/O from hard drive is a “slow” operation, as shown by our preliminary tests. COMSOL on the contrary is based exclusively on MATLAB scripts which can access directly at any portion of the solver. In other words, within COMSOL, we can access the data directly in RAM memory, much faster than using I/O on hard drives. We decided to test the performances of COMSOL with different meshes, changing the number of nodes, the element type and also with different hardware.



5 Reverse Engineering systems

5.1 Introduction In the last decade the growing interest for shape acquisition systems (commonly called 3D scanning systems), able to digitize physical objects is mainly due to the increased accuracy and repeatability they offer to reproduce geometric profiles of objects, even complex, and to their adaptability to a lot of different applications. Nowadays, typical fields of applications of such systems range from Reverse Engineering (RE) for the analysis of the competition, to the quality control of industrial products; from the creation of models for virtual reality environments to the reconstruction and restoration of objects that belong to cultural heritage [ASW 2007]. The term “Reverse Engineering” is commonly referred to the whole process from the acquisition of the physical object to its digital reconstruction. Depending on the field of application the digital model can be a large point data set, a tasselled model (like STL model), a set of curves, a set of 3D patches, or a CAD model to be used, for example, to re-designing the object. Recently, it is often also required to get a parametric CAD model of the acquired object. Typical RE applications include [ASW 2007], [BRA 2005], [TOR 2003]:

– Creating data to manufacture or restyle a part of which no CAD data is available, or for which the data has become obsolete or lost.

– Inspection and/or Quality Control - Comparing a fabricated part to its CAD description or to a standard item.

– Product Design – Tool Making. – Package Design. – Benchmarking. – Creating 3D data from a model or sculpture for animation in games and movies. – Creating 3D data from an individual, model or sculpture for creating, scaling or

reproducing artwork. – Documentation and/or measurement of cultural objects or artefacts in archaeology,

palaeontology and other scientific fields. – Fitting clothing or footwear to individuals and determining the anthropometry of a

population. – Generating data to create dental or surgical prosthetics, tissue-engineered body parts,

or for surgical planning. – Documentation and reproduction of crime scenes. – Architectural and construction documentation and measurement.

Particular attention should be posed on specific field of use because many actual systems are able to be applied on several applications and so it is important to highlight the different key parameters which may address to a right choice. With the growing interest in 3D scanning, in fact, the need for assistance in qualifying the technology has arisen. 3D Scanners have inner variables that are unmatched by traditional tools. When combined with a limited understanding of the technology, this creates a challenge in identifying the key parameters to evaluate when looking to a scanning solution. Service providers and systems manufacturers report that a majority of new users place arbitrary demands on the scanning and output quality. In many cases, these specifications limit the technology selection, force use of expensive systems or impede the use of 3D scanning. For example, vendors frequently receive specifications for ultra-tight tolerances, with no viable justification, when a realistic and useable data accuracy could be lower. While the technology has been available for more than a decade, it is only over the last few years that scanner technology, software capability and system affordability have progressed to the point of being a tool for companies of all types and sizes. So, while the technology is not new, it appears as such to the majority of organizations in industries as diverse as manufacturing, medical and the arts [CHE 2000], [CLO 1998], [KAR 1999]. The growing request of 3D scanner systems available on the market today can be satisfied thank to several acquisition techniques which can be classified according to Figure 31.



With reference to Figure 9, at the top level a wider classification can be done in passive and active systems. Passive image-based methods (e.g. photogrammetry or computer vision) simulate the functioning of the human visual apparatus and acquire 3D measurements from single images or multi-stations; they use projective geometry or perspective camera model; they are very portable and the sensors are not expensive. On the other hand, 3D active systems (mainly those based on structured light, interferferometry, laser) use some kind of energy interacting with the object to capture its shape (or just few points), and are becoming a common approach for objects, and a standard source for geometric input data. They provide for millions of points and often also for the associated colour. But these sensors are quite expensive, designed for specific applications and depend on the reflective characteristics of the surface. The active acquisition systems can be classified into two categories [BAR 2007] as shown in Figure 31: 1) Contact-based systems: there is a physical contact between the probe and the surface to

scan; 2) Contactless-based systems: the hardware uses physical phenomena like light, sound or

electromagnetic waves. Manual measurements systems, mechanical arms and CMM belong to the first group of acquisition systems. Instead, systems based on magnetic, acoustical and optical methods belong to the category of contactless based systems. The state of the art of most common techniques and systems, with their several key parameters, their strengths and weaknesses, has been shown and analyzed in the following paragraph.

Figure 31: Active 3D acquisition methods

5.2 3D Scanning techniques The shape acquisition methodologies available on the market today can be classified into intrinsically 3D methods, one/two dimensional methods integrated with positioning systems to define the orientation in the 3D space. In the latter category two widespread systems are included: CMM (Coordinate Measurement Machines) and Laser trackers.

Digitizing Methods

Contact Methods (there is physical contact between the probe and

the surface to scan)

Contactless Methods (based on magnetic, acoustical and optical methods)

Magnetic Methods

Acoustic Methods

Optical Methods

Manual measurements

systems

Mechanical arms

Coordinate Measuring

Machines (CMM)

Triangulation Stereoscopy Time Delay

Mono or Multi-line Projection

Fringe or Coded Pattern Projection

Moiré Effect

Time of Flight

Interferometry Continue Modulation



The CMM are based on the use of a mechanical probe and of a Cartesian or anthropomorphic (mechanical arm) positioning system, while Laser trackers have a laser interferometer put on a theodolite.

5.2.1 Contact digitizers Contact digitizers, or touch-probes ( Figure 32), are often very accurate over a wide measurement volume (Table 3), and some instruments in this class are among the most affordable available devices. Strengths:

• Accuracy. • Low-cost instruments available. • Measures deep slots. • Pockets. • Not affected by color or transparency.

Weaknesses:

• Manual operation. • Slow for complex surfaces. • Can distort soft objects.

There are contact digitizers that are manually positioned to yield a single measurement at a time, or they may be moved across a surface to get automatically many measurements. There are also touch probe systems available which can automatically scan an object using a variety of mechanical drive means. Contact digitizers are often in the form of an articulated arm that allows multiple degrees of freedom of movement. The position of each section of the arm is determined by encoders, glass scales, or in the case of the more inexpensive devices, by potentiometers mounted in each joint. Other mechanical arrangements besides arms are also used.

Figure 32: Examples of Cartesian CMM (left) and Mechanical Arm (right) But contacting devices can distort soft objects such as car upholstery, are too slow, or may require much labour to scan complex curved surfaces. On the other hand, they are not affected by some surfaces properties such as colors, trasparency or reflectivity. More over while slow, they may actually be the fastest way to digitize simple surfaces where just a few data points need be gathered. Manually positioned devices can also make it easier to digitize areas of an object such as narrow slots or pockets. Strengths:

• Accuracy. • Low-cost instruments available. • Measures deep slots.



• Pockets. • Not affected by color or transparency.

Weaknesses:

• Manual operation. • Slow for complex surfaces. • Can distort soft objects.

Table 3: Mechanical Touch Probe

Mechanical Touch Probe Technology Representative

Vendor Model Measuring

Range Accuracy Color Speed Price

Range Mechanical Arm

Faro Technologies

FaroArm - Platinum

1,2 m +/-0.018 mm

Not Applic

Manual operations

$19,900

Mechanical Arm

Faro Technologies

FaroArm - Platinum

3,7 m +/- 0.86 mm

Not Applic

Manual operations

-

Mechanical Arm

Faro Technologies

FaroArm – Titanium

from 1.2 m to 3.7 m

From +/-0.036 mm to +/-0.172 mm

Not Applic

Manual operations

-

Mechanical Arm

Immersion MicroScribe G2

1.27 m 0.38 mm Not Applic

Manual operations

$3,495

Mechanical Arm

Immersion MicroScribe G2LX

1.67 m 0.30 mm Not Applic

Manual operations

$5,495

Guided Probe Roland DGA Corp.

PIX-4 150 x 100 x 60 mm

Minimum scan pitch of 0.05 mm

Not Applic

No info provided

$1,995

Guided Probe Roland DGA Corp.

PIX-30 300 x 200 x 60 mm

Minimum scan pitch of 0.05 mm

Not Applic

No info provided

$3,495

5.2.2 Mixed CMM-Optical digitizers A mixed CMM-optical system consists of an active or passive mobile probe with a high resolution CCD camera, a portable PC for system control. The probe is equipped with a

measuring tip to touch object points. During the measurement, the camera is held facing a field of control points that is located nearby, either on portable or on fixed panels. Pushing the button

at the probe triggers a measurement. The operator places the ruby ball tip in contact with the feature being measured. The result is immediately shown on the computer.

Figure 33 shows two systems by Aicon and by Metronor Companies (see section 5.7.1).

Figure 33: Mixed CMM-optical system by Aicon (left) and Metronor (right)

5.2.3 Line and Spot Scanners (based on triangulation) The two major classes of non-contact scanners are those based on laser technology and those based on some form of non-coherent, white, or broadband light source. Laser scanners most often use straightforward geometric triangulation to determine surface coordinates of an object. A laser line is scanned on the target object and individual sensors image the line, usually



simultaneously from each side of the line. Where image of the laser line falls on each sensor (most often a CCD array) its position is easily determined and the rules of trigonometry are then applied to calculate the position of the target surface at each point on the laser line. The simple base concept of this technique, and its ability to fairly quickly digitize a substantial volume with good accuracy and resolution, have made laser line scanners a popular choice. Products are supplied both as complete systems, and as self-contained measuring heads for mounting to standard touch-probe arms or in other ways, including customized mechanical fixtures for specialized applications [KIM 2004]. Laser systems are based on a low power laser source that, due to an adequate optic system, generates a coherent and monochromatic light detected by a CCD and then processed with the triangulation method ( Figure 34).

Figure 34: laser scanner Minolta Vivid 9i

The laser ray allows to minimize the divergence angle and, in case of large projection, to generate beams not very thick. Therefore, it is possible to set up band-pass filters on the CCD centred on the emission frequency of the laser making the probe more resistant to the parasitic lights source [CHE 2000].

Figure 35: Triangulation method

The best results (Table 4, Table 5) are obtained by angling the lens according to the CCD to position the plane defined by the beam, so to allow best focusing all along the depth of the measurement [KAR 1999], [HAL 1989]. Laser systems may be affected by some surfaces properties such as colors, trasparency or reflectivity. As more experience has been gained over the years, however, users have become adept at work-arounds for surface problems which may cause errors. Among laser line scanners the model VI-9i by Minolta (



Figure 35) use a new technology which allows to acquire objects with dark to very bright regions. It use a dynamic pange magnification mode that reduces the need for surface processing of object with high-contrast surfaces (both very light and very dark areas). This feature enable to complete a measurement in only one operation. Strengths:

• Non-contacting. • Fast digitizing of substantial volumes. • Good accuracy and resolution. • Color capability available Weaknesses:

• Possible limitations for colored or transparent surfaces. • Laser cautions apply.

5.2.4 Probes based on the Conoscopic Holography Conoscopic Holography is an implementation of polarized light interference process based on crystal optics. In the basic interference set-up, a point of light is projected on a diffuse object. This point creates a light point, which diffuses light in every direction [CON 1992]. In a Conoscopic system a complete solid angle of the diffused light is analyzed by the system. The measurement process retrieves the distance of the light point from a fixed reference plane (Figure 36).

CrystalPolarizing Filter

Splitter Laser SourceCCD probe

Lens CrystalPolarizing Filter


Lens CrystalPolarizing Filter


Lens

Figure 36 – Conoprobe

A single ray, at a given angle, emitted by a light source point, impinges on the first face of the crystal. It is split into two rays propagating inside the crystal at different velocities along almost the same geometrical path. The velocity of one is isotropic and so this ray is called ordinary, while the other has an anisotropic velocity, i.e., it is extraordinary. Thus two superimposed rays emerge from the crystal with a phase difference at orthogonal polarizations. In order for both rays to interfere, an analyzer (polarizer) aligns the directions of the electrical fields. As the measurement is repeatable with a several thousands Hertz frequency it is possible to obtain the same measurements at the same speed. The monochromatic lighting of the scene is obtained by means of a laser diode that projects, coaxially to the lens, a light point on the scanning surface. The main interesting feature of these probes is the possibility to make measurements starting from the limited opening of the observing cone of the probe lens, so allowing to measure holes. This is impossible with the triangulation-based systems. Furthermore, these systems are barely sensitive to the reflectivity of the analyzed surface, to its angle position and to its also not slight speed [SIR 1985], [SIR 1985], [LOM 2001].



Table 4: Laser Line Scanner

Laser Line Scanner Technology Represent.

Vendor Model Measuring Range Accuracy Color Speed Price

Range Laser Meas. Head

Laser Design, Inc.

RPS-120 Not Applic., Measuring head; 89 mm stand-off distance

+/- 0.00635 mm

No 14,400 points/sec

-

Laser Meas. Head

Laser Design, Inc.

RPS-450 Not Applic. Measuring head; 200 mm stand-off distance

+/- 0.0254mm

No 14,400 points/sec

-

Laser Meas. Head

KREON Technologies

Zephyr KZ 25

Not Applic. Measuring head; Mtd to Arm or Mach tool / Triangulation angle 30 deg

Vertical resolution 3 microns

No 30000 points/sec

-

Laser Meas. Head

KREON Technologies

ZEPHYR KZ 100

Not Applic. Measuring head; Mtd to Arm or Mach tool / Triangulation angle 30 deg

Vertical resolution 11 microns

No 30000 points/sec

-

Laser Scanner / Aux. Video

Cyberware Model 15 Desktop Scanner

250 (X) x 150 (min Y) x 75 (Z) mm

Resol: X: 300 micron to 1 mm,Y: 300 micron; Z: 50 to 200 micron

Yes 14,580 points/sec.

-


Minolta Vivid 910

111 x 84 x 40 mm (tele lens) to 1200 x 903 x 400 mm (wide lens)

+/- 0.008 mm acc.; +/- 0.008 mm prec.

Yes 307,000 pixels in 2.5 secs using Fine mode and tele. lens

-


Minolta Vivid 9i X: TELE 93 to 463 mm MIDDLE 165 to 823 mm WIDE 299 to 1495 mm Y: TELE 69 to 347 mm MIDDLE 124 to 618 mm WIDE 224 to 1121 mm Z: TELE 26 to 680 mm MIDDLE 42 to 1100 mm WIDE 66 to 1750 mm

±0.05 mm, precision ±0.008 mm

Yes 340,000 pixels in 2.5 secs using Fine mode and tele. lens

€50.000

Laser Scanner

Nextec Table Top Hawk

Linear travel: 240 mm (9.4 inch); Scanning range: +/- 5 mm

Resolution: 1 micron; Total measuring accuracy: 10 microns (1 sigma)

No 40 points per sec

-


Riegl LMS-Z210i

4 to 400 m range x 80 deg vert x 360 deg Rot.

5 mm resol.; +/- 15 mm accuracy (averaged) or +/- 25 mm single shot

Yes, Opti.

12,000 points / sec; up to 15 deg / sec for horiz scan

-

Laser Scanner

Riegl LMS-Z420i

2 to 800 m range x 80 deg vert x 360 deg Rot.

5 mm resol.; +/- 5 mm accuracy (averaged) or +/- 10 mm single shot

No 12,000 points / sec; up to 15 deg / sec for horiz scan

-



Laser Scanner

Roland DGA Corp.

LPX-600 Plane scanning: Width 254.0 mm , height 406.4mm Rotary scanning: Diameter 254mm, height 406.4mm

±0.05mm No No info provided

$9,995

Laser Scanner

ShapeGrabber LM600 System

600 x 160 x 165 mm 0.03 mm No 15,000 to 100,000 points/sec

-

Conoscopic holography

Optimet Conoprobe 1000

180 x 80 x60 mm < 1 µm No 850 points/sec

-

5.2.5 Dual-Capability Systems Many companies that make contact digitizing instruments, as well as many of those that make laser scanners, provide turnkey products that have both of these quite complementary capabilities. Broad areas can be quickly scanned using a laser device mounted on the arm, and features which might be geometrically problematic for the laser can be contact-probed. Some companies provide instruments which can carry both a contact probe and a laser head simultaneously (Figure 37).

Figure 37: Dual – capability systems

A few companies provide colour laser scanning technology. Arius 3D uses a multiplexed arrangement of red, green and blue lasers to simultaneously gather colour and geometric data. The company mainly offers scanning services, however, rather than selling equipment. Other companies such as Minolta and Cyberware use laser scanning to gather surface measurements and combine that data with colour video information gathered separately. Most colour scanners use structured white light or broadband sources

5.2.6 Other Types of Laser Systems Several additional laser technologies are also utilized, including time of flight, optical radar and laser tracking. In general, these methods offer good accuracy combined with the capability of making measurements from a long distance away from the subject - in some cases tens of meters. This so-called “stand-off” distance is important for applications such as digitizing large machinery, buildings, and the like, which represent a large fraction of the applications for these technologies. Time of flight systems measure how long it takes for light emitted by a laser to return to a sensor located near its source. Optical radar systems are similar in operation, and both are analogous to standard radar systems which measure the return-time of a radio wave. Time of flight and radar systems don't usually require retro reflectors mounted on the object to be measured and can operate at very high rates to quickly capture entire scenes or objects. In contrast, laser trackers look for a signal in their field of view from a retro reflector placed or held on the object.



The main advantage these systems offer is high precision over a large working volume and a frequent use is for aligning large pieces of machinery or verifying as-built dimensions of large objects (Table 5).

Table 5: Other Types of Laser Systems Other Types of Laser Systems Technology Represent.



Range Laser time of flight

Callidus Precision Systems GmbH

CT900 1.6 m W x 1.4 m H

50 microns resolution on axis of turntable; 70 microns at far end of meas range

No 4000 points / sec

-

Laser time of flight

Callidus Precision Systems GmbH

CT180 350 mm W x 375 mm H

25 at near end to 70 microns at far end of meas range

No 4000 points / sec

-

Laser tracking Automated Precision, Inc.

Smart Trak 6D

40 m range

Static acc: 0.025 mm at 200 mm

No Up to 2000 meas / second

-

Strengths: Non-contacting. Fast digitizing of substantial volumes. Good accuracy and resolution. Trackers are very precise. Large stand-off distance for large objects. Weaknesses: Possible limitations for colored or transparent surfaces. Trackers are fairly slow and may req targets. Laser cautions apply.

5.2.7 Other Types of Tracking Systems In addition to laser-based tracking systems, a number of companies make LED-based and other types of tracking systems (Table 6), such as magnetic trackers. These technologies generally have smaller working envelopes than laser-based systems and may not be quite as accurate. They're most frequently used in human and other types of motion studies, but are also useful for reverse engineering. A probe with one or more LED's is touched or attached to the object to be digitized. Sensors, most often utilizing CCD chips in a dual camera arrangement, image the LED's in their field of view. As with laser scanners, trigonometry is then used to calculate the position of the probe on the surface of the object. Encoding schemes based on high-speed modulation of the light emitted by the LED's allow some instruments to simultaneously track the position of hundreds of LED's.

Table 6: Tracking Systems, Other Than Laser Types Tracking Systems, Other Than Laser Types Technology Represent.



Range LED-based tracking

Boulder Innovation Group, Inc.

3D Creator 1 m Sphere 0.2 mm mean volumetric acc

No 300 points / second

-

LED-based tracking

Metronor Solo Measurement System

10.5 m range

0.02 to 0.1 mm meas. uncertanty

No Manual operations

-

Magnetic tracking

Northern Digital

Aurora 500 mm cubic volume

1.3 mm rms; 2.1 mm to 95% confid. level

No 22 to 45 points / sec depending on number of sensor coils

-



Magnetic trackers offer the added benefit of being able to digitize points on objects that are not within a direct line of sight. Instead of an LED probe, these systems use a small wire coil as a target. One company that makes magnetic trackers, Polhemus, combines this technology with laser scanning. The result is a system that has many of the same features as laser scanners mounted on mechanical arms, while providing very great freedom of movement. In RE applications, these systems provide good accuracy over a substantial volume and moderate speeds. They are not affected by surface quality or color. On the down-side, they require a contacting probe or marker and can be slow to digitize complex surfaces. Numerous instruments are available that use one form or another of structured-light to measure the geometry of an object (Figure 38). Most use white light, but some use other broadband sources. Perhaps the simplest to understand are those that project a pattern of lines on an object to be digitized. The pattern is distorted by the object’s three-dimensional nature, and the deviation from the original pattern is translated into a surface measurement at each point in the field of view of the instrument. Triangulation is used to calculate the surface data and nearly all systems use CCD cameras for sensing. Several variations are available including systems that project moiré patterns or fringes, structured-colour light patterns, and polarized-light interferometry.

Figure 38: Structured-light systems

Structured Light Systems is based on the mix of the “gray coding” method and of the PMP (Phase Measurement Profilometry) technique also called “Phase Shift” method (Figure 39). The first method allows to obtain a fast scanning with a rough accuracy (~1/100 of the measured volume diagonal) with simple steps, then by means of the second method is possible to improve the results up to a high accuracy (1/25000).



Figure 39: 3-D system based on structured light: scheme of the projected fringes

The gray coding phase is based on the projection and on the simultaneous acquisition and recording of a sequence of images with white and black fringes whose period is little by little halved. As the directions of the projection and of the acquisition are angulated, each point of the object analyzed will be shown illuminated in some images and in others no, according to its position in the 3D space. In this way, it is possible to obtain a binary code for each pixel of the acquired image. In particular, a different projected image corresponds to each bit and to its value (0 or 1) when the examined pixel frames the illuminated area of the surface. An adequate calibration of the measurement volume allows to convert the binary codes in three-dimensional positions. By means of the Phase Shift is possible to improve the results obtainable with the Gray Coding method following this procedure: an image is created starting from the use of parallel fringes, sinusoidal profile and pitch next to the half of the pitch used in the last projection of the Gray coding; then, this image is projected varying the phase angle of the fringes in order to show it moving on the object surface. Together with the projection, the camera acquires the images sequence with the gray tone of each pixel that vary with a sinusoidal behaviour with a frequency equal to the frequency of the projected fringes and a phase angle according to the position of the area of the surface examined. Mixing the results of this analysis with the gray coding one it is possible to solve the non-univocity between the positions and the phases due to the fact that the phases are computable except for 2π multiples [CLO 1998], [PER 1995]. Structured-light systems (Table 7) are very fast and can digitize hundreds of thousands to millions of points per second. These features have resulted in making them strongly favored for digitizing human beings. A wide selection of application-specific instruments is available for digitizing complete human bodies, and for more specialized needs such as faces, teeth, feet, breasts, etc. A few manufacturers of broadband-source systems provide color capability, but laser scanners that provide color information from ancillary video sources, such as those made by Minolta and Cyberware, are also strong contenders in the body-scanning field.

Gray Coding Methodology: Projecting in sequence the images with theshowed fringes and acquiring them at the same time by means of the camera, every CCD pixelwill be showed as illuminated or not, whether itfalls in a white or black fringe. Therefore, projecting five images for each CCD pixel, a 5 bitbinary code that is univocally related to theposition of the examined area can be obtained.

Improvement by means of the Phase Shifttechnique: During the projection of the showed images, thegray tone of each pixel of the CCD will vary with a sinusoidal behaviour with period equal to theperiod of the projected fringes and of the phase and it depends on the height of the examinedarea. To find the positions of the phases thisanalysis is to be mixed with the rough oneobtained by means of the Gray Coding.



Table 7: Structured-light and Broadband Source Systems Structured-light and Broadband Source Systems Technology Representative



Range Arm-mounted structured white light scanners

GOM mbH ATOS II 175 x 140 - 2000 x 1600 mm²

Resolution 0.12 - 1.4 mm

No 1.4 million points in 1 secs.

-

Arm-mounted structured white light scanners

GOM mbH ATOS III 150 x 150 - 2000 x 2000 mm²

Resolution: 0.07 - 1.0 mm

No 4 million points in 2 secs.

-

Moiré white light pattern

Inspeck 3D Mega Capturor II / Small Field

401 x 321 mm X & Y: 0.3 mm; Z: 0.4 mm resol.

Yes 1.3 million points in 0.7 sec.

-


Inspeck 3D Mega Capturor II / Large Field

1140 x 910 mm

X & Y: 0.9 mm; Z: 1.0 mm resol.


-


Inspeck Capturor II / Small Field

352 x 264 mm X & Y: 0.6 mm; Z: 0.5 mm resol.


-


Inspeck Capturor II / Large Field

1195 x 896 mm

X & Y: 1.9 mm; Z: 1.0 mm resol.


-

Projected white light patterns

Genex Technologies, Inc.

Rainbow 3D Camera 25

Capacities from 32x25x20 mm3.

Accuracy to 25 microns

Yes Captures 768 X 576 pixel image (442,368 pixels) in < 1 sec.

-

Moiré-coded white light triangulation

Steinbichler Optotechnik GmbH

Comet 5 80 x 80 - 800 x 800

> 0.001 mm

No 1600 x 1200 / 2000 x 2000 / … in < 1 sec.

-

On the down-side, broadband-source systems are, in general, somewhat less accurate than laser systems and are for the most part limited to smaller scanning volumes, typically a cubic meter or less. This may not be a strong limitation, however, since scans can be merged to completely cover very large objects, although it may take considerable time and effort to do so. Strengths:

• Can digitize hidden points in some cases. • Can be used to track motion of many points simult. • Magnetic devices can meas outside line of sight. • Not affected by color, transparency or surface quality.

Weaknesses:

• requires targets or contacting probes. • smaller volumes. • lower accuracy. • slow for complex surfaces.

5.2.8 Photogrammetry Among the passive method to get measurements from a physical object the photogrammetry is the more mature technology able to extract 3D coordinate of points from two or more common 2D images. It is a contactless measurement method which uses the projective geometry and perspective camera model through stereo-restitution or bundle adjustment of overlapping images. Similarly to the optical methods described before, photogrammetry also makes use of the triangulation principle. By taking photographs from at least two different locations, the so-called “lines of sight” can be developed from each camera to points on the object. These lines of sight



(sometimes called rays) are mathematically intersected by using epipolar geometry to produce the 3-dimensional coordinates of the points of interest (Figure 40). Particular attention should be posed on the camera calibration process that calculates the intrinsic parameters of the camera using only the information available in the images taken by that camera. The measurement step can be performed with manual or automatic procedures. Automated photogrammetric matching algorithms can produce very dense point clouds, but mismatches, irrelevant points and missing parts could be present in the results, requiring a post-processing check of the data. These automated procedures usually do not take into consideration the geometrical conditions of the surface's object and mainly work with smoothing constraints: therefore is often quite difficult to turn randomly generated point clouds into polygonal structures of high quality and without losing important information. On the other hand, if the measurements are done in manual or semi-automatic mode, there is a higher reliability of the measures but a smaller number of points that describe the object.

Figure 40: Photogrammetry technique (by Geodetic Services Inc.)

Photogrammetry can be used in the manufacturing process in-line or off-line. They can be applied in the process of development of a product (research and development). Some companies use photogrammetric methods to generate a high accurate reference coordinate system especially for the measurement of very large objects. The output of a photogrammetric process can be 3D points coordinates, topographical maps, rectified photographs (orthophoto) [GER 2004]. Photogrammetry can be classified as follows: – Far-range photogrammetry (areas of m2) is used to produce topographical maps, to

reconstruct the shape of buildings, and so on. – Close-range and very close-range techniques are preferred for the industrial applications. In

this case the dimensions of the physical object and of its features can vary between 1 cm2 and 1 mm2.

The user will place scale bars around the object to be scanned and attach reference markers to the object. With the dedicated digital camera, the operator takes pictures of the object, being sure to include both the coded markers and dimension-controlled scale bars. Based on these pictures, the coordinates of the reference markers are determined with high accuracy, creating a sparse point cloud of the object. With a well calibrated high resolution camera an accuracy of 1 part in 120.000 can be reached (see www.geodetic.com). Best results can be obtained using retro-reflective targets or coded targets such as those made by AICON (www.aicon.de). Photogrammetry is often used in combination with other optical methods to capture very large objects by measuring the right position of some sparse targets posed all around the object, and using them to align point clouds acquired for example with laser or structured light scanners.



5.2.9 Specifications and application criteria Digitizers have numerous specifications and Application Criteria: – Accuracy

The degree to which the scanned data matches the physical object. While accuracy is the commonly used word in the design world, the better choice is uncertainty, a statistical term used in the inspection and quality realms. Uncertainty (aka accuracy) is the deviation band in which there is a high degree of confidence that a measurement will fall. For example, if an object is 1.000 inch long, and the uncertainty is ± 0.005 in., the scanned data can be reasonably expected to fall between 0.095 and 1.005 in.

– Resolution The spacing between the sampled points in scanned data. Manufacturers will specify a resolution that is that of the CCD (charge coupled device) or similar data storage element. However, the important factor is the spacing of data points on the physical object. This spacing is dictated by both the CCD and the distance of the object from the focal point of the scanner. For a complete data capture, resolution should be ≤ ½ of the size of the smallest feature to be scanned. Note: Do not confuse resolution with accuracy. While resolution contributes to overall accuracy, there is no direct correlation.

– Mobility Portability of a scanning system, including consideration for set-up time and calibration. Consideration should also be given to the amount of equipment (e.g. computer, tripod), size and weight that will be transported and the method (shipped, carry-on, checked baggage). For large objects or parts that are in service, portability is often a critical factor.

– Range Also called field of view (FOV) and depth of field. The minimum and maximum distance of an object from the scanner and the associated XY scanning area. If objects exceed the range, stationary scanners will require multiple scans to capture the entire length or width of an object. A closely associated criterion is coverage, which refers to the ability of a scanner to address line-of-sight constraints. Examples include deep channels, narrow holes and undercuts. In general, handheld or arm-mounted scanners over greater coverage.

– Time For scanners, there are two components of time: set-up and scanning. Both may be important for any scanning project. Set-up includes the time to mount or position the scanner stably, the time to calibrate the scanner and the time to prepare the object. Scanning time encompasses the duration required to capture all features of an object. Note that since software is not addressed, its time component, which can be substantial, is not discussed.

– Ease-of-Use The degree to which an inexperienced employee can prepare and set-up an object for scanning and execute the scanning process. Typically, as ease-of-use increases, the degree of operator control decreases. The steps in the scanning process that influence ease-of use are object preparation (surface coating, targeting), scanning and data processing. Processing the scan data can be more involved than either of the other two steps. While a discussion of software data processing is not included, it cannot be ignored. However, one aspect of data processing is often coupled with the scanning hardware, stitching of individual scans. For scanners that capture an XY patch with each exposure, the operator will use the software to carefully align and stitch each scan. For handheld scanners that paint an object, or scanners that rotate the object, alignment is virtually eliminated. This eliminates one step in the process, which makes scanning easier for the operator.

– Versatility The degree to which a scanner can accommodate a wide variety of objects, in terms of size, complexity and material properties; accommodate a wide variety of operation conditions (environment, ambient light, vibration); and address common scanning challenges (line-of-sight, shadowing, high aspect ratio). The importance of versatility increases when owning multiple scanners is not desired.



Four tables with the main features of the widespread shape digitizing systems are shown to present the state of the art. In particular: 1) 3D systems, 2) One-dimensional sensors, 3) Two-dimensional sensors and 4) Positioning systems. The differentiation between the sensors and the positioning system is needed to remark that different sensors (one or two dimensions) allow to obtain many alternative results realizing systems with different performances. The first three tools categories are evaluated on four parameters: (I) Sensitivity, that synthesize the accuracy, repeatability and resolution of the measurement; (II) Speed that considers both the acquisition time of the single data and the time needed for the calibration system; (III) Robustness, that considers both the sensitivity of the system to the noise due to the external lights and vibrations and the versatility and the eventual need of specialized operators. Finally (IV) Performance/cost ratio, that represents the valuation of the hardware related to the best measuring performances realized. Table 8: 3D Systems, qualitative evaluation 3-D Systems Sensitivity Speed Robustness Performance/cost Photogrammetry + – + – + – Structured Light + + + + – + + Moiré + – + – – Interferometry/Holography + + + + – + – Fourier transform profilometry – + + – + +

Table 9: 3D Systems, qualitative evaluation 1-D Probes Sensitivity Speed Robustness Performance/cost Mechanical probes + – + + + Triangulation Optoelectronic probes + – + + – + + Ultrasonic probes + – + – + Conoscopic holography probes + + + + – – Laser Interferometry + + + + – – – Radar laser + + – + – – –

Table 10: 2D Systems: qualitative evaluation 2-D Probes Sensitivity Speed Robustness Performance/cost

Laser triangulation probes + – + + + – + Confocal Microscope + + – + –

For the comparison between the different positioning systems the following requirements were considered: 1) Performance/cost parameter, 2) Accuracy, 3) Easy-of-use with particular attention on complex shape and 4) Transportability, that considers the ratio between the dimensions of the system and of the bigger measurable object.

Table 11: Positioning systems, qualitative evaluation Positioning systems Accuracy Easy of use Transportability Performance/costCartesian mechanical + + + – – + – Mechanical (Theodolite) + + – + + + “Faro Arm” Manual arms + – + + + + Industrial Robots – – + – – – Photogrammetry with IR diodes + – + – + + Inverse Photogrammetry (Aicon – ProCam) + + + – +

In applications where the 3D scan data will be the basis for redesign and modification, accuracy and resolution take precedence. A scanned object will represent the as-built or as-exists condition. From this information, the design intent is determined. To do so, tight tolerance and high resolution are required. In those environments where a technician or specialist will conduct the scanning, all other criteria become secondary. If, on the other hand, the designer will be performing his own scanning, ease-of-use, time and versatility will be additional considerations. For design applications, the scan data is processed and imported into 3D CAD software. The imported data becomes the baseline, or reference, from which to construct new features or



modify elements. Since this data represents the as-built condition, not the design intent, high uncertainty (low accuracy) will lead to inaccurate interpretation of the as-designed state. Additionally, low accuracy decreases the overall value of the scanned data for product design. Even so, the sub-micron accuracies specified by inexperienced users are rarely needed. Realistic accuracy specifications will be between 0.05 and 0.25 mm, and with some projects, 0.7 – 1.0 mm is acceptable. As with accuracy, inexperienced users often arbitrarily specify ultra-fine resolution, which is unnecessary and only serves to swell the size of the point cloud file. Organizations that are experienced in generating scan data for design applications report that a resolution of 0.1 – 1.0 mm is appropriate for most projects. Ease-of-use, rapid scanning and versatility combine to make scanning a simple, efficient and accessible process. When the scanner is a tool used by the designer, these characteristics are important. Without them, the designer may opt for other methods that are simpler and more efficient. Speed, ease and versatility eliminate the creation of artificial barriers to the productive use of the 3D scanner. Ease, convenience and accessibility are the keys to successful use in graphic design. Versatility is another factor for success. In this environment, a graphic designer may be scanning people, clay models, organic objects or a multitude of other pieces. With the exception of those who relish new technology, there will be little interest in having a separate scanner for each of these types of scanning projects. For those times when resolution becomes important, a reasonable request is in the 0.5 – 1.2 mm range. But for most projects, resolution of 2,5 – 10,00 mm will satisfy the demands for creating visually pleasing digital objects.

5.3 Critical issuses related to "Puodarsi" RE systems This project allows to use for the required applications a wide range of commercial RE systems. The research efforts are currently oriented towards the non-contact systems that, with respect to the co-ordinate measurement machines (CMMs), have the advantages of higher speed, easy-to-carry out feature for many systems, medium-high accuracy and relatively low cost. At the moment, among the contactless techniques analyzed, laser and structured light are mostly used. The structured-light systems are very fast and can digitize hundreds of thousands to millions of points per second and they do not use a laser. These two features have resulted in making them strongly favored for digitizing human beings. A wide selection of application-specific instruments is available for digitizing complete human bodies, and for more specialized needs such as faces, teeth, feet, breasts, etc. Instead, the laser systems based on the triangulation are widespread in the industrial field and are common and very attractive for the market. In particular, our choise for the project – contactless 3D digitizer VI-9i by Konica Minolta (in Table 12 technical specifications are shown) – offers high speed, high precision, and measurement accuracy of ±50µm. The VI-9i requires only 2,5 seconds per scan to acquire accurate 3D data. Consequently, the VI-9i is ideal for accuracy verification and shape inspection of parts in several field of applications. The VI-9i can be used for reverse engineering to reflect the shape and dimension data of mock-ups or prototypes in design drawings, or checking part shape, checking molds, inspecting quality, etc. in prototype-production or mass-production processes. Among laser line scanners, the VI-9i use a new technology which allows to acquire in only one operation objects with high-contrast surfaces while other systems need to tune laser power till to get best results in the acquisition. Table 12 – VI-9i Technical Specifications

Type Non-contact 3D digitizer

Measuring Method Triangulation light block method

Light-Receiving Lenses (Interchangeable)

TELE Focal distance f=25 mm, MIDDLE Focal distance f=14 mm WIDE Focal distance f=8 mm

Scan Range 0.6 to 1.0 m (In Standard mode) 0.5 to 2.5 m (In Extended mode)



Laser Scan Method Galvanometer-driven rotating mirror

Laser Class Class 2 (IEC60825-1) Class 1 (FDA)

X Direction Input Range (In Extended mode)

TELE 93 to 463 mm MIDDLE 165 to 823 mm WIDE 299 to 1495 mm

Y Direction Input Range (In Extended mode)


Z Direction Input Range (In Extended mode)


Accuracy (X,Y,Z) ±0.05 mm (Using TELE lens at distance of 0.6 m, with Field Calibration System, Konica Minolta's standard, at 20°C)

Precision (Z, σ ) ±0.008 mm (Using TELE lens at distance of 0.6 m, Konica Minolta's standard, at 20°C)

Input Time (per scan) 2.5 sec

Transfer Time to Host Computer

Approx. 1.5 sec

Ambient Lighting Condition

Office environment, 500 lx or less

Imaging Element 3D data: 1/3-inch frame transfer CCD (307,200 pixels). Color data: Common with 3D data (color separation by rotary filter)

Number of Output Pixels

3D data/Color data: 640 x 480

Output Format 3D data Konica Minolta format, & (STL,DXF, OBJ, ASCII points, VRML) (Converted to 3D data by the Polygon Editing Software/ standard accessory)Color data RGB 24-bit raster scan data

Data File Size Total 3D and color data capacity: 3.6MB per data

Viewfinder 5.7-inch LCD (320 x 240 pixels)

Output Interface SCSI II (DMA synchronous transfer)

Power Commercial AC power, 100 to 240 V (50/60Hz), rated current 0.6 A (at 100 VAC)

Dimensions 221 (W) x 412 (H) x 282 (D) mm

Weight Approx. 15 kg (with lens attached)

Operating temperature/humidity range

10°C to 40°C, relative humidity 65% or less with no condensation

Storage temperature/humidity range

0°C to 40°C, relative humidity 85% or less (at 35°C) with no condensation

Regulatory approvals UL 61010A-1, CSA-C22.2 No.1010-1, etc.

5.4 Conclusions This chapter gave a overview of the Reverse Engineering systems at the current time.



The 3D Shape of the objects can be digitized using many contact or contactless techniques that can vary according to the costs of the hardware and to the quality of the acquired points in terms of accuracy and resolution. The contact-based systems need a slow complex acquisition process of the surfaces. The optical techniques allow to digitize three-dimensional shapes without contact with the surface of the object and in a very short time. In particular, the active techniques of stereovision are based on the structured light and optical triangulation methods. In this report, we focused our attention on the most widespread systems and especially suitable for the industrial applications. In particular, laser scanners are more attractive. In fact, they are able to fairly quickly digitize a substantial volume with good accuracy and resolution. Here with reference to the specification and application criteria of the RE systems showed in the section 5.2.9, the Konica Minolta VI 9i represents a good compromise. Our attention was focused on this system and the Unit of Naples is going to buy it. It allows to integrate with the Photogrammetry System PSC-1, a 3D data integration tool exclusively for the VI-9i which uses the photogrammetric technology by Photomodeler. By means of this tool, relatively large parts such as doors, bumpers, interior panels, etc. for which merging measurement data was difficult with the previous devices, can be rapidly measured and the data merged with very high accuracy.

5.5 References [ASW 2007] ASWENDT P., CAPUTO F., RENNO F., Manufacturing Optimization by means of Reverse Engineering Techniques, Proc. of IPMM 2007 - The 6th International Conference on Intelligent Processing and Manufacturing of Materials, Salerno, June, 25-29 2007. [BAR 2007] BARONE S., PAOLI A., RAZIONALE A.V., Validation of a No-Contact CMM System for Fast and Accurate 3D Shape Measurements, Proc. of XVI ADM – XIX Ingegraf Congreso Conjunto, Perugia, 6 - 9 Giugno 2007. [BAR 2004] BARONE S., RAZIONALE., A.V., A Reverse Engineering Methodology to Capture Complex Shapes, Proceedings of the XVI Congreso Internacional de Ingeniería Gráfica, Saragoza, 2004. [BAR 2003] BARONE S., CURCIO A., RAZIONALE A.V., A Structured Light Stereo System for Reverse Engineering Applications, Proc. of IV Seminario Italo-Español, Reverse Engineering Techniques and Applications, vol. 1, pp. 65-74, Cassino 2003. [BRA 2005] BRADLEY C., CURRIE B., Advances in the Field of Reverse Engineering, Computer-Aided Design & Applications, Vol. 2, No. 5, 2005, pp 697-706. [BRO 2001] BROGGIATO, G.B., CAMPANA, F., GERBINO, S., Shape deviation analysis on sheet-metal parts through reverse engineering techniques, Proceedings of the 12th ADM International Conference on Design Tools and Methods in Industrial Engineering, Sept. 5-7, Rimini (Italy) 2001. [CHE 2000] CHEN, F., BROWN, G. M., SONG, M., Overview of three-dimensional shape measurement using optical methods, Optical Engineering, vol. 39, n. 1, pp. 10-22, 2000. [CLO 1998] CLOUD, G.L., Optical Methods of Engineering Analysis, Cambridge University Press, 477-491, 1998. [CON 1992] CONTI, P., NIGRELLI, V., PETRUCCI, G., Computer aided holographic investigation of the deformation of cylindrical shells, Proceedings of the International Conference on Non-destructive testing & stress-strain measurement, FENDT '92, Tokyo, 1992. [GER 2004] GERBINO, S., MARTORELLI, M., RENNO, F., SPERANZA, D., Cheap Photogrammetry Versus Expensive Reverse Engineering Techniques in 3D Model Acquisition and Shape Reconstruction, Proceedings of the International Design Conference – Design 2004, Vol.2, ISBN: 953-6313-61-8, Dubrovnik, May 14-17, 2004, pp.749-754.



[HAL 1989] HALIOUA, M., LIU, H.C., Optical Three-Dimensional Sensing by Phase Measuring Profilometry, Optics and Lasers in Engineering, Vol. 11, pp. 185-215, 1989. [KAR 1999] KARBACHER, S., HÄUSLER, G., SCHONFELD, H., Reverse Engineering using optical Range Sensors, Handbook of Computer Vision and Applications Volume 3- Systems and Applications, Academic Press ISBN 0-12-379773-X, 1999. [KIM 2004] KIM S., KIM K., WOO W., 3D registration for image-based virtual environment generation using color and depth information, IEEE International Conference on Image Processing (ICIP04), 2004. [LOM 2001] LOMBARDO, E., MARTORELLI, M., NIGRELLI, V., Non-Contact Roughness Measurement in Rapid Prototypes by Conoscopic Holography, XII ADM International Conference, Rimini, Italy, 5-7 september 2001. [PER 1995] PERRY, K.E., McKELVIE, J., Reference Phase Shift Determination in Phase-Shifting Interferometry, Optics and Lasers in Engineering, Vol. 22, pp. 79-90, 1995. [RAI 2005] RAINDROP GEOMAGIC, Geomagic Studio 7 User Guide, Raindrop Geomagic, Inc., 2005. [SIR 1985] SIRAT, G., PSALTIS, D., Conoscopic Holography, SPIE vol. 523 Application of Holography, 1985. [TOR 2003] TORNINCASA S., VEZZETTI E., Virtual Archaeology: Web 3D Collaborative reconstruction, Proc. of IV Seminario Italo-Español, Reverse Engineering Techniques and Applications, vol. 1, pp. 65-74, Cassino 2003.

5.6 Reverse Engineering Software: technical specs 3D3 Solutions (http://www.3d3solutions.com): Makes FlexScan3D, a relatively low-cost program that allows digitizing objects using your own digital SLR camera and TV projector. 3D-SHAPE GmbH (Germany) (http://www.3d-shape.com): SLIM3D makes it possible to create a model from a set of different single 3D views. Registration, mesh reconstruction and visualization processing steps can also be run almost entirely automatically via Visual Basic® Script. Alias Wavefront (http://www.aliaswavefront.com): Spider, is a module for the Studio Tools suite for editing, translating and operating on point cloud data in a variety of formats. Anatomics Pty. Ltd. (http://www.qmi.asn.au/anatomics) BioBuild converts volumetric imaging data to rapid prototyping file formats. While designed for medical bio-modeling, it can also be used for reverse engineering applications. C4W (France) (http://www.c4w.com). 3D Shop ModelFix is composed of 3D Shop ModelDesign plus 3 plug-ins applications: 3D Doctor which repairs shells and B-rep faces; 3D STL which allows the creation of airtight mesh from approximate geometry, and 3D ReBuilder which allows reverse engineering from clouds of points and repair of mesh files. Creative Dezign Concepts (http://www.dezignworks.net). Makes DezignWorks, reverse engineering SW for SolidWorks. Automatically creates 2D and 3D features and provides interactive editing of complex curves. Delcam PLC (UK) (http://www.delcam.com). CopyCAD - Creates complex CAD models from digitised data. Floating Point Solutions (http://www.fpsols.com). Provides Meshworks for generating meshes from point clouds, and manipulating and editing the resultant files, and PointCloud for reverse engineering applications. Geometry Systems Inc. (http://www.geometrysystems.com). Offers GSI Studio and GSI Mesh Studio for entertainment and engineering applications. The SW can directly control Minolta or Roland digitizers and has tools for mesh alignment, merging, smoothing, repair and other operations. Also provides GSI Crystal Studio for 2D and 3D laser sub-surface engraving applications, and GSI Viewer Studio for general 3D model viewing and display.



HighRES, Inc. (http://www.reverse-it.com). Offers plug-in software for such CAD programs as Unigraphics, Solid Edge, Pro/Engineer, Solidworks and others for directly interfacing CMM's and other digitizers. InnovMetric Software Inc. (Canada) (http://www.innovmetric.com). PolyWorks/Modeler 10 3D modeling software for milling, rapid prototyping, reverse engineering, and finite-element analysis applications. Materialise (http://www.materialise.com). Magics RP and RT suite can be used for reverse engineering applications. Robert McNeel & Associates (http://www.rhino3d.com). Starting with a sketch, drawing, physical model, or an idea, Rhino provides tools to model designs for rendering, animation, drafting, engineering, analysis, and manufacturing. It creates, edits, analyzes, and translates NURBS curves, surfaces, and solids in Windows. The application also supports polygon meshes and point clouds. Paraform Software (http://www.paraform.com/index.html). Paraform allows users to build parameterized surfaces from a wide variety of unstructured, complex, organic 3D polygonal objects. It can be used to create freeform surfaces from STL data. Imports industry standard file formats and handles extremely large or complex source geometry. Raindrop Geomagic, Inc. (http://www.geomagic.com) Geomagic enables the creation of STL files and 3D models from point clouds of data from any source. Rapidform, Inc. (http://www.rapidform.com) Produces RapidForm XO VerifierTM for comparing scan and CAD data, and RapidForm XOR RedesignTM for generating parametric CAD models from scan point clouds. Simpleware Ltd. (UK) (http://www.simpleware.com). ScanFETM provides a fully automated and robust conversion of 3D data such as CT, MRI and Ultrasound into rapid prototyped and finite element models. ScanFETM generates volumetric meshes which can be exported as STL and/or finite element files. Applications range from medical engineering to manufacturing and reverse engineering. Virtual Grid (http://www.vrmesh.com). VRMesh Studio v4.0 is designed for reverse engineering and mesh modeling and includes a set of tools for smoothing, decimating, merging, and marking point clouds. VX Corporation (http:/www.vx.com). Provides modularized software for mechanical design, mold design, CAM/CNC and reverse engineering. Modules include: VX Mechanical, VX Modeler, VX Designer, VX Mold & Die and VX Machinist. All modules also include file translation capabilities.

5.7 Reverse Engineering Hardware: technical specs

5.7.1 Mechanical Touch Probe Systems Aicon (http://www.aicon.de) [Mechanical probes and Optical 3D measurement systems]. AXILA Inc. (http://www.axila.com) [Mechanical probe with optional laser sensor head]. FARO Technologies Inc. (http://www.faro.com) [Probe arms can be used with laser sensor heads. Also provides long-range laser tracking systems] . Garda S.r.l. (Italy) (http://www.garda-misure.com) [Mechanical probe with optional laser sensor head]. Immersion, Inc. (http://www.immersion.com) [Manufactures the Microscribe line of inexpensive touch probe systems]. Lemoine (France) (http://www.lemoine.fr) [Mechanical probes]. Maxnc (http://www.desktopcnc.com/mfg_pages/maxnc.htm) [Mechanical probes]. Metronor (http://www.metronor.com) [Mechanical probes]. Microscribe (http://www.immersion.com) [Mechanical probes]. Renishaw, Inc. (UK) (http://www.renishaw.com) [Mechanical probe with optional laser sensor head]. Roland DGA Corp. (http://www.rolanddga.com) [Piezo mechanical probe and laser-based systems]. ROMER CimCore (http://www.romer.com) [Mechanical probe with optional laser sensor head]. xystum S.r.l. (Italy) (http://www.xystum.it) [Provides combination scanning and small milling machines using both touch-probe and laser technologies].



5.7.2 Line Scanners/Triangulation 3D Digital Corp. http://www.3ddigitalcorp.com/ 3rdTech, Inc. (http://www.3rdtech.com) [Large volume laser scanning. Crime scene reconstruction a specialty. Can combine ranging data with separately-gathered color video data]. Aicon (http://www.aicon.de) [Mechanical probes and Optical 3D measurement systems. It offer optimized solutions for measuring and inspection procedures]. Arius3D Inc. (http://www.arius3d.com) [The company primarily acts as a service bureau. Uses proprietary NRC technology for colinear scanning of geometry and color]. Callidus Precision Systems GmbH (Germany) (http://www.callidus.de/en/index.html) [Makes laser scanning systems appropriate for scanning small parts up to the architectural and construction survey level]. Cyberware, Inc. (http://www.cyberware.com) [Laser video, color capability]. FARO Technologies Inc. (http://www.faro.com)[Probe arms can be used with laser sensor heads. Also provides long-range laser tracking systems]. HandyScan (Canada) (http://www.handyscan3d.com) [The company is a division of Creaform, Inc. and produces a hand-held, moderate resolution scanner that sells in the range of US$30,000]. KREON Technologies (France) (http://kreon3d.com) Laser Design, Inc. (www.laserdesign.com) Leading manufacturer of high-accuracy 3D laser scanning systems since 1987. Offices in Detroit, Minneapolis, Seattle and India. Distribution throughout Asia and Europe. Technology is used for inspection and reverse engineering of complex-shaped parts of all sizes. MENSI, Inc. (http://www.mensi.com) Minolta Corp. (http://www.minolta.com) [Also see; http://www.minolta3d.com. Can be used to digitize large objects. Color capability also available. Special versions for face and head scanning]. Metron Systems, Inc. (http://www.metronsys.com) NexTec Group (http://www.nextec-wiz.com) NextEngine, Inc. (http://www.nextengine.com) [The company makes a desktop scanner having high resolution and reasonable speed capability that sells for US$2,495. The device is also capable of scanning in color]. Perceptron (http://www.perceptron.com) ScanWorks is a 3D laser triangulation sensor with controller and software. Configurations are available for use on Portable CMM's and CMMs, etc. ScanWorks software works with 3rd party point cloud software and CAD for applications such as reverse engineering, quality inspection, and tool path generation. Polhemus, Inc. (http://www.polhemus.com) Renishaw, Inc. (UK) (http://www.renishaw.com) [Mechanical probe with optional laser sensor head]. RIEGL Laser Measurement Systems GmbH (Austria) (http://www.riegl.com) [Primary applications are for large objects, architecture, etc. Optional color capability, and can also be used as laser tracking systems]. Roland DGA Corp. (http://www.rolanddga.com) [Piezo mechanical probe and laser-based systems]. ROMER CimCore (http://www.romer.com) [Mechanical probe with optional laser sensor head]. Scan Technology (http://www.scantech.net) ShapeGrabber (http://www.shapegrabber.com) 3D Scanners Ltd. (UK) (http://www.3dscanners.co.uk/homenew.htm) Steinbichler Optotechnik GmbH (Germany) (http://www.steinbichler.com) [Laser line scanners and moiré-coded white light triangulation technologies]. Steintek GmbH (Germany) (http://www.steintek.de) [Laser line, and white light point measurement systems]. Third Dimension Software Ltd. (UK) (http://www.third.com) [Makes specialized scanners for measurement of small gaps in products and for large items such as palletized shipping containers]. xystum S.r.l. (Italy) (http://www.xystum.it) [Provides combination scanning and small milling machines using both touch-probe and laser technologies]. Z Corporation (http://www.zcorp.com) [Distributes Creaform's Handycan 3D high-resolution, hand-held scanner]



5.7.3 Laser Trackers FARO Technologies Inc. (http://www.faro.com) [Probe arms can be used with laser sensor heads. Also provides long-range laser tracking systems]. Automated Precision (http://www.apisensor.com) [Laser tracker/touch probe] ZETT-MESS Präzisions-Messmaschinen (http://www.zettmess.de)

5.7.4 Optical Radar MetricVision (http://www.metricvision.com) [Near IR coherent laser radar]. Leica Geosystems HDS, Inc. (http://www.cyra.com; http://www.leica-geosystems.com) [Formerly Cyra Technologies. HDS=High Definition Surveying Division of Leica Geosystems. The technology is based on a laser radar, time-of-flight technique]. Surphaser (http://www.surphaser.com) [Instruments based on phase shift detection ranging. Color available from auxiliary digital camera].

5.7.5 Color Capable Systems 3rdTech, Inc. (http://www.3rdtech.com) [Large volume laser scanning. Crime scene reconstruction a specialty. Can combine ranging data with separately-gathered color video data]. Arius3D Inc. (http://www.arius3d.com) [The company mainly acts as a service bureau. Uses proprietary NRC technology for colinear scanning of geometry and colour]. Cyberware, Inc. (http://www.cyberware.com) [Laser video, color capable]. Genex Technologies, Inc. (http://www.genextech.com) [Projected white light patterns and CCD camera sensors. Face, breast, ear and other body part scanning is a specialty. Color capability]. InSpeck, Inc. (http://www.inspeck.com) [Moiré white light pattern generator with CCD camera sensors. Face and body scanning a specialty. Color capable systems]. Minolta Corp. (http://www.minolta.com) [Also see; http://www.minolta3d.com. Can be used to digitize large objects. Color capability also available. Special versions for face and head scanning]. NextEngine, Inc. (http://www.nextengine.com) [The company makes a desktop scanner having high resolution and reasonable speed capability at a very low price. The device is also capable of scanning in colour]. RIEGL Laser Measurement Systems GmbH (Austria) (http://www.riegl.com) [Primary applications are for large objects, architecture, etc. Optional color capability, and can also be used as laser tracking systems]. Surphaser (http://www.surphaser.com) [Instruments based on phase shift detection ranging. Color available from auxiliary digital camera]. VITRONIC Dr.-Ing. Stein Bildverarbeitungssysteme GmbH (Germany) (http://www.vitronic.com) [Structured white light systems for RE, machine vision and factory automation, QC. Also produces specific instruments for body, face and other medical scanning applications. Color capability available]. VX Technologies (Canada) (http://www.vxtechnologies.com) VX Technologies produces the StarCam (TM) series of light-based, color-capable digitizing systems. They’re able to capture a surface in less than a second, attaining an accuracy of 0.004 inches. Our powerful software allows you to automatically align and stitch multiple images. The patented Internal Optical Reference (IOR) capability in our FW-3 series corrects for temperature change and mechanical shocks to the camera.

5.7.6 3D Metrology Systems for Manufacturing 4DI Group / Brooks Automation (http://www.4dionline.com) [Structured laser light systems for automatic inspection]. Acuity Research (http://www.acuityresearch.com) [Laser Time of Fight-based systems].



LMI Technologies (http://www.lmint.com) [Laser-based machine vision systems for aluminum and iron manufacturing, automotive, robotic and lumber applications]. Steinbichler Optotechnik GmbH (Germany) (http://www.steinbichler.com) [Laser line scanners and moiré-coded white light triangulation technologies. Tire testing and auto body QC are specialties]. VITRONIC Dr.-Ing. Stein Bildverarbeitungssysteme GmbH (Germany) (http://www.vitronic.com) [Structured white light systems for RE, machine vision and factory automation, QC. Also produces specific instruments for body, face and other medical scanning applications].

5.7.7 Scanners for Very Large Objects and Surveying Applications

3rdTech, Inc. (http://www.3rdtech.com) [Large volume laser scanning. Crime scene reconstruction a specialty. Can combine ranging data with separately-gathered color video data]. Automated Precision (http://www.apisensor.com) [Laser tracker/ touch probe] Callidus Precision Systems GmbH (Germany) (http://www.callidus.de/en/index.html) [Makes laser scanning systems appropriate for scanning small parts up to the architectural and construction survey level]. Laserdenta AG (Switzerland) (http://www.laserdenta.com) Leica Geosystems HDS, Inc. (http://www.cyra.com) [Formerly Cyra Technologies. HDS=High Definition Surveying Division of Leica Geosystems. The technology is based on a laser radar, time-of-flight technique]. Also see Leica Geosystems AG. MetricVision (http://www.metricvision.com) [Near IR coherent laser radar]. Minolta Corp. (http://www.minolta.com) [Also see; http://www.minolta3d.com. Can be used to digitize large objects. Color capability also available. Special versions for face and head scanning]. RIEGL Laser Measurement Systems GmbH (Austria) (http://www.riegl.com) [Primary applications are for large objects, architecture, etc. Optional color capability, and can also be used as laser tracking systems]. Steintek GmbH (Germany) (http://www.steintek.de) [Provides laser line scanner-based MobileScan3D optical mobile coordinate measurement system]. Surphaser (http://www.surphaser.com) [Instruments based on phase shift detection ranging. Color available from auxiliary digital camera]. Vexcel Corporation (http://www.vexcel.com) [The company provides a wide variety of data and hardware products for manipulating and interpreting data from 3D sources such as synthetic aperture radar, satellites, architectural scanning, etc].





6 Multimodal Annotations

6.1 Introduction: Paper Annotation, electronic annotation and web annotation This report illustrates the state of the art of interactive systems in which users can add knowledge to the system itself by different annotation techniques. Therefore the document is organized as follow: first the annotation is defined and some multimodal annotation tools for 2d documents are reviewed, next 3d multimodal interactive systems based on annotation are presented and finally several XML DBMSs are outlined in order to present different storage strategies for annotation documents. An annotation is “a note, added by way of comment or explanation” [1] to a document or to a part of a document. The entity being annotated is called the base of the annotation. The base of the annotation is often made evident by a visual identifier, e.g. a curve surrounding a part of the document. In some case, the human annotating a base makes explicit the link between annotation and base using a visual link, for example tracing an arrow between the two [2]. On the whole, a human creates an annotation using a set of graphical elements which s/he considers meaningful (letters, elements of an alphabet, icons). S/he uses visual identifiers and visual links whenever s/he considers necessary to make the link between annotation and base clearer. In paper based world, annotations can exist ‘within the document’, such as in-line, on the margin or on the top or bottom of the page, or ‘stand alone’, such as written on a separate piece of paper [3]. When documents and annotations become “electronic”, they are no more recorded on a permanent support, but exist ‘virtually’ as the result of the interpretation of a program by a computer. An electronic annotation is a multimedia note that is associated and provides a comment or an explanation to an e-document or to a part of it [4], [5]. Users can access and annotate e-documents because the computational process generates some physical representations perceivable by them, for example images on a screen. These physical representations only exist and are perceivable until the electronic machinery maintains them in existence. E-document and e-annotation are less persistent than paper-based one, but this dependence on a machine offers some advantages. Interactive computers allow e-documents and e-annotations to be managed and adapted by their users more easily than paper-based ones; they also can evolve during their usage and adapt to their users. The physical representation of documents and annotations results from a mapping of the content of the document into output events perceivable by users (e.g. the images on the screen or a speech through a microphone). The process of creating the content of a document is separated by the process of its materialization (physical representation), which may be multimodal. The materialization can be adapted to the culture, skills and abilities of the current user, without altering its content. When a document becomes a web document, it can be formed by parts stored in different archives in the web but appears to the users as a single entity [6]. Similarity the web annotation is seen as a document associated to an element of the root document. Being a web document the web annotation can in turn be stored in different archives and can be created by cooperative activity of users of the root document which are geographically dispersed and can work in an asynchronous way. Note that e-annotation and web annotation can now be performed via visual interfaces and the document being annotated can be a 2D image as well as a 3D structure [1][7][8][9].

6.2 Annotation in 2d environments

Tools for Collaborative Annotation: a Comparison among Three Annotation Styles developed by Unimi (University of Milano) An annotation system consists of a set of services which first allow the creation of metadata describing a collection of digital objects, then such metadata are used to create, store, retrieve and present documents and annotation to the users. An annotation system allows the users to manage both the documents and the annotations depending on their role and skill level.



Since each annotation system determines an annotation style, this section aims at investigating, by means of comparison between three systems, which annotations styles perform better depending on goals, context and users. The section develops on some findings on annotation systems achieved by the University of Milan (UNIMI) group participating to three different annotation projects. These projects are: DELOS, T.Arc.H.N.A and BANCO. Within DELOS [10] the UNIMI unit collaborated with the “Laboratory of Distributed Multimedia Information System and Application” of the Technical University of Crete (TUC/MUSIC) to integrate SyMPA (System for Multimedia Presentation Authoring), an annotation tool developed by UNIMI, and a framework developed by TUC. The integrated system allows the annotation of multimedia documents by means of descriptors (a set of domain specific indexes) or structured XML annotations based on MPG7-21. In T.Arc.H.N.A [11,12] , UNIMI unit collaborated with several European groups – including archaeologists as domain experts – to develop a system which allows the annotation of archaeological documents by means of narrations, a type of annotation characterized by the capability of being assembled at run time. In BANCO project [13,14], the UNIMI group collaborated with domain experts and ITC companies, to develop a system which allows the annotation of multimedia documents, by means of pure annotations, developed by domain expert on the field.

A. SyMPA annotation activities Annotation in SyMPA is a tool to create metadata in order to retrieve and organize collection of multimedia digital objects. A domain expert is in charge of performing the annotation while a system component will use metadata to organize the collection aggregating objects in structured sets in an automatic way. Annotation activities in SyMPA are based on three types of annotations: labels, descriptors and automatically extracted low-level multimedia objects’ features, such as MIME type, file size or bit rate. Descriptors are used by the system to group objects in structured sets based on the content relationships between them. SyMPA’s annotation activities are: a. Labels: creation, storage, retrieval; b. Descriptors: creation, storage, retrieval; c. Automatically extracted metadata: creation, storage, retrieval; d. Structured sets of objects: creation, storage, retrieval.

B. T.Arc.H.N.A annotation activities The approach proposed in the T.Arc.H.N.A. project considers the Cultural Heritage (CH) as a set of objects automatically retrievable from several federated museums archives. To contextualize and to enhance the semantic of the related artefacts according to their historical and anthropological meanings, the T.Arc.H.N.A project introduces the concept of “narration”. A narration is a document written by domain experts and able to contextualize cultural data in a novel way by relating different types of artefacts retrieved from different databases. In the adopted knowledge system a narration is composed of multiple components: a text and a context. The context can be view as a particular kind of annotation. Moreover, the proposed approach supports two different types of narration: direct and indirect. In a direct narration, the text is a detailed interpretation by a domain expert, such as an explanation of a given finding or monument, or a given group of invariable and limited objects, and its context is defined by means of an explicit description of the relations between the text and the objects involved in it. For instance a possible direct narration can concern the function of a specific trumpet as musical instrument, or the role of a given object as a votive offering. Instead, in an indirect narration, the text is a more general document that deals with arguments involving a (possibly large) group of artefacts. This narration typically focuses deeply on a given topic, such as a description of iconography and of its formal origin, a class of monuments or paintings, the role of a given group of objects and so on. In this case the context expresses the network of relations that connect the text with the set of artefacts related to it. For instance a possible indirect narration can deal with the theme of devotional practices or the music in ancient Etruria. It is important to note that each archive in the federation of databases can be updated, for example because the institution maintaining it acquires some new exhibit. Moreover the federation can be joined by new member institutions that thus contribute new databases to the



federation. In all these cases the set of artefacts collected as the result of the interpretation of a context in an indirect narration is different from the previous sets. Hence the context and the narration itself are document systems varying in time. Under such perspective, in a direct narration, the context is an extensive description that connects the text with a well defined set of objects. For example a narration describing the winged horses or a specific vase located in the Louvre museum. In this case the context is statically defined. Instead, in an indirect narration the context is an intensional description dealing with topics involving a larger and not pre-defined group of objects, for example a narration describing the use of the bronze in Etruria, or the funerary rituals, or the Etruscan music. In this case the context varies in time because the contents of the databases change over time. In these cases the set of objects collected as the result of the interpretation of a context in an indirect narration is different from the previously collected sets.

C. BANCO annotation activities BANCO main goal is to support cooperative work and knowledge enhancing through annotation. Annotations are created and managed by end-users who can arrange them in thread and create static links between annotations or part of a multimedia document. Annotation indexing is based on links and descriptors (automatically extracted or user-defined). Automatic extraction of low level document features is also supported. BANCO is an approach to social representation of community emotional and social moods about a theme that can be mapped into an image (such as a thematic map for a territory, medical images and body map in medicine, images deploying social relations etc.). BANCO approach is based on the development of map-based wiki (MPW) which is maintained and updated by its users through multimedia annotations. MPW users can act as authors or as consumers: they are users. 1. Users as knowledge consumers can:

− Navigate in the wiki, which is organized as an engine that permits the access to a federation of data bases concerning the domain of knowledge of interest.

2. Users as knowledge producers can: − signal the presence of an “expression of mood” by creating a graphical tag (GT), also

called visual link. The graphical tag is an active button whose selection opens a thread of discussion consisting of one or more notes;

− add a multimedia text to the thread of discussion. BANCO’s annotation features and activities are: Graphical tag creation and meaning - The creation consists of selecting a GT from a tag pallet and of dragging it to a point in the

map. The GT becomes associated with that point; - the GT is the synthetic representation of an emotional mood (e.g. <dangerous,

disappointing, neutral, enjoyable, gorgeous>); - the GT is represented according to the reader’s culture (i.e. adopting localized colour,

graphic, articulatory languages); - the GT acts as a title: it is the more abstract representation of the mood emerging from the

multimedia text in the thread. Association of thread of discussion and map element Users can associate their thread of discussion with a map element (including the map itself). − if the element is a point element

• if it is an existing point element, users can graphically link a GT to the element • else if the point element does not exist in the map, users can

- create a point element according to the visual language adopted to create the map - link the GT to it.

− if the element is a region element: • users identify the region by tracing its free end boundary, defining a colour that identifies

the area and a level of transparency. Colour and transparency are defined according to the reader’s culture;



• the defined region is an active element that opens a thread of discussion, if it is selected;

• the defined region can be associated or contain one or more GT. Note creation and meaning By selecting a GT, users can access to the note thread associated with that GT.

• if they agree with the mood expressed by the GT, they can add a note consisting of a text, a visual, video or audio element or a combination of them;

• else they can open a new thread, creating a new GT that resumes their mood. Implementation issues The implementation is based on the use of several archives: - an archive of mood definitions; - a set of user culture profile archives, i.e. graphical, colour and action definitions – one for

each culture taken into account; - a set of (pre-existing) archives that store knowledge about the territory at end, which are

federated using a domain ontology. The implementation is based on: - a mechanism of association of mood definitions with local archives at user login time. At

login time, the mechanism generates a version of BANCO environment localized to the culture, situation and context in which the user operates;

- an engine that uses the domain ontology to retrieve data in the federated dbs.

Fig 1. A sketch of a typical BANCO implementation is shown in fig.1 in which two different users access a common environment by the use of two localized BANCO instances In the following several other approaches for annotating documents in 2d environment are presented.

Del.ico.us, Digg, BlinkList These three tools are tagging systems addressed to organize information gathered from the web. The tag is a particular form of annotation. An annotation can be defined as: 1. a set of keywords (tags) 2. a free text 3. a set of pointers to other resources.



Del.ico.us, Digg, BlinkList accepts annotations in the form 1 and 2 and differs according to how they organize the annotation. Recently there has been an explosion of systems aimed at supporting social tagging/bookmarking system for photos (e.g. flick.com), videos, (e.g. youtube.com) or web resources (e.g. Del.ico.us, Digg, BlinkList). These systems provide a mean for uses to generate tags and comments (keywords and free texts) that then can be browsed and searched. Tags are defined using free-form keyword and can be organized to provide meaningful navigation structure and an user centric knowledge management. [ws1, ws2, ws3].

Pliny and traditional scholarly practice Pliny is developed in a plug-in framework that supports the integration onscreen of independently built tools. The plug-in model allows tools to share screen space in ways that promote integrated thinking by the end user. The user, however, has chosen to annotate a selected image. The annotations although appearing in the VLMA object viewer are actually elements of the Pliny annotation framework and in that context coexist with annotations created in other environments Pliny supports – web pages, PDF documents, images, etc (figure 2). [ws4].

Fig 2. Pliny in operation with notes attached to an image.

DesignDesk ViewLink DesignDesk VIEWLink provides the flexibility to organize, view, locate and annotate CAD drawings. VIEWLink does not require an CAD to locate, view (both raster & vector), print, annotate or redline drawings. CAD users are also able to immediately launch viewed drawings within their CAD system or launch the appropriate drawing based on part number from their ERP system. DesignDesk Viewer (works without CAD) provides powerful and convenient tools to query and retrieve drawings. The user can locate his/her drawings by common attribute data such as drawing number, title, project, name or version number. Also available is a powerful query feature that allows to select the drawing that the user wants by viewing the bird's-eye view of the drawings which meet his/her search criteria. Once the user has found his/her drawings, he/she can save the queries for reuse. [ws5].



Eroiica Edit Facilitates viewing, annotating, redlining and editing of large engineering size raster and vector file formats such as ACIS, AutoCAD DWG, DXF, HPGL, CALS, CGM, DGN, ME10, PNG, etc. [ws6].

eReview eReview allows to view and mark-up many different types of documents and drawings without having or using the application that created them. Mark-ups and annotations are non-destructive; i.e., the original document is never changed; rather, the mark-ups, annotations and comments are saved in a corollary file, providing a history of suggested changes. This process prevents uncontrolled changes to a document/drawing while protecting the integrity of the document/drawing under review. Collaboration simplifies and enhances the review process. What used to be sequential, time-consuming and difficult to accomplish is intuitive, natural, and in real-time within a fully collaborative Web environment. Real-time collaboration saves time, money, travel, and allows better and quicker decisions. [ws7].



6.3 Annotation in 3d environments

The Virtual Annotation System Annotation is a key operation for developing understanding of complex data or extended spaces. “Virtual Annotation System” (VAnno) is a flexible set of annotation tools that can be placed in a variety of applications. These tools offer a full set of capabilities for inserting, iconizing, playing back, and organizing annotations in a virtual space. They also have an intuitive and easy-to-use interface for employing these capabilities while immersed in the virtual environment. VAnno, is a complete set of tools for making, finding, and reviewing annotations while immersed in virtual environments. This system shows how a simple collection of controls, icons, and menus can be an intuitive, easy-to-use interface for annotations. Since any object in the scene can be annotated (including controls and menus), the system supports the development of training environments. It also can easily be extended to display annotations containing text or images. By selecting annotations organized in the annotation menu, the user can review important points or take tours of the data and the virtual space. [15]

CATIA 3D Functional Tolerancing & Annotation 2 (FTA) CATIA 3D Functional Tolerancing and Annotation 2 (FTA) is a new-generation CATIA product that addresses the easy definition and management of tolerance specifications and annotations of 3D parts and products. The intuitive interface of CATIA 3D Functional Tolerancing and Annotation 2 product provides an ideal solution for new CATIA customers in small and medium size industries, reduces reliance on 2D drawings and increases the use of 3D as the master representation. 3D annotations can be extracted, using the annotation plane concept in CATIA Generative Drafting products. Syntactic and semantic verification command allows to check the correctness of all the annotations of the active product or part regarding to the standard used. Finally, the 3D annotations can be reviewed using ENOVIA DMU Dimensioning & Tolerancing Review 1 product (DT1) or DELMIA DMU Dimensioning and Tolerancing Review 2 product (MTR), which offers comprehensive tools for interpretation of annotations and tolerances on specific areas of the design or across the complete digital mock-up. [ws8]

Annotation Authoring in Collaborative 3D Virtual Environments Annotations may be thought of as author-attributable content placed within a 3d scene in association with (or reference to) a particular element of that scene. For example, an expert in archaeology may enter an environment that contains a 3D model of an archaeological artefact in need of some form of commentary. Using scene annotation capabilities, this user should be able to add commentary in association with specific aspect (or aspects) of the artefact. This new content would persist in the environment and be made available to other users as a form of annotation attributed to the author. This new content should also be searchable, so that other users could locate all commentary on a specific artefact or commentary created by a specific author. To this end, basic conventions have been developed using which annotations can be created independent of the form of media. It is possible to create an annotation even if the object was not originally designed to be annotated. This is important so that it is possible to provide the richest possible set of tools for expressing comments and annotations. Of course users can make a new annotation object using such as text input interface. [16].

Composing PDF Documents with 3D Content from MicroStation MicroStation V8 2004 Edition includes the ability to print 3D design file geometry directly to 3D annotations within a PDF file. This is an exciting addition for MicroStation users as it lets them easily distribute 3D designs within PDF documents. Typically, PDF documents printed from MicroStation contain 3D annotations that encapsulate everything required to visualize a design. This includes model geometry, materials, lighting, and texture maps. 3D annotations also can contain animations, both of the model geometry and of “flythrough” animations of the viewing camera.



This document includes suggestions for integrating 3D annotations into existing documents, and instructions for adding links and book marks to let the user interactively control the viewing of the 3D content. MicroStation’s 3D printing produces either separate PDF documents with a 3D annotation in each, or (with batch printing) a multi-page PDF document with separate 3D annotations on each page. While these can be useful in their own right, the real power of 3D annotations within PDF is in their ability to include them within an existing PDF document such as a marketing brochure, a design portfolio, or a technical manual. [ws9].

A Direct-Manipulation Tool for JavaScripting Animated Exploded Views Using Acrobat 7.0 Professional This 3D Annotation approach is animated by a JavaScript which interpolates the positions of CAD model parts (meshes) over time. A JavaScript tool is used to arrange these parts visually for establishing the relative positions of the exploded view. The translations of the meshes, once arranged, can then be printed as lines of code in the console, then copied and pasted into the document-side JavaScript, where it becomes a permanent part of the PDF document. [ws10].

NX I-deas Master Notation: For documenting solid model designs Master Notation provides a palette of annotation tools that can be used within Master Modeler for documenting parts, or for use within Master Assembly for documenting assemblies. Master Modeler and Master Assembly provide capabilities for adding key/driving dimensions, annotation dimensions and notes to the design. Master Notation provides optional capabilities with additional annotation symbology and organizational tools. Master Notation provides many ways to attach annotation, with or without leaders or stacked upon other symbols or notes. NX I-deas goes beyond simple attachment, and also provides the user an opportunity to select other associated geometry as well. It is a simple task to query annotation and see what geometry it is associated to. Annotation can be attached and associated to multiple geometric entities, and subsequent queries of the annotation or geometry will highlight these associations, providing design intent not available in standalone documentation. [ws11].

Immersive redlining and annotation of 3D design models on the Web Using this system, users can view a shared design model stored on a central server from their VRML-enabled Web browsers. They can add text annotation objects to the design model, which appear as small coloured spheres with numbers on them. The spheres are linked to text that is maintained in a log on the server, and the colours of the spheres indicate the authors’ identities. The annotation objects (and all additions and changes made by users) are stored in separate VRML files on the server, which are loaded on top of the base model VRML file that contains the original design. User annotations are available for all to see and comment on. [17].

Post Processing Tips & Hints: Annotation in ANSYS The first thing to understand about annotation is that there are two flavours in ANSYS: 2D and 3D. The 2D annotation is a writing on a transparent window on top of graphics. 3D annotation creates objects that are attached to ANSYS entities like nodes, areas or lines. This means that when a user moves the model by rotating or panning, 2D annotations stay put and 3D annotation move with the entities they are attached to. With the controls for entity types for 3D annotation the user specifies where to place the object by one of three methods, depending on what option they choose: specifying the location type in the second drop-down and clicking on a model entity; specifying an X,Y,Z position; or by picking a point on the screen. If an user is clicking on a model entity, ANSYS converts that entity’s location into an X,Y,Z position. If a user clicks on the screen, it places the point on the view plane directly under where he/she picks. [18].



Boom Chameleon: Simultaneous capture of 3D viewpoint, voice and gesture annotations on a spatially-aware display Boom Chameleon is a novel input/output device consisting of a flat-panel display mounted on a tracked mechanical armature. The display acts as a physical window into 3D virtual environments, through which a one-to-one mapping between real and virtual space is preserved. The Boom Chameleon is further augmented with a touch-screen and a microphone/speaker combination. The 3D annotation application exploits this unique configuration in order to simultaneously capture viewpoint, voice and gesture information. Results of an informal user study show that the Boom Chameleon annotation facilities have the potential to be an effective, and intuitive system for reviewing 3D designs. [19].

ANNOT3D DESCRIPTION Annot3D is 3D annotation system that allows users to add various 3D annotations to 3D volume data. Users can then interact with this data (rotate, translate, scale, pick annotations, etc.) via a simple viewing window. The underlying rendering system is the Visualization ToolKit (VTK). Annot3D can be run locally on Windows and Linux. It can also be run as a web server on Windows. Annot3d is designed to be a simple and integrated method for annotating and packaging 3D visualizations for educational purposes. Specifically, Annot3D is meant to be used as a tool for human anatomy education. The current version of Annot3D fulfils the following requirements: - Read scanned data into a 3d rendering system. - Incorporate a set of 3D annotations that have been requested by medical instructors - Include an authoring tool for adding 3d annotations to the model. Annot3D provides the ability to add the following annotations to models: buttons (dots and boxes), lines, boundaries, text, clipping planes, and spheres/boxes that allow a portion of the data volume to be highlighted or visualized by removal of surrounding data. The Annot3D authoring system currently consists of XML files generated by hand to contain all information related to a dataset and its annotations. [ws12].

Drawing for Illustration and Annotation in 3D While 2D annotation is widespread, few systems provide a straightforward manner to do this in 3D. In a typical user session, the system initially loads the 3D model. It is subsequently integrated with the drawing in the same way as for local surfaces: if a stroke is drawn on the model, it is set to lie on it. Line strokes can be used for adding annotation symbols (e.g., arrows, text, etc). Figure 5 (b) shows an example of annotation: adding a coarse landscaping sketch around an architectural model. Figure 5 (a) shows annotation used for educational purposes: a heart model is annotated in a classroom-like context, for instance during an anatomy course. The use of a well-chosen clipping plane gives an inside view of the model and allows to draw anatomical details inside it. Another example could be using the system in collaborative design sessions. Annotation could then be employed for coarsely indicating which parts of the model should be changed, and to exchange ideas in a brainstorming context. [20].



Fig 5. Examples of annotations in 3D

Markup and Drawing Annotation Tools Hypercosm Teleporter Pro for SketchUp allows to add additional markup to 3D models. By adding markers and annotations to the 3D scene, it is possible to make it easier to point out certain features of a model or to call attention to a particular part of the model. Since these markers are actually part of the 3D scene, when the view is changed, they continue to stay with the part of the 3D model that they belong to. It is also possible to annotate the model just by "scribbling" on them as if drawing on a piece of paper. Draw "draw line" control allows to place the individual vertices of a line, like carefully drawing lines with a ruler or laying out a piece of string. The "freehand" control allows to casually scribble right on top of the 3D object, just like a graffiti artist with a can of spray paint. Together, these tools allow additional freedom for creatively expressing your ideas superimposed on a 3D model. [ws13].

Space Pen Annotation and sketching on 3D models on the Internet Space Pen works directly on any regular web browser with Java 1.3 capabilities. When it is completed, architects, clients and/or contractors will be able to use it to review and comment on a 3D representation of the architect’s design proposal. Participants log on to the Space Pen web site to view the project in 3D in their own web browsers. They move around the building as they would once it has been built and leave drawing marks and annotations directly onto the 3D model, and save those as part of the model. Each user chooses a different pen colour to represent their comments or drawing. Figures 6 and 7 show the Space Pen applet window. Once logged on, the user becomes immersed in the virtual 3D model. On the left is a set of different pen colours for drawing annotations on the model. The user can navigate around the model using the arrow keys of the keyboard. [21].

Fig 6. Drawing marks left by the first user.

(a) (b)



Fig 7. Drawing marks left by a second user on top on the first ones.

6.4 XML DBMS

6.4.1 Not native XML Databases

CACHE Cache is a "post-relational" database that stores data in multi-dimensional arrays and provides SQL, object-oriented, and multi-dimensional access to that data. Cache supports XML in three ways. First, it provides an implementation of XML data binding, which uses the object-oriented view of the data. This contains tools for creating DTDs and XML Schemas from classes stored in Cache and for creating classes from XML Schemas. Data can then be transferred between XML documents and objects stored in the database. Second, methods on any class stored in Cache can be exposed as Web services. Third, Cache's Web application server (Cache Server Pages) can use the data binding functionality of Cache classes to expose data as XML over the Web. Database type: Post-relational, Commercial [ws14]

eXtremeDB eXtremeDB is an object-oriented in-memory database. It supports a standard navigational API, as well as an application-specific API from a database schema. An SQL implementation is available as an add-on. eXtremeDB is available in three versions: a standard version (which is available for conventional or shared memory), a single-threaded version, and a high-availability version. Transaction logging is available as an add-on. eXtremeDB supports XML through XML data binding. From a database schema, it can generate methods to create new objects from XML documents, serialize existing objects as XML documents, and update existing objects from XML documents. In addition, it can generate XML Schema documents for the generated XML documents. The mapping from objects to an XML Schema was "developed in accordance with the W3C SOAP encoding recommendations" so that serialized objects can be used inside SOAP messages. Database type: Object-oriented, Commercial [ws15]

Informix Informix supports XML through its Object Translator and through the Web DataBlade. The Object Translator generates object code, including the ability for objects to transfer their data to and from the database. It also supports functionality such as transactions and optimistic and pessimistic locking. XML support is provided through generated methods that transfer data between objects and XML documents. A GUI tool allows users to create object-relational mappings from XML documents to the database, also specifying how to construct intermediate objects. Version 2.0 of the Object Translator (beta expected in Sept. 2000) is expected to support SOAP as well as be able to generate XML DTDs from object schema and relational schema.



The Web DataBlade is an application that creates XML documents from templates containing embedded SQL statements and other scripting language commands. It is run from a Web server and supports most major Web server APIs. Database type: Relational, Commercial [ws16]

Matisse Matisse is an object-oriented database that has three different APIs. First is native object support, which is available in a number of languages, including C++, Java, C#, Eiffel, PERL, Python and PHP. Second is an implementation of SQL 99, including support for stored procedures and triggers. This is available through JDBC and ODBC. Of interest, this can be used to enforce the referential integrity of objects. The third interface is XML. XML is supported through XML data binding, in which XML documents are mapped to objects. The mapping is done by a utility that generates class schema from a DTD. Data can then be transferred between XML documents and objects in the database according to this mapping, and data loaded into objects is validated against the object schema. Matisse also provides "an object API to manipulate XML documents", although it is not clear if this is the DOM. Matisse supports transactions and uses versioning instead of logs as a way to improve update speed. (Versions are also directly available to the user, so historical queries can be performed.) Other features include support for .NET and J2EE, a database adminstration API, "fast joins" through "pre-computed relationships", multi-threading, and client-side and server-side caching. Database type: Object-oriented, Commercial [ws17]

OpenInsight OpenInsight is a development environment that can access data in its own "native tables", as well as Oracle, Lotus Notes, and ODBC and OLE/DB databases. The native tables, which are the same as those used by the DOS-based Advanced Revelation development environment, are a multi-valued database. It appears that XML support is limited to data stored in the native tables. The mapping between XML documents and native tables appears to be table-based, in which a single document can contain multiple rows from the same table. Because columns in the native tables can be multi-valued, child elements in the XML document can occur multiple times. It also appears that only child elements (and not attributes) are supported. Users can map XML documents to the database in one of two ways. Either they can map individual child elements to individual columns, or they can map the entire document to a single column and map child elements to calculated columns. OpenInsight can retrieve data from the database in three ways. First, it can execute a query and return the data as XML. Second, it can execute XPath queries against the database, presumably viewing the database as a set of virtual XML documents defined by a particular mapping. Third, it can retrieve data from specific columns in a particular row according to the element types to which those columns have been mapped. An XML Schema is required to store data from an XML document in the database. (DTDs are not supported.) OpenInsight can generate an XML Schema for a given table. OpenInsight includes GUI-based tools for importing XML documents into native tables, retrieving the result of queries as XML, viewing data stored in the database as XML, and executing XPath queries against the database. Database type: Multi-valued, Commercial [ws18]

Oracle Oracle provides support for an XML data type, SQL/XML, XQuery, XSLT, the DOM, XML indexes, the XML DB Repository, and XML Schemas. XML support starts with the XMLType data type, which implements the XML data type defined by SQL/XML. Data in XMLType columns can be stored in one of two ways: with object-relational



storage or as a CLOB. Application code is the same regardless of which option is chosen, although changing from one storage type to the other requires a database export and import. In addition, there are a number of significant differences between the storage mechanisms. For example: - CLOB storage round trips documents exactly, while object-relational storage round trips

documents at the level of the DOM. - CLOB storage can store any document, while object-relational storage can only store

documents that conform to a particular XML Schema. - Object-relational storage can often perform node-level updates, while CLOB storage always

performs document-level updates. - CLOB storage has fewer indexing options and may be significantly slower to query than

object-relational storage. Object-relational storage is defined with an annotated XML Schema. Annotations can be made by the user or generated automatically. Annotating schemas by hand generally results in more efficient mappings and also allows users to create XML "views" over existing relational data. In either case, when the schema is registered with the database, the database automatically creates any tables and columns needed to store documents conforming to the schema. Unlike most software that uses an object-relational mapping to store XML in relational tables, Oracle can round-trip XML documents at the level of the DOM. To do this, it uses hidden columns to store information that is not directly modelled by SQL, such as sibling order, processing instructions, and comments. XMLType columns can be constrained to a particular XML Schema. In addition to limiting the documents that can be stored in the column to those that conform to the schema, this allows the query engine to optimize queries based on the information in the schema. However, because constraints are limited to a single schema, updating that schema means that all existing data must be migrated to the new schema. XMLType columns can be indexed in one of four ways. First, when object-relational storage is used, the columns to which the XML elements and attributes are mapped can be indexed using B-tree indexes. Second, XMLIndex indexes can be constructed over any XML values, regardless of how they are stored. These index paths, values, and order information. Function-based indexes can also be used over any XML values. These use proprietary functions such as extractValue to identify values in the XML document to be indexed and are (apparently) evaluated by matching XPath expressions in queries to those used to define the index. Finally, Oracle Text indexes provide full-text indexing of the values in elements and attributes. XML values can be queried with XQuery, which is called from SQL statements with the proposed SQL/XML XMLQUERY function. They can also be queried with the proprietary existsNode and extract functions in SQL, which accept an XPath expression and, respectively, test the existence of a node or return a set of nodes. XML values can be used as a source of tabular data for SQL statements via the proposed SQL/XML XMLTABLE function and the proprietary extractValue function in SQL, which returns a value identified by an XPath expression. Updates are performed via proprietary SQL functions for inserting, appending, and deleting nodes as well as updating node values. These identify nodes by XPath expressions. Oracle also defines an ora:contains function in XQuery for performing full-text searches over values in XML documents. This uses Oracle Text indexes if they are available. Relational values can be used to construct XML values via the SQL/XML publishing functions (XMLELEMENT, XMLATTRIBUTE, XMLFOREST, XMLAGG, etc.) They can also be queried with XQuery by first constructing an XML view of the data, then querying the view with XQuery. Views can be constructed with the SQL/XML publishing functions, an annotated XML Schema, or the proprietary ora:view function in XQuery, which (apparently) uses the table-based mapping defined by SQL/XML to map a table to a virtual XML document. Oracle includes two implementations of XQuery: a mid-tier implementation and an implementation that is part of the database engine. The mid-tier implementation is commonly used to query individual XML values, such as XML messages, while the engine implementation is optimized for querying values stored in the database. The engine implementation attempts to rewrite both XPath and XQuery expressions as SQL statements. This is possible only when the XML value is stored using object-relational storage or the expression can be resolved by querying an index. Otherwise, the query is evaluated directly; for XML values stored as CLOBs this can be "very expensive", as it involves parsing the XML and (apparently) building a DOM tree. Oracle also features the XML DB Repository, which provides a file-system like view over objects in the database. (Although designed specifically for accessing XML values, the XML DB



Repository can be used to access any object stored in the database.) Objects are assigned a path and corresponding URL and can be accessed via HTTP, WebDAV, and FTP, as well as JDBC, PL/SQL, OCI, and ODP.NET. (SQL-based access uses the exists_path and under_path operators to query the RESOURCE_VIEW and PATH_VIEW views of the tables describing the repository.) In addition, the repository maintains system-defined metadata for each object (owner, creation date, etc.) and optional user-defined metadata in the form of an XML document, as well as providing versioning, access control lists, and hierarchical indexes for optimizing repository access. The support described above is for Oracle 10g release 2. The major XML features in Oracle 10g release 2 are XQuery support and XMLIndex indexes. Most of the other features have been supported since Oracle 9i release 2. Oracle 8i and Oracle 9i release 1 support the XML SQL Utility for Java, the Internet File System (iFS) (which provides functionality similar to the XML Repository), and searching XML documents stored as CLOBs with Oracle Intermedia XML Search. In addition, Oracle 9i release 1 supports the XMLType data type (which is stored as a CLOB) and XPath searches over XMLType columns. Database type: Relational, native XML, Commercial [ws19]

SQL Server Microsoft SQL Server 2000 supports XML in three ways: the FOR XML clause in SELECT statements, XPath queries that use annotated XML-Data Reduced schemas, and the OpenXML function in stored procedures. SELECT statements and XPath queries can be submitted via HTTP, either directly or in a template file. The FOR XML clause has three options, which specify how the SELECT statement is mapped to XML. RAW models the result set as a table, with one element (named "row") returned for each row. Columns can be returned either as attributes or child elements. AUTO is the same as RAW, except that: 1) the row elements are named the same as table name, and 2) the resulting XML is nested in a linear hierarchy in the order in which tables appear in the select list. EXPLICIT allows you to model an XML document using a series of SELECT statements that are UNIONed together. In its simplest form, each SELECT statement is numbered and includes the number of its parent statement. The results of an individual statement are modeled as a table and an element is created for each row. This is placed in the XML document beneath the appropriate parent element. Assuming there is a relation between the result sets (for example, each contains a sales order number), the children are nested as one would expect. EXPLICIT allows you to create canonical object-relational mappings from the database to an XML document, but supports more sophisticated queries as well. Annotated XML-Data Reduced schemas, also known as mapping schemas, contain extra attributes that map elements and attributes to tables and columns. These specify an object-relational mapping between the XML document and the database, and are used to query the database using a subset of XPath. A tool exists to construct mapping schemas graphically. The OpenXML function uses a table-based mapping to extract any part of an XML document as a table and use it in most places a table name can be used, such as the FROM clause of a SELECT statement. This can be used in conjunction with an INSERT statement to transfer data from an XML document to the database. An XPath expression identifies the element or attribute that represents a row of data. Additional XPath expressions identify the related elements, attributes, or PCDATA that comprise the columns in each row, such as the children of the row element. Inserts, updates, and deletes are done through specially formatted XML documents called "updategrams". These contain the before and after data (both in the case of update, only after data in the case of insert, and only before data in the case of delete). By default, updategrams use table-based mappings. They can use object-relational mappings by specifying an annotated schema. Database type: Relational, Commercial [ws20]



Sybase ASE 12.5 Sybase supports XML in two ways. First, the ResultSetXml class can transfer data between an XML document and the database. A ResultSetXml object can be created from an XML document or a SELECT statement. Among other things, applications can modify the data in a ResultSetXml object, serialize the data to an XML document, or create an SQL script to create a table for the data and store the data in the database. The XML document used by ResultSetXml has a proprietary format that contains a set of ColumnMetaData elements followed by a set of Row and Column elements. Sybase also has native XML capabilities. It can store XML documents in a pre-parsed, indexed form in BLOB columns. These can then be queried with XQL. (The XQL engine can also be used to query XML documents stored elsewhere, such as in the file system or on the Web). Updates must be performed at the level of whole documents. Database type: Relational, Commercial [ws21]

UniVerse UniVerse is a nested relational database, which is a relational database that does not follow first normal form. That is, it allows columns to have multiple values for a single row. Multi-valued columns may be grouped together into "associations", which are effectively nested tables. There are two different types of multi-valued columns -- multi-valued and multi-subvalued -- and these types determine the degree of nesting in an association. The multi-valued columns in an association effectively form a sub-table, while the multi-subvalued columns in an association effectively form a sub-sub-table; deeper nesting is not supported. Note that a table may contain multiple associations and that multi-valued columns are not required to be in an association. When retrieving data from the database, UniVerse uses what is effectively a table-based mapping, although this is modified to handle the nested tables that can occur in UniVerse tables. As in a table-based mapping, one element is created in the XML document for each row in the table. Because UniVerse tables can effectively contain sub-tables and sub-sub-tables, child and grandchild elements are created for each sub-table "row" and sub-sub-table "row" as needed. There are three options for mapping data to an XML document. In an attribute-centric mapping (the default), data from single-valued columns is stored as an attribute of the row element, data from multi-valued columns is stored as an attribute of the association (sub-table) element, and data from multi-subvalued columns is stored as an attribute of the sub-sub-table element. The element-centric mapping is the same as the attribute-centric mapping except that data is stored in child elements instead of attributes. In both the attribute- and element-centric mappings, UniVerse generates element and attribute names from table, association, and column names. The last mapping option is "mixed" mapping, in which the user specifies whether to map columns to attributes or elements using an XML-based mapping language. Users can also specify element and attribute names (including namespace name), how data is to be converted and formatted before it is placed in the XML document, and whether to include the root and/or sub-table elements. Applications specify that they want data returned as XML with a TOXML clause on the LIST command in RetrieVe or the SELECT statement in UniVerse SQL. This clause allows users to specify whether they want to use an attribute-centric, element-centric, or mixed mapping, as well as whether to include a DTD or XML Schema in the resulting XML document. UniVerse also supports an XML-based object-relational mapping language. This allows multiple tables to be mapped to a single XML document, with primary key / foreign key relationships in the database determining the nesting structure in the document. The language handles single- and multi-valued columns in the same way as the language described above. That is, single-valued columns can be mapped to child elements or attributes of a row element, and multi-valued columns can be mapped to child elements or attributes of an association element. Applications can use object-relational mapping documents with TCL and UniVerse BASIC. In TCL, there are commands to transfer the data for entire documents to/from the database. In BASIC, there are functions to transfer data from entire documents to/from the database, as well as the XMAP API, which gives applications row-by-row control over what data is transferred.



Database type: Nested relational, Commercial [ws22]

6.4.2 Native XML Databases

Berkeley DB XML Berkeley DB XML is a native XML database built on top of Berkeley DB, adding an XML parser, XML indexes, and an XQuery engine. From Berkeley DB it inherits a storage engine, transaction support (including XA), automatic recovery, and other features. Berkeley DB XML stores XML documents in logical groupings called containers, which are the same as collections in other native XML databases. Users can specify a number of properties on a per-container basis, including whether to validate documents, whether to store documents whole or as individual nodes, and what indexes to create (element, attribute, or metadata). It is worth noting that schemas are specified through schemaLocation hints in documents, rather than being associated with the container as a whole. In addition to storing XML documents, Berkeley DB XML can store non-XML documents (in the underlying Berkeley DB data store) as well as metadata for XML documents. The latter take the form of user-specified property-value pairs and can be queried as if they were child elements of the root element, although they do not actually appear in stored XML documents. Berkeley DB XML supports XQuery as its query language. It provides an API for updating documents that uses XQuery to identify a set of nodes to update and allows users to append a new child to a target node, insert a new node before or after a target node, remove a target node, rename a target node, or change the value of a target node. Updates are performed at the node level. Like Berkeley DB, Berkeley DB XML is a library that is linked directly to applications, rather than being used in client-server mode. It has a command-line interface as well as APIs for C++, Java, Tcl, Perl, Python, and PHP. Third-party APIs for other languages are available as well. Database type: Key-value, Open Source

DB2 DB2 version 9 supports XML in the base DB2 product, in the Net Search Extender, and in its Web Services Object Runtime Framework for DB2 (DB2 WORF). In addition, DB2 Express-C is a free version of DB2 that includes full XML support. XML support in DB2 itself (known as pureXML) consists of native XML storage, XQuery, the publishing functions in SQL/XML (XMLELEMENT, XMLATTRIBUTE, XMLFOREST, XMLAGG, etc.), various other SQL/XML functions (XMLVALIDATE, XMLQUERY, XMLEXISTS, XMLCAST, etc.), and a decomposition engine. Native XML storage is used to store columns whose type is XML and is implemented through a proprietary, hierarchical storage mechanism. XML values are indexed by identifying the nodes to be indexed (using a subset of XPath) and specifying whether they are to be indexed as strings, numeric values, dates, or timestamps. XML values can be queried with XQuery, which can be used as a standalone query language or called from SQL statements with the SQL/XML XMLQUERY function. XML values can also be used as a source of tabular data for SQL statements via the SQL/XML XMLTABLE function. Relational values can be queried with XQuery by first constructing an XML view of the data with the SQL/XML publishing functions, then querying the view with XQuery. Node-level updates are supported through a stored procedure that uses a path expression to identify the nodes to be updated and an XML value (element, attribute, etc.) to specify the new value. The decomposition engine in DB2 is used to extract values from an XML document and store them in columns in relational tables. Values are transferred according to mappings defined by annotations in XML Schema documents. These annotations support object-relational mappings, with some additional capabilities, such as mapping multiple elements or attributes to the same table/column or mapping a single element or attribute to more than one table/column. Additional features include support for user-defined functions to modify values before insertion and conditional expressions to insert rows only if their values meet certain criteria. Other XML support in DB2 includes a repository for managing XML Schemas to be used for validation and document decomposition, use of DTDs for resolving external entities and



retrieving default values (but not for validation), and access to XML values through a variety of APIs, including JDBC 4.0, SQLJ, .NET, CLI, embedded SQL, and PHP. The DB2 Net Search Extender contains a variety of search technologies, such as fuzzy searches, synonym searches, and searches by sentence or paragraph. It can be used when an XML document is stored in a single column and is XML-aware to the extent that searches can be limited to specific sections of XML documents, identified by their paths. DB2 WORF allows users to define Web services through DADX documents. DADX documents extend the functionality of DAD documents (as used in the XML Extender) and describe how a Web service accesses data in the database. Supported functionality includes storing and retrieving documents with the XML Extender, executing SQL queries, and calling stored procedures. DB2 WORF can also generate WSDL documents from DADX documents. Earlier versions of DB2 support the SQL/XML publishing functions and the XML Extender. The XML Extender can store XML documents in a single column as character data, extracting values into "side tables", which can then be queried with SQL statements. It can also decompose documents using a proprietary, XML-based mapping language called DAD (Data Access Definition). DAD supports object-relational mappings for transferring data to and from the database and a template-based language with embedded SELECT statements for transferring data from the database to XML. The XML Extender also supports MQSeries, validation, and XSLT. Database type: Relational, native XML [ws24]

dbXML dbXML is a native XML database that supports four different data stores. The first of these is a proprietary data store that uses B trees. The second is an in-memory data store, which is used for temporary storage and whose contents are deleted when the database is stopped. The third is the file system. And the fourth is a mapping to a relational database (it is not known what mapping is used). Which data store to use is specified on a per-collection basis. dbXML has a directory-like collection model. Collections can be nested and can store documents that match any XML schema, although it is suggested that a single collection contain documents that match a single XML schema to simplify indexing and querying. Collections can also contain binary streams (such as JPEG files), although a collection cannot contain both binary streams and XML documents. dbXML supports XPath, XSLT, XUpdate, and full-text searches. XPath and XSLT have been extended for use against collections, and both XSLT and full-text searches can be run against the results of an XPath query. dbXML supports three different types of indexes. Name indexes index element and attribute names. Value indexes index element and attribute values and support strings, characters, bytes, integers, real numbers, and booleans. Full text indexes index tokens in element and attribute values. They are case insensitive and actually index word stems; for example, both "happening" and "happen" have the same stem. Individual indexes are associated with a particular collection and users specify what to index according to an XPath-like expression. dbXML supports triggers. These are user-specified Java classes that can be fired before or after an insert, update, delete, or data retrieval. They can be used to do such things as validating documents on insertion or modifying documents on retrieval. dbXML also supports extensions to the server through Java classes. dbXML supports transactions and security. Security options are no security, a single user name and password for the entire database, and role-based security (the default). dbXML has four different APIs: the direct API, the client API, XML:DB, and Web services. The direct API allows applications to work directly with dbXML. The client API allows applications to use dbXML in client-server fashion. This can be done where both client and server are in the same process, or through XML-RPC. The Web services interface supports both XML-RPC and REST (URL encoding). dbXML comes with a set of command line tools for connecting to the database, managing collections, indexes, security, triggers, and extensions, and storing and retrieving documents. NOTE: dbXML is a complete rewrite of the code that became Xindice and is therefore different from that product.



Database type: Proprietary, Open Source [ws25]

eXist eXist is a native XML database that uses a proprietary data store (B+ trees and paged files). It can be run as a standalone database server, as an embedded database, or in the servlet engine of a Web application. Documents are stored in a hierarchy of collections. Collections can contain child collections and do not constrain documents to any particular schema or document type. eXist supports XQuery/XPath 2.0 and XQuery statements can query any combination of collections and documents. eXist does not support strong data typing but does provide a number of extensions to XQuery. In particular, eXist's implementation of XQuery can execute full text searches, call the XML:DB API (such as to store query results in the database), execute dynamically constructed XQuery statements, apply XSLT stylesheets to a node, work with HTTP, and execute arbitrary Java methods. eXist also provides partial support for XInclude and XPointer. Updates are primarily supported through XUpdate. When eXist is being used as an embedded database, live DOM trees are supported as well. eXist supports the XML:DB API, with additional services for preparing and executing XQuery statements, managing users, managing multiple database instances, and querying indexes. DOM and SAX are supported for documents returned through the XML:DB API. eXist can also be called via XML-RPC, a REST-style Web services API, SOAP, and WebDAV. eXist automatically indexes all element and attribute structure. By default, it creates full text indexes over all text and attribute values, but users can turn this off for selected parts of a document. It supports concurrent read/write access for multiple users, as well as access control at both the collection and document level. It does not currently support transactions. Database type: Proprietary, Open Source [ws26]

SQL/XML-IMDB SQL/XML-IMDB is an in-memory database with both native XML and relational data stores. While both data stores organize data in tables, a "table" in the XML data store is what most other native XML databases refer to as a collection, with one XML document per "row". Tables can be created as either local to a particular process or shared among processes and use compression to minimize memory use. Both types of tables are indexed with TST-trees, which "combine the speed advantage of a hash table with the ordered access of a binary tree", and XML tables are also indexed with "Reverse-Lookup" and "Token-Segment-Build-Up" mechanisms. While there does not appear to be a way to directly store the entire database to disk, individual relational tables can be saved as text files and individual XML tables can be saved as XML documents. SQL/XML-IMDB supports both XQuery and a "significant subset" of SQL92. This allows XML queries against XML data and SQL queries against relational data. In addition, it extends XQuery so that users can mix XML and relational data. To do this, it allows SQL statements in "any part of [an] XQuery statement where an expression is allowed". From a practical standpoint, it appears that this means SELECT statements are used anywhere except in a RETURN clause and INSERT, UPDATE, and DELETE statements are used in RETURN clauses. When a SELECT statement is used, the returned result set is mapped to an XML document with a table-based mapping. That is, each row in the result set is mapped to a <row> element and each column is mapped to a child of that <row> element. This allows XQuery variables to be bound to individual rows or columns in the result set. When any type of SQL statement is used, it can include XQuery variables. For example, these can be used in the WHERE clause of a SELECT statement to correlate relational and XML data, or in the VALUES clause of an INSERT statement to transfer data from XML documents to relational tables. SQL/XML-IMDB also extends XQuery with operators to update XML documents. Supported operations include deleting nodes, renaming nodes, updating node values, replacing nodes, and inserting new nodes before or after existing nodes. Note that these operations cannot be performed inside a transaction.



SQL/XML-IMDB has a proprietary API for interacting with the database. This includes functions for preparing and executing SQL and XQuery statements, beginning, committing, and rolling back transactions, transferring data between internal tables and external files or application variables, and bi-directional iteration over result sets. It is worth noting that XQuery results are returned in result sets just like SQL results. Each item in an XQuery sequence is returned as a separate column, with atomic values mapped to columns of the appropriate data type and nodes mapped to XML strings. When an XQuery statement returns multiple sequences, these are mapped to multiple rows in the result set. SQL/XML-IMDB can be used from Microsoft .NET, Visual C++, Visual Basic, Office, and IIS/ASP, Borland C++ and Delphi, Perl, and PHP. Database type: Proprietary XML store plus relational store, Commercial [ws27]

Tamino Tamino XML Server is a suite of products built in three layers -- core services, enabling services, and solutions (third-party applications) -- which may be purchased in a variety of combinations. Core services include a native XML database, an integrated relational database, schema services, security, administration tools, and Tamino X-Tension, a service that allows users to write extensions that customize server functionality. The XML engine uses the Data Map, which describes where the data in a given XML document is stored. This allows individual XML documents to be composed of data from multiple, heterogeneous sources, such as the native XML data store, relational databases, and the file system. Since the connections to external data (made through the X-Node module) are live and bidirectional, Tamino may thus be used to perform heterogeneous joins and updates. Tamino's XML support includes the DOM, JDOM, SAX, and XML:DB APIs, an extended XPath implementation called X-Query (not to be confused with W3C XQuery, which it predates), full-text retrieval, processing of XML documents with server-side XSL and CSS, and limited support for SOAP. It can store schema-less documents and can use schema information (including a subset of XML Schemas) if it is available. The internal SQL engine is directly addressable through ODBC, JDBC, and OLE DB. However, when addressed via these APIs, it cannot integrate data from the internal XML data store or from external data sources. (As noted above, the reverse is true. That is, with the help of the X-Node, the XML engine can integrate data from the XML data store and other databases, including the internal SQL engine.) Enabling services include X-Port, X-Plorer, X-Application, various APIs (mentioned above), X-Node (also mentioned above), and the WebDAV Server. X-Port provides URL-based data transfer through various standard HTTP servers, X-Plorer is a browser-based navigation tool for documents stored in Tamino, and X-Application is a set of JSP tags for accessing Tamino through Web pages. The WebDAV Server adds namespace management (nested collections or directories), additional properties (such as last-modified, content length or content type) and overwrite protection (persistent locking) to the existing Tamino XML Server functionality. This allows Tamino to serve as a virtual file system (Web folder) where the information can be stored and retrieved using a standard Web browser and the common drag and drop metaphor. (Note: In spite of rumors to the contrary, Tamino is not built on top of Adabas, a hierarchical database from Software AG. Instead, the Tamino data store was built from the ground up as a native XML database, obviously drawing on the knowledge gained from developing Adabas). Database type: Proprietary. Relational through ODBC, Commercial [ws28]

TEXTML Server TEXTML Server is a native XML database that stores, indexes, and retrieves whole XML documents. A TEXTML Server installation consists of one or more document bases, each of which consists of a document repository and a set of indexes. The document repository is organized as a hierarchical set of collections and can store both XML and non-XML documents. All documents are stored intact. The major difference between XML and non-XML documents is that XML documents are parsed at insert time to create indexes. While non-XML documents are



not parsed, they can be associated with an XML document that provides indexable metadata for the non-XML document. Unlike most native XML databases, the indexes in TEXTML Server effectively form an additional schema layer on top of the documents stored in the database. This is because indexes are defined using one or more XPath expressions. Since these can refer to any document in the database, the effect is that a single index can refer to more than one field. For example, an author index might refer to the AuthorName element in one set of documents and the StoryAuthor attribute in another set of documents. Furthermore, because indexes are defined using XPath expressions, it is possible to transform values and index the transformed values. TEXTML Server supports five different types of indexes: word (token), string, numeric, date, and time. TEXTML Server has its own, XML-based query language. Queries are defined as a series of boolean tests over specific indexes or the full text of the documents. Tests are generally for equality. In addition, numeric, date, and time indexes support range tests, and word and string indexes support wild-card tests. Tests can then be joined with a number of operators, including And, Or, And Not, Near, adjacency, and frequency. Queries return whole documents and can sort results based on index values, document properties, and hit counts. In addition to being able to associate XML documents with non-XML documents, TEXTML Server also has a Universal Converter that can convert more than 225 file formats (word processor, spreadsheet, presentation, drawing, bitmap, and so on) to XML. This uses Stellent's Outside In XML Export and extracts document "contents, presentation information, and metadata". Extracted information is stored in a document that uses the SearchML schema, also defined by Stellent. Converted documents can then be searched directly or associated with the original documents as indexing documents. Other features of TEXTML Server include check-in/check-out, versioning, support for plug-ins that are run at insert time, and COM, Java, .NET, WebDAV, and OLE DB APIs. Security can be specified at the document, collection, or document-base level. System features include fault tolerance, replication, load management, and automated recovery. Database type: Proprietary (Document-based), Commercial [ws29]

TigerLogic XML Data Management Server (XDMS) TigerLogic XML Data Management Server (XDMS) is a database designed to store multiple kinds of data, including "structured, XML, and unstructured information". (Examples of the latter are office documents, email, and graphics.) Data is stored in the TigerLogic Native XML Data Store, which "leverages the Pick Universal Data Model". As XML documents are inserted into the database, an XML Profiler reads the incoming documents and gathers information to build indexes. These are used by the query processor, which supports XPath. TigerLogic XDMS also supports XSLT. TigerLogic XDMS has a Java API and is also accessible over SOAP, HTTP, and JCA. It supports both DTDs and XML Schemas. Of interest, it supports XA transactions, and provides "on-line backup and recovery". Database type: Pick, Commercial [ws30]

Timber Timber is a native XML database that has an architecture "as close as possible to that of a relational database," in order to "reuse, where appropriate, the technologies developed for relational databases over the past several decades". The basis of Timber is "an XML algebra that manipulates sets of ordered, labeled trees". The primary difficulties of such an algebra include the "complex and variable structure of trees in a set, and issues of ordering." By default, Timber uses Shore as its underlying data store. It can also use Berkeley DB. It supports a number of different types of indexes, including element, attribute, text, inverted, parent, and join indexes. Timber supports a subset of XQuery. Users can enter queries either as XQuery expressions or as logical or physical query plans using Timber's logical or physical plan syntax. The latter allows advanced users to optimize queries by hand, as well as to perform



some operations not supported through XQuery. Timber extends XQuery with functions for deleting nodes or their contents, updating the contents of a node, and inserting elements or attributes. In addition, Timber has a command line option for appending the contents of an XML document to a document already in the database. Timber has command line, GUI, SOAP, and Web interfaces for performing both queries and administrative functions. Database type: Shore, Berkeley DB, Open Source (for non-commercial users) [ws31]

XQuantum XML Database Server XQuantum XML Database Server is a native XML database built on a proprietary data store. It supports a subset of XQuery, a subset of the XQuery full-text specification, and XSLT. XQuantum optimizes queries with a cost-based algorithm, which uses statistics about the data to optimize the search process. The query processor also relies on "recursive XML indexing" (a schemaless indexing method), lazy query evaluation, and stream processing of queries. XQuantum supports static typing through its own typing mechanism, which "generalizes XQuery's sequence type syntax to include full regular expression types" and is used instead of XML Schemas. Types (effectively schemas for individual XML documents) can be declared in the prolog of an XQuery query or in external type modules. They are applied in the query through explicit validation and are used to provide type information to the query processor. XQuantum includes a Web server, which allows it to use HTTP as its API. That is, queries are embedded in URLs and results are returned as an XML stream. Queries can also be placed in XQuery Server Pages. These are preferrable for URLs exposed to the public, as they are more secure (the query is not exposed to the public) and less fragile (the query can be changed without changing the URL). XQuantum is also available as the XQuantum XML Database Appliance, a dedicated server running Linux and XQuantum. Database type: Proprietary, Commercial [ws32]

6.4.3 Tables

Not native Databases Product Developer License DB Type Access 2002 Microsoft Commercial Relational Cache InterSystems Corp. Commercial Post-relational eXtremeDB McObject Commercial Navigational FileMaker FileMaker Commercial FileMaker FoxPro Microsoft Commercial Relational Informix IBM Commercial Relational Matisse Matisse Software Commercial Object-oriented Objectivity/DB Objectivity Commercial Object-oriented OpenInsight Revelation Software Commercial Multi-valued Oracle Oracle Commercial Relational, native XML

PostgreSQL PostgreSQL Global Development Group Open Source Relational

SQL Server Microsoft Commercial Relational Sybase ASE 12.5 Sybase Commercial Relational UniVerse IBM Commercial Nested relational Versant enJin Versant Corp. Commercial Object-oriented



View500 eB2Bcom Commercial Proprietary (LDAP)

Native Databases

Product Developer License DB Type 4Suite, 4Suite Server FourThought Open Source Object-oriented

DB2 IBM Commercial Relational, native XML Berkeley DB XML Oracle Open Source Key-value Birdstep RDM XML Birdstep Commercial Object-oriented Centor Interaction Server

Centor Software Corp. Commercial Proprietary

dbXML dbXML Group Open Source Proprietary Dieselpoint Dieselpoint, Inc. Commercial None (indexes only) DOMSafeXML Ellipsis Commercial File system(?) eXist Wolfgang Meier Open Source Proprietary

eXtc M/Gateway Developments Ltd. Commercial Post-relational

Extraway 3D Informatica Commercial Files plus indexes GoXML DB XML Global Commercial Proprietary (Model-based) Infonyte DB Infonyte Commercial Proprietary (Model-based) Ipedo Ipedo Commercial Proprietary Lore Stanford University Research Semi-structured MarkLogic Server Mark Logic Corp. Commercial Proprietary(?) myXMLDB Mladen Adamovic Open Source MySQL Natix data ex machina Commercial File system(?) NaX Base Naxoft Commercial Proprietary Neocore XMS Xpriori Commercial Proprietary ozone ozone-db.org Open Source Object-oriented Sedna XML DBMS ISP RAS MODIS Free Proprietary Sekaiju / Yggdrasill Media Fusion Commercial Proprietary

SQL/XML-IMDB QuiLogic Commercial Proprietary (native XML and relational)

Sonic XML Server Sonic Software Commercial Object-oriented (ObjectStore). Relational and other data through Data Junction

Tamino Software AG Commercial Proprietary. Relational through ODBC.

TeraText DBS TeraText Solutions Commercial Proprietary TEXTML Server IXIASOFT, Inc. Commercial Proprietary (Text-based) TigerLogic XDMS Raining Data Commercial Pick

Timber University of Michigan

Open Source (non-commercial only)

Shore, Berkeley DB

TOTAL XML Cincom Commercial Object-relational? Virtuoso OpenLink Software Commercial Proprietary. Relational through



ODBC

XDBM Matthew Parry, Paul Sokolovsky Open Source Proprietary (Model-based)

XDB ZVON.org Open Source Relational (PostgreSQL only?) XediX TeraSolution AM2 Systems Commercial Proprietary

X-Hive/DB X-Hive Corporation Commercial Proprietary. Relational through JDBC

Xindice Apache Software Foundation Open Source Proprietary (Model-based)

XML Transactional DOM Ontonet Commercial Object-oriented

XpSQL Makoto Yui Open Source Relational (PostgreSQL) XQuantum XML Database Server Cognetic Systems Commercial Proprietary

XStreamDB Native XML Database

Bluestream Database Software Corp.

Commercial Proprietary (Model-based)

6.5 Bibliography [1] Volkmer, T., Smith, J. R., and Natsev, A. (. 2005. A web-based system for collaborative

annotation of large image and video collections: an evaluation and user study. In Proceedings of the 13th Annual ACM international Conference on Multimedia (Hilton, Singapore, November 06 - 11, 2005). MULTIMEDIA '05. ACM, New York, NY, 892-901

[2] Padula, Annotation in cooperative work: from paper-based to the web one, Annotation for Collaboration workshop, Paris, France, 2005

[3] Ovsiannikov, I. A., Arbib, M. A., Mcneill, T. H., Annotation Technology, Int. J. Human-Computer Studies, 50, 1999, 329-362.

[4] Daniela Fogli, Giuseppe Fresta, Andrea Marcante, Piero Mussio, Two-way exchange of knowledge through visual annotation, Proceedings 2004 International Conference on Distributed Multimedia Systems (DMS 2004), San Francisco, USA, 2004

[5] Marshall, C. C. 1998. Toward an ecology of hypertext annotation. In Proceedings of the Ninth ACM Conference on Hypertext and Hypermedia : Links, Objects, Time and Space. Structure in Hypermedia Systems: Links, Objects, Time and Space Structure in Hypermedia Systems (Pittsburgh, Pennsylvania, United States, June 20 - 24, 1998). HYPERTEXT '98. ACM, New York, NY, 40-49. DOI= http://doi.acm.org/10.1145/276627.276632

[6] Raman, R., Venkatasubramani, S.: Annotation tool for semantic web. In: Proceedings of WWW2002. (2002)

[7] Bottoni, P., Levialdi, S., Labella, A., Panizzi, E., Trinchese, R., and Gigli, L. 2006. MADCOW: a visual interface for annotating web pages. In Proceedings of the Working Conference on Advanced Visual interfaces (Venezia, Italy, May 23 - 26, 2006). AVI '06. ACM, New York, NY, 314-317

[8] Bilasco, I. M., Gensel, J., Villanova-Oliver, M., and Martin, H. 2005. On indexing of 3D scenes using MPEG-7. In Proceedings of the 13th Annual ACM international Conference on Multimedia (Hilton, Singapore, November 06 - 11, 2005). MULTIMEDIA '05. ACM, New York, NY, 471-474. DOI= http://doi.acm.org/10.1145/1101149.1101254

[9] Costabile, M. F., Mussio, P., Parasiliti Provenza, L., Piccinno, A., Co-Evolution of Users and Interactive Systems in the Web. In Proceedings of DMS 2007, San Francisco Bay, USA, September 2007

[10] Marcante, A., Martin, A., Mussio, P., Perego, A., Valtolina, S., Tools for Collaborative Annotation: a comparison among three annotation styles; Proceedings of the 3rd Italian Research Conference on Digital Libraries (IRCDL 2007), Padova, Italy, 2007



[11] Valtolina S., Franzoni S., Mazzoleni P., Building knowledge networks using panoramic images; System and Information Sciences Notes, ISSN 1753-2310 (Print) ISSN 1753-2329 (CD-ROM); Vol.2, N. 1, September 2007

[12] Valtolina, S., Mussio, P., Mazzoleni, P., Franzoni, S., Bagnasco, G., Geroli, M., Ridi, C., Media for Knowledge Creation and Dissemination: Semantic Model and Narrations for a New Accessibility to Cultural Heritage; accepted to the 6th Creativity & Cognition Conference (CC 2007), Washington DC, USA - June 13-15, 2007

[13] Costabile, M. F.; Fogli, D.; Mussio, P.; Piccinno, A. Systems, Man and Cybernetics, Part A, IEEE Transactions on Volume 37, Issue 6, Nov. 2007 Page(s):1029 - 1046.

[14] Costabile M. F., Malerba D., Hemmje M., Paradiso A., Building Metaphors for Supporting User Interaction with Multimedia Database. In Proceedings of 4th IFIP 2.6 Working Conference on Visual DataBase Systems (VDB 4), L’Aquila, Italy, May, 27-29 1998.

[15] R. Harmon, W. Patterson, W. Ribarsky, and J. Bolter, The Virtual Annotation System, Proceedings of IEEE VRAIS'96. 1996. IEEE CS. pp. 239-245.

[16] R. Kadobayashi, J. Lombardi, M. P. McCahill, H. Stearns, K. Tanaka, A. Kay, Annotation authoring in collaborative 3D virtual environments. Proceedings of the 2005 international conference on Augmented tele-existence, Christchurch, New Zealand, 2005, pp. 255-256.

[17] T. Jung, E. Yi-Luen Do and M. D. Gross, Immersive redlining and annotation of 3D design models on the Web. Proceedings of CAAD Futures 1999 Conference, Georgia Institute of Technology, Atlanta, Georgia, USA, June 7-8, 1999, pp. 81-98

[18] Miller E., Post Processing Tips & Hints: Annotation in ANSYS, The Focus ed. PADT, Issue 31, September 27, 2004

[19] Tsang, M., Fitzmaurice, G.W., Kurtenbach, G., Khan, A., Buxton, B., Boom Chameleon, Proceedings of the SIGGRAPH 2003 Conference, Annual Conferences Series, ACM Transactions on Graphics 22(3), pp. 698

[20] Bourguignon, D., Cani, M.P., Drettakis, G., Drawing for Illustration and Annotation in 3D. Comput. Graph. Forum Vol. 20, N. 3, 2001

[21] Jung, T., Gross, M.D., Yi-Luen Do E., Sketching annotations in a 3D web environment, CHI '02 extended abstracts on Human factors in computing systems, Minneapolis, Minnesota, USA, 2002, pp. 618-619

6.6 Sitography [ws1] Del.ico.us: http://del.icio.us [ws2] Digg: http://digg.com [ws3] BlinkList: http://www.blinklist.com [ws4] Pliny: http://pliny.cch.kcl.ac.uk/index.html [ws5] Aimasoft, Inc. : http://www.design-drawing.com/scripts/click.cfm?id=730 [ws6] Parallax69 Software: http://www.parallax69software.com [ws7] Allegria Software: http://www.allegria.com [ws8] Catia: http://www.cad-3d.net [ws9] MicroStation: ftp://ftp2.bentley.com/dist/collateral/Web/composing3d.pdf [ws10] Direct Manipulation for JavaScripting 3D:

http://www.acrobatusers.com/tech_corners/3d/3d_tutorials/ExplodedView_with_js.pdf [ws11] NX I-deas Master Notation: http://www.ugs.com/en_us/Images/nx%20i-

deas%20master%20notation%20fs%20W%203_tcm53-4305.pdf [ws12] Annot3D documentation:

http://www.sci.utah.edu/~simpson/documentation/projects/annotation/webdocs/ [ws13] HyperCosm:

http://www.hypercosm.com/products/teleporter/sketchup/pro/features/markup/ [ws14] CACHE: http://www.intersystems.com/cache/technology/components/xml/index.html [ws15] eXtremeDB: http://www.mcobject.com/standardedition.shtml [ws16] Informix: http://www.informix.com/idn-secure/webtools/ot/ or http://www-

4.ibm.com/software/data/informix/blades/web/ [ws17] Matisse: http://www.matisse.com/product_information/ [ws18] OpenInsight: http://www.revelationsoftware.com.au/products/openinsight/ [ws19] Oracle: http://www.oracle.com/technology/tech/xml/xmldb/index.html [ws20] SQL Server : http://www.microsoft.com/sql/default.mspx [ws21] Sybase ASE 12.5: http://www.sybase.com/mysybase



[ws22] UniVerse : http://www-306.ibm.com/software/data/u2/universe/ [ws23] Berkeley DB XML: http://www.oracle.com/database/berkeley-db/xml/index.html [ws24] DB2: http://www-306.ibm.com/software/data/db2/ [ws25] dbXML: http://www.dbxml.com/product.html/ [ws26] eXist: http://exist.sourceforge.net [ws27] SQL/XML-IMDB: http://www.quilogic.cc/ [ws28] Tamino: http://www.softwareag.com/Corporate/products/tamino/default.asp [ws29] TEXTML Server: http://www.ixiasoft.com/default.asp?xml=/xmldocs/webpages/textml-

server.xml [ws30] TigerLogic XML Data Management Server (XDMS):

http://www.rainingdata.com/products/tl/abouttl.html [ws31] Timber: http://www.eecs.umich.edu/db/timber [ws32] XQuantum XML Database Server: http://www.cogneticsystems.com/server.html



7 Data transmission The communication between visualization and simulation environment is an emerging topic in the research field of VR because there is the need to connect VR software with highly specialized simulation code in order to have more and more realistic simulation inside the Virtual Environment. One of the few approaches devoted to the integration of simulation package and VR has been presented by Kirner et al. [KIRNER 2005] who developed the VR-SIM, an object oriented C++ library, able to incorporate a Real-Time Systems (RTS) simulator with VR technologies. The use of VR-SIM involves the creation of the system to be validated and of a virtual environment related to this system. The case-study is a robot arm coupled to an automatic transport-belt, used in a factory for piling up boxes. This work demonstrates that VR technology is applicable and useful to support RTS simulations, as a form used to evaluate the correctness of such systems. But the VR-SIM is a tool addressed to software engineers responsible for the development of real-time, process control systems, it requires code development for the implementation of the virtual product and it is not suitable to be used by industrial or mechanical engineers in the PDP. A similar work has been presented by Sánchez et al. [SÁNCHEZ 2005] who developed the Easy Java Simulation (Ejs) (http://fem.um.es/Ejs), a software tool designed to create interactive simulations in Java using models created with Simulink. Basically, Ejs is able to communicate with the Simulink model and to tightly control its execution. The communication takes place through a DLL library written in C and some useful Java classes. The main advantage of this work is that Ejs creates Java applets that are independent, multi-platform, which can be visualized using any Web browser, read data across the net and be controlled using scripts from within html pages. Ejs can also be used with Java 3D to create interactive virtual products in 3D, but it has been conceived mainly for educational purposes and it cannot be efficiently integrated into a product development process because Java 3D is not suitable for complex models visualization. In the mechatronics field the simulation of the products can be carried out by employing software for Computer Aided Control Engineering (CACE) or simulation packages like Simulink, that allow engineers to model and simulate the behaviour of a product across multiple domains (mechanics, electronics, software, etc.). In order to obtain a complete virtual prototype, electronics and software components of the product also have to be simulated in VR. At present, the only way to achieve this goal is by reproducing -within the VR application- the behaviour of the product developing a simplified working model and implementing it in the code of the application. It means that the programmer has to define how the product responds to each of the user’s actions, taking into account all electronical, mechanical and software components. This approach has some limits. The first one is that it requires a lot of work in code writing that cannot be compatible with the time the market imposes on the product. Moreover, each modification made on any component of the product requires a modification on the FDMU and, finally, it is not sure that the behaviour of the product can be replicated in the programming language used for the VR application. In order to build an FDMU that strictly reproduces the behaviour of the product (as it was defined by the mechanics, electronics and software engineers) one may create a connection between the VR application and the simulation package. This means that the VR application sends user generated events to the simulation package. The simulation software processes the user’s actions and computes the positions of the FDMU parts in keeping with the mathematical model implemented by the designer, or, in other words, it computes the current configuration of every degree of freedom (DOF) of the assembly. Subsequently, the simulation software sends data about every DOF in the 3D assembly to the VR software that updates the status of the 3D models. So the FDMU is continuously updated by the simulation model. In [BARBIERI 2007] such approach have been tested by implementing two software modules that allow a VR application (Virtools Dev) to dialogue and exchange data with a simulation package (Simulink). The connection between the solver and the VR system is an inter-process



communication (IPC) channel based on network sockets. The first module has been implemented with two BBs inside Virtools:

• The VR_FDMU_Receiver BB, which sets the 3D model in agreement with the data coming from Simulink. Its output is a set of float values read through the IPC channel.

• The VR_FDMU_Sender BB, which sends, through the IPC channel, the messages coming from the user interaction in the VR environment to Simulink. Its inputs are the signal-generating sources in the 3D environment, i.e. collision detection BB.

Since both of them make use of network sockets for the inter-process communication with Simulink, it is possible to keep simulators and VR system on different machines. The user in fact, can set the network parameters (simulation host and socket port-number) within the VR environment. Afterwards the user connects every output of the VR_FDMU_Receiver to the correspondent DOF in the 3D model, and every input of the VR_FDMU_Sender to the correspondent BB that could generate signals (i.e. a BB for collision detection) The second module is made up of two Simulink S-Functions:

• The SL_FDMU_Sender S-Function, which sends the computed values for every DOF through the IPC channel.

• The SL_FDMU_Receiver S-Function, which switches different simulation parameters in keeping with the signals received from the VR system.

Like all the Simulink User Defined Functions, their parameters can be set within the Function-Box Parameters window. The final result is a bi-directional communication channel, which allows to set position and orientation of the 3D model parts in keeping with simulated data, and, further, to send signals to the simulation environment, like the collision between two 3D entities in the virtual environment. To correctly define the way the interface between Simulink and Virtools works, we have chosen a simple example of a pendulum connected to an electric motor controlled by the user. The model of the pendulum has been realized with the SimMechanics toolbox with one rigid body connected to the ground through a revolute joint. The motor applies a torque on the joint when the user activates a manual switch. A simple 3D scene, representative of the same pendulum experiment, has been modelled and imported in Virtools. A picking sensor has been applied to the red button so, when the user presses the button, a signal is sent to Simulink that activates or deactivates the motor and sends back to Virtools the angle of the pendulum. Then Virtools updates the orientation of the pendulum with the data received from the physical simulation.

Figure 41: A pendulum model implemented in Simulink



Figure 42: The pendulum experiment in Virtools Similar problems are approached in [BRUNO 07] where a visualization and simulation environment is presented. This environment allows the runtime communication among the visualization framework and different solvers. In this way it is quite simple to obtain a multi-domain simulation. Each solver in fact, can be the specialized solver for a specific domain. This approach, known in literature as heterogeneous co-simulation, has the advantage of a faster integration of the simulation model within the overall framework, since no conversion to a common solver is needed. The disadvantage is that it implies the communication among each solver, since it is possible that the output of a solver can be the input for another solver. Further, another issue of this approach, is the need of a scheduler to synchronize the several solvers. To achieve both these tasks we used an open source middleware called OpenMASK. It simplify the creation of a simulation tree, in which each solver acts as a simulated object, i.e. a node of the simulation tree. The simulation tree ends with a simulated object called VisSimulator. This one is the responsible of sending simulated data to the visualization environment. The visualization framework is IFXOpenSG. This framework has been developed at Fraunhofer IGD, and it is a powerful and extendable post-processor, using OpenSG as scene-graph for visualization. The idea of using a post-processor allows one to visualize data from several numerical simulator: structural, CFD and whatever. It is therefore a general purpose environment, which has both graphics performances and scientific visualization tools. The overall architecture of the system is represented in the figure below.

Figure 43: Overall architecture of the framework The actors of this scenario are: The VisSim OpenMASK application The solvers The IFXOpenSG application They are better explained in the following sections.

7.1 SimLib: A library for the inter-process communication This section describes an high level software library for the inter-process communication (IPC) that has been developed to let the simulation software communicate with the VR environment.



Figure 44: Framework architecture

We choose Matlab/Simulink as simulation software because it is possible to control Comsol and other simulation software by using Matlab. Moreover, this environment is almost a standard for the general purpose simulation. It is very diffused and versatile. Further, a lot of optional packages (called toolbox) are present. These packages provide further sets of high level operations for a specific task. The sender (i.e. MATLab/Simulink) sends data deriving from the simulation in asynchronous mode using non-blocking sockets, and does not take care about the reception of the visualization environment. In this way, the simulator computes the simulation results also for the next simulation cycles, without stopping the computation. The VR software instead, adopts the synchronous mode communication, using the blocking sockets. It stops its execution until it gets the message from MATLAB/Simulink. Once received, it moves the 3D model using the obtained information, and it sends the request for the new data. As it is easy to understand, a new thread takes care of the communication. In this way the user can still interact with the 3D scene. In other words, Matlab/Simulink does not send new data computed for the next simulation cycles until a VR request occurs. The VR software instead waits for the whole data from Matlab/Simulink, then sends a data request to the simulator. The SimLib should provide an easy to use IPC channel for the communication between the VR environment and the solver. Therefore, we implemented the SimLIB library. It is a versatile library that can be easily adapted to all the possible test cases. SIMLib uses TCP socket, therefore it is possible run the simulator and the VR application on different machines. The SIMLib library is made up by few functions, in which is implemented the code for the TCP/IP communication and synchronization. Therefore, the developer must not take care about sockets and thread. The functions of the library are:

• SimLib_Channel* SIMServerOpen(unsigned short port, int connections, u_long non_blocking).

• SimLib_Channel* SIMClientOpen(const char* host, unsigned • int SIMClose(SimLib_Channel* s) • SimLib_Channel* SimLib_Accept(SimLib_Channel* server) • int SimLib_synchroSend(SimLib_Channel* to, SimLib_Data* r) • int SimLib_synchroReceive(SimLib_Channel* s, SimLib_Data* r) • void SimLib_SendNextData(SimLib_Channel* s) • void SimLib_StartListener(SimLib_Channel* server, void (*ptActionCB)(const

SimLib_Data)) • void SimLib_StopListener()

The SIMServerOpen function creates a server communication channel listening at the specified port. It is possible to set it as a blocking or non-blocking communication channel. The SIMClientOpen function creates a client communication channel, which attempts to connect the server specified by parameter. It is possible to set it as a blocking or non-blocking communication channel. The SIMClose function closes and removes from memory a communication channel. SimLib_Accept function establishes a connection between the server specified as parameter and the connecting client. It returns a new communication channel which is the one to be used by the client for the communication. The SimLib_synchroSend and SimLib_synchroReceive functions send and receive data using a common protocol for the IPC. The developer must specify the SimLib_Data data structure used

VR Env

Sim

Geometric Model

Numerical Model

IPC channel



for the communication. Then, the struct is sent and received through the IPC channel. The data structure can be also very complex and huge, even if in this way the connection speed is slower. The SimLib_StartListener and SimLib_StopListener functions respectively create and destroy the listener thread. The listener thread can be created by a server communication channel to handle the connection with a client. If there is a listener, the server communication channel must be in blocking mode. A listener thread calls another thread (called Answer thread) for each connecting client. The answer thread receives data from the client, and after calls a callback function (if specified) which can use data received from the client. The Simulink S-Function for the communication The Matlab/Simulink environment can be extended by the development of user defined S-Function. These ones can be used within a Simulink model as a conventional Simulink building block, with user defined behaviour and set of actions. In our purpose, the S-Function is responsible of the communication between MATLab/Simulink and VR software. As it is easy to understand, the communication make use of the SIMLib library and rely on the IPC channel provided by this one. The main task of this S-Function is the asynchronous data sending to VR. In other words, the S-Function sends simulation data to VR, without stopping the simulation. In order to obtain a consistent visualization in fact, there is the need of sending all the data of a particular/singular time-step. Since in the model there is more than one part governed by the simulator, it is important do not send data of different time-step to achieve a correct visualisation. To achieve this target, we used the information from the Matlab ssGetT function, which returns the simulation time. As explained in the next image, once the OK message from VR is obtained, all the blocks send data in a sequentially order.

Figure 45: Finite State diagram of the S-Function The configuration of each communication block is quite easy. It is possible in fact, set the network parameters and the part name via GUI as shown in next figure.

Figure 46: GUI for the S-Function configuration

State 0: waiting for a new simulation cycle

State 1: Sending data

(New simulation time) AND

( request received )

(New simulation time) OR

( data sent )



The parameters needed by each block are: • IP and port of the host to which data are sent • the name of the part which is governed by the block • number of the blocks in the Simulink model.

7.2 References [KIRNER 2005] Kirner T.G., Kirner C., Simulation of real-time systems: an object-oriented approach supported by a virtual reality-based tool, In Proceedings of 38th Annual Simulation Symposium, 4-6 April, San Diego, California, 2005.

[SÁNCHEZ 2005] Sánchez J., Esquembre F., Martín C., Dormido S., Dormido-Canto S., Canto R.D., Pastor R., Urquía A., Easy java simulations: An open-source tool to develop interactive virtual laboratories using MATLAB/Simulink. International Journal of Engineering Education, 21(5):798-813, 2005.

[BARBIERI 2007] Barbieri L., Bruno F., Caruso F., Muzzupappa M., Innovative integration techniques between Virtual Reality systems and CAx tools, The International Journal of Advanced Manufacturing Technology, http://dx.doi.org/10.1007/s00170-007-1160-3, 2007.

[BRUNO 2007] BRUNO F., CARUSO F., MUZZUPAPPA M., STORK A., An experimental environment for the runtime communication among different solvers and visualization modules. In Proceedings of 19th European Modeling and Simulation, 4-6 October 2007, Bergeggi (SV), Italy.