7
Infocarve: A Framework for Volume Visualization on Commodity Augmented Reality Displays Lingqiang Ran Geotechnical and Structural Engineering Research Center Shandong University, Jinan 250061 Shandong, China [email protected] John Dingliana Graphics Vision and Visualisation Group Trinity College Dublin Dublin, Ireland [email protected] AbstractWe present a framework for interactive real-time visualization of three-dimensional volume data on commodity augmented reality (AR) displays. In particular, we address the problem of seamlessly blending internal anatomy data sets with real- world objects. One key challenge, particularly relevant to this scenario, is conveying the correct sense of relative depths of virtual and real world objects. To address this issue, we exploit information captured by a depth sensor to build a mask which is used as a weighting parameter to correctly combine rendered volume images with real imagery in a depth-preserving manner. Results obtained on prototype AR hardware devices indicate improvements to relative depth perception. Furthermore, we address performance challenges and provide solutions to ensure that the framework is applicable to a range of different AR devices, many of whom have limited computational and graphical rendering capabilities. Keywords— Volume rendering, Augmented reality, Mixed reality, Real-time graphics rendering, Scientific visualization. I. INTRODUCTION Augmented Reality (AR) can be defined as a combination of Virtual Reality (VR) elements interacting with real-world inputs (e.g. head tracking) or composited with live images of the real- world. AR has received renewed interest in recent years with highly functional commodity-priced display technologies becoming available in the market. However, the industry is still in its infancy, particularly with regard to serious non-leisure applications that can effectively use these platforms to benefit the end user. This paper addresses the problem of effective interactive visualization of three-dimensional (3D) volume data on commodity see-through head-mounted AR displays. At a high-level, the design of our proposed visualization framework is driven by two key objectives, addressing issues of computational performance and visual perception. Firstly, typical state-of-the-art volume visualization is a computationally complex processes, which poses challenges for commodity AR platforms, the majority of which are comprised of mobile devices or lightweight and highly-portable systems applicable for head-mounted displays untethered to a desktop system. Invariably, this implies solutions applicable to devices with limited Graphical Processing Units (GPUs) and general processing capabilities. Secondly, the goal of all visualization is to improve the ability of the user to perceive and effectively process visually- presented information. This is important in volume rendering which concerns visually-overlapped 3D information, compounded in AR by an additional layer of information of the real-world captured interactively by sensors. Thus, it is important that virtual elements do not excessively occlude the scene and that they are rendered in a manner that ensures users have a reliable sense of how embedded virtual objects are located in relation the real world scene (see Fig. 1). Our main contribution in this paper is the discussion of a framework, INFOCARVE, for depth-preserving volume visualization in AR. Specifically it is designed to leverage commodity depth sensors to achieve this goal and is adapted to accommodate for the reduced graphical processing capabilities of lightweight mobile and head-mounted AR devices. The rest of the paper is structured as follows: Section II discusses related works in the area of volume visualization, augmented reality and depth perception. Section III provides an overview of our framework, highlighting unique elements of our volume rendering system. Section IV discusses our AR blending technique designed to preserve a sense of the relative depths of virtual and real objects. In Section V we summarize our results, and conclude with findings and avenues for further development in Section VI. This publication has emanated from research supported by Science Foundation Ireland under Grant Numbers 14/TIDA/2349 and 13/IA/1895. Fig. 1. Sample result obtained using our framework. Left: real-world head model; Middle: blended with a naive flat overlay of MRI Head rendering; Right: blended using our framework to create illusion of containment

Infocarve: A Framework for Volume Visualization on Commodity Augmented Reality … · 2016-10-26 · Infocarve: A Framework for Volume Visualization on Commodity Augmented Reality

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Infocarve: A Framework for Volume Visualization on Commodity Augmented Reality … · 2016-10-26 · Infocarve: A Framework for Volume Visualization on Commodity Augmented Reality

Infocarve: A Framework for Volume Visualization on

Commodity Augmented Reality Displays

Lingqiang Ran

Geotechnical and Structural Engineering Research Center

Shandong University, Jinan 250061

Shandong, China

[email protected]

John Dingliana

Graphics Vision and Visualisation Group

Trinity College Dublin

Dublin, Ireland

[email protected]

Abstract— We present a framework for interactive real-time

visualization of three-dimensional volume data on commodity

augmented reality (AR) displays. In particular, we address the

problem of seamlessly blending internal anatomy data sets with real-

world objects. One key challenge, particularly relevant to this

scenario, is conveying the correct sense of relative depths of virtual

and real world objects. To address this issue, we exploit information

captured by a depth sensor to build a mask which is used as a

weighting parameter to correctly combine rendered volume images

with real imagery in a depth-preserving manner. Results obtained on

prototype AR hardware devices indicate improvements to relative

depth perception. Furthermore, we address performance challenges

and provide solutions to ensure that the framework is applicable to a

range of different AR devices, many of whom have limited

computational and graphical rendering capabilities.

Keywords— Volume rendering, Augmented reality, Mixed

reality, Real-time graphics rendering, Scientific visualization.

I. INTRODUCTION

Augmented Reality (AR) can be defined as a combination of Virtual Reality (VR) elements interacting with real-world inputs (e.g. head tracking) or composited with live images of the real-world. AR has received renewed interest in recent years with highly functional commodity-priced display technologies becoming available in the market. However, the industry is still in its infancy, particularly with regard to serious non-leisure applications that can effectively use these platforms to benefit the end user. This paper addresses the problem of effective interactive visualization of three-dimensional (3D) volume data on commodity see-through head-mounted AR displays. At a high-level, the design of our proposed visualization framework is driven by two key objectives, addressing issues of computational performance and visual perception.

Firstly, typical state-of-the-art volume visualization is a computationally complex processes, which poses challenges for commodity AR platforms, the majority of which are comprised of mobile devices or lightweight and highly-portable systems applicable for head-mounted displays untethered to a desktop system. Invariably, this implies solutions applicable to devices with limited Graphical Processing Units (GPUs) and general processing capabilities.

Secondly, the goal of all visualization is to improve the ability of the user to perceive and effectively process visually-presented information. This is important in volume rendering which concerns visually-overlapped 3D information, compounded in AR by an additional layer of information of the real-world captured interactively by sensors. Thus, it is important that virtual elements do not excessively occlude the scene and that they are rendered in a manner that ensures users have a reliable sense of how embedded virtual objects are located in relation the real world scene (see Fig. 1).

Our main contribution in this paper is the discussion of a framework, INFOCARVE, for depth-preserving volume visualization in AR. Specifically it is designed to leverage commodity depth sensors to achieve this goal and is adapted to accommodate for the reduced graphical processing capabilities of lightweight mobile and head-mounted AR devices.

The rest of the paper is structured as follows: Section II discusses related works in the area of volume visualization, augmented reality and depth perception. Section III provides an overview of our framework, highlighting unique elements of our volume rendering system. Section IV discusses our AR blending technique designed to preserve a sense of the relative depths of virtual and real objects. In Section V we summarize our results, and conclude with findings and avenues for further development in Section VI.

This publication has emanated from research supported by Science Foundation Ireland under Grant Numbers 14/TIDA/2349 and 13/IA/1895.

Fig. 1. Sample result obtained using our framework. Left: real-world head

model; Middle: blended with a naive flat overlay of MRI Head rendering;

Right: blended using our framework to create illusion of containment

Page 2: Infocarve: A Framework for Volume Visualization on Commodity Augmented Reality … · 2016-10-26 · Infocarve: A Framework for Volume Visualization on Commodity Augmented Reality

II. RELATED WORK

Volume rendering is a subset of computer graphics that deals with 3D data in the form of a discretely sampled regular grid or acquired as a 3D stack of 2D images such as from an MRI scan. Although many techniques of varying complexity for rendering volumetric data have been developed over the past three decades, advancements in graphics processing unit (GPU) technologies have led to real-time volume ray-casting [1] becoming widely accepted as the preferred approach in the graphics community. Ray-casting has many advantages such as its generality, flexibility and reduced pre-processing requirement, however it is performance intensive, typically requiring a powerful graphical system with 3D texture support in order to achieve real-time frame rates. Many mobile and web-based graphical systems still have limited support for this feature, and thus until recently have been limited to employing techniques such as slice-based rendering [2] [3].

Blending layers of information to enhance visibility and depth perception is a strategy of general interest to scientific and information visualization and significant research has been published in this area. For example, Bruckner et al. [4] Presented a framework to composite interactively rendered 3D layers extending digital layering to 3D without giving up the advantages of 2D methods. de Moura Pinto and Freitas [5] proposed a front-to-back equation for fragment composition that allows precise and smooth importance-based visibility control during the blending of different portions of a volume dataset. Based on quantitative perception models, Zheng [6] presented an energy function to enhance perceptual depth-ordering of layers in rendered volume images. Considering the real-world image as an additional layer of information, we could apply similar strategies to visualization in AR.

Several previous works have explored perceptual issues in augmented reality [7] [8]. A recurring issue is seamless integration of real and virtual elements, achieved by simulating camera artefacts [9], photorealistic rendering [10] or stylizing the real world image [11]. However, in many serious applications, it could be argued that it is not the illusion of realism that is important but rather the fidelity of information conveyed by the resulting image. For instance, Fukiage et al. [12] employed a metric to improve blending of virtual and real images towards achieving a target level of visibility in blended components. Kalkofen et al. [13] presented an interactive focus and context approach to control the amount of augmented information, preserving relevant occlusions and landmarks whilst minimizing visual clutter. Related to this, Macedo and Apolinario [14] presented a focus and context visualization based on volume clipping that they found to improve the visual quality of medical AR scenes.

Various techniques have been applied to improve, specifically, depth perception in AR, such as controlling illumination [15], depth-of-field effects [16], stereopsis [17] and other factors. The importance of correct depth perception in medical augmented reality is discussed by Bichlmeier et al. [18] who presented a method for manipulating transparency for insitu visualization of anatomy. Their technique is based on polygonal datasets and they indicate that GPU ray-casting would render the technique non-interactive. Wither and Höllerer [19] evaluated a

number of pictorial depth cues for outdoor mobile augmented reality, such as shadow planes, and color-encoded markers as three pictorial depth cues. Zollmann et al. [20] presented an X-ray visualization technique that analyzes camera images of a physical scene and uses the extracted information for occlusion management. They also compared three different methods to achieve X-ray visualization [21], concluding that an image-based achieved the best results amongst those they tested. Mendez et al. [22] proposed the use of a static mask to enhance the perception of spatial arrangement in a scene. Choi et al. [23] proposed seamless switching between VR and AR to provide an indication of real distance. Bork et al. [24] combined Temporal Distance Coding and auditory feedback to improve 3D perception of an augmented camera view.

Whist, clearly, numerous works have dealt with depth perception, due to the interactive requirements of AR and the computational expense of volume rendering, there are relatively few studies combining the two and fewer directly address AR devices currently available to consumers. This will be the main focus of our discussion.

III. VISUALIZATION FRAMEWORK

An overview of our framework is illustrated in Fig. 2. In this section, we describe the design of the volume rendering component of the system, adapted to cater for a range of heterogeneous commodity display devices capable of AR display. The specifics of the augmented reality component are described in Section IV.

Fig. 2. Overview of the INFOCARVE framework, showing key

components discussed in this paper. Note that dashed lines indicate an

optional component, employed in the case of limited GPU resources.

SCANNING & TRACKING

AR BLENDING

VOLUME RENDERING

Sensor / Camera

GPU Ray-cast Volume Rendering

Volume data sampling

3D-Texture

Thin-client rendering

Marker Tracking

Depth ImageColor Image

Volume Data ( e.g. MRI )

2D Texture Atlas

MODELVIEW + PROJECTION

Volume Image

Normalised Depth Mask Null-Region Mask

Importance Map

Compositing

Final Image

Visualization Operations (e.g. Transfer Function etc.)Closing + Smoothing

Page 3: Infocarve: A Framework for Volume Visualization on Commodity Augmented Reality … · 2016-10-26 · Infocarve: A Framework for Volume Visualization on Commodity Augmented Reality

Fig. 4 shows some of the devices employed during development of our framework. We were particularly interested in AR head-mounted displays (HMDs) with in-built camera and depth sensor systems, as this appears to be the popular setup that will be followed by devices such as the Microsoft Hololens and Magic Leap that have been widely publicized in the media. At the time of development, we had access to only one device that had both of these components integrated, namely the Meta1, a binocular, AR HMD developed by Meta Company [25], which is tethered to a desktop PC and equipped with an embedded camera and depth sensor system, developed by SoftKinetic [26]. Thus we also performed development on Epson Moverio BT-200 [27], a binocular HMD without a depth sensor (Epson has since integrated a depth sensor into its BT2000 model), based on an Android mobile platform; an Intel RealSense [28] F200 sensor, a standalone camera and depth-sensor unit connected to a desktop PC; and the Google Project Tango tablet [29], an Android mobile device with depth-sensing capabilities and an NVIDIA Tegra K1 GPU.

Our core software implementation is based on OpenGL and the GLSL shader language [30], with OpenGL ES used on mobile platforms. In order to develop for the Meta1, we were required to use the device’s proprietary SDK which extends on Unity 3D, with no specific support for volume rendering. However, employing Unity in OpenGL mode meant that the majority of the codebase for volume raycasting, which resides mainly within GLSL shaders, could be almost directly ported with mainly just the scene setup performed in Unity specific code. As the Android development kit allows for embedding of native C++ code, minimal effort was also required to retarget the implementation across to the mobile platforms.

A. Volume Rendering

GPU-Raycasting is the preferred method of volume rendering in our framework for several reasons. In addition to having become the best-in-class technique in recent years, its generality and limited pre-processing requisites make it ideal for interactive AR in particular. The ability to change lighting conditions and the volume transfer function (which maps input values to appearance attributes such as color and opacity) on-the-fly makes it amongst the most flexible techniques. This in turn provides opportunities for better embedding virtual objects in the real world. For instance, the depths of individual voxels of the 3D volume can be individually compared to discrete points in the real-world scene to allow correct occlusion or blending, and the model can be relit consistently with the real-world scene.

In our system we employ an approach that is largely based on [1] (see example in Fig. 3), with key differences discussed in this section. The system provides functionality for loading and editing different data sets, changing the volume transfer function and a range of shaders, in order to achieve effects, such as two-level volume visualization and illustrative visualization, based on methods well-established in the literature. We will focus our discussion on elements that are particularly relevant to visualization in the AR context.

B. Support for Lightweight Clients

As previously mentioned in Section II, the ray-casting technique has a high computational complexity and is only achievable in real-time by exploiting parallel graphics processing capabilities. Typically, this requires 3D texture support which is only mandatory in version 3.0 onwards of the OpenGL ES standard. At the time of development of our framework many mobile devices and AR devices based on the Android platform had support only for OpenGL ES 2.0, wherein 3D texture support is provided as an optional extension. We found this extension not supported on several test platforms including the Moverio BT200. In order to guarantee support for ray-casting in our framework, we thus unroll the 3D volume into a 2D texture atlas [31], and use a custom GLSL shader to access corresponding texels in the 2D atlas. Then we apply interpolation to effectively achieve the functionality of a 3D texture lookup. It has been reported that use of texture atlases provides potential benefits for mobile display platforms [32]. We also integrate temporary support for slice-based volume rendering as an alternative to the ray-caster.

As a further support for mobile platforms with no GPU support, we developed a thin-client solution for volume rendering that exploits a standard desktop workstation as a server for ray-cast volume rendering. A lightweight software client running on a mobile AR device sends camera parameters to the server, which performs a fully-fledged ray-cast rendering from a corresponding virtual camera position at the client’s requested resolution. Image-space blending is performed to integrate the volume rendering with the real-world image, however, in the case of see-through displays, no blending is required. Fig. 3 (right) shows an example of the thin-client rendering mode running on the Epson Moverio BT200 HMD, which is powered by a lightweight hand-held Android device, with limited graphical capability.

Fig. 3. Devices tested in the development of our framework (from left to right):

Epson Moverio BT200, Metavision Meta1, Intel Realsense F200 Sensor.

Fig. 4. Volume rendering output. Left: volume ray-casting example on Meta1

AR-HMD. Right: photo of thin-client rendering on the refractive screen of the

Moverio BT200 (inset: a close-up view).

Page 4: Infocarve: A Framework for Volume Visualization on Commodity Augmented Reality … · 2016-10-26 · Infocarve: A Framework for Volume Visualization on Commodity Augmented Reality

IV. AUGMENTED REALITY

Augmented reality typically involves the interplay between technologies for interaction or real-world scene processing with display technologies that deliver the visual output that augments the real-world. Our main focus in this paper is on the latter, i.e. improving display components, thus we leverage existing technologies for tracking the position of the real world object. We exploit depth sensor data, obtained through off-the-shelf devices, in order to perceptually improve perception of depth when mixing complicated virtual volume data with the real world.

A. Tracking

In order to embed our volume data correctly in the real world, we use functionality available in existing AR libraries for tracking a two-dimensional marker in the real-world. Specifically, we use Metaio [33] on the desktop test platform, and Vuforia [34] on the Android mobile platforms including the Moverio BT200. The Meta1’s proprietary SDK (unrelated to the Metaio) provides additional built in functionality for tracking.

In an initial setup step, the position and scale of the volume needs to be calibrated to the position of the marker (or a pre-identified image or tag), which will be rigidly attached to the real-world object (see Fig. 5). At run-time, the tracking system scans for the marker in the real world and relays its relative position and orientation to the rendering system, more specifically sending the OpenGL modelview and projection matrices to the volume renderer. The rendering system renders the volume data from the appropriate view into a 2D image using a GLSL shader, as shown in Fig. 6(a).

B. Real-world Capture

A sensor mounted on the display device captures both the color image and depth image representing the real-world scene. In recent years, several commodity devices have become available that capture this combination of data, referred to as an RGB-D image. The RGB component is used for tracking purposes (as described in the previous paragraph) and also as the background real-world image, except in the case of see-through HMD displays, where the viewer sees the real-world directly and not a captured RGB image of the world.

The depth image is used as the basis for generating a 2D mask image. However, as the depth image includes not only the target 3D object but also other background objects that we don’t specifically need for the composition process, we segment the depth image to remove the extraneous information. We achieve this by intersection with a secondary mask delimiting areas in the volume data that don't contain any visible elements. As the volume data is already aligned with the real world, this secondary mask is obtained by simply thresholding the volume rendered image to provide a binary mask representing non-null regions in the volume scene (Fig. 6(b)). We used OpenCV [35] for a number of common image processing operations.

As we can see in Fig. 6(c), the fidelity of the depth image is quite poor due to the limited pixel resolution of the depth camera (the figure is based on a depth image of 640 x 480 resolution from the Intel RealSense F200 sensor) as well as the accuracy of the depth information. In this example, the border of the head area is not complete and there are small black holes inside the head area and these issues will influence the final composition result. In order to ameliorate the problems, we employ a morphological close operation to delete the black holes. Then, a further dilate operation is used to expand and refine the border of the head area.

C. Composition

Finally, the pixel value of the mask image is normalized to the 0 to 1 range, in order to achieve the final Importance map, Fig. 6(d), before being used in the final mixing process. Then, given the rendered volume image, denoted 𝐼𝑣 , input RGB image, 𝐼𝑐 , and the Importance map, 𝐼𝑚, we achieve the final blended result based on the following simple equation, which is implemented as a GLSL fragment shader: 𝐼 = 𝐼𝑚 × 𝐼𝑐 +(1 − 𝐼𝑚) × 𝐼𝑣 .

In the case of a see-through AR HMD such as the Meta1 or Hololens, the RGB color image is never displayed, but instead the volume rendering occludes, at corresponding levels of opacity, areas that need to be blended.

The manner in which we derive the importance map based on the depth information is stylistically motivated, and designed for consistency with a number of strategies observed in illustrative visualization [36] [18] [37]. Gradually increasing opacity inwards from the edge exposes interior objects in the center and creates an effect akin to the fall-off in specularity observed in spherical objects such as a fish tank or crystal ball. In our framework, the fall-off is dependent on the geometry of the real world object, and this was found to create an effective illusion of containment. In future work, we plan to conduct formal perceptual trials, to study and refine this effect.

Fig. 6. Example image targets used for tracking. Left: the Metaio SDK used

on MS Windows 8 with a marker at the base of the head model; Right: Vuforia used on Android, wth a smaller tarket affixed to the head.

Fig. 5. (a) Example of rendered volume data; (b) corresponding mask of non-

null regions; (c) depth image (d) depth image intersected with b and transformed to get final importance map for blending

Page 5: Infocarve: A Framework for Volume Visualization on Commodity Augmented Reality … · 2016-10-26 · Infocarve: A Framework for Volume Visualization on Commodity Augmented Reality

V. RESULTS

Fig. 1 shows a static example of a head model with an overlaid digital MRI rendering of the head. From visual inspection, it is clear that the right-most image, based on our framework, conveys a better sense that the virtual model is embedded within the head. In comparison the flat overlay (middle) is clearer but appears to float in front of the “real” head. A second example is provided in Fig. 7, which shows a number of frames captured from a live simulation. Again the importance-mapped approach (bottom row) seems to convey more clearly that the brain model is contained within the head than in the case of the flat overlay (top row).

The middle row in Fig. 7 shows the depth-map before smoothing and morphological close operations are applied to reduce noise in the model. Some temporal artefacts persist in the final result (bottom) but they are greatly reduced in comparison. Our current results are limited mainly by the fact that the prototype devices we tested had limited pixel and depth resolution for the depth sensor. The depth channel of the RealSense F200 has a pixel resolution of 640 x 480, whilst the SoftKinetic sensor in the Meta1 has half this resolution (320 x 240), which is further compounded by the fact that its proprietary SDK makes this available only in a color-coded format that requires an additional processing pass, and some inconsistencies across different platforms. We believe that the accuracy of our approach will scale up proportionally with the resolution of the depth sensor in upcoming AR products.

We tested our framework on several different AR hardware devices ranging from mobile tablet, HMD and desktop platforms. Real-time performance over 20 frames per second (FPS) was obtained for rendering of the MRI Head and Brain, both of 256x256x256 voxel resolution, using a standard GPU ray-caster with 2000 samples per ray on a desktop system (Dell Precision T1700 with Intel Xeon E3-1246 3.10Ghz CPU, 16Gb Memory, AMD FirePro W4100, Windows 8) equipped with an Intel Realsense F200 Sensor and on an identical system running with the Meta1 as sensor and output device.

The NVIDIA Tegra K1 system of the Google Project Tango tablet, equipped with 4Gb Ram and Android 4.4, was able to handle the rendering in real-time with a reduced ray-casting sampling rate. In our unoptimized system we found the sampling rate had to be more than halved in comparison to the desktop solution in order to guarantee framerates of greater than 15 FPS for rendering and tracking. We did not test the compositing operations on the Tango due to differences in the way the Tango handles depths as a point-cloud.

The Moverio BT200’s display is driven by a very lightweight hand-held hardware client (T1 OMAP 4460 1.2GHz Dual Core CPU, 1Gb RAM, Android 4.04). The device did not support 3D texture mapping and performance was significantly outside of interactive rates for the tested models using the texture-atlas renderer. However, it was possible to use slice based rendering and the thin-client mode (see Fig. 3 (right)) for real-time display of the volumes tracked to an image target, using a rendering server with the same spec as the desktop version discussed above. The BT200 was not equipped with a depth-sensor, however Epson have subsequently released a more extensive model, the BT2000 with this feature added.

VI. CONCLUSIONS AND FUTURE WORK

We discussed a framework for rendering 3D volumetric data

on augmented reality displays. The system exploits depth sensors that are becoming available in most augmented reality HMDs and mobile devices. We provided adaptations to our framework to accommodate devices with relatively low-end computational capabilities, as is common in mobile and HMD platforms designed to be highly portable. The framework achieves real-time performance across a number of platforms and provides improvements that preserve the sense of interior volumes being embedded inside real-world objects.

We were unable to test the fully-featured depth-based composition on all devices due to lack of some hardware features, however, with our main target being upcoming HMD devices, such as the Microsoft Hololens and Meta Company’s next HMD version, the Meta 2, we believe our results provide useful insights for future development once these become available. Furthermore, there are a number of standalone depth sensor devices being made available in the market such as the Structure sensor [39], and these could facilitate integration of our full-framework into a range of different portable devices.

In this paper, we presented results with rigid anatomy models. A key challenge that must be addressed for applying our approach to a real human body is to maintain correct calibration of the real world object reference with the volume data. This is a popular area of research in computer vision, and AR solutions are becoming available with increasing support for multiple image targets or 3D targets which we believe will soon allow us to maintain robust interactive alignment with non-rigid and highly dynamic objects. Our main objective, here, was to address the graphical part of the AR challenge, and we believe that our solution is applicable to a subset of other problems concerning more rigid models, such as displaying of interior features of machinery or electronic devices, such as in [38].

Fig. 7. Frames captured from live animation. Top row: flat overlay of volume

rendering; middle: naive importance mask before smoothing; bottom: depth-

preserving importance mask after closing and smoothing operations.

Page 6: Infocarve: A Framework for Volume Visualization on Commodity Augmented Reality … · 2016-10-26 · Infocarve: A Framework for Volume Visualization on Commodity Augmented Reality

Our framework employs an image-based solution to achieve the final AR blend at the final stages of the pipeline. This presents several advantages such as convenience for the purposes of the thin-client solution, as well as efficiency in that the blending is simple to implement in a straightforward fragment shader. Blending at the ray-cast stage could have certain additional advantages such as partial occlusion of the volume data by potential real-objects embedded within the virtual volume. For the present, our assumption is that the volume data represent interior structures underlying the hull of a solid real-world object, e.g. visualizations of bone under the skin of a human subject, which our current approach is able to handle.

We plan to conduct user studies in the future to obtain quantitative results on the improvements to depth perception and frame-to-frame coherency resulting from our approach. In addition such an experiment could inform modifications of the shaders that will lead to further improvements to depth perception. However, such studies involve a non-negligible investment of time and it was difficult to justify extensive trials at this stage as only prototype versions of the AR devices were available to us. We expect to exploit the better resolution of depth-sensors in the Hololens, Meta2, and Realsense R200 to achieve correspondingly better rendering quality. We plan to acquire and port our solution to these upcoming devices before commencing user studies. In addition we believe that higher depth resolution and better processing capabilities will provide opportunities to refine our approach in other ways, such as by using gradient information in the depth image for further improvements to the Importance map discussed in Section IV.

ACKNOWLEDGMENT

The authors would like to thank Niall Mullally for practical

contributions to the framework described in this paper. The

MRI Head dataset is obtained from Stefan Roettger’s volume

library at Erlangen University; the torso dataset seen in the

framework diagram was obtained from the Osirix DICOM

sample data page. The Brain dataset was a down-sampled

version of data obtained from the Harvard Surgical Planning

Library.

REFERENCES

[1] K. Engel, M. Hadwiger, J. Kniss, C. Rezk-Salama and D.

Weiskopf, Real-time volume graphics, CRC Press, 2006.

[2] T. T. Elvins, "A Survey of Algorithms for Volume Visualization,"

ACM SIGGRAPH Computer Graphics, vol. 26, no. 3, pp. 194-201,

August 1992.

[3] H. Brodlie and J. Wood, "Recent Advances in Volume

Visualization," in Eurographics State of the Art Reports, 2000.

[4] S. Bruckner, P. Rautek, I. Viola, M. Roberts, M. C. Sousa and M.

E. Groeller, "Hybrid Visibility Compositing and Masking for

Illustrative Rendering," Computers and Graphics, vol. 34, no. 4, pp.

361-369, 2010.

[5] F. de Moura Pinto and C. M. D. S. Freitas, "Illustrating volume

data sets and layered models with importance-aware composition,"

The Visual Computer, vol. 27, no. 10, pp. 875-886, 2011.

[6] L. Zheng, Y. Wu and K.-L. Ma, "Perceptually-based depth-

ordering enhancement for direct volume rendering," Visualization

and Computer Graphics, IEEE Transactions on, vol. 19, no. 3, pp.

446-459, 2013.

[7] D. Drascic and P. Milgram, "Perceptual issues in augmented

reality," in SPIE Volume 2653: Stereoscopic Displays and Virtual

Reality Systems III, San Jose, CA, 1996.

[8] E. Kruijff, J. E. Swan II and S. Feiner, "Perceptual issues in

augmented reality revisited.," in ISMAR, 2010.

[9] G. Klein and D. Murray, "Compositing for small cameras," in

Proceedings of the 7th IEEE/ACM International Symposium on

Mixed and Augmented Reality, 2008.

[10] A. Augustano, L. Li, Z. Chuangui and N. Sing, "Photorealistic

Rendering for Augmented Reality using Environment Illumination,"

in IEEE and ACM International Symposium on Mized and

Augmented Reality, 2003.

[11] J. Hsicher, D. Cunningham, D. Bartz, C. Wallraven, H. Bülthoff

and Strasser, "Measuring the Discernability of Virtual Objects in

Conventional and Stylised Augmented Reality," in Eurgraphics

Symposium on Virtual Environments, 2006.

[12] T. Fukiage, T. Oishi and K. Ikeuchi, "Visibility-based blending

for real-time applications," in Mixed and Augmented Reality

(ISMAR), 2014 IEEE International Symposium on, 2014.

[13] D. Kalkofen, E. Mendez and D. Schmalstieg, "Interactive Focus

and Context Visualization for Augmented Reality," in Mixed and

Augmented Reality, 2007. ISMAR 2007. 6th IEEE and ACM

International Symposium on, 2007.

[14] M. C. F. Macedo and A. L. Apolinario, "Focus plus context

visualization based on volume clipping for markerless on-patient

medical data visualization," Computers \& Graphics, vol. 53, pp.

196-209, 2015.

[15] M. Haller, S. Drab and W. Hartmann, "A real-time shadow

approach for an augmented reality application using shadow

volumes," in ACM Symposium on Virtual Reality Software and

TEchnology, 2003.

[16] P. Kán and H. Kaufmann, "Physically-based Depth-of-Field in

Augmented Reality," in Eurographics Short Papers, 2012.

[17] M. Kersten, N. Troje, R. Ellis and others, "Enhancing depth

perception in translucent volumes," Visualization and Computer

Graphics, IEEE Transactions on, vol. 12, no. 5, pp. 1117-1124, 2006.

[18] C. Bichlmeier, F. Wimme, S. M. Heining and N. Navab,

"Contextual Anatomic Mimesis Hybrid In-Situ Visualization Method

for Improving Multi-Sensory Depth Perception in Medical

Augmented Reality," in Mixed and Augmented Reality, 2007. ISMAR

2007. 6th IEEE and ACM International Symposium on, 2007.

Page 7: Infocarve: A Framework for Volume Visualization on Commodity Augmented Reality … · 2016-10-26 · Infocarve: A Framework for Volume Visualization on Commodity Augmented Reality

[19] J. Wither and T. Höllerer, "Pictorial depth cues for outdoor

augmented reality," in Wearable Computers, 2005. Proceedings.

Ninth IEEE International Symposium on, 2005.

[20] S. Zollmann, D. Kalkofen, E. Mendez and G. Reitmayr, "Image-

based ghostings for single layer occlusions in augmented reality," in

Mixed and Augmented Reality (ISMAR), 2010 9th IEEE International

Symposium on, 2010.

[21] S. Zollmann, R. Grasset, G. Reitmayr and T. Langlotz, "Image-

based X-ray visualization techniques for spatial understanding in

outdoor augmented reality," in Proceedings of the 26th Australian

Computer-Human Interaction Conference on Designing Futures: the

Future of Design, 2014.

[22] E. Mendez and D. Schmalstieg, "Importance Masks for

Revealing Occluded Objects in Augmented Reality," in Proceedings

of the 16th ACM Symposium on Virtual Reality Software and

Technology, New York, NY, USA, 2009.

[23] H. Choi, B. Cho, K. Masamune, M. Hashizume and J. Hong, "An

effective visualization technique for depth perception in augmented

reality-based surgical navigation," The International Journal of

Medical Robotics and Computer Assisted Surgery, 2015.

[24] F. Bork, B. Fuers, A.-K. Schneider, F. Pinto, C. Graumann and

N. Navab, "Auditory and Visio-Temporal Distance Coding for 3-

Dimensional Perception in Medical Augmented Reality," in Mixed

and Augmented Reality (ISMAR), 2015 IEEE International

Symposium on, 2015.

[25] Meta Company, "Meta Vision - Augmented Reality," 2016.

[Online]. Available: http://www.metavision.com. [Accessed 8 June

2016].

[26] Softkinetic, "Softkinectic - A Sony Group Company," 2016.

[Online]. Available: http://www.softkinetic.com. [Accessed 8 June

2016].

[27] Seiko Epson Corporation, "Moverio BT-200 Smart Glasss -

Epson," 2016. [Online]. Available:

https://www.epson.co.uk/products/see-through-mobile-

viewer/moverio-bt-200. [Accessed 8 June 2016].

[28] Intel Corporation, "Intel RealSense Technology," Intel

Corporation, 2016. [Online]. Available:

http://www.intel.ie/content/www/ie/en/architecture-and-

technology/realsense-overview.html. [Accessed 8 June 2016].

[29] Google, "Google's Project Tango," Google, 2016. [Online].

Available: https://www.google.com/atap/project-tango/. [Accessed 8

June 2016].

[30] K. Group, "OpenGL - The Industry's Foundation for High

Perfromance Graphics," Khronos Group, 2016. [Online]. Available:

httP//www.opengl.org. [Accessed 8 June 2016].

[31] K. Beets, "Developing 3D Applications for PowerVR MBX

Accelerated ARM Platforms," in ARM Developers Conference, 2005.

[32] M. Moser and D. Weiskopf, "Interactive Volume Rendering on

Mobile Devices," in Vision, Modelling and Visualization, 2008.

[33] Metaio. “Metaio | Home.” [Online]. Available:

http://www.metaio.com. [Accessed 5 March, 2015]

[34] P. Inc., "Vuforia," 2016. [Online]. Available:

http://www.vuforia.com/. [Accessed 8 June 2016].

[35] Itseez. “OpenCV,” 2016. [online] Available: http://openCV.org.

[Accessed 8 June 2016].

[36] S. Bruckner, S. Grimm, A. Kanitsar and M. E. Gröller,

"Illustrative Context-Preserving Volume Rendering," in IEEE VGTC

Symposium on Visualization, 2005.

[37] B. Csébfalvi, B. Tóth, S. Bruckner and E. M. Gröller,

"Illumination-Driven Opacity Modulation for Expressive Volume

Rendering," in Vision, Modelling and Visualization, 2012.

[38] P. Mohr, B. Kerbl, M. Donoser, D. Schmalstieg and D. Kalkofen,

"Retargeting Technical Documentation to Augmented Reality," in

Proceedings of the 33rd Annual ACM Conference on Human Factors

in Computing Systems, New York, NY, USA, 2015.

[39] Structure, “Structure Sensor – 3D Scanning, Augmented Reality

and More for Mobile Devices,” 206. [Online]. Available:

http://structure.io