10
Above Surface Interaction for Multiscale Navigation in Mobile Virtual Reality Tim Menzner * Travis Gesslein Alexander Otte Jens Grubert § Coburg University of Applied Sciences and Arts Figure 1: Relative, rate-controlled mapping for multiscale navigation between a touchscreen and VR headset: a) A user points her finger in the zoom out space (between the maximum finger height and halfway down the minimum height). b) While she holds her finger in this zoom out space the virtual information space scales down until it eventually reaches the size of the touch screen (with a rate corresponding to the height of the finger). c) The user points her finger in the zoom in space (between the minimum height and halfway up to the maximum height). c) While she holds her finger in this zoom in space the virtual information space scales up until it eventually reaches 1:1 scale (with a rate corresponding to the height of the finger). Target selection happens through touch on the touch screen. ABSTRACT Virtual Reality enables the exploration of large information spaces. In physically constrained spaces such as airplanes or buses, controller-based or mid-air interaction in mobile Virtual Reality can be challenging. Instead, the input space on and above touch- screen enabled devices such as smartphones or tablets could be employed for Virtual Reality interaction in those spaces. In this context, we compared an above surface interaction tech- nique with traditional 2D on-surface input for navigating large planar information spaces such as maps in a controlled user study (n = 20). We find that our proposed above surface interaction technique results in significantly better performance and user preference compared to pinch-to-zoom and drag-to-pan when navigating planar information spaces. Index Terms: Human-centered computing—Visualization—Visu- alization techniques—Treemaps; Human-centered computing— Visualization—Visualization design and evaluation methods 1 I NTRODUCTION Recent progress in Virtual Reality (VR) technology has enabled the possibility of allowing users to conduct knowledge work in mobile scenarios. While the vision of a spatial user interface supporting knowledge work has been investigated for many years (e.g. [84,85]), the recent emergence of consumer-oriented VR headsets now make it feasible to explore the design of portable user interface solutions [35]. Lately, commercial VR head-mounted displays (HMDs) are pro- gressing to ’inside-out’ tracking using multiple built-in cameras. Inside-out tracking allows a simple setup of the VR system, and * e-mail: [email protected] e-mail: [email protected] e-mail: [email protected] § e-mail: [email protected] the ability to work in un-instrumented environments. Further, 3D finger tracking might be supported by traditional computing de- vices such as smartphones [114], which at the same time provide complementary high accuracy 2D touch input. Given the capability of spatial hand and finger sensing, it is pos- sible to see VR extending the input space of existing computing devices such as touch screens to a space around the devices, enhanc- ing their usage. For example, knowledge workers can utilize existing 2D surface tools, new 3D tools, as well as using space around the de- vices and in front of the screen, reachable while sitting, to represent and manipulate additional information. However, while VR is promising for mobile work it also brings its own additional challenges. For example, mobile information workers might many times be confined to a small space, such as the case of traveling in an airplane or a bus. While they can view potentially very large information spaces through their HMD, the ability to interact with those spaces might be limited in constrained physical spaces, restricting the suitability of ample spatial gestures [35]. Hence, we see it as important to examine interactions in potentially small input spaces. Also, while VR lends itself for interaction with three dimensional data, many knowledge worker tasks are still bound to two dimensional information surfaces [35]. In this context, this paper explores how to efficiently navigate multiscale information spaces in a small input space using a combi- nation of VR HMD and a touch surface. Specifically, we explore the navigation of 2D planar information spaces such as maps using small spatial finger movements above a touchscreen while simultaneously allowing users to view beyond the boundaries of the touchscreen using a VR HMD. Our contribution lies in the design and evaluation of such as spa- tial interaction technique for navigation of large planar information spaces. In a user study (n = 20) we show that the proposed technique outperforms traditional 2D surface gestures. 2 RELATED WORK Our work builds on the areas of spatial and around device interaction as well as multiscale navigation, which will be reviewed next. arXiv:2002.03037v1 [cs.HC] 7 Feb 2020

arXiv:2002.03037v1 [cs.HC] 7 Feb 2020 · handheld touchscreens [36,60], or virtual lenses [62] and sizes [73]. Also, the combination of touch and gaze has seen recent interest for

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: arXiv:2002.03037v1 [cs.HC] 7 Feb 2020 · handheld touchscreens [36,60], or virtual lenses [62] and sizes [73]. Also, the combination of touch and gaze has seen recent interest for

Above Surface Interaction for Multiscale Navigation inMobile Virtual Reality

Tim Menzner* Travis Gesslein† Alexander Otte‡ Jens Grubert§

Coburg University of Applied Sciences and Arts

Figure 1: Relative, rate-controlled mapping for multiscale navigation between a touchscreen and VR headset: a) A user points herfinger in the zoom out space (between the maximum finger height and halfway down the minimum height). b) While she holds herfinger in this zoom out space the virtual information space scales down until it eventually reaches the size of the touch screen (with arate corresponding to the height of the finger). c) The user points her finger in the zoom in space (between the minimum height andhalfway up to the maximum height). c) While she holds her finger in this zoom in space the virtual information space scales up until iteventually reaches 1:1 scale (with a rate corresponding to the height of the finger). Target selection happens through touch on thetouch screen.

ABSTRACT

Virtual Reality enables the exploration of large information spaces.In physically constrained spaces such as airplanes or buses,controller-based or mid-air interaction in mobile Virtual Realitycan be challenging. Instead, the input space on and above touch-screen enabled devices such as smartphones or tablets could beemployed for Virtual Reality interaction in those spaces.

In this context, we compared an above surface interaction tech-nique with traditional 2D on-surface input for navigating large planarinformation spaces such as maps in a controlled user study (n = 20).We find that our proposed above surface interaction technique resultsin significantly better performance and user preference compared topinch-to-zoom and drag-to-pan when navigating planar informationspaces.

Index Terms: Human-centered computing—Visualization—Visu-alization techniques—Treemaps; Human-centered computing—Visualization—Visualization design and evaluation methods

1 INTRODUCTION

Recent progress in Virtual Reality (VR) technology has enabled thepossibility of allowing users to conduct knowledge work in mobilescenarios. While the vision of a spatial user interface supportingknowledge work has been investigated for many years (e.g. [84,85]),the recent emergence of consumer-oriented VR headsets now makeit feasible to explore the design of portable user interface solutions[35].

Lately, commercial VR head-mounted displays (HMDs) are pro-gressing to ’inside-out’ tracking using multiple built-in cameras.Inside-out tracking allows a simple setup of the VR system, and

*e-mail: [email protected]†e-mail: [email protected]‡e-mail: [email protected]§e-mail: [email protected]

the ability to work in un-instrumented environments. Further, 3Dfinger tracking might be supported by traditional computing de-vices such as smartphones [114], which at the same time providecomplementary high accuracy 2D touch input.

Given the capability of spatial hand and finger sensing, it is pos-sible to see VR extending the input space of existing computingdevices such as touch screens to a space around the devices, enhanc-ing their usage. For example, knowledge workers can utilize existing2D surface tools, new 3D tools, as well as using space around the de-vices and in front of the screen, reachable while sitting, to representand manipulate additional information.

However, while VR is promising for mobile work it also bringsits own additional challenges. For example, mobile informationworkers might many times be confined to a small space, such asthe case of traveling in an airplane or a bus. While they can viewpotentially very large information spaces through their HMD, theability to interact with those spaces might be limited in constrainedphysical spaces, restricting the suitability of ample spatial gestures[35]. Hence, we see it as important to examine interactions inpotentially small input spaces. Also, while VR lends itself forinteraction with three dimensional data, many knowledge workertasks are still bound to two dimensional information surfaces [35].

In this context, this paper explores how to efficiently navigatemultiscale information spaces in a small input space using a combi-nation of VR HMD and a touch surface. Specifically, we explore thenavigation of 2D planar information spaces such as maps using smallspatial finger movements above a touchscreen while simultaneouslyallowing users to view beyond the boundaries of the touchscreenusing a VR HMD.

Our contribution lies in the design and evaluation of such as spa-tial interaction technique for navigation of large planar informationspaces. In a user study (n = 20) we show that the proposed techniqueoutperforms traditional 2D surface gestures.

2 RELATED WORK

Our work builds on the areas of spatial and around device interactionas well as multiscale navigation, which will be reviewed next.

arX

iv:2

002.

0303

7v1

[cs

.HC

] 7

Feb

202

0

Page 2: arXiv:2002.03037v1 [cs.HC] 7 Feb 2020 · handheld touchscreens [36,60], or virtual lenses [62] and sizes [73]. Also, the combination of touch and gaze has seen recent interest for

2.1 Spatial InteractionIn the context of spatial user interfaces, a large number of techniquesfor selection, spatial manipulation, navigation and system controlhave been proposed [58]. Regarding object selection, Argelaguet etal. [4] presented a survey on 3D selection techniques. For a recentsurvey on 3D virtual object manipulation we refer to Mendes etal. [67]. Also, for spatial interaction across mobile devices (mostlyin non-VR settings) recent surveys are available [18, 33].

For VR, spatial pointing techniques are of special interest, oftenrelying on virtual hand or raycasting techniques. In the contextof raycasting, Mayer et al. [66] investigated the effects of offsetcorrection and cursor on mid-air pointing and Schwind et al. [94]showed that avatar representations can have an effect on ray-casting-based pointing accuracy. For virtual hand-based pointing, Barreraet al. [9] indicated effects of stereo display deficiencies. Chan etal. [21] explored the use of visual and auditory feedback for facili-tating spatial selection. Teather et al. [109] also investigated visualaids for 3D selection in VR. Lubos et al. [64] and Yu et al. [119]investigated 3D selection in VR headsets, where Lubos found thatvisual perception has a larger effect on target acquisition than motoractions. Stuerzlinger and Teather proposed guidelines for targetsin 3D pointing experiments [106]. The performance of 2D touchand mid-air selection on stereoscopic tabletop displays has beenexplored by Bruder et al. [16, 17]. Further, a number of techniqueshave been proposed for 3D object selection in the presence of objectocclusion (e.g., [3, 11, 28, 102, 104]).

Besides, unimodal techniques, the combination of touch withmid-air, has drawn attention from researchers. For example, out-side of VR, Muller et al. [70] investigated the use of touch andmid-air interaction on public displays, Hilliges et al. [44] studiedtabletop settings. Multiple works have proposed to utilize handheldtouchscreens in spatial user interfaces for tasks such as sketching,ideation and modelling (e.g., [5, 26, 31, 46, 83]), navigation of vol-umetric data [98], 3D data exploration [63] or system control [13].Spatial manipulation has seen particular interest in single-user set-tings (e.g., [8, 48, 61, 65, 69, 107, 108]) but also in collaborativesettings [32].

Evolving from the magic lens [12] and tangible interaction con-cepts [112] tangible magic lenses allow to access and manipulateotherwise hidden data in interactive spatial environments. A wide va-riety of interaction concepts have been proposed within the scope ofinformation visualization (for surveys we refer to [110, 111]). Bothrigid shapes (e.g., rectangular [99]) or circular [101] and flexibleshapes (e.g., [103]) have been utilized as well as various display me-dia (e.g., projection on cardboard [20,99]), transparent props [15,88],handheld touchscreens [36, 60], or virtual lenses [62] and sizes [73].Also, the combination of touch and gaze has seen recent interest forinteraction in spatial user interfaces (e.g., [45, 55, 79, 86, 93]).

Recently, Satriadi et al. and Austin et al. [6, 87] investigated handand foot gestures for map navigation within Augmented Realityin tabletop settings. Our work relates in the use of above surfaceinteraction, but focuses on the interaction in constrained spatialenvironments.

Within this work, we concentrate on the use of mid-air inter-action in conjunction with touch screen interaction for navigatingpotentially large planar information spaces such as maps.

2.2 Around Device InteractionAlong with the reduction of the size and weight of mobile andwearable devices, the need for complementary interaction meth-ods evolved. Research began investigating options for interactionnext to [72], above [30, 53], behind [25, 115], across [22, 89], oraround [116, 123] the device. The additional modalities are eithersubstituting or complementing the devices’ capabilities.

Different sensing solutions have been proposed such as infraredsensors (e.g., [19]), cameras (e.g., [7,118], acoustic sensing (e.g., [40,

41, 71], piezo sensors (e.g., [116]), depth sensors (e.g., [23, 54, 95],electric field sensing ( [124]), or radar-based sensing (e.g., [114].Several approaches also investigated how to enable around-deviceinteraction on unmodified devices (e.g., [34, 90, 96, 97]).

In VR, around device interaction has been studied in conjunctionwith physical keyboards for text entry (e.g., [68, 74, 75, 91]).

This prior work, alongside commercialization efforts (e.g., forradar-based sensing [114], which recently made its way into com-mercial products such as Google Motion Sense1) motivate us, thataround-device interaction on mobile devices has the chance to be-come widely available for consumers.

2.3 Multiscale Navigation

To deal with large information spaces, multiscale interfaces [10, 39,78] introduce the scale dimension, sometimes called Z2. Cockburn etal. [24] give a comprehensive overview of overview+detail, zoomingand focus+context interfaces including techniques for multiscalenavigation. Many techniques make use of a separate zoom/scaledimension and a scroll/pan dimension to select a target value inthe multiscale space [37]. Prior work indicated, that multiscalenavigation with 2D interfaces obeys Fitt’s law [38, 39]. Perluson etal. [77] investigated bimanual techniques for multiscale navigation.Appert et al. [2] investigated 1D multiscale navigation with a mouse-based zooming technique.

In VR, multiscale collaborative virtual environments have beenstudied (e.g., [29, 56, 59, 80, 81, 121]) and various metaphors (suchas world-in-miniature) have been investigated for navigation tasks(e.g. [1, 51, 117]). Further, Zhang et al. [57, 105, 120] introduced aprogressive mental model supporting multiscale navigation and alsoindicated that navigation in multiscale virtual environments can bechallenging [122].

For mobile devices, Jones et al. [47] investigated mid-air inter-actions for multiscale navigation afforded by mobile depth sensorsand discussed design implication for mid-air interactions and depth-sensor design. Kister et al. [50] combined spatially-aware mobiledisplays and large wall displays for graph explorations. Kratz etal. [52] proposed a semi-automatic zooming technique as an alter-native to multi-touch zooming but could not indicate significantperformance benefits. For mouse-based interaction in desktop en-vironments, Igarashi and Hinckley proposed rate-based scrollingwith automatic zooming to obtain a constant perceived scrollingspeed in screen space. To this end, when the user scrolled fast, theview autonatically zoomed out. The authors indicated a comparableperformance to the use of scroll bars.

Spindler et al. [100] compared pinch to zoom and drag to pan withmid-air spatial navigation utilizing absolute mappings and found thatnavigating a large information space by moving the display devicesoutperforms navigation with 2D surface gestures. Pahud et al. [76]compared 2d surface gestures with spatial navigation but derived atdifferent findings (spatial navigation was slower than 2D navigation).Related, Hasan et al. [43] compared dynamic peephole pointing anddirect off-screen pointing with on-screen 2D gestures for map basedanalytic tasks. They found that while the spatially aware techniquesresulted in up to 30% faster navigation times, 2D onscreen gesturesallowed for more accurate retrieval of work space content.

Our study also compares to a 2D surface gesture baseline tech-nique, but compare it to relative-rate controlled above surface in-teraction technique, which is suitable for a wide variety of usagescenarios, including interaction in physically constrained environ-ments such as trains or touchdown places.

1https://www.blog.google/products/pixel/new-features-pixel4/ Last ac-cess November 21st, 2019

2In this paper, we use the words scale and zoom interchangeably.

Page 3: arXiv:2002.03037v1 [cs.HC] 7 Feb 2020 · handheld touchscreens [36,60], or virtual lenses [62] and sizes [73]. Also, the combination of touch and gaze has seen recent interest for

3 ABOVE SURFACE INTERACTION FOR MULTISCALE NAVI-GATION IN MOBILE VR

Within this research, we set out to support efficient navigation oflarge information spaces using a potential small interaction space(such as the space on and above a smartphone or a tablet) usinga combination of spatial interaction and a touch surface. In thisjoint interaction space between touchscreens and VR headsets, thetouchscreen can be utilized for fine grained 2D touch interactions,while the VR headsets allows to visualize content beyond the deviceboundary of the touchscreen. Within this scenario, spatial trackingof users’ hands and fingers could be achieved either by camera-basedinside-out tracking on the VR headset (which at the same time couldspatially track the position of the smartphone through marker-basedor model-based tracking), by smartphone-integrated sensing, or acombination of both.

Inspired by the large amount of prior work, we followed an itera-tive approach with multiple design loops consisting of conceptualiza-tion, implementation and initial user tests (dogfood tests) [27, 113].For prototyping purposes we relied on an external outside-in track-ing system (OptiTrack Prime 13) in conjunction with a HTC VivePro headset.

First, we explored an absolute, position-controlled mapping offinger height to the scale of the virtual information space. Withthis technique, the height of the finger above the smartphone dis-play until a previous empirically determined maximum height (5cm above the display) was directly mapped to a scale level of thevirtual display. The scaling changed linearly between minimum andmaximum finger height. While initial user tests indicated that thismapping felt natural, the accuracy of the technique is dependent onthe the maximum scale of the information space.

To overcome this limitation, we implemented a relative, rate-controlled mapping, in which the scale level was constantly in-creased or decreased with a certain speed dependent of the height ofthe finger over the display (see Figure 1, a-d). For this approach, thevolume above the smartphone display was separated into two smallerspaces of the same size. The first space extended from above thedisplay surface to half a previous defined maximum height. The sec-ond area extended from this height to the defined maximum height.If the finger was in the lower volume, the scale level increased. Ifthe finger was in the upper volume, the scale level decreased. Thepoint between those areas at half maximum height, theoreticallywould result in a scale change of 0. We experimented with a ”deadzone” in the center (i.e. expanding the volume in which no scalechange would occur). Initial user tests, indicated that the scalechange around the center was sufficiently low that no extended deadzone was necessary. We also experimented with a dead zone directlyabove the touch surface but came to similar conclusions.

For zooming out, a zoom base speed parameter determining themaximum reachable zoom speed was multiplied with the fingerheight above the display normalized between half maximum heightand maximum height. The result was used to decrease the map sizeby this factor. Similarly, for zooming in, the height was normalizedbetween half maximum height and the minimum finger height, seeFigure 1. If the user touched the screen however, the zoomingprocess was stopped so she could still conduct touchscreen inputwithout constantly zooming in or moving on the other axes.

The movement on the two other axes in both techniques was han-dled equally. The position of the finger on the x axis or respectivelythe y axis relative to the display middle point was multiplied witha plane base speed parameter and was used to determine the speedin this direction, resulting in a rate-controlled change of the x,yposition. In order to make sure that the user could always see whereexactly the fingertip was hovering, a small flat white disc with a 5mm diameter indicated where a ray cast perpendicular to the touchsurface would hit the display. This point also functioned as the pivotpoint for zooming.

Initial user tests indicated that the rate-controlled mapping re-sulted in both in better performance (i.e. task completion time) andwas more preferred than the absolute one. Especially, on the largermap we tested (1.392m * 0.655m and a touchscreen width to virtualmap width ratio of 1:13,2 compared to 13.92m * 6.55m and a ratioof 1:132), the user would have to zoom more to get to the mini-mum scale level. Specifically, the range between the display surfaceand a physically possible maximum finger height was too small toprovide a smooth scale change. A small change in finger heightwould map to a large scale change. This made accurate and precisenavigation very difficult. Compared to this, the relative movementwas perceived much more smooth and precise on all map sizes.

All base speed parameters were empirically determined. For thezoom base speed, this value was 0.05 and for the plane base speed,this value was 0.001. Please note that the zoom speed constantof 0.05 is dependent on the maximum finger height (in our case5 cm) and the plane base speed parameter is dependent on thesmartphone display size. For other finger heights or display sizesthose parameters can be linearly interpolated. Also, the maximumfinger height of 5 centimeters above the display empirically producedthe best results for various users.

4 USER STUDY

We aimed to compare the relative, rate-controlled mapping, whichoutperformed the absolute, position-controlled mapping in initialtests, with commonly used technique for navigating 2D informationspaces on touch screens. To this end, we utilized the well knownpinch-to-zoom and drag to pan interaction techniques and conducteda multiscale target search and selection task in a map navigationscenario.

4.1 Study DesignWe followed a within-subjects design with two independent vari-ables: INTERFACE and MAP SIZE.

INTERFACE had two levels: 3D NAVIGATION utilized the relativeabove surface interaction technique described in the previous sec-tion. The BASELINE condition utilized 2D pinch to zoom and dragto pan known from mobile map applications. The parameters for panand zoom were chosen to mimic the Google Maps pan and zoomexperience on the smartphone used in the user study (see Apparatussection), including inertia and pivot point for zooming. The BASE-LINE technique utilized the smartphone touch screen for sensing.We also verified that the navigation performance when wearing aVR HMD matches performance when not wearing an HMD butinteracting with a mobile phone only version of the application. Asmall white flat disc was used to visualize the pivot point of zooming,in this case the middle point between the two fingers used for zoom.The users’ fingers used for interaction (typically thumb and indexfinger of the dominant hand) were visualized as colored spheres with10 mm diameter (same as the target size).

There were two MAP SIZEs: SMALL MAP had a size of 1.45 m* 0.69 m resulting in ratio between touch screen width to virtualmap width of 1:13.8. LARGE MAP had a size of 144.71 * 69.11 mresulting in ratio between touch screen width to virtual map width of1:1380. In a pilot study, we evaluated an additional map size (with atouchscreen width to virtual map width ratio of 1:138), but foundno additional insights compared to the large map size and, hence,excluded it from the main study.

Within each of the four conditions (2 INTERFACEs x 2 MAP SIZEs,participants needed to select targets at three TARGET DISTANCEs:SMALL DISTANCE, MEDIUM DISTANCE and LARGE DISTANCE. Theorder of TARGET DISTANCE was randomized within each condition,ensuring that each of the three TARGET DISTANCES appeared exactlyfive times per condition. The target distances were defined relative tothe length of the diagonal of each MAP SIZE corresponding to 0.125* map diagonal for SMALL DISTANCE, 0.25 * map diagonal for

Page 4: arXiv:2002.03037v1 [cs.HC] 7 Feb 2020 · handheld touchscreens [36,60], or virtual lenses [62] and sizes [73]. Also, the combination of touch and gaze has seen recent interest for

Figure 2: Left: View on the VR scene in the 3D NAVIGATION condition at a small scale. The active target is shown as blue dot, inactive targetsas red dots. The pink sphere indicates the fingertip position, the white disc below the pivot point for zooming. The touch screen area of thesmartphone is shown as semi-transparent blue rectangle with red border. Right: View on the VR scene in the BASELINE condition at 1:1 scale.The circular targets are complemented with 3D flag symbols to indicate that targets can be selected. The two fingertips used for zooming areindicated by yellow and pink spheres. The pivot point indicated by the white disc is visualized half-way between both fingertips.

MEDIUM DISTANCE and 0.5 * map diagonal for LARGE DISTANCE.Hence, SMALL DISTANCE corresponded to 0.2 m on the SMALL MAPand 20 m on the LARGE MAP. MEDIUM DISTANCE correspondedto 0.4 m on the SMALL MAP and 40 m on the LARGE MAP. LARGEDISTANCE corresponded to 0.8 m on the SMALL MAP and 80 m onthe LARGE MAP.

The conditions were blocked by MAP SIZE, i.e. the starting orderof INTERFACE was counter-balanced within each MAP SIZE butboth INTERACTION TECHNIQUEs were conducted sequentially for asingle MAP SIZE. The order of MAP SIZE was also counterbalanced(e.g., participant i starting with MAP SIZE SMALL MAP, participanti+1 starting with LARGE MAP.

4.2 TaskParticipants were asked to conduct a target search and acquisitiontask. To this end, they had to identify a target on the map, navigateto the target and eventually select it.

In each condition, the participants were asked to find and clicka total of 15 circular targets with a diameter of 10 mm on a map.This target size was scale-independent (i.e. the target was alwaysdisplayed with a 10 mm diameter), but the target could only beselected in the largest map scale (zoomed in to the maximum scalelevel). The active target was highlighted in blue, while all otherpotential targets were colored red. In order to click the target, theparticipant had to navigate close enough so that the target was insidethe touch screen bounds (and at scale level 1:1). When this wasachieved, it was possible to selecting it by dwelling the finger onthe real world smartphone display at the target location for onesecond. This delay time was used to separate an accidental touchfrom an indented click. If the participant failed to hit the target atfirst attempt, an error was logged. In order to progress to the nextmarker, the participant still had to eventually achieve a successfulhit afterwards. After successfully selecting the target, a new targetwas highlighted. The new target was always selected to fit a certaindistance to the previous target, respectively the origin for the firsttarget. Those distances were separated in three categories. Near,middle and far. Each target distance was randomly used 5 timesadding up to 15 total targets per condition. The task started at themaximum scale level.

4.3 ProcedureAfter welcoming, participants were asked to fill out a demographicquestionnaire. This was followed by a short tutorial video, whichgave first information about the to be fulfilled tasks and the twoimplemented interaction techniques. Thereafter, we conducted atraining phase for the first condition, in which participants clicked6 targets, 2 of each distance type, to get used to the controls. Thistraining phase was later repeated before each condition with the

respective technique. After finishing this training phase, the actualtest condition was carried out. After each condition, the participantfilled out three questionnaires, the system usability scale SUS [14],the NASA Task Load Index TLX [42] and the Simulator SicknessQuestionaire SSQ [49]. The conditions were blocked by MAP SIZE.The starting order of the blocks (SMALL MAP and LARGE MAP) aswell as the order of the INTERFACEs in each block were balancedacross participants. After each block, the participants answeredan additional questionnaire concerning their personal preferencesrelated to both methods. After the first block, the participants hadthe possibility to take a 10 minute break before proceeding withthe next block. Following both blocks and all questionnaires, theparticipants were asked to participate in a semi-structured interview,in which further feedback was collected. Finally, participants werethanked and compensated with a 10 Euro voucher.

The duration of each condition was strongly dependent on thespeed of the participant, the map size and the used navigation method.Together with the questionnaires and the short interview, the overallaverage duration of the experiment was 60 minutes.

4.4 ApparatusFor capturing the touch input and sending it via WiFi to the mainapplication on a PC, we used an Android application on an AmazonFire Phone running Fire OS 4.6.3. The smartphone dimensionswere 139.2 mm 66.5 mm 8.9 mm (width x height x depth), with atouchscreen of size 105 mm x 60 mm. For 3D Tracking, we usedan OptiTrack Prime 13 system, consisting of 8 cameras, of which3 were Prime 13w wide field-of-view cameras and Motive Version1.10.3. The main study application was a running on a PC withwindows 10, an Intel Xeon E5 - 1650 CPU, a Nvidia GeForce GTX1070 graphics card and 64 GB RAM. The VR Headset used wasthe HTC Vive Pro. Both applications, the one on the smartphone asmuch as the one running on the PC were created in Unity 2018.3.2f1.While the HTC Vive Pro and the fingertips of users could have beentracked with mobile sensing solutions [92] the HMD and fingerswere equipped with retro-reflective markers for increased accuracy,see Figure 4.

4.5 ParticipantsWe recruited 20 participants from a university campus with diversestudy backgrounds. All participants were familiar with touch sen-sitive screens. From the 20 participants (7 female, 13 male, meanage 24.4 years, sd = 2.973, mean height 175.8 cm, sd = 9.463, 15indicated prior Virtual Reality Experience, 5 of those only once,3 participants rarely but more than once, 5 participants occasion-ally and 2 often. Three participants indicated they don’t play videogames, 1 participant only once, 4 rarely, 6 occasionally, 3 frequentlyand 3 very frequently. Five participants wore contact lenses or

Page 5: arXiv:2002.03037v1 [cs.HC] 7 Feb 2020 · handheld touchscreens [36,60], or virtual lenses [62] and sizes [73]. Also, the combination of touch and gaze has seen recent interest for

Figure 3: Left: Target acquisition time per target in seconds with significant differences between all conditions. Right: Number of errors with nosignificant differences between any condition. 2D: BASELINE, 3D: 3D NAVIGATION, SM: SMALL MAP, LM: LARGE MAP.The boxes encompass theinterquartile range, the whiskers extend to 1.5 times the interquartile range, respectively the minimum and maximum data points, if those do notexceed the interquartile range. The solid vertical lines indicate the median.

Figure 4: Left: Study location with OptiTrack Prime 13 and a userwearing the HTC VIVE Pro and interacting with the smartphone. Themonitor shows the user’s view of the scene. Top right: user pinchesto zoom in condition BASELINE. Bottom right: user zooms in condition3D NAVIGATION.

glasses with corrected to normal vision. Nineteen participants wereright handed while 1 was left handed. Nineteen participants usedtheir right hand to conduct the task and 1 used the left hand. Forzooming in the 2D BASELINE interface, 15 participants used theirthumb together with their index finger and five participants usedtheir index finger in combination with their middle finger.

4.6 ResultsStatistical significance tests for log-transformed target acquisitiontime was carried out using general linear model repeated mea-sures analysis of variance (RM-ANOVA) with Holm-Bonferroniadjustments for multiple comparisons at an initial significance levelα = 0.05. We indicate effect sizes whenever feasible (η2

p). We hadto exclude performance data (target acquisition time and errors) forone participant due to logging errors.

For subjective feedback, or data that did not follow a normaldistribution or could not be transformed to a normal distributionusing the log-transform (errors), we employed the Aligned RankTransform before applying RM-ANOVA.

The results in the following sections can be summarized as fol-lows: Participants acquired targets significantly faster with 3D NAV-IGATION compared to BASELINE. 3D NAVIGATION also resulted insignificantly higher SUS scores, significantly lower demand ratings(as indicated by Nasa TLX) and was preferred by more participants.No significant differences for the number of errors or for simulator

sickness ratings were detected between conditions.

4.6.1 Target Acquisition TimeThe times to acquire individual targets averaged over all target dis-tances are depicted in Figure 3, left (for individual distances, seeAppendix). An omnibus test revealed significance main effects ofINTERFACE (F1,18 = 59.91, p < .001, η2

p = 0.77, 1−β = 1.0), and(as expected) for MAP SIZE (F1,18 = 199.05, p < .001, η2

p = 0.92,1 − β = 1.0) and TARGET DISTANCE (F2,36 = 45.83, p < .001,η2

p = 0.72, 1−β = 1.0).As expected, significant interactions have been indicated between

INTERFACE and MAP SIZE (F1,36 = 29.61, p < 0.001, η2p = 0.622,

1−β = 0.99), MAP SIZE and TARGET DISTANCE (F2,36 = 11.08,p > 0.001, η2

p = 0.38, 1−β = 0.99), but not between INTERFACE

and TARGET DISTANCE (F2,36 = 3.06, p = 0.059, η2p = 0.15, 1−

β = 0.56). We did not observe any asymmetrical effects [82].Holm-Bonferroni adjusted post-hoc testing revealed that there

were significant differences between each level for target distance(p < 0.001, as expected), map size (p < 0.001, as expected) andbetween BASELINE (mean target acquisition time over all map sizesand target distances: 59.62 seconds, sd = 23.95) and 3D NAVIGA-TION (mean target acquisition time over all map sizes and targetdistances: 33.52 seconds, sd = 8.10) (p < 0.001).

Also, Holm-Bonferroni adjusted pairwise t-tests between BASE-LINE and 3D NAVIGATION for each combination of MAP SIZEand TARGET DISTANCE indicated that 3D NAVIGATION resulted insignificantly faster target acquisition times compared to BASELINE.

Similar results were obtained when excluding one participantwhose target acquistion times were substantially larger than thetimes of the other participants (depicted as outlier in Figure 3, left).These results are omitted for brevity.

In other words, participants acquired targets significantly fasterwith 3D NAVIGATION compared to BASELINE.

4.6.2 ErrorsWe looked at two different error types. First, we looked at thenumber of wrongly selected targets. No target selection error wasmade in any condition. Second, we counted the occasions when atarget was not hit at first touch down (i.e. the finger was dwellingoutside of the target circle for more than a second), see Figure 3,right. No significant main effects or interactions were indicated.

In other words, no significant differences between 3D NAVIGA-TION and BASELINE were detected.

4.6.3 WorkloadThe descriptive statistics for workload (as measured by the un-weighted NASA TLX [42]) are depicted in Table 1.

Significant main effects for INTERFACE were indicated for mentaldemand (F1,19 = 4.71, p = 0.043, η2

p = 0.20). physical demand

Page 6: arXiv:2002.03037v1 [cs.HC] 7 Feb 2020 · handheld touchscreens [36,60], or virtual lenses [62] and sizes [73]. Also, the combination of touch and gaze has seen recent interest for

Table 1: Mean and standard deviation (in parentheses) for the NASATLX dimensions. 2D: BASELINE, 3D: 3D NAVIGATION, SM: SMALL MAP,LM: LARGE MAP. Bold headings indicate dimensions with significantdifferences.

2D-SM 3D-SM 2D-LM 3D-LMMental Demand 38.50 28.25 41.75 30.75

(26.01) (16.33) (23.64) (19.89)Physical Demand 26.00 19.50 32.75 18.5

(21.56) (17.24) (25.73) (16.47)Temporal Demand 41.00 31.50 55.50 30.25

(25.00) (19.54) (30.78) (21.30)Performance 37.75 32.25 58.25 25.25

(23.98) (16.74) (23.64) (19.50)Effort 35.75 25.25 49.5 24.75

(25.41) (18.03) (23.84) (21.30)Frustration 31.00 17.75 40.00 15.25

(20.81) (12.41) (19.53) (12.08)Overall Demand 35.00 25.75 46.29 24.13

(16.96) (10.88) (14.80) (12.38)

Figure 5: Ratings for the system usability scale SUS. 2D: BASELINE,3D: 3D NAVIGATION, SM: SMALL MAP, LM: LARGE MAP.

(F1,19 = 6.42, p = 0.02, η2p = 0.25), temporal demand (F1,19 =

22.82, p< 0.01, η2p = 0.55), performance (F1,19 = 39.422, p< 0.01,

η2p = 0.67), effort (F1,19 = 18.30, p < 0.01, η2

p = 0.49), frustration(F1,19 = 66.18, p < 0.01, η2

p = 0.78) and overall demand (F1,19 =

39.95, p < 0.01, η2p = 0.68). For MAP SIZE a significant main

effect was indicated for overall demand (F1,19 = 5.44, p = 0.031,η2

p = 0.22). No further main effects were indicated.Significant interactions were indicated for performance (F1,19 =

22.79, p < 0.01, η2p = 0.55), effort (F1,19 = 7.78, p = 0.01, η2

p =

0.29), frustration (F1,19 = 4.89, p = 0.039, η2p = 0.20) and overall

demand (F1,19 = 11.77, p < 0.01, η2p = 0.38).

In other words, 3D NAVIGATION led to a significantly lowerworkload compared to BASELINE.

4.6.4 Usability

Results from the SUS questionnaire [14] are depicted in Figure 5.All conditions but BASELINE-LARGE MAP resulted above averageSUS ratings > 68 (with BASELINE-LARGE MAP resulting in a meanscore of 66).

A significant main effect for INTERFACE was indicated for theSUS score (F1,19 = 28.14, p < 0.001, η2

p = 0.60). A significantinteraction between INTERFACE and MAP SIZE was also indicated(F1,19 = 6.09, p = 0.02, η2

p = 0.24).In other words, 3D NAVIGATION resulted in significantly higher

Figure 6: Ratings for the simulator sickness questionnaire SSQ. D:Disorientation, N: Nausea, O: Oculo-Motor, TS: Total Severity, 2D:BASELINE, 3D: 3D NAVIGATION, SM: SMALL MAP, LM: LARGE MAP.

usability scores than BASELINE.

4.6.5 Simulator SicknessResults from the simulator sickness questionnaire SSQ [49] are de-picted in Figure 6. Omnibus tests revealed no statistically significantmain effects or interactions for any SSQ dimension.

In other words, no significant difference for simulator sicknesscould be indicated between 3D NAVIGATION and BASELINE.

4.6.6 Preferences and Open CommentsWhen asked to rank the interaction techniques, all 20 participantspreferred 3D NAVIGATION for LARGE MAP. For SMALL MAP, 15participants preferred 3D NAVIGATION and 5 preferred BASELINE.

As benefits of 3D NAVIGATION, 8 participants mentioned it tobe faster with one saying I have the feeling that it is faster” and twomentioning it was also fast to learn. Three participants mentionedthat they felt it enabled more precise navigation. Five participantsit was easier to learn and ”more fluid” with one mentioning With2D [BASELINE] I need more steps to zoom in or out. 3D [3DNAVIGATION] just feels better and four mentioning that using onlya single finger for navigation felt more intuitive than the 2 fingerpinch gesture.

Regarding drawbacks of the 3D NAVIGATION technique oneparticipant mentioned that the fast zooming gave me slight dizzinesson the large map, but all in all its better, because it is faster. Anotherparticipant mentioned I was losing the feeling on zoom range onsingle colored map parts and 2 participants mentioned that theyaccidentally touched the screen, which led to an interruption of thezoom process.

Regarding benefits of BASELINE, one participant mentioned thatthis technique helped better in the final stage of the target acquisitionphase by stating For me it was more precise because the frame isstable in the last step of target acquisition. Two participants men-tioned that they liked that a hovering finger does not invoke zooming.Three participants mentioned that they like the technique becausethey were used to it and three mentioned that they felt the techniqueto be fast enough on the small map. In contrast, two participantsalso mentioned that the technique was too slow for navigating on thelarge map as well as to imprecise. Another participant mentioned myfingertips start to hurt if I touch the displays or things [in BASELINE]for too long, 3D on the other hand feels nice.

We further noticed differences regarding the amount of zoomingused by the participants in the different conditions. Overall, theparticipants zoomed less on SMALL MAP for both interfaces. On

Page 7: arXiv:2002.03037v1 [cs.HC] 7 Feb 2020 · handheld touchscreens [36,60], or virtual lenses [62] and sizes [73]. Also, the combination of touch and gaze has seen recent interest for

a normalized scale, where 0 equals the minimum zoom scale (i.e.the whole map is visible within the smartphone boundaries) and1 equals the maximum scale, the average scale across all partici-pants was 0.425 for 2D Navigation and 0.357 for 3D Navigation onSMALL MAP, compared to 0.065 and 0.064 for 2D Navigation and3D Navigation on LARGE MAP (please remember that participantsstarted the task at the maximum scale 1 and had to be at that scale inorder to select a target). In 70 instances for BASELINE respectively30 instances for 3D NAVIGATION on SMALL MAP, the participantsfound and clicked the highlighted target without zooming out atall. However, this was never the case in any of both conditions onLARGE MAP. Those differences can be attributed to the fact thaton SMALL MAP, a larger proportion of the map was already in theparticipants field of view when zoomed in.

5 DISCUSSION AND LIMITATIONS

Our study indicated that the proposed relative, rate-controlled tech-nique outperformed the BASELINE technique in terms of task com-pletion time on both MAP SIZEs. No differences in terms of errorswere detected. While we assumed that potentially, differences be-tween the conditions could appear for simulator sickness due to thefaster scale change in 3D NAVIGATION, this assumption was notconfirmed by the SSQ ratings.

The indications from workload and usability ratings were morenuanced. While for workload, 3D NAVIGATION resulted in signif-icantly lower frustration and overall demand for both MAP SIZEs,for the TLX dimensions temporal effort, performance and effortsignificant differences were solely indicated for the LARGE MAP.No significant differences were indicated for mental or physicaldemand. One potential explanation for this might be that whileholding the finger in mid-air (even with support of the palm lyingon a resting surface) might induce fatigue, so does the repetitive useof pinch-to-zoom gestures.

Also, regarding usability ratings, 3D NAVIGATION was solelyrated significantly higher for the LARGE MAP. These quantitativeresults are also echoed by the qualitative feedback. While for LARGEMAP all participants preferred 3D NAVIGATION, for SMALL MAP,five out 20 preferred the BASELINE condition. This is also reflectedby one participant stating ”2D [baseline] on the small map is man-ageable but on the big map it is too much trouble”.

With these findings in mind, we can still suggest the proposedrelative, rate-control navigation technique can serve as viable optionfor navigating multiscale planar information spaces on and abovetouchscreens.

As a limitation we can see, that in the experiment, the smart-phone was lying on a surface, which also functioned as a restingsurface for both interaction techniques. While this represents wellthe intended target usage in physical constrained spaces such as anairplane seat with a tray, other use cases, such as free hand usageor usage while walking could result in other findings, specificallyregarding expected fatigue of the techniques. Further, in such sce-narios, approaches that utilize the touchscreen pose for spatiallynavigating an information space (such as the work by Spindler etal. [100]) could be used and potentially outperform the proposedapproach. Also, the results might change if other form factors suchas tablets will be employed. Further, while we took care to selecta representative range of information space sizes, the results mightchange when investigating further scales. Finally, one could arguethat given spatial finger tracking on modern VR headsets, there isno need for a supporting touchscreen at all. While this is true forspatial interaction, and, eventually for sliding motions on physicalsurfaces such as tables, selecting items through nuanced toucheson the surface without touchscreens (or other contact-based sen-sors) will remain a challenge for the foreseeable future. Not using atouchscreen also strips away further potential interaction possibili-ties (which we did not investigate in this work) that could arise due

to the tangible nature of smartphones and tablets [107].

6 CONCLUSION

The combination of physical touchscreen devices such as smart-phones and tablets with Virtual Reality headsets enables uniqueinteraction possibilities such as the exploration of large informationspaces.

In this context, a controlled laboratory study with 20 partici-pants indicated that a relative, rate-controlled multiscale navigationtechnique resulted in significantly better performance and user pref-erence compared to pinch-to-zoom and drag-to-pan when navigatingplanar information spaces.

In future work, we plan to investigate the technique in furtherusage scenarios (such when standing or walking) and to explorethe rich interaction space that opens up when combining traditionalcompute devices such as smartphones, tablets or notebooks with VRheadsets capable of spatial tracking of the HMD, traditional inputdevices such as smartphones and the user’s hands.

REFERENCES

[1] P. Abtahi, M. Gonzalez-Franco, E. Ofek, and A. Steed. I’m a giant:Walking in large virtual environments at high speed gains. In Proceed-ings of the 2019 CHI Conference on Human Factors in ComputingSystems, p. 522. ACM, 2019.

[2] C. Appert and J.-D. Fekete. Orthozoom scroller: 1d multi-scalenavigation. In Proceedings of the SIGCHI conference on HumanFactors in computing systems, pp. 21–30. ACM, 2006.

[3] F. Argelaguet and C. Andujar. Efficient 3d pointing selection incluttered virtual environments. IEEE Computer Graphics and Appli-cations, 29(6):34–43, 2009.

[4] F. Argelaguet and C. Andujar. A survey of 3d object selection tech-niques for virtual environments. Computers & Graphics, 37(3):121–136, 2013.

[5] R. Arora, R. Habib Kazi, T. Grossman, G. Fitzmaurice, and K. Singh.Symbiosissketch: Combining 2d & 3d sketching for designing de-tailed 3d objects in situ. In Proceedings of the 2018 CHI Conferenceon Human Factors in Computing Systems, p. 185. ACM, 2018.

[6] C. R. Austin, B. Ens, K. A. Satriadi, and B. Jenny. Elicitation studyinvestigating hand and foot gesture interaction for immersive maps inaugmented reality. Cartography and Geographic Information Science,pp. 1–15, 2020.

[7] D. Avrahami, J. O. Wobbrock, and S. Izadi. Portico: tangible interac-tion on and around a tablet. In Proceedings of the 24th annual ACMsymposium on User interface software and technology, pp. 347–356.ACM, 2011.

[8] T. Babic, H. Reiterer, and M. Haller. Pocket6: A 6dof controller basedon a simple smartphone application. In SUI’18: 6th ACM Symposiumon Spatial User Interaction, pp. 2–10, 2018.

[9] M. D. Barrera Machuca and W. Stuerzlinger. The effect of stereodisplay deficiencies on virtual hand pointing. In Proceedings of the2019 CHI Conference on Human Factors in Computing Systems, p.207. ACM, 2019.

[10] B. B. Bederson and J. D. Hollan. Pad++: a zooming graphical inter-face for exploring alternate interface physics. In Proceedings of the7th annual ACM symposium on User interface software and technol-ogy, pp. 17–26. ACM, 1994.

[11] S. Bhowmick and K. Sorathia. Explorations on body-gesture basedobject selection on hmd based vr interfaces for dense and occludeddense virtual environments. 2018.

[12] E. A. Bier, M. C. Stone, K. Pier, W. Buxton, and T. D. DeRose.Toolglass and magic lenses: the see-through interface. In Proceedingsof the 20th annual conference on Computer graphics and interactivetechniques, pp. 73–80. ACM, 1993.

[13] D. A. Bowman and C. A. Wingrave. Design and evaluation of menusystems for immersive virtual environments. In Proceedings IEEEVirtual Reality 2001, pp. 149–156. IEEE, 2001.

[14] J. Brooke et al. Sus-a quick and dirty usability scale. Usabilityevaluation in industry, 189(194):4–7, 1996.

Page 8: arXiv:2002.03037v1 [cs.HC] 7 Feb 2020 · handheld touchscreens [36,60], or virtual lenses [62] and sizes [73]. Also, the combination of touch and gaze has seen recent interest for

[15] L. D. Brown and H. Hua. Magic lenses for augmented virtual envi-ronments. IEEE Computer Graphics and Applications, 26(4):64–73,2006.

[16] G. Bruder, F. Steinicke, and W. Stuerzlinger. Touching the voidrevisited: Analyses of touch behavior on and above tabletop surfaces.In IFIP Conference on Human-Computer Interaction, pp. 278–296.Springer, 2013.

[17] G. Bruder, F. Steinicke, and W. Sturzlinger. To touch or not to touch?:comparing 2d touch and 3d mid-air interaction on stereoscopic table-top surfaces. In Proceedings of the 1st symposium on Spatial userinteraction, pp. 9–16. ACM, 2013.

[18] F. Brudy, C. Holz, R. Radle, C.-J. Wu, S. Houben, C. N. Klokmose,and N. Marquardt. Cross-device taxonomy: Survey, opportunitiesand challenges of interactions spanning across multiple devices. InProceedings of the 2019 CHI Conference on Human Factors in Com-puting Systems, p. 562. ACM, 2019.

[19] A. Butler, S. Izadi, and S. Hodges. Sidesight: Multi-”touch” interac-tion around small devices. In Proceedings of the 21st Annual ACMSymposium on User Interface Software and Technology, UIST ’08, pp.201–204. ACM, New York, NY, USA, 2008. doi: 10.1145/1449715.1449746

[20] L. K. Chan and H. Y. Lau. Magicpad: the projection based 3d userinterface. International Journal on Interactive Design and Manufac-turing (IJIDeM), 6(2):75–81, 2012.

[21] L.-W. Chan, H.-S. Kao, M. Y. Chen, M.-S. Lee, J. Hsu, and Y.-P. Hung.Touching the void: direct-touch interaction for intangible displays.In Proceedings of the SIGCHI Conference on Human Factors inComputing Systems, pp. 2625–2634. ACM, 2010.

[22] X. Chen, T. Grossman, D. J. Wigdor, and G. Fitzmaurice. Duet:exploring joint interactions on a smart phone and a smart watch.In Proceedings of the SIGCHI Conference on Human Factors inComputing Systems, pp. 159–168. ACM, 2014.

[23] X. Chen, J. Schwarz, C. Harrison, J. Mankoff, and S. E. Hudson.Air+ touch: interweaving touch & in-air gestures. In Proceedingsof the 27th annual ACM symposium on User interface software andtechnology, pp. 519–525. ACM, 2014.

[24] A. Cockburn, A. Karlson, and B. B. Bederson. A review of overview+detail, zooming, and focus+ context interfaces. ACM ComputingSurveys (CSUR), 41(1):2, 2009.

[25] A. De Luca, E. von Zezschwitz, N. D. H. Nguyen, M.-E. Maurer,E. Rubegni, M. P. Scipioni, and M. Langheinrich. Back-of-deviceauthentication on smartphones. In Proceedings of the SIGCHI Con-ference on Human Factors in Computing Systems, CHI ’13, pp. 2389–2398. ACM, New York, NY, USA, 2013. doi: 10.1145/2470654.2481330

[26] T. Dorta, G. Kinayoglu, and M. Hoffmann. Hyve-3d and the 3d cursor:Architectural co-design with freedom in virtual reality. InternationalJournal of Architectural Computing, 14(2):87–102, 2016.

[27] A. Drachen, P. Mirza-Babaei, and L. E. Nacke. Games user research.Oxford University Press, 2018.

[28] A. O. S. Feiner. The flexible pointer: An interaction technique forselection in augmented and virtual reality. In Proc. UIST’03, pp.81–82, 2003.

[29] C. Fleury, A. Chauffaut, T. Duval, V. Gouranton, and B. Arnaldi. Ageneric model for embedding users’ physical workspaces into multi-scale collaborative virtual environments. 2010.

[30] E. Freeman, S. Brewster, and V. Lantz. Towards usable and acceptableabove-device interactions. In Proceedings of the 16th InternationalConference on Human-computer Interaction with Mobile Devices&#38; Services, MobileHCI ’14, pp. 459–464. ACM, New York, NY,USA, 2014. doi: 10.1145/2628363.2634215

[31] D. Gasques, J. G. Johnson, T. Sharkey, and N. Weibel. Pintar: Sketch-ing spatial experiences in augmented reality. In Companion Publica-tion of the 2019 on Designing Interactive Systems Conference 2019Companion, pp. 17–20. ACM, 2019.

[32] J. G. Grandi, H. G. Debarba, L. Nedel, and A. Maciel. Design andevaluation of a handheld-based 3d user interface for collaborativeobject manipulation. In Proceedings of the 2017 CHI Conference onHuman Factors in Computing Systems, pp. 5881–5891. ACM, 2017.

[33] J. Grubert, M. Kranz, and A. Quigley. Challenges in mobile multi-

device ecosystems. mUX: The Journal of Mobile User Experience,5(1):5, 2016.

[34] J. Grubert, E. Ofek, M. Pahud, M. Kranz, and D. Schmalstieg.Glasshands: Interaction around unmodified mobile devices usingsunglasses. In Proceedings of the 2016 ACM International Confer-ence on Interactive Surfaces and Spaces, pp. 215–224. ACM, 2016.

[35] J. Grubert, E. Ofek, M. Pahud, P. O. Kristensson, F. Steinicke, andC. Sandor. The office of the future: Virtual, portable, and global.IEEE computer graphics and applications, 38(6):125–133, 2018.

[36] J. Grubert, M. Pahud, R. Grasset, D. Schmalstieg, and H. Seichter.The utility of magic lens interfaces on handheld devices for touristicmap navigation. Pervasive and Mobile Computing, 18:88–103, 2015.

[37] Y. Guiard and M. Beaudouin-Lafon. Target acquisition in multiscaleelectronic worlds. International Journal of Human-Computer Studies,61(6):875–905, 2004.

[38] Y. Guiard, M. Beaudouin-Lafon, J. Bastin, D. Pasveer, and S. Zhai.View size and pointing difficulty in multi-scale navigation. In Pro-ceedings of the working conference on Advanced visual interfaces, pp.117–124. ACM, 2004.

[39] Y. Guiard, M. Beaudouin-Lafon, and D. Mottet. Navigation as multi-scale pointing: extending fitts’ model to very high precision tasks. InProceedings of the SIGCHI conference on Human Factors in Comput-ing Systems, pp. 450–457. ACM, 1999.

[40] T. Han, K. Hasan, K. Nakamura, R. Gomez, and P. Irani. Soundcraft:Enabling spatial interactions on smartwatches using hand generatedacoustics. In Proceedings of the 30th Annual ACM Symposium onUser Interface Software and Technology, pp. 579–591. ACM, 2017.

[41] C. Harrison. Appropriated interaction surfaces. Computer,43(6):0086–89, 2010.

[42] S. G. Hart and L. E. Staveland. Development of nasa-tlx (task loadindex): Results of empirical and theoretical research. In Advances inpsychology, vol. 52, pp. 139–183. Elsevier, 1988.

[43] K. Hasan, D. Ahlstrom, and P. P. Irani. Comparing direct off-screenpointing, peephole, and flick & pinch interaction for map naviga-tion. In Proceedings of the 3rd ACM Symposium on Spatial UserInteraction, pp. 99–102, 2015.

[44] O. Hilliges, S. Izadi, A. D. Wilson, S. Hodges, A. Garcia-Mendoza,and A. Butz. Interactions in the air: adding further depth to interactivetabletops. In Proceedings of the 22nd annual ACM symposium onUser interface software and technology, pp. 139–148. ACM, 2009.

[45] T. Hirzle, J. Gugenheimer, F. Geiselhart, A. Bulling, and E. Rukzio.A design space for gaze interaction on head-mounted displays. InProceedings of the 2019 CHI Conference on Human Factors in Com-puting Systems, p. 625. ACM, 2019.

[46] K. Huo, K. Ramani, et al. Window-shaping: 3d design ideation bycreating on, borrowing from, and looking at the physical world. InProceedings of the Eleventh International Conference on Tangible,Embedded, and Embodied Interaction, pp. 37–45. ACM, 2017.

[47] B. Jones, R. Sodhi, D. Forsyth, B. Bailey, and G. Maciocci. Arounddevice interaction for multiscale navigation. In Proceedings of the14th international conference on Human-computer interaction withmobile devices and services, pp. 83–92. ACM, 2012.

[48] N. Katzakis, R. J. Teather, K. Kiyokawa, and H. Takemura. Inspect:extending plane-casting for 6-dof control. Human-centric Computingand Information Sciences, 5(1):22, 2015.

[49] R. S. Kennedy, N. E. Lane, K. S. Berbaum, and M. G. Lilienthal. Sim-ulator sickness questionnaire: An enhanced method for quantifyingsimulator sickness. The international journal of aviation psychology,3(3):203–220, 1993.

[50] U. Kister, K. Klamka, C. Tominski, and R. Dachselt. Grasp: Com-bining spatially-aware mobile devices and a display wall for graphvisualization and interaction. In Computer Graphics Forum, vol. 36,pp. 503–514. Wiley Online Library, 2017.

[51] R. Kopper, T. Ni, D. A. Bowman, and M. Pinho. Design and evaluationof navigation techniques for multiscale virtual environments. In IEEEVirtual Reality Conference (VR 2006), pp. 175–182. Ieee, 2006.

[52] S. Kratz, I. Brodien, and M. Rohs. Semi-automatic zooming formobile map navigation. In Proceedings of the 12th internationalconference on Human computer interaction with mobile devices andservices, pp. 63–72, 2010.

Page 9: arXiv:2002.03037v1 [cs.HC] 7 Feb 2020 · handheld touchscreens [36,60], or virtual lenses [62] and sizes [73]. Also, the combination of touch and gaze has seen recent interest for

[53] S. Kratz and M. Rohs. Hoverflow: exploring around-device interac-tion with ir distance sensors. In Proceedings of the 11th InternationalConference on Human-Computer Interaction with Mobile Devicesand Services, p. 42. ACM, 2009.

[54] S. Kratz, M. Rohs, D. Guse, J. Muller, G. Bailly, and M. Nischt.Palmspace: continuous around-device gestures vs. multitouch for 3drotation tasks on mobile devices. In Proceedings of the InternationalWorking Conference on Advanced Visual Interfaces, pp. 181–188.ACM, 2012.

[55] M. Kyto, B. Ens, T. Piumsomboon, G. A. Lee, and M. Billinghurst.Pinpointing: Precise head-and eye-based target selection for aug-mented reality. In Proceedings of the 2018 CHI Conference on HumanFactors in Computing Systems, p. 81. ACM, 2018.

[56] E. Langbehn, G. Bruder, and F. Steinicke. Scale matters! analysis ofdominant scale estimation in the presence of conflicting cues in multi-scale collaborative virtual environments. In 2016 IEEE Symposiumon 3D User Interfaces (3DUI), pp. 211–220. IEEE, 2016.

[57] J. J. LaViola Jr, D. A. Feliz, D. F. Keefe, R. C. Zeleznik, et al. Hands-free multi-scale navigation in virtual environments. SI3D, 1:9–15,2001.

[58] J. J. LaViola Jr, E. Kruijff, R. P. McMahan, D. Bowman, and I. P.Poupyrev. 3D user interfaces: theory and practice. Addison-WesleyProfessional, 2017.

[59] M. Le Chenechal, J. Lacoche, J. Royan, T. Duval, V. Gouranton, andB. Arnaldi. When the giant meets the ant an asymmetric approachfor collaborative and concurrent object manipulation in a multi-scaleenvironment. In 2016 IEEE Third VR International Workshop onCollaborative Virtual Environments (3DCVE), pp. 18–22. IEEE, 2016.

[60] S.-w. Leigh, P. Schoessler, F. Heibeck, P. Maes, and H. Ishii. Thaw:tangible interaction with see-through augmentation for smartphoneson computer screens. In Proceedings of the Ninth InternationalConference on Tangible, Embedded, and Embodied Interaction, pp.89–96. ACM, 2015.

[61] H.-N. Liang, C. Williams, M. Semegen, W. Stuerzlinger, and P. Irani.An investigation of suitable interactions for 3d manipulation of distantobjects through a mobile device. International Journal of InnovativeComputing, Information and Control, 9(12):4737–4752, 2013.

[62] J. Looser, R. Grasset, and M. Billinghurst. A 3d flexible and tangiblemagic lens in augmented reality. In 2007 6th IEEE and ACM Interna-tional Symposium on Mixed and Augmented Reality, pp. 51–54. IEEE,2007.

[63] D. Lopez, L. Oehlberg, C. Doger, and T. Isenberg. Towards an un-derstanding of mobile touch navigation in a stereoscopic viewingenvironment for 3d data exploration. IEEE Transactions on Visualiza-tion and Computer Graphics, 22(5):1616–1629, 2015.

[64] P. Lubos, G. Bruder, and F. Steinicke. Analysis of direct selection inhead-mounted display environments. In 2014 IEEE Symposium on3D User Interfaces (3DUI), pp. 11–18. IEEE, 2014.

[65] A. Marzo, B. Bossavit, and M. Hachet. Combining multi-touch inputand device movement for 3d manipulations in mobile augmentedreality environments. In Proceedings of the 2nd ACM symposium onSpatial user interaction, pp. 13–16. ACM, 2014.

[66] S. Mayer, V. Schwind, R. Schweigert, and N. Henze. The effect ofoffset correction and cursor on mid-air pointing in real and virtualenvironments. In Proceedings of the 2018 CHI Conference on HumanFactors in Computing Systems, p. 653. ACM, 2018.

[67] D. Mendes, F. M. Caputo, A. Giachetti, A. Ferreira, and J. Jorge. Asurvey on 3d virtual object manipulation: From the desktop to immer-sive virtual environments. In Computer Graphics Forum, vol. 38, pp.21–45. Wiley Online Library, 2019.

[68] T. Menzner, A. Otte, T. Gesslein, J. Grubert, P. Gagel, and D. Schnei-der. A capacitive-sensing physical keyboard for vr text entry. In 2019IEEE Conference on Virtual Reality and 3D User Interfaces (VR), pp.1080–1081. IEEE, 2019.

[69] A. Mossel, B. Venditti, and H. Kaufmann. 3dtouch and homer-s: intu-itive manipulation techniques for one-handed handheld augmentedreality. In Proceedings of the Virtual Reality International Conference:Laval Virtual, p. 12. ACM, 2013.

[70] J. Muller, G. Bailly, T. Bossuyt, and N. Hillgren. Mirrortouch: com-bining touch and mid-air gestures for public displays. In Proceedings

of the 16th international conference on Human-computer interactionwith mobile devices & services, pp. 319–328. ACM, 2014.

[71] R. Nandakumar, V. Iyer, D. Tan, and S. Gollakota. Fingerio: Usingactive sonar for fine-grained finger tracking. In Proceedings of the2016 CHI Conference on Human Factors in Computing Systems, pp.1515–1525. ACM, 2016.

[72] I. Oakley and D. Lee. Interaction on the edge: Offset sensing forsmall devices. In Proceedings of the SIGCHI Conference on HumanFactors in Computing Systems, CHI ’14, pp. 169–178. ACM, NewYork, NY, USA, 2014. doi: 10.1145/2556288.2557138

[73] J.-y. Oh and H. Hua. User evaluations on form factors of tangiblemagic lenses. In 2006 IEEE/ACM International Symposium on Mixedand Augmented Reality, pp. 23–32. IEEE, 2006.

[74] A. Otte, T. Menzner, T. Gesslein, P. Gagel, D. Schneider, and J. Gru-bert. Towards utilizing touch-sensitive physical keyboards for textentry in virtual reality. In 2019 IEEE Conference on Virtual Realityand 3D User Interfaces (VR), pp. 1729–1732. IEEE, 2019.

[75] A. Otte, D. Schneider, T. Menzner, T. Gesslein, P. Gagel, and J. Gru-bert. Evaluating text entry in virtual reality using a touch-sensitivephysical keyboard. In 2019 IEEE International Symposium on Mixedand Augmented Reality Adjunct (ISMAR-Adjunct), pp. 387–392. IEEE,2019.

[76] M. Pahud, K. Hinckley, S. Iqbal, A. Sellen, and B. Buxton. Towardcompound navigation tasks on mobiles via spatial manipulation. InProceedings of the 15th international conference on Human-computerinteraction with mobile devices and services, pp. 113–122. ACM,2013.

[77] S. Pelurson and L. Nigay. Bimanual input for multiscale navigationwith pressure and touch gestures. In Proceedings of the 18th ACMInternational Conference on Multimodal Interaction, pp. 145–152.ACM, 2016.

[78] K. Perlin and D. Fox. Pad: An alternative approach to the computerinterface. In Proceedings of the 20th Annual Conference on ComputerGraphics and Interactive Techniques, SIGGRAPH ’93, pp. 57–64.ACM, New York, NY, USA, 1993. doi: 10.1145/166117.166125

[79] K. Pfeuffer, B. Mayer, D. Mardanbegi, and H. Gellersen. Gaze+ pinchinteraction in virtual reality. In Proceedings of the 5th Symposium onSpatial User Interaction, pp. 99–108. ACM, 2017.

[80] T. Piumsomboon, G. A. Lee, B. Ens, B. H. Thomas, andM. Billinghurst. Superman vs giant: a study on spatial perception fora multi-scale mixed reality flying telepresence interface. IEEE trans-actions on visualization and computer graphics, 24(11):2974–2982,2018.

[81] T. Piumsomboon, G. A. Lee, A. Irlitti, B. Ens, B. H. Thomas, andM. Billinghurst. On the shoulder of the giant: A multi-scale mixedreality collaboration with 360 video sharing and tangible interaction.In Proceedings of the 2019 CHI Conference on Human Factors inComputing Systems, p. 228. ACM, 2019.

[82] E. Poulton and P. Freeman. Unwanted asymmetrical transfer effectswith balanced experimental designs. Psychological Bulletin, 66(1):1,1966.

[83] D. Ramanujan, C. Piya, K. Ramani, et al. Mobisweep: Exploringspatial design ideation using a smartphone as a hand-held referenceplane. In Proceedings of the TEI’16: Tenth International Conferenceon Tangible, Embedded, and Embodied Interaction, pp. 12–20. ACM,2016.

[84] R. Raskar, G. Welch, M. Cutts, A. Lake, L. Stesin, and H. Fuchs.The office of the future: A unified approach to image-based modelingand spatially immersive displays. In Proceedings of the 25th annualconference on Computer graphics and interactive techniques, pp.179–188. ACM, 1998.

[85] J. Rekimoto and M. Saitoh. Augmented surfaces: a spatially continu-ous work space for hybrid computing environments. In Proceedingsof the SIGCHI conference on Human Factors in Computing Systems,pp. 378–385. ACM, 1999.

[86] K. Ryu, J.-J. Lee, and J.-M. Park. Gg interaction: a gaze–grasp poseinteraction for 3d virtual object selection. Journal on MultimodalUser Interfaces, pp. 1–11, 2019.

[87] K. A. Satriadi, B. Ens, M. Cordeil, B. Jenny, T. Czauderna, andW. Willett. Augmented reality map navigation with freehand gestures.

Page 10: arXiv:2002.03037v1 [cs.HC] 7 Feb 2020 · handheld touchscreens [36,60], or virtual lenses [62] and sizes [73]. Also, the combination of touch and gaze has seen recent interest for

In 2019 IEEE Conference on Virtual Reality and 3D User Interfaces(VR), pp. 593–603. IEEE, 2019.

[88] D. Schmalstieg, L. M. Encarnacao, and Z. Szalavari. Using transpar-ent props for interaction with the virtual table. SI3D, 99:147–153,1999.

[89] D. Schmidt, J. Seifert, E. Rukzio, and H. Gellersen. A cross-deviceinteraction style for mobiles and surfaces. In Proceedings of theDesigning Interactive Systems Conference, DIS ’12, pp. 318–327.ACM, New York, NY, USA, 2012. doi: 10.1145/2317956.2318005

[90] D. Schneider and J. Grubert. Towards around-device interaction usingcorneal imaging. In Proceedings of the 2017 ACM InternationalConference on Interactive Surfaces and Spaces, pp. 287–293. ACM,2017.

[91] D. Schneider, A. Otte, T. Gesslein, P. Gagel, B. Kuth, M. S. Damlakhi,O. Dietz, E. Ofek, M. Pahud, M. J. Kristensson, Per Ola, and J. Gru-bert. Reconviguration: Reconfiguring physical keyboards in virtualreality. IEEE transactions on visualization and computer graphics,25(11):3190–3201, 2019.

[92] D. Schneider, A. Otte, A. S. Kublin, A. Martschenko, P. O. Kristens-son, E. Ofek, M. Pahud, and J. Grubert. Accuracy of commodityfinger tracking systems for virtual reality head-mounted displays. In2020 IEEE Conference on Virtual Reality and 3D User Interfaces(VR). IEEE, 2020.

[93] R. Schweigert, V. Schwind, and S. Mayer. Eyepointing: A gaze-basedselection technique. In Proceedings of Mensch und Computer 2019,pp. 719–723. ACM, 2019.

[94] V. Schwind, S. Mayer, A. Comeau-Vermeersch, R. Schweigert, andN. Henze. Up to the finger tip: The effect of avatars on mid-airpointing accuracy in virtual reality. In Proceedings of the 2018 AnnualSymposium on Computer-Human Interaction in Play, pp. 477–488.ACM, 2018.

[95] R. S. Sodhi, B. R. Jones, D. Forsyth, B. P. Bailey, and G. Maciocci.Bethere: 3d mobile collaboration with spatial input. In Proceedingsof the SIGCHI Conference on Human Factors in Computing Systems,pp. 179–188. ACM, 2013.

[96] J. Song, G. Soros, F. Pece, S. R. Fanello, S. Izadi, C. Keskin, andO. Hilliges. In-air gestures around unmodified mobile devices. InProceedings of the 27th annual ACM symposium on User interfacesoftware and technology, pp. 319–329. ACM, 2014.

[97] J. Song, G. Soros, F. Pece, and O. Hilliges. Real-time hand gesturerecognition on unmodified wearable devices. In IEEE Conference oncomputer vision and pattern recognition, 2015.

[98] P. Song, W. B. Goh, C.-W. Fu, Q. Meng, and P.-A. Heng. Wysiwyf:exploring and annotating volume data with a tangible handheld de-vice. In Proceedings of the SIGCHI conference on human factors incomputing systems, pp. 1333–1342. ACM, 2011.

[99] M. Spindler and R. Dachselt. Paperlens: advanced magic lens inter-action above the tabletop. In Proceedings of the ACM InternationalConference on Interactive Tabletops and Surfaces, p. 7. ACM, 2009.

[100] M. Spindler, M. Schuessler, M. Martsch, and R. Dachselt. Pinch-drag-flick vs. spatial input: rethinking zoom & pan on mobile displays.In Proceedings of the SIGCHI Conference on Human Factors inComputing Systems, pp. 1113–1122. ACM, 2014.

[101] M. Spindler, C. Tominski, H. Schumann, and R. Dachselt. Tangibleviews for information visualization. In ACM International Conferenceon Interactive Tabletops and Surfaces, pp. 157–166. ACM, 2010.

[102] A. Steed and C. Parker. 3d selection strategies for head tracked andnon-head tracked operation of spatially immersive displays. In 8thInternational Immersive Projection Technology Workshop, pp. 13–14,2004.

[103] J. Steimle, A. Jordt, and P. Maes. Flexpad: highly flexible bendinginteractions for projected handheld displays. In Proceedings of theSIGCHI Conference on Human Factors in Computing Systems, pp.237–246. ACM, 2013.

[104] F. Steinicke, T. Ropinski, and K. Hinrichs. Object selection in vir-tual environments using an improved virtual pointer metaphor. InComputer Vision and Graphics, pp. 320–326. Springer, 2006.

[105] R. Stoakley, M. J. Conway, and R. Pausch. Virtual reality on a wim:interactive worlds in miniature. In CHI, vol. 95, pp. 265–272, 1995.

[106] W. Stuerzlinger and R. J. Teather. Considerations for targets in 3dpointing experiments. Proceedings of HCI Korea, pp. 162–168, 2014.

[107] H. B. Surale, A. Gupta, M. Hancock, and D. Vogel. Tabletinvr:Exploring the design space for using a multi-touch tablet in virtualreality. In Proceedings of the 2019 CHI Conference on Human Factorsin Computing Systems, p. 13. ACM, 2019.

[108] Z. Szalavari and M. Gervautz. The personal interaction panel–a two-handed interface for augmented reality. In Computer graphics forum,vol. 16, pp. C335–C346. Wiley Online Library, 1997.

[109] R. J. Teather and W. Stuerzlinger. Visual aids in 3d point selectionexperiments. In Proceedings of the 2nd ACM symposium on Spatialuser interaction, pp. 127–136. ACM, 2014.

[110] C. Tominski, S. Gladisch, U. Kister, R. Dachselt, and H. Schumann.A survey on interactive lenses in visualization. In EuroVis (STARs).Citeseer, 2014.

[111] C. Tominski, S. Gladisch, U. Kister, R. Dachselt, and H. Schumann.Interactive lenses for visualization: An extended survey. In ComputerGraphics Forum, vol. 36, pp. 173–200. Wiley Online Library, 2017.

[112] B. Ullmer and H. Ishii. The metadesk: models and prototypes for tan-gible user interfaces. In Proceedings of Symposium on User InterfaceSoftware and Technology (UIST 97), ACM, 1997.

[113] R. Unger and C. Chandler. A Project Guide to UX Design: For userexperience designers in the field or in the making. New Riders, 2012.

[114] S. Wang, J. Song, J. Lien, I. Poupyrev, and O. Hilliges. Interactingwith soli: Exploring fine-grained dynamic gesture recognition inthe radio-frequency spectrum. In Proceedings of the 29th AnnualSymposium on User Interface Software and Technology, pp. 851–860.ACM, 2016.

[115] D. Wigdor, C. Forlines, P. Baudisch, J. Barnwell, and C. Shen. Lucidtouch: A see-through mobile device. In Proceedings of the 20thAnnual ACM Symposium on User Interface Software and Technology,UIST ’07, pp. 269–278. ACM, New York, NY, USA, 2007. doi: 10.1145/1294211.1294259

[116] R. Xiao, G. Lew, J. Marsanico, D. Hariharan, S. Hudson, and C. Harri-son. Toffee: enabling ad hoc, around-device interaction with acoustictime-of-arrival correlation. In Proceedings of the 16th internationalconference on Human-computer interaction with mobile devices &services, pp. 67–76. ACM, 2014.

[117] Z. Yan, R. W. Lindeman, and A. Dey. Let your fingers do the walking:A unified approach for efficient short-, medium-, and long-distancetravel in vr. In 2016 IEEE Symposium on 3D User Interfaces (3DUI),pp. 27–30. IEEE, 2016.

[118] C. Yoo, I. Hwang, E. Rozner, Y. Gu, and R. F. Dickerson. Sym-metrisense: Enabling near-surface interactivity on glossy surfacesusing a single commodity smartphone. In Proceedings of the 2016CHI Conference on Human Factors in Computing Systems, pp. 5126–5137. ACM, 2016.

[119] D. Yu, H.-N. Liang, F. Lu, V. Nanjappan, K. Papangelis, and W. Wang.Target selection in head-mounted display virtual reality environments.J. UCS, 24(9):1217–1243, 2018.

[120] X. Zhang. A multiscale progressive model on virtual navigation.International Journal of Human-Computer Studies, 66(4):243–256,2008.

[121] X. Zhang and G. W. Furnas. mcves: Using cross-scale collabora-tion to support user interaction with multiscale structures. Presence:Teleoperators & Virtual Environments, 14(1):31–46, 2005.

[122] X. L. Zhang. Multiscale traveling: crossing the boundary betweenspace and scale. Virtual reality, 13(2):101, 2009.

[123] C. Zhao, K.-Y. Chen, M. T. I. Aumi, S. Patel, and M. S. Reynolds.Sideswipe: Detecting in-air gestures around mobile devices usingactual gsm signal. In Proceedings of the 27th Annual ACM Symposiumon User Interface Software and Technology, UIST ’14, pp. 527–534.ACM, New York, NY, USA, 2014. doi: 10.1145/2642918.2647380

[124] J. Zhou, Y. Zhang, G. Laput, and C. Harrison. Aurasense: enablingexpressive around-smartwatch interactions with electric field sensing.In Proceedings of the 29th Annual Symposium on User InterfaceSoftware and Technology, pp. 81–86. ACM, 2016.