View
223
Download
0
Category
Preview:
Citation preview
8/3/2019 Depth Camera in Computer Vision and Computer Graphics EXCELENTE
http://slidepdf.com/reader/full/depth-camera-in-computer-vision-and-computer-graphics-excelente 1/12
8/3/2019 Depth Camera in Computer Vision and Computer Graphics EXCELENTE
http://slidepdf.com/reader/full/depth-camera-in-computer-vision-and-computer-graphics-excelente 2/12
482 Journal of Frontiers of Computer Science and Technology 计算机科学与探索 2011, 5(6)
made in traditional 3D measuring systems. Recently, significant improvements have been made in order to achieve
low-cost and compact depth camera devices that have the potential to revolutionize many fields of research, including
computer vision, computer graphics and human computer interaction (HCI). These technologies are also starting to
attract many researchers working for academic or commercial purposes. This paper gives an overview of recent
developments in depth camera technology and discusses the current state of the integration of this technology into
various related applications in computer vision and computer graphics.
Key words: depth camera; computer vision; computer graphics
1 Introduction
Acquiring 3D geometric information from real
environments is an essential task for many applications
in computer vision and computer graphics. Numerous
assignments, such as cultural heritage preservation,
augmented reality and human computer interaction,
obviously favors simple and accurate devices for real-
time range image acquisition. Unfortunately, even for
static scenes, there exists no low-priced off-the-shelf
system, which can provide good quality, high resolu-
tion distance information in real time. Laser scanning
techniques, which merely sample a scene row by row
with a single laser device, are rather time-consuming
and therefore infeasible for dynamic scenes. Stereo
vision systems are rather limited: they are known to be
quite fragile in practice (e.g., due to lack of texture).
Being a newly developed distance measuring
hardware, the depth camera technology opens a new
epoch for 3D geometric information acquisition.
Unlike other 3D systems, the depth camera is very
compact and it has already fulfilled most of the above
stated features, such as full range field and high photo
speed, that are desired for real-time distance meas-
urement.
There are two main approaches employed currently
in depth camera technology. The first one is based on
the time-of-flight (ToF) principle, measuring time de-
lay between transmissions of a light pulse. Some solu-
tions utilize modulated, incoherent light with radiofrequency (RF) carrier, then measure the phase shift of
that carrier on the receive side (e.g., the Photonic
Mixer Devices (PMD)[1]
and Swiss Ranger 4000[2]
).
With phase unwrapping algorithms, the maximum
uniqueness range can be increased. The Swiss Ranger
4000 (http://www.mesa-imaging.ch, Fig.1 (a)) has
ranges of 5 or 10 meters, with 176×144 pixels. The
PMD (http://www.pmdtec.com, Fig.1 (b)) can provide
ranges up to 60 meters. On the other hand, the 3DV Inc.
cameras (http://www.3dvsystems.com)[3]
and Canesta
3D cameras (http://www.canesta.com) are range-gatedsystems using Medina’s design
[4], and indirectly mea-
sure the time of flight using a fast shutter technique.
The second approach is based on the light coding,
projecting a known infrared pattern onto the scene and
determining depth based on the pattern’s deformation
captured by an infrared CMOS imager. This driven by
a single-chip custom-silicon solution, e.g., PrimeSensor
(http://www.primesense.com, Fig.1 (c)), can produce
Fig.1 Different types of depth camera
图 1 不同种类深度相机示意图
8/3/2019 Depth Camera in Computer Vision and Computer Graphics EXCELENTE
http://slidepdf.com/reader/full/depth-camera-in-computer-vision-and-computer-graphics-excelente 3/12
向学勤 等:深度相机在计算机视觉与图形学上的应用研究 483
depth image up to 640×480 pixels with a maximum
throughput of 60 f/s. And recently popular Microsoft
Kinect sensor (http://www.xbox.com/kinect, Fig.1 (d))
also uses light coding for depth measuring.The overview gives a summary on the depth cam-
era measurement principles (Section 1). Sections 2
and 3 discuss sensor calibration issues and basic con-
cepts in terms of image processing and sensor fusion.
Section 4 focuses on applications for geometric recons-
truction, human-oriented applications, and interaction
based on depth cameras. Finally, Section 5 draws a
conclusion and gives a perspective on future work of
depth camera related research and applications.
2 CalibrationDepth cameras use standard optics to focus the
reflected active light onto the chip. Thus, it is important
that classical intrinsic calibration is required to com-
pensate effects like shifted optical centers and lateral
distortion. For depth camera with relatively high reso-
lution, i.e., 176×144, standard calibration techniques[5]
can be used. For low resolution sensors, Beder[6]
has
proposed an optimization approach based on analysis-
by-synthesis.
To evaluate the error of the depth camera, acqui-
sition of reference data (“ground truth”) is a non-trivialtask. Previous approaches use track lines
[7], which
unfortunately need cost intensive experiment. Alterna-
tive techniques use image based approach to estimate
the extrinsic parameters of the sensor with respect to a
reference plane, e.g., a checkerboard[8]
.
Considering the systematic measurement error,
first approach[9]
assumed a linear deviation with re-
spect to the objects distance. Then, this systematic
depth error can be corrected using look-up-tables[10]
or
B-splines[5]
. Since the systematic error behaves quite
similar for different sensor types[11], it was a signifi-cant improvement when Zhu et al.
[10]combined ToF
sensor with passive stereo (See Fig.2) for getting high
accuracy depth maps. Their approach is based on the
observation that ToF sensors have error characteristics
which are complementary to passive stereo.
Unfortunately, the captured range data are typically
contaminated by noise. The noise level of the distance
measurement depends on the amount of incident ac-
tive light. Also, an additional depth error related to the
intensity color is observed[11]
, i.e., object regions with
Fig.2 Multi-sensor calibration in [10]
图 2 文献[10]中多传感器标定示意图
low near-infrared reflectivity (NIR) have a non-zero
mean offset compared to regions with high reflectivity.
In [8] the systematic and the intensity-related errors
were compensated using a bivariate correction func-
tion based on B-splines directly on the distance values,
assuming both effects to be coupled. Alternatively,
Chan et al.[12]
proposed an adaptive multi-lateral filter
that takes into account the inherent noisy nature of
real-time depth data.
Regarding the multiple reflections, the authors in
[13−14] proposed a model for multiple reflections as
well as a technique for correcting the related meas-
urements. It is assumed that the perturbation compo-
nent due to multiple reflections outside and inside the
camera depends on the scene and the camera con-
struction, respectively. Therefore, the spatial spectral
components consist mainly of low spatial frequencies,
which can be compensated using a genuine model of
the signal as being complex with the amplitude and
the distance as modulus and argument. In a word, this
model is useful if an additional light pattern can be
projected on the object.The device manufacturers also attempt to reduce
the motion artifacts, which are mainly caused by the
latency between the individual exposures for the four
phase images. However, the problem remains and
might be solved by motion-compensated integration
of the individual measurements or motion deblurring
method[15]
.
3 Range Image Processing and Multi–Sensor
FusionBefore using the range data from a depth camera,
8/3/2019 Depth Camera in Computer Vision and Computer Graphics EXCELENTE
http://slidepdf.com/reader/full/depth-camera-in-computer-vision-and-computer-graphics-excelente 4/12
484 Journal of Frontiers of Computer Science and Technology 计算机科学与探索 2011, 5(6)
usually some pre-processing of the input data is re-
quired. In current generation, these sensors provide
noise-contaminated range data of comparably low
image resolution (e.g., only up to 176×144 for SwissRanger 4000). For the purpose of removing outliers
caused by random noise, bilateral filter is typically
used to refine the range data[16]
.
To upsample the resolution of depth camera, most
approaches are based on the main assumption that
depth discontinuities are often related to color changes
in the corresponding color image. In [17], Markov
random field (MRF) was first designed based on the
low resolution depth maps and the high resolution
camera images. Unfortunately, this method gives
promising spatial resolution enhancement only up to
10×. Yang et al.[18]
then presented a method that mod-
els a cost volume of depth probability and iteratively
applies bilateral filter[16]
to refine the cost volume.
Another recent method[2]
utilized exclusively depth
maps, without color image aid: a sequence of low
resolution depth maps of same scene is aligned and
then merged together to obtain a single depth map
with improved resolution. But this method is restricted
to static scenes’ acquisition. Then, we therefore pre-
sented a simple pipeline[19]
to enhance the quality as
well as improve the spatial and depth resolution of
range data in real time (See Fig.3). Similarly, by using
information from one or more additional high resolu-
tion vision cameras, Tian et al.[20]
considered the
problem of upsampling a low resolution depth map
generated by a range camera to provide an accurate
high resolution depth map from the viewpoint of one
of the vision cameras.
Fig.3 Depth camera data denosing
图 3 深度相机数据去噪处理
From a practical point of view, a higher resolution
is need for color than for depth information. Therefore
different combinations of high resolution video cam-
eras and lower resolution depth cameras have been
studied. Many researchers use a binocular combina-
tion of a depth camera with one[16]
or several RGB-
cameras
[21]
to upsample the low resolution ToF datawith high resolution color information. This fixated
sensor combinations make it available to compute the
rigid 3D transformation between the optical centers of
both sensors (external calibration) and intrinsic cam-
era parameters of each sensor. Utilizing this transfor-
mations the 3D points provided by the depth camera
are co-registered with the 2D image, thus color infor-
mation can be assigned to each 3D point.
There are also a number of monocular systems,
which combine a depth camera with a conventional
image sensor. They have the advantage of making data
fusion easier but requiring more sophisticated optics,
hardware and algorithm. The currently released Micro-
soft Kinect is a good example of monocular 2D/3D-
camera aimed at video game. The device features an
RGB camera and depth sensor running proprietary
software, which provides the capabilities of full-body
3D motion capture, facial recognition and voice recog-
nition.
Another research direction investigates on com-
bining depth cameras with classical stereo techniques.
In [22], it has been first shown that a ToF-stereo com-
bination can greatly speed up the stereo algorithm
while helping to manage textureless regions. A global
data fusion algorithm that incorporates the belief
propagation for depth from stereo images and the ToF
depth data was proposed by [10]. They combine both
depth estimates with an MRF to obtain a fused supe-
rior depth map. For those that are interested in more
technical details, please refer to [23] where authors
built a hybrid camera system composed of a stereo-
scopic camera and a time-of-flight depth camera to
generate high-quality and high-resolution video-plus-
depth.A recent technique
[24]for improving the accuracy
of range maps measured by ToF-cameras is based on
the observation that the range map and intensity image
are not independent but are linked by the shading
constraint: If the reflectance properties of the surface
are known, a certain range map implies a correspond-
ing intensity image (See Fig.4). The main limitation of
this method is that it does not cope well with range
discontinuities. But it will be possible overcome by
ignoring any mesh triangle that straddles range dis-
continuities.
8/3/2019 Depth Camera in Computer Vision and Computer Graphics EXCELENTE
http://slidepdf.com/reader/full/depth-camera-in-computer-vision-and-computer-graphics-excelente 5/12
向学勤 等:深度相机在计算机视觉与图形学上的应用研究 485
Fig.4 3D reconstruction of a human face using shading constraint
图 4 利用阴影约束对人脸进行三维重建
Fig.5 3D reconstruction based on depth camera
图 5 基于深度相机的三维重建
4 Applications of Depth Camera
4.1 Geometry Extraction and 3D Recon-
struction
Depth cameras typically record surroundings at
high photo speed, e.g., up to 30 f/s for Microsoft Ki-
nect. Thus, these sensors are especially well suited for
directly capturing 3D scene geometry in static and
even dynamic environments. A 3D map of the envi-
ronment can be captured by sweeping the depth cam-
era and registering all scene geometry into a consis-
tent reference coordinate system[25]
. Kim et al.[26]
have
proposed an integrated multi-view sensor fusion
approach that combines information from multiple
color cameras and multiple ToF depth sensors. They
first combined multi-view ToF sensor measurements
to obtain a coarse but complete model. Then, the
initial model is refined by means of a probabilistic
multi-view fusion framework, optimizing over an
energy function that aggregates ToF depth sensor infor-
mation with multi-view stereo and silhouette con-
straints. Fig.5 (a) and (b) show a sample acquired with
this kind of approach.
For high quality 3D reconstruction, Fuchs et al.[27]
investigated how well the known 3D geometry of a
cube was reconstructed with ToF sensors information.
Guan et al.[28]
presented a system that combines mul-
tiple ToF cameras with a set of video cameras to
simultaneously reconstruct dynamic 3D objects with
shape-from-silhouettes and range data. After definingsensing models for each type of sensors, they solved
the reconstruction problem robustly by using Bayesian
inference. A probabilistic ad hoc fusion algorithm[29−30]
was then derived in order to obtain relatively high
quality 3D construction result from the information of
both the ToF camera and the stereo-pair. According to
experimental results, this ad hoc fusion algorithm
leaded to a very accurate calibration suitable for the
fusion algorithm, that, in turn, allowed for precise ex-
traction of the depth information. On the other hand,
the low resolution and small field of views of a depthcamera can be merged or aligned together to utilize
additive information among these scenes. Cui et al.[31]
described a method for 3D object scanning by align-
ing depth scans that are taken from around an object
with a time-of-flight camera (See Fig.5 (c)). This new
easy-to-use 3D object scanning approach makes it
applausible in 3D reconstruction.
Also, high quality 3D reconstruction can be
achieved by utilizing a structure from motion (SFM)
approach[32−33]
. The inherent problem of SFM, how-
ever, is that no metric scale is available. This can be
8/3/2019 Depth Camera in Computer Vision and Computer Graphics EXCELENTE
http://slidepdf.com/reader/full/depth-camera-in-computer-vision-and-computer-graphics-excelente 6/12
486 Journal of Frontiers of Computer Science and Technology 计算机科学与探索 2011, 5(6)
solved by the metric properties of the depth measure-
ments[34]
. Thus, the SFM approach allows to recon-
struct metric scenes with high resolution at interactive
rates, e.g., for 3D map building and navigation
[35]
.Since color and depth can be obtained simultaneously,
free viewpoint rendering is easily incorporated using
depth compensated warping[36]
.We also propose a 3D
reconstruction method for non-rigid object using one
depth camera[37]
, and then extend this method to scan
hairstyle[38]
(See Fig.6).
Fig.6 Hairstyle scanning using one depth camera
图 6 采用单个深度相机进行发型扫描示意图
Simultaneous reconstruction of a scene with wide
field of view and dynamic scene analysis can beaccomplished by jointly combining a depth/color camera
pair on a computer-driven pan-tilt unit and by scan-
ning the environment in a controlled manner. When
scanning the scene, a 3D panorama can be achieved
by stitching both depth and the color images into a
common cylindrical or spherical panorama. Therefore,
from the center point given by the position of the
pan-tilt unit, a 3D environment model can be finally
reconstructed in a preparation phase. Dynamic 3D
scene content like person movements can then be
acquired online by adaptive object tracking with thecamera head
[39].
4.2 Human-Oriented Analysis
A number of human-oriented applications based on
depth cameras have been made in last few years. For
example, ToF camera systems can be successfully
used to detect respiratory motion of human persons[40]
.
Possible samples are emission tomography where res-
piratory motion may be the main reason for image
quality degradation. In such cases, ToF camera sys-
tems can detect the three dimensional, markerless,
real-time respiratory motion with an accuracy of
0.1 mm. Thus, it is clearly competitive with other im-
age based approaches[41]
. A further paper[42]
used ToF
cameras to monitor respiration during sleep and detect
sleep apnea. Currently, ToF cameras were reported in[43] to identify person facial identification from
single-view on real depth images acquired with an
“off-the-shelf” 3D time-of-flight depth camera.
Some medical applications such as cancer treat-
ment require a repositioning of the patient to a previ-
ously defined position. Depth cameras have been used
in such situation to solve the problem by segmenting
the patient body and registering a rigid 3D-3D surface
registration[44]
. Also, in iris capturing scenario, it has
been reported that[45]
, depth sensor (See Fig.7 (a)) was
used in iris deblurring algorithm for less intrusive iris
capture while improving the robustness and nonintru-
siveness for iris capture.
Fig.7 Human-oriented applications using depth camera
图 7 利用深度相机进行面向人的应用案例
Depth cameras are also useful in motion detection.
In [46], Liao et al. first utilized a single depth camera
to reconstruct complete 3D deformable models (e.g.,
human body) over time, provided that most parts of
the models are observed by the camera at least once.
Unlike well-studied structure from motion method,
their approach can tackle time-varying objects de-
forming arbitrarily but predictably. Acting like a touch
sensor, depth cameras were used to touch on a tabletop[47]
.
Automatic detection and pose estimation of humans
is an important task in human computer interaction
(HCI). In [48], Jain and Subramanian presented a
model based approach for detecting and estimating
human pose by fusing depth and RGB color data from
monocular view. A further study was released by Ga-
napathi et al. in [49] where they derive an efficient
8/3/2019 Depth Camera in Computer Vision and Computer Graphics EXCELENTE
http://slidepdf.com/reader/full/depth-camera-in-computer-vision-and-computer-graphics-excelente 7/12
向学勤 等:深度相机在计算机视觉与图形学上的应用研究 487
Fig.8 Overview of the algorithm in [53]
图 8 文献[53]的算法框架图
filtering algorithm for markerlessly tracking humanpose in real time, using a stream of monocular depth
images (See Fig.7 (b)). The key idea lies in their ap-
proach is to combine an accurate generative model
which is achievable using programmable graphics
hardware with a discriminative model that feeds data
driven evidence about body part locations. Since the
accurate real-time tracking of humans and other
articulated bodies has enticed researchers for many
years, their work opens a new door for the large num-
ber of useful applications. Most recently, Shotton et
al.[50]
proposed a new method to quickly and accu-rately predict 3D positions of body joints from a sin-
gle depth image, using no temporal information. By
breaking the whole skeleton into parts, their system
can run at 200 f/s on consumer hardware (i.e., Micro-
soft Kinect), while achieving state of the art accuracy.
4.3 User Interaction and User Tracking
Depth cameras have an obvious potential for
interactive systems such as alternative input devices,
games, animated avatars etc. In an early literature[51]
,
Oggier et al. have used a ToF-camera to track the hand
and thereby allow for touch-free interaction in a largevirtual interactive screen. Soutschek et al.
[52]then
presented a similar application for a touch-free navi-
gation in a 3D medical visualization.
User interaction often requires image matting op-
eration, since it wants to extract an interesting object
by recovering per-pixel opacity from its background.
More recently, Zhu et al.[53]
proposed an automatic
matting technique by combining a ToF camera with a
stereo. The key idea of their method is to fuse infor-
mation from the ToF sensor and the stereo camera to
jointly optimize depth map and alpha matte iteratively.
Fig.8 shows the overview of their method. For moredetails we refer the reader to [53].
Recently, many works consider the application of
depth camera for user tracking and man-machine-
interaction. Tracking people in a smart room, i.e.
multi-modal environments where the audible and
visible actions of people inside the rooms are recorded
and analyzed automatically, can benefit from the using
of ToF-sensors[22]
. Another different tracking approach
has been discussed in [54]. Here, only one ToF-sensor
is utilized to observe a scene at an oblique angle. As
for tracking non-rigid objects, in particular humanfaces, Cai et al.
[55]proposed a regularized maximum
likelihood deformable model fitting (DMF) algorithm
for 3D face tracking with a commodity depth camera.
They regulated the noisy depth data in the ICP
framework by using a novel l1 regularization scheme.
Fig.9 demonstrates some tracking results using their
algorithm.
Fig.9 Example tracking results using the
algorithm in [55]
图 9 文献[55]算法跟踪结果示意图
8/3/2019 Depth Camera in Computer Vision and Computer Graphics EXCELENTE
http://slidepdf.com/reader/full/depth-camera-in-computer-vision-and-computer-graphics-excelente 8/12
8/3/2019 Depth Camera in Computer Vision and Computer Graphics EXCELENTE
http://slidepdf.com/reader/full/depth-camera-in-computer-vision-and-computer-graphics-excelente 9/12
向学勤 等:深度相机在计算机视觉与图形学上的应用研究 489
based on the reflectance and depth image of a planar
object[J]. International Journal of Intelligent Systems
Technologies and Applications: Issue on Dynamic 3D
Imaging, 2008, 5(3/4): 285−294.
[7] Kahlmann T, Remondino F, Guillaume S. Range imaging
technology: new developments and applications for peo-
ple identification and tracking[J]. Proceedings of SPIE,
2007, 6491: 1−12.
[8] Lindner M, Kolb A. Calibration of the intensity-related
distance error of the PMD ToF-camera[J]. Proceedings of
SPIE, 2007, 6764: 35.
[9] Kuhnert K D, Stommel M. Fusion of stereo camera and
PMD-camera data for real-time suited precise 3D environ-
ment reconstruction[C]//Proceedings of the 2006 IEEE/RSJ
International Conference on Intelligent Robots and Sys-
tems (IROS ’06), Beijing, China, 2007: 4780−4785.
[10] Zhu J J, Wang L, Yang R G, et al. Reliability fusion of
time-of-flight depth and stereo for high quality depth
maps[J]. IEEE Transactions on Pattern Analysis and
Machine Intelligence (TPAMI), 2010.
[11] Rapp H. Experimental and theoretical investigation of
correlating ToF-camera systems[D]. University of Hei-delberg, Germany, 2007.
[12] Chan D, Buisman H, Theobalt C, et al. A noise-aware
filter for real-time depth upsampling[C]//Proceedings of
the Workshop on Multi-camera and Multi-modal Sensor
Fusion Algorithms and Applications (M2SFA2 ’08), Mar-
seille, France, October 12-18, 2008: 1−12.
[13] Falie D, Buzuloiu V. Distance errors correction for the
time of flight (ToF) cameras[C]//Proceedings of European
Conference on Circuits and Systems for Communications
(ECCSC ’08), Bucharest, Romania, July 10-11, 2008:
193−196.
[14] Falie D. 3D image correction for time of flight (ToF)
cameras[J]. Proceedings of SPIE, 2008, 7156: 133.
[15] Tai Y W, Kong N, Lin S, et al. Coded exposure imaging
for projective motion deblurring[C]//Proceedings of the
IEEE International Conference on Computer Vision and
Pattern Recognition (CVPR ’10), San Francisco, USA,
June 13-18, 2010: 1−8.
[16] Huhle B, Schairer T, Jenke P, et al. Robust non-local
denoising of colored depth data[C]//Proceedings of the
IEEE International Conference on Computer Vision and
Pattern Recognition Workshop on Time of Flight Camera
Based Computer Vision (CVPRW ’08), Anchorage, Alaska,
USA, June 23-28, 2008.
[17] Diebel J, Thrun S. An application of Markov random
fields to range sensing[C]//Proceedings of the Conference
on Neural Information Processing Systems (NIPS ’05),
Vancouver, British Columbia, Canada, December 5-8, 2005.
[18] Yang Q X, Yang R G, Davis J, et al. Spatial-depth super
resolution for range images[C]//Proceedings of the IEEE
International Conference on Computer Vision and Pattern
Recognition (CVPR ’07), Minneapolis, Minnesota, USA,
June 18-23, 2007: 1−8.
[19] Xiang X Q, Li G X, Pan Z G, et al. Real-time spatial and
depth upsampling for range data[J]. LNCS Transactions
on Computational Science, 2011, 6670: 78−98.
[20] Tian C, Vaishampayan V, Zhang Y F. Upsampling range
camera depth maps using high-resolution vision camera
and pixel-level confidence classification[J]. Proceedings
of SPIE, 2011, 7863.[21] Guomundsson S A, Larsen R, Aanaes H, et al. ToF imaging
in smart room environments towards improved people
tracking[C]//Proceedings of the IEEE International Con-
ference on Computer Vision and Pattern Recognition
Workshop on Time of Flight Camera Based Computer
Vision (CVPRW ’08), Anchorage, Alaska, USA, June 23-28,
2008.
[22] Gudmundsson S A, Aanaes H, Larsen R. Fusion of stereo
vision and time-of-flight imaging for improved 3D esti-
mation[J]. International Journal of Intelligent Systems
Technologies and Applications: Issue on Dynamic 3D
Imaging, 2008, 5(3/4): 425−433.
[23] Kim S Y, Koschan A, Mongi A A, et al. Book chapter:
three-dimensional video contents exploitation in depth
camera-based hybrid camera system[M]//Signals and
Communication Technology, High-Quality Visual Ex-
perience. [S.l.]: Springer, 2010: 349−369.
[24] Böhme M, Haker M, Martinetz T, et al. Shading con-
8/3/2019 Depth Camera in Computer Vision and Computer Graphics EXCELENTE
http://slidepdf.com/reader/full/depth-camera-in-computer-vision-and-computer-graphics-excelente 10/12
490 Journal of Frontiers of Computer Science and Technology 计算机科学与探索 2011, 5(6)
straint improves accuracy of time-of-flight measure-
ments[C]//Proceedings of the IEEE International Conference
on Computer Vision and Pattern Recognition (CVPR ’08),
Anchorage, Alaska, USA, June 24-26, 2008: 1−8.
[25] Huhle B, Jenke P, Strasser W. On-the-fly scene acquisi-
tion with a handy multisensory system[J]. International
Journal of Intelligent Systems Technologies and Applica-
tions: Issue on Dynamic 3D Imaging, 2008, 5(3/4):
255−263.
[26] Kim Y M, Theobalt C, Diebel J, et al. Multi-view image
and ToF sensor fusion for dense 3D reconstruction[C]//
Proceedings of the IEEE Workshop on 3-D Digital Imag-
ing and Modeling (3DIM ’09), Kyoto, Japan, October 3-4,
2009: 1542−1549.
[27] Fuchs S, May S. Calibration and registration for precise
surface reconstruction[C]//Proceedings of the Dynamic
3D Imaging Workshop in Conjunction with DAGM
(Dyn3D), Heidelberg, Germany, September 2007.
[28] Guan L, Franco J S, Pollefeys M. 3D object reconstruc-
tion with heterogeneous sensor data[C]//Proceedings of
the 4th International Symposium on 3D Data Processing,
Visualization and Transmission (3DPVT ’08), Atlanta,
USA, June 18-20, 2008: 295−302.
[29] Mutto C D, Zanuttigh P, Cortelazzo G M. Accurate 3D
reconstruction by stereo and ToF data fusion[C]//Pro-
ceedings of the GTTI Meeting 2010, Brescia, Italy, May 2010.
[30] Mutto C D, Zanuttigh P, Cortelazzo G M. A probabilistic
approach to ToF and stereo data fusion[C]//Proceedings
of the 5th International Symposium on 3D Data Processing,
Visualization and Transmission (3DPVT ’10), Paris,
France, May 17-20, 2010.
[31] Cui Y, Schuon S, Chan D, et al. 3D shape scanning with a
time-of-flight camera[C]//Proceedings of the IEEE Inter-
national Conference on Computer Vision and Pattern Rec-
ognition (CVPR’10), San Francisco, USA, June 13-18,
2010: 1−8. [32] Bartczak B, Koeser K, Woelk F, et al. Extraction of 3D
freeform surfaces as visual landmarks for real-time
tracking[J]. Journal of Real-Time Image Processing, 2007,
2(2/3): 81−101.
[33] Koeser K, Bartczak B, Koch R. Robust GPU-assisted
camera tracking using free-form surface models[J]. Jour-
nal of Real-Time Image Processing, 2007, 2(2/3): 133−147.
[34] Streckel B, Bartczak B, Koch R, et al. Supporting struc-
ture from motion with a 3D range-camera[C]//Lecture
Notes in Computer Science 4522: Proceedings of the 15th
Scandinavian Conference on Image analysis. Berlin,
Heidelberg: Springer-Verlag, 2007: 233−242.
[35] Prusak A, Melnychuk O, Roth H, et al. Pose estimation
and map building with a PMD-camera for robot naviga-
tion[J]. International Journal of Intelligent Systems Tech-
nologies and Applications: Issue on Dynamic 3D Imaging,
2008, 5(3/4): 355−364.
[36] Koch R, Evers-Senne J. View synthesis and rendering
methods[M]//3D Video Communication: Algorithms, Con-
cepts and Real-time Systems in Human Centered Com-
munication. [S.l.]: Wiley, 2005: 151−174.
[37] Tong J, Xiang X Q, Pan Z G, et al. 3D reconstruction of
non-rigid shapes using one ToF camera[J]. Journal of
Computer-Aided Design & Computer Graphics, 2011,
23(3): 377−384.
[38] Tong J, Zhang M M, Xiang X Q, et al. 3D body scanning
with hairstyle using one time-of-flight camera[J]. Journal
of Computer Animation and Virtual Worlds, 2011, 22(2/3):
203−211.
[39] Bartczak B, Schiller I, Beder C, et al. Integration of a
time-of-flight camera into a mixed reality system for han-
dling dynamic scenes, moving viewpoints and occlusions in
real-time[C]//Proceedings of the 4th International Sympo-
sium on 3D Data Processing, Visualization and Transmission
(3DPVT ’08), Atlanta, USA, June 18-20, 2008: 295−302.
[40] Schaller C, Penne J, Hornegger J. Time-of-flight sensor
for respiratory motion gating[J]. Medical Physics, 2008,
35(7): 3090−3093.
[41] Penne J, Schaller C, Hornegger J, et al. Robust real-time
3D respiratory motion detection using time-of-flight
cameras[J]. International Journal of Computer Assisted
Radiology and Surgery, 2008, 3(5): 427−431.
[42] Falie D, Ichim M, David L. Respiratory motion visualiza-
tion and the sleep apnea diagnosis with the time of flight
8/3/2019 Depth Camera in Computer Vision and Computer Graphics EXCELENTE
http://slidepdf.com/reader/full/depth-camera-in-computer-vision-and-computer-graphics-excelente 11/12
向学勤 等:深度相机在计算机视觉与图形学上的应用研究 491
(ToF) camera[C]//Proceedings of International Confer-
ence on Visualization Imaging and Simulation (VIS ’08),
Bucharest, Romania, November 7-9, 2008: 179−184.
[43] Ding H, Moutarde F, Shaiek A. 3D object recognition and
facial identification using time averaged single-views
from time-of-flight 3D depth-camera[C]//Proceedings of
the Eurographics Workshop on 3D Object Retrieval,
Norrköping, Sweden, May 3-7, 2010.
[44] Adelt A, Schaller C, Penne J. Patient positioning using
3D surface registration[C]//Proceedings of the Russian-
Bavarian Conference on Biomedical Engineering, Mos-
cow, Russia, July 8-9, 2008: 202−207.
[45] Huang X Y, Ren L, Yang R G. Image deblurring for less
intrusive iris capture[C]//Proceedings of the IEEE Inter-
national Conference on Computer Vision and Pattern
Recognition (CVPR ’09), Miami, Florida, USA, June
20-26, 2009: 1558−1565.
[46] Liao M, Zhang Q, Wang H M, et al. Modeling deformable
objects from a single depth camera[C]//Proceedings of the
IEEE International Conference on Computer Vision
(ICCV ’09), Kyoto, Japan, September 27 - October 4,
2009: 167−
174.[47] Wilson A. Using a depth camera as a touch sensor[C]//
Proceedings of the ACM International Conference on In-
teractive Tabletops and Surfaces (ITS ’10), Saarbrucken,
Germany, November 7-10, 2010: 69−72.
[48] Jain H P, Subramanian A. Real-time upper-body human
pose estimation using a depth camera, HPL-2010-190[R].
2010.
[49] Ganapathi V, Plagemann C, Koller D, et al. Real time
motion capture using a single time-of-flight camera[C]//
Proceedings of the IEEE International Conference on
Computer Vision and Pattern Recognition (CVPR ’10),
San Francisco, USA, June 13-18, 2010: 755−762.
[50] Shotton J, Fitzgibbon A, Cook M, et al. Real-time human
pose recognition in parts from single depth images[C]//
Proceedings of the IEEE International Conference on
Computer Vision and Pattern Recognition (CVPR ’11),
Colorado Springs, USA, June 20-25, 2011.
[51] Oggier T, Büttgen B, Lustenberger F, et al. SwissRanger
SR3000 and first experiences based on miniaturized 3D
ToF cameras[C]//Proceedings of the 1st Range Imaging
Research Day at ETH Zurich, 2005: 97−108.
[52] Soutschek S, Penne J, Hornegger J, et al. 3D gesture-
based scene navigation in medical imaging applications
using time-of-flight cameras[C]//Proceedings of the IEEE
International Conference on Computer Vision and Pattern
Recognition Workshop on Time of Flight Camera Based
Computer Vision (CVPRW ’08), Anchorage, Alaska, USA,
June 23-28, 2008.
[53] Zhu J J, Wang L, Yang R G, et al. Reliability joint depth
and alpha matte optimization via fusion of stereo and
time-of-flight sensor[C]//Proceedings of the IEEE Inter-
national Conference on Computer Vision and Pattern
Recognition (CVPR ’09), Miami, Florida, USA, June
20-26, 2009.
[54] Hansen D W, Hansen M S, Kirschmeyer M, et al. Cluster
tracking with time-of-flight cameras[C]//Proceedings of
the IEEE International Conference on Computer Vision
and Pattern Recognition Workshop on Time of Flight
Camera Based Computer Vision (CVPRW ’08), Anchorage,
Alaska, USA, June 23-28, 2008.[55] Cai Q, Gallup D, Zhang C, et al.3D deformable face
tracking with a commodity depth camera[C]//Proceedings
of the 11th European Conference on Computer Vision
(ECCV ’10), Crete, Greece, September 5-11, 2010: 229−242.
[56] Penne J, Soutschek S, Fedorowicz L, et al. Robust
real-time 3D time-of-flight based gesture navigation[C]//
Proceedings of the IEEE International Conference on
Automatic Face & Gesture Recognition (FG 2008),
Amsterdam, Netherlands, September 17-19, 2008.
[57] Holte M, Moeslund T. View invariant gesture recognition
using the CSEM SwissRanger SR-2 camera[J]. Interna-
tional Journal of Intelligent Systems Technologies and
Applications: Issue on Dynamic 3D Imaging, 2008,
5(3/4): 295−303.
[58] Holte M, Moeslund T, Fihl P. Fusion of range and inten-
sity information for view invariant gesture recogni-
tion[C]//Proceedings of the IEEE International Confer-
ence on Computer Vision and Pattern Recognition Work-
8/3/2019 Depth Camera in Computer Vision and Computer Graphics EXCELENTE
http://slidepdf.com/reader/full/depth-camera-in-computer-vision-and-computer-graphics-excelente 12/12
492 Journal of Frontiers of Computer Science and Technology 计算机科学与探索 2011, 5(6)
shop on Time of Flight Camera Based Computer Vision
(CVPRW ’08), Anchorage, Alaska, USA, June 23-28, 2008.
[59] Acharya S, Tracey C, Rafii A. System design of time-
of-flight range camera for car park assist and backup
applications[C]//Proceedings of the IEEE International
Conference on Computer Vision and Pattern Recognition
Workshop on Time of Flight Camera Based Computer Vision
(CVPRW ’08), Anchorage, Alaska, USA, June 23-28, 2008.
[60] Swadzba A, Beuter N, Schmidt J, et al. Tracking objects
in 6D for reconstructing static scene[C]//Proceedings of
the IEEE International Conference on Computer Vision
and Pattern Recognition Workshop on Time of Flight
Camera Based Computer Vision (CVPRW ’08), Anchor-
age, Alaska, USA, June 23-28, 2008.
[61] Ghobadi S E, Loepprich O E, Ahmadov F, et al. Real time
hand based robot control using 2D/3D images[C]//Pro-
ceedings of the International Symposium on Advances in
Visual Computing (ISVC ’08), Las Vegas, Nevada, USA,
December 1-3, 2008: 307−316.
[62] Dolson J, Baek J, Plagemann C, et al. Upsampling range
data in dynamic environments[C]//Proceedings of the
IEEE International Conference on Computer Vision and
Pattern Recognition (CVPR ’10), San Francisco, USA,
June 13-18, 2010: 1141−1148.
[63] Halit B S, Sonia C. Humanoid robot control using depth
camera[C]//Proceedings of the 6th International Confer-
ence on Human-Robot Interaction (HRI ’11), EPFL,
Lausanne, Switzerland, March 6-9, 2011: 401−402.
XIANG Xueqin was born in 1981. He is a Ph.D. candidate at Zhejiang University. His research interests
include computer vision and depth camera, etc.
向学勤(1981—), 男, 湖南麻阳人, 浙江大学博士研究生, 主要研究领域为计算机视觉, 深度相机等。
PAN Zhigeng was born in 1965. He is a professor and doctoral supervisor at Zhejiang University, and the
senior member of CCF. His research interests include virtual reality, human animation, human-computer
interaction and edutainment, etc.
潘志庚(1965—), 男, 江苏淮安人, 浙江大学研究员、博士生导师, CCF 高级会员, 主要研究领域为虚
拟现实, 人体动画, 人机交互, 教育娱乐等。
TONG Jing was born in 1981. He is a Ph.D. candidate at Zhejiang University. His research interests
include computer graphics and 3D animation, etc.
童晶(1981—), 男, 江苏扬州人, 浙江大学博士研究生, 主要研究领域为计算机图形学, 三维动画等。
Recommended