06147421

Embed Size (px)

Citation preview

  • 8/14/2019 06147421

    1/6

    978-1-4577-1524-2/11/$26.00 2011 IEEE

    Abstract This paper presents a method to distinguishbetween obstacles and manipulator when they share the sameworkspace. Microsoft Kinect is used as a capturing device. A

    Kinect calibration method is explained. Furthermore,calibration between Kinect and the manipulator is addressedby iterative least-square method. 3D model of manipulator isgenerated using OpenGL library. Finally, the manipulatorsurface is deleted from the scene by intersection of databetween the manipulator model and its corresponding pointcloud.

    I. I NTRODUCTION N robotics, obstacle detection is important for a robot toaccomplish tasks especially in dynamic environment. To

    achieve this goal, obstacle information can be acquired usingcomputer vision methods. Furthermore, camera system plays

    a role in obstacle detection.For obstacle detection based on vision, there are manytechniques to analyze data, namely 2D, 2D and 3Dmethods. To detect collision in 2D environment [1][2],obstacles and background are recognized based on theircolors. However, it is prone to error if the robot andobstacles contain similar color(s).

    In 2D, [3]-[6] generate surfaces inside a scene usingdisparity information. On the other hands, [7] reconstructobjects using epipolar parameterization instead of using thedisparity images. However, their output is rougher than thatof the surface reconstruction methods. In any case, thementioned 2D methods still require the robot and itssurroundings to have different color/texture.

    In 3D, there are many possible algorithms for surfacereconstruction, for example, employing corner detection toreconstruct 3D object model [8]. Also, shape-from-silhouettealgorithm [9]-[11] constructs 3D surface by volumeintersection from multiple views. Nevertheless, it requires alot of computation to reconstruct 3D scene. Furthermore, alltechniques described previously are experimented withstereo or multiple cameras. Hence, 3D surfaces usuallycontain some noise when the robot and the background aresimilar in color/texture.

    Some paper even considers multiple types of input. For

    example, [12] use 2D cameras and a 3D robot model todistinguish between the robot and its surrounding. However,their method requires complicated mapping between themodel and each cameras view to eliminate robot from 2Dimage .

    Manuscript received August 10, 2011.P. R. is with Faculty of Engineering, Kasetsart University, Bangkok,

    Thailand (corresponding author to provide phone: +66-2-942-8555; fax:Ext. 1550; e-mail: [email protected]).

    M. R. is with Faculty of Engineering, Kasetsart University, Bangkok,Thailand (e-mail: [email protected]).

    A. C. is with ESIREM, U De Bourgogne, Dijon, France (e-mail:[email protected]).

    To avoid the mentioned problems, one can directly use adepth measuring device to reconstruct the 2D or 3Dsurface. Presently, we have a new equipment that is

    inexpensive however versatile and produces less noise thanstereo cameras. Kinect is one of such device and is used tocapture depth image in this work. We transfer the depthimage to world coordinate for obstacle detection. However,the resulting point cloud comprises of both obstacle and themanipulator itself.

    In this paper, we will focus on a technique of obstacledetection where parts of the robot are also presented in thescene (in Fig.1). The technique relies on depth imagescaptured from the depth camera, Kinect. We will show howto calibrate Kinect. Furthermore, calibration between thedepth camera and the robot is addressed here. Error in thelatter calibration is minimized using iterative least-square

    method. Ultimately, the robots point cloud can be deletedfrom that of the whole scene. Hence, obstacle in the scenecould then be detected efficiently.

    This paper is organized as follow. Section II explains howto calibrate Kinect as well as how to transfer its depth imageto 2D depth image. Section III shows how to calibrate

    between depth camera and manipulator. Section IVdescribes how to generate 3D model of manipulator anddelete the point cloud of robot from the scene. Section V

    shows experimental result. Finally, conclusion of this paperis presented in section VI.

    II. CALIBRATION OF DEPTH CAMERA This work uses Microsoft Kinect as a depth measuring

    device, to detect obstacles in a scene as it is powerful and yetinexpensive. Kinect is a new alternative gaming device thatis capable to capture color and a depth image at the sametime (in Fig.2). Kinect is composed of an RGB camera, adepth sensor and a multi-array microphone (in Fig.3). Thedepth sensor is consisted of an infrared laser projector and amonochrome CMOS sensor. The resulting point cloud can

    Kinect-based Obstacle Detection for Manipulator

    Panjawee Rakprayoon, Miti Ruchanurucks, and Ada Coundoul

    I

    Fig.1 A workspace consists of obstacle (such as human and a plant ofPLC) and a manipulator which is not the obstacle in the red circle line.

    - 68 - SI International 2011

  • 8/14/2019 06147421

    2/6

    be loaded to a computer using open soenable Kinect to be operated with WinData connection to the computer is thro

    In order to use Kinect for depth mecalibrate it first. Important parameters

    parameters which comprise of focaldistance away from the optic axis (c x ,that lens distortion is so small and neglmethod in [15] requires only a chessbothe form of depth image. Then, cornersthe depth image (in Fig.4) are detecintrinsic parameters.

    Note that Kinect returns a rawcontains integer value between 0 and 2image must be transformed into the depin equation (1)(2) using the following e

    1( )

    depth f z

    =

    ( ) ( 0.0030711016* 3.330 f z x= +

    Where z is the raw depth from Kine

    Fig.2 Color and depth images captured from Ki

    Fig.4 Depth image of chessboard which cchessboard shown in red color.

    Fig.3 The configuration and component of Kine

    DEPTH SENSOR

    RGB CAMERA

    MULTI-ARR

    urce libraries whichows, Linux or Mac.gh USB interface.

    surement, one mustare among intrinsiclength (f x , f y ) and

    y). Here we assumeigible. A calibrationrd to be captured inof the chessboard inted to estimate the

    epth image which047. The raw depthth image to derive Z quation [13].

    (1)

    9495161)

    (2)

    t and f(z) is a linear

    function which can be estimaAfter that, the intrinsic p

    (4) and (5) to approximate rethe depth image ( xyZ ).

    Z =

    X =

    Y =

    III. CALIBRATION BETMANI

    This section explains hotransformation between the base ( cam

    base H ). This calibrat

    attached to the robots end-ef

    First, we define an equati

    coordinate system in (6).

    camcam ba se P H =

    Where is a position of

    coordinate. This parameter cdescribed in section II.

    ect device.

    ntains corners of the

    ct device.

    Fig.6 Coordinate system of depth c

    Z

    Fig.5 A point on image p = (x,y,on world coordinate P = (X,Y,Z)

    O

    j

    Image

    fCenter of projection

    cam P

    Y MIC

    ted from regression techniques.rameters are input to equational-world coordinate ( XYZ ) from

    epth (3)

    ( ) x x

    Z x c f

    (4)

    ( ) y

    y

    y c

    f

    (5)

    EEN DEPTH CAMERA A NDULATOR

    w to derive a homogeneousdepth camera and the robotsion requires a marker to be

    fector as shown in Fig.6.

    n to derive for cambase H in world

    baseend eff end eff H P (6)

    the marker in the depth cameran be acquired using the method

    amera and manipulator.

    X

    Y

    ZY

    X

    Z) is capable to be projected to point.

    i

    k

    plane

    Optical axis

    p=(x,y,Z)

    P=(X,Y,Z)

    - 69 - SI International 2011

  • 8/14/2019 06147421

    3/6

    [ 1]T cam c c c P X Y Z =

    end eff P is a position of the marker in the end-effector

    coordinate. For convenience, we attach the marker at thecenter of the gripper.

    [ ][ 1] 0 0 0 1 T T end eff end eff end eff end eff P X Y Z = =

    baseend eff H is a homogeneous matrix from base of the robot

    to its end-effector . This matrix is known as we can acquirethe robots forward kinematic relationship.

    11 12 13 14

    21 22 23 24

    31 32 33 34

    0 0 0 1

    baseend eff

    a a a a

    a a a a H

    a a a a

    =

    And the unknown cambase H contains 12 parameters.

    11 12 13 14

    21 22 23 24

    31 32 33 34

    0 0 0 1

    cambase

    r r r t

    r r r t H

    r r r t

    =

    Here, equation (6) then be represented by equation (7).

    11 12 13 14 11 12 13 14

    21 22 23 24 21 22 23 24

    31 32 33 34 31 32 33 34

    0

    00

    1 0 0 0 1 0 0 0 1 1

    c

    c

    c

    cam end eff

    X r r r t a a a a

    Y r r r t a a a a Z r r r t a a a a

    =

    (7)

    To derive the unknown, the first 3 rows of (7) can berewritten in a different matrix form.

    i i A x B= (8)

    Where A is a known coefficient matrix.

    14 24 34

    14 24 34

    14 24 34

    0 0 0 0 0 0 1 0 0

    0 0 0 0 0 0 0 1 0

    0 0 0 0 0 0 0 0 1

    a a a

    A a a a

    a a a

    =

    And B contains the position of the marker in the depthcamera coordinate.

    [ ]T c c c B X Y Z =

    For i, it is the number of data frame that and is indexedwith i = 1,2,3,4,5,,n.

    Our unknown is matrix x that contain 12 parameters ofmatrix cam

    base H .

    [ ]11 12 13 21 22 23 31 32 33 14 24 34T

    x r r r r r r r r r t t t =

    In order to solve the 12 unknown parameters, 9 rotationsand 3 translations, multiple numbers of robot configurationsare required. Also, as A is not a square matrix, x matrix can

    be derived using pseudo inverse as in (9).

    i i x A B+

    = (9)

    Wherei A

    + denotes the left matrix pseudo inverse given

    by

    ( )1T T

    i i i i A A A A

    += (10)

    Error of the 3D position is a major cause of error incalculating cam

    base H . We can reduce such error by exploiting

    multiple inputs of A and B matrix. Here, iterative least- squares method [14] is used to update and average every

    data accumulatively in order to acquire a more accuratecambase H . It is more robust than linear least square as well as

    than QR decomposition in the sense that there is no limit inthe number of inputs, as presented in the two previousmethods. Thus, error in approximation of x can be averaged.In this paper, each depth image is captured whileconfiguration of robot is changed. The equation of iterativeleast-square method can be written as (11).

    1

    1 1

    ( ) ( ) N N

    T T i i i i

    i i

    A A A B

    = =

    = (11)So now, the parameters of cam

    base H are resolved from (11).

    Then, cambase H will be exploited to model the robot in cameracoordinate in the next section.

    IV. GENERATING 3D MODEL A ND DELETING POINT CLOUDOF MANIPULATOR

    For deleting point clouds of the robot from depth image, point clouds of the whole scene (manipulator +surroundings) and 3D model of the manipulator must be inthe same coordinate.

    First, the manipulator is modeled as a chain of cylinders(in Fig.7a) which consist of the position of each robot joint(

    robot P ) with respect to its base. The chain of cylinders is able

    to be moved using the input angles from each correspondingconfiguration.

    We have to transfer the 3D model of manipulator to depthcamera coordinate (in Fig.7b) using cam

    base H obtained

    previously from equation (10). The change in coordinatesystem is shown in Fig.8. The equation for transferring

    joints position of the model (robot P ) to the depth camera

    coordinate is in (12).

    camcam base robot P H P = (12)

    - 70 - SI International 2011

  • 8/14/2019 06147421

    4/6

    As a result, the point cloud in scenecan then be put in the same coordina

    point cloud of manipulator only, wewithin the cylinders model as the robotcloud within will be deleted. Final outpof surroundings as require. Experimenwill confirm the usefulness of the metho

    V. EXPERIMENTAL R EAn articulated robot (AR6) of BOSC

    6 axes, is used in this paper. The posit

    end eff P ), which the marker is attached, a

    of AR6 configurations ( baseend eff H ) (in

    determine the relationship between cam

    ). Also, it used to generate 3D model o f

    deleting point cloud within the model.

    Fig.8 Coordinate system from manipulator to de

    Fig.9 The configuration of BOSCH articulated a

    (a) (Fig.7 3D model of manipulator (a) and mmanipulator to camera coordinate (b).

    and the robot modele. To eliminate theonsider point clouds surface. Any pointt would be only thatin the next section

    d.

    ULT H, which consists ofion of end-effector (nd multiple numbers

    Fig.9) are used to

    era and robot ( cambase H

    the robot (robot P ) for

    Kinect device is installedworkspace of manipulator (then applied. For transferricloud in 3D world coordincalibrate depth camera to estiequation (3)-(5) are then appareas in Fig.10 behind theresulting from the fact thatocclusion can be avoided bmeasuring device.

    Then, the 3D model ofFig.7a) and transferred to tFig.7b) using cambase H . To veri

    whole scene as well as the musing OpenGL library. Tworkspace, we transform thecamera coordinate to OpeFig.11.

    Finally, the point clouddeleted from the depth sur f

    model. The experimental recloud of surroundings is pres

    VI. COIn scenes that consist of

    we need to distinguish betwIn order to do so, a 3superimposed onto depth imcalibration steps. First, Kinecinformation in real-world crequires a known size planSecond, calibration betweemanipulator is applied to fi( cambase H ). This second calibrat

    to the robots end-effector.

    generate multiple configuratimatrixs error as much asmethod is applied to deaexperiment shows that thecloud of robot from the depacquired from Kinect is cleamethods. Furthermore, the

    prior known object to remodisadvantage of the proposecan be avoided by using mult

    pth camera.

    rm robots [16].

    ) pping 3D model of

    Fig.11 Coordinate system from de

    at some distance to cover an Fig.6). Equation (1)-(2) areg depth information to point

    te, OpenCV library is used tomate intrinsic parameters. Thenied. Reader may observe blank

    robot. This is an occlusionwe use only one Kinect. Suchy using more than one depth

    manipulator is generated (ine depth camera coordinate (iny the result, point cloud of the

    odel is displayed in 3D graphicget a better view of the

    whole scene again from depthGL coordinate, as shown in

    of the manipulator is can beace using the 3D manipulator

    ult that consists of only pointnted in Fig.12.

    NCLUSION oth obstacles and manipulator,en them before path planning.

    D model of manipulator isges. System setup requires twot is calibrated to generate depthordinate. The first calibration

    ar object such as chessboard.the depth camera and the

    d a homogeneous relationshipion requires a marker attached

    The robot is then moved to

    ons to reduce the homogeneous possible. Iterative-least squarewith such calibration. The

    D model can delete the pointth images. Also, depth imageser than that of stereo matching

    proposed method requires noe the point cloud. However, ad method is occlusion, whichiple depth cameras.

    pth camera to OpenGL.

    - 71 - SI International 2011

  • 8/14/2019 06147421

    5/6

    R EFERENCES [1] S. Khan and M. Shah, Object based segmentation of video using

    color, motion and spatial information, IEEE Computer Society Conf.on Computer Vision And Pattern Recognition, vol.2, pp. II-746 II-751, 2001

    [2] H. Shoji, K. Shirai, and M. Ikehara, Shape from focus using colorsegmentation and bilateral filter, Digital Signal ProcessingWorkshop,12 th- Signal Processing Education Workshop, pp. 566-571,2006

    [3] R. B. Rusu, A. Holzbach , M. Beetz , and G. Bradski , Detecting andsegmenting object for mobile manipulator, IEEE 12 th InternationalConf. on Computer Vision Workshops , pp. 47-54, 2009

    [4] S. Fazl, H. M. Dehnavi, and P. Moallem, A robust obstacle detectionmethod in highly textured environments using stereo vision, 2nd

    International conf. on Machine Vision, pp. 97-100, 2009[5] N. Suganuma, Clustering and tracking of obstacles using stereo

    vision system, International Joint Conf. on ICCAS-SICE, pp. 4623-4628, 2009

    [6] M. Marron, J. C. Garcia, M. A. Sotelo, D. Pizzarro, and I. B. Munoz,Extraction of 3D feature from complex environments in visualtracking applications, IEEE Conf. on Instrumentation And

    Measurement Technology, pp. 1-5, 2007

    [7] C. Zhao and R. Mohr, Epipolar parameterization for reconstructing a3D rigid curve, International Symposium on Computer Vision,

    pp. 67-72, 1995[8] A. S. Vazquez, A. Adan, and F. Molina, 3D vision system for robot

    interaction in moving scenes, IEEE International Symposium on Intelligent Signal Processing, pp. 1-6, 2007

    [9] S. D. Roeck, N. Cornelis, and L. V. Gool, Augmenting fast stereowith silhouette constraints for dynamic 3D capture., 18 th

    International Conf. on Pattern Recognition, pp. 131-134, 2006[10] M. Toyoura, M. Liyama, T. Funatomi, K. Kakusho, and M. Minoh,

    3D shape reconstruction from incomplete silhouettes in multipleframes, 19 th International Conf. on Pattern Recognition, pp. 1-4,2008

    [11] H. Y. Lin, J. R. Wu, 3D reconstruction by combining shape fromsilhouette with stereo, 19 th International Conf. on Pattern

    Recognition , pp. 1-4, 2008[12] D. Henrich, T. Gecks, Multi-camera collision detection between

    known and unknown objects, 2nd ACM/IEEE International conf. on Distributed Smart Cameras, pp. 1-10, 2008

    [13] K. Conley. (2011, May 26). http://www.ros.org/wiki/kinect_node[14] N. Ziraknejad, S. Tafazoli, and P. D. Lawrence, Autonomous stereo

    camera parameter estimation for outdoor visual servoing, IEEEWorkshop on Machine Learning for Signal Processing, pp. 157-162,2007

    Fig.10 Point cloud of each scene which contains point cloud of manipulator in the black circle (scale in cm.).

    Fig.12 Point cloud of surroundings when point cloud of manipulator is eliminated (scale in cm.).

    - 72 - SI International 2011

  • 8/14/2019 06147421

    6/6

    [15] N. Burrus. (2011, August 07).http://nicolas.burrus.name/index.php/Research/KinectCalibration

    [16] Robot controls-Robot technology Trainning manual IQ200-B & IQ200-P Operating using PHG and ROPS4 or RobotSuite Programming with Bapsplus or Robotsuite , Boasch Rexroth AG,Berliner Strasse 25, Germany, 2000-2005, pp. 42

    - 73 - SI International 2011