7
A Unified Framework for Grasping and Shape Acquisition Via Pretouch Sensing Liang-Ting Jiang and Joshua R. Smith Abstract— This paper presents a unified framework to enable automatic exploration with a pretouch sensor to reduce object shape uncertainty before grasping. The robot starts with only the incomplete object shape data it acquires from a Kinect depth sensor—it has no model of the object. Combining the Kinect pointcloud with prior probability distributions for occlusion and transparency, it makes inferences about portions of the object that the Kinect was not able to observe. Operating on the inferred shape of the object, an iterative grasp replanning and exploration algorithm decides when further exploration is required, and where to explore in the scene using the pretouch sensor. The information gathered by the exploration action is added directly to the robot’s environment representation and is considered automatically in the next grasp planning iteration. The pretouch sensor used is the first “seashell effect” pretouch sensor fully integrated into a robot (the PR2); the sensor replaces the PR2’s touch sensor pads, and uses the PR2’s existing hardware interfaces to eliminate all external wires for power and signal. This new non-contact sensor enables the robot to gather local object shape information without disturbing it. The new sensor and framework was validated on the PR2 robot for its usefulness in sensor-driven grasping of un-modeled objects. I. INTRODUCTION Robotic grasping of unknown objects is a challenging task due to the uncertainty arising from imperfect perception. Vision and depth sensors are commonly used for sensing before grasping. However, when it comes time to execute a grasp, vision has shortcomings. The robot’s hand and arm often occludes a head-mounted camera’s view of the object to be manipulated. A head-mounted camera is not in the same coordinate frame as the hand, and thus even if the camera has an unrestricted view of the object, uncertainties associated with actuation can lead to manipulation errors. Optical sensors are also known to fail to detect transparent objects, and to fail under poor lighting conditions. Tactile exploration is widely used to acquire local geo- metric information of the object [1][2], and is much less subject to occlusion. However, because touch sensing relies on contact between the manipulator and the object, touch sensing tends to displace objects, particularly light objects. Pretouch sensing has been proposed to address the above perception problems [3][4]. The aim of pretouch sensing is to combine the desirable features of touch and vision sensing: like touch, it provides information about the relative geometry of the hand and the object, in the hand’s coordinate Liang-Ting Jiang is with the Department of Mechanical Engineering, University of Washington, Seattle, WA. [email protected] Joshua R. Smith is with the Departments of Computer Science and of Electrical Engineering, University of Washington, Seattle, WA. [email protected] frame; like vision, it is non-contact and does not disturb the object’s position. In our previous work, we introduced two different pretouch modalities, E-field sensing [5][6][7] and seashell effect pretouch sensing [8]. We demonstrated the use of seashell effect pretouch sensor to detect transparent and occluded object before grasping. Recently, Maldonado et al. [4] also shows the use of an optical pretouch sensor to explore the back side of objects. However, the selection of exploration for uncertain areas in prior work is not yet automatic. In this paper we propose a probabilistic framework to en- able automatic exploration with a pretouch sensor to reduce object shape uncertainty before grasping. The robot uses the information acquired by the depth sensor as initial data in a probabilistic mapping representation, and uses priors on occlusion and transparency to make inferences about possible occluded and transparent portions of the object. An iterative grasp replanning and exploration algorithm decides when further exploration is required, and where to explore in the scene using pretouch. The information gathered by the ex- ploration action is added directly to the robot’s environment representation. The main contributions of this work are: A combined grasp planning and pretouch sensing framework. We propose an iterative grasp replanning and exploration algorithm, in which the grasp planner automatically determines when and where to do the pretouch sensing exploration to collect additional in- formation about object shape. The first seashell effect pretouch sensor fully inte- grated into a robot (the PR2); the sensor replaces the PR2’s touch sensor pads, and uses the PR2’s existing hardware interfaces to eliminate all external wires for power and signal. II. RELATED WORK A common way to deal with grasping under uncertainty is to take a sequence of actions (sensing, pushing, etc) which reduces the uncertainty about the object. Planning an optimal sequence of these actions can be modeled as a partially observable Markov decision process (POMDP) problem, which is known to be PSPACE complete [9]. Therefore, POMDP-based action planning is usually used on abstracted state and action spaces [10]. Information-theoretic optimal action selection is widely applied to this set of problems. Hebert et al. [11] use Next-Best-Touch to estimate the pose of object. The Next- Best-View in-hand Kinect sensing [12] has the drawback of the difficulty of integration to a manipulator. Dogar et al.

A Unified Framework for Grasping and Shape Acquisition Via ... · the object shape, we use occlusion and transparency priors to extend the occupied grid cells detected by Kinect,

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: A Unified Framework for Grasping and Shape Acquisition Via ... · the object shape, we use occlusion and transparency priors to extend the occupied grid cells detected by Kinect,

A Unified Framework for Grasping and Shape Acquisition ViaPretouch Sensing

Liang-Ting Jiang and Joshua R. Smith

Abstract— This paper presents a unified framework to enableautomatic exploration with a pretouch sensor to reduce objectshape uncertainty before grasping. The robot starts with onlythe incomplete object shape data it acquires from a Kinectdepth sensor—it has no model of the object. Combiningthe Kinect pointcloud with prior probability distributions forocclusion and transparency, it makes inferences about portionsof the object that the Kinect was not able to observe. Operatingon the inferred shape of the object, an iterative grasp replanningand exploration algorithm decides when further exploration isrequired, and where to explore in the scene using the pretouchsensor. The information gathered by the exploration action isadded directly to the robot’s environment representation and isconsidered automatically in the next grasp planning iteration.

The pretouch sensor used is the first “seashell effect”pretouch sensor fully integrated into a robot (the PR2); thesensor replaces the PR2’s touch sensor pads, and uses the PR2’sexisting hardware interfaces to eliminate all external wires forpower and signal. This new non-contact sensor enables the robotto gather local object shape information without disturbing it.

The new sensor and framework was validated on the PR2robot for its usefulness in sensor-driven grasping of un-modeledobjects.

I. INTRODUCTION

Robotic grasping of unknown objects is a challenging taskdue to the uncertainty arising from imperfect perception.Vision and depth sensors are commonly used for sensingbefore grasping. However, when it comes time to execute agrasp, vision has shortcomings. The robot’s hand and armoften occludes a head-mounted camera’s view of the objectto be manipulated. A head-mounted camera is not in thesame coordinate frame as the hand, and thus even if thecamera has an unrestricted view of the object, uncertaintiesassociated with actuation can lead to manipulation errors.Optical sensors are also known to fail to detect transparentobjects, and to fail under poor lighting conditions.

Tactile exploration is widely used to acquire local geo-metric information of the object [1][2], and is much lesssubject to occlusion. However, because touch sensing relieson contact between the manipulator and the object, touchsensing tends to displace objects, particularly light objects.

Pretouch sensing has been proposed to address the aboveperception problems [3][4]. The aim of pretouch sensingis to combine the desirable features of touch and visionsensing: like touch, it provides information about the relativegeometry of the hand and the object, in the hand’s coordinate

Liang-Ting Jiang is with the Department of Mechanical Engineering,University of Washington, Seattle, WA. [email protected]

Joshua R. Smith is with the Departments of Computer Scienceand of Electrical Engineering, University of Washington, Seattle, [email protected]

frame; like vision, it is non-contact and does not disturb theobject’s position. In our previous work, we introduced twodifferent pretouch modalities, E-field sensing [5][6][7] andseashell effect pretouch sensing [8]. We demonstrated theuse of seashell effect pretouch sensor to detect transparentand occluded object before grasping. Recently, Maldonadoet al. [4] also shows the use of an optical pretouch sensorto explore the back side of objects. However, the selectionof exploration for uncertain areas in prior work is not yetautomatic.

In this paper we propose a probabilistic framework to en-able automatic exploration with a pretouch sensor to reduceobject shape uncertainty before grasping. The robot uses theinformation acquired by the depth sensor as initial data ina probabilistic mapping representation, and uses priors onocclusion and transparency to make inferences about possibleoccluded and transparent portions of the object. An iterativegrasp replanning and exploration algorithm decides whenfurther exploration is required, and where to explore in thescene using pretouch. The information gathered by the ex-ploration action is added directly to the robot’s environmentrepresentation. The main contributions of this work are:

• A combined grasp planning and pretouch sensingframework. We propose an iterative grasp replanningand exploration algorithm, in which the grasp plannerautomatically determines when and where to do thepretouch sensing exploration to collect additional in-formation about object shape.

• The first seashell effect pretouch sensor fully inte-grated into a robot (the PR2); the sensor replaces thePR2’s touch sensor pads, and uses the PR2’s existinghardware interfaces to eliminate all external wires forpower and signal.

II. RELATED WORK

A common way to deal with grasping under uncertainty isto take a sequence of actions (sensing, pushing, etc) whichreduces the uncertainty about the object. Planning an optimalsequence of these actions can be modeled as a partiallyobservable Markov decision process (POMDP) problem,which is known to be PSPACE complete [9]. Therefore,POMDP-based action planning is usually used on abstractedstate and action spaces [10].

Information-theoretic optimal action selection is widelyapplied to this set of problems. Hebert et al. [11] useNext-Best-Touch to estimate the pose of object. The Next-Best-View in-hand Kinect sensing [12] has the drawback ofthe difficulty of integration to a manipulator. Dogar et al.

Page 2: A Unified Framework for Grasping and Shape Acquisition Via ... · the object shape, we use occlusion and transparency priors to extend the occupied grid cells detected by Kinect,

[13] demonstrated a push-grasp algorithm that addresses theuncertainty about object poses in clutter. Recently Platt et al.[14] proposed to simultaneously localize and grasp knownobjects under uncertainty by creating plans in belief-space,the space of probability distributions over the underlyingstate of the system. The algorithm estimates the state ofthe system and attempts to reach a goal region with highprobability. Javdani et al. [15] proposed a method to selecta sequence of information gathering actions to localize andgrasp known objects by considering many hypotheses of ob-ject pose and greedily select actions expected to disprove themost hypotheses. This formulation is shown to be adaptivesubmodular.

In contrast to the above work, our framework aims tograsp unknown or novel objects. For known objects (oneswhere a model of object shape is available), the problemcan be approximately reduced to a pose and state estimationproblem. In our framework, the object’s shape is not avail-able to the robot in advance, and pretouch sensing collectsadditional information about object shape, rather than beingused to estimate object pose. Dang et al. [16] utilized tactilefeedback to predict the stability of a robotic grasp withoutknowing the object’s model, but the action after identifyinginstability is not mentioned in their work. Our frameworkcombines grasp planning and shape sensing natually: thegrasp planner automatically determines when and where todo the pretouch sensing exploration, and the informationgathered during exploration will be considered during thenext iteration of the grasp planning, i.e. the shape uncertaintyis reduced in an attempt to a better grasp.

III. FRAMEWORK AND ENVIRONMENTREPRESENTATION

The framework consists of two main subsystems. Therobot first infers the object shape from Kinect data in aprobabilistic occupancy map representation. The constructedoccupancy probability distribution is later used by the it-erative grasp replanning algorithm to decide when furtherexploration is required, and where to explore in the scene.The robot executes the grasp when the algorithm terminates.The flowchart of the framework is shown in Figure 1.

To let the robot plan under uncertainties, an probabilis-tic environment representation is required. Hollinger et al.[17] used Gaussian process implicit surfaces to representthe uncertainty and fomulate the avtive sensing problemto a problem of minimizing the average variance. We useOctoMap [18], a compact and efficient probabilistic 3-Dgrid environment representation which stores the occupancyprobability at each grid cell, to represent the uncertaintyaround the tabletop and objects. OctoMap has become thestandard the standard collision map environment [19] foronline motion planning. It provides easy conversion betweenoccupancy grids and pointclouds, which makes it easy to usevarious algorithms in the Point Cloud Library (PCL) [20].

Tabletop Probabilistic Occupancy Grids

(OctoMap)

Occlusion Prior

Transparency Prior

Probabilistic Grasp Planning(Weighted PCA)

Kinect Sensor

ComputeStability (Subroutine 1)

Grasp Pose

NO (s > st)

Pointcloud with probability

Object Shape Inference

Grasp Execution

Pretouch Exploration

BayesianUpdates

YES (s < st)

Pointcloud

Ray casting

Iterative Grasp Re-PlanningAnd Exploration (Algorithm 1)

Pretouch Required

Fig. 1. The flowchart of the probabilistic framework for pretouchexploration and grasping under uncertainty

IV. INFERRING OBJECT SHAPE FROM KINECTDATA

Glover et al. [21] proposed to learn probablistic generativemodels of a class of objects from 2-D image, and thelearned models can be used to infer the hidden parts tocomplete the object contour before manipulation. We take adifferent approach by using priors inspired from the sensorcharacteristics to infer the object shape in 3-D for the purposeof further pretouch sensing and grasping. No model trainingis requred in our framework. We first construct an initialoccupancy distribution from Kinect data, and then inferthe object shape using prior probability distributions forocclusion and transparency in the 3-D space. Specifically,all occupancy grid cells are initialized with the uniformdistribution (p = 0.5), and are updated using ray-castingfrom the Kinect pointcloud to build the initial occupancydistribution. To make the robot aware of the uncertainty ofthe object shape, we use occlusion and transparency priors toextend the occupied grid cells detected by Kinect, inferringpotentially occluded and transparent parts of the object.

A. Ray casting from the Kinect Pointcloud

A beam-based inverse sensor model of the Kinect is usedto update the occupancy grid. The underlying ray-castingalgorithm is a 3D variant of the Bresenham algorithm [22].The occupancy of the grid cells along the ray are updatedby Bayesian updates using the Kinect’s beam-based inversesensor model (pkinect,hit = 0.7 and pkinect,miss = 0.4).

Page 3: A Unified Framework for Grasping and Shape Acquisition Via ... · the object shape, we use occlusion and transparency priors to extend the occupied grid cells detected by Kinect,

B. Occlusion Prior

Occlusion is a common problem in object grasping. Oftenit causes underestimates of the object depth. To deal withthe occlusion problem, the robot should have doubt aboutthe backside of the object. Based on the physics of the depthsensor, grid cells along the ray behind each point sensed bythe Kinect are unknown and possibly occupied. In the themost general case, occluded portions of an object could ap-pear anywhere from the detected point to the maximum rangeof the sensor. In the tabletop environment, we can truncatethe distribution at the table surface. For the occupancy prior(along the ray from a sensed point to the table top), we usean exponential distribution. This corresponds to the simpleassumption that for a grid cell i and its neighbor along theray i−1, the conditional probability of occupancy p(i|i−1)is a constant. Another interpretation is that the exponentialdistribution models the distribution of the distance from theactually detected point to the first non-occupied grid cellalong the ray given the conditional independence mentionedabove. The inferred occupancy probability along the rayis given by p = p0 × e−λoccx, where p0 is the occupiedprobability on the actually detected point, x is the distancein meters away from the detected point, and λ is theexponential decay rate. In our experiments, we use λocc =4.0. For example, if pkinect,hit were 1.0, then the occupancyprobability would have decayed to 0.5 at a distance of 17 cm.Essentially the decay rate in our occupancy prior encodes therobot’s initial assumptions about the size of the object.

C. Transparency Prior

Transparency and non-Lambertian objects are anothercommon error source for depth sensors. One importantobservation when inspecting the Kinect data for transparentobjects is that although the transparent parts are not detectedby the sensor, the transparent parts still block the ray sonothing can be detected behind the transparent cell along theray. Based on this observation, we can search for unknownareas in the tabletop scene, where points should be observedbut nothing is detected. If we cast a ray from the Kinectsensor along a ray that has no depth data, there are likely tobe transparent parts somewhere along the ray.

There are several possible ways to look for these suspi-cious areas. In the tabletop environment, we do pointcloudprocessing to find the holes (shadows) on the table, andthen cast rays to those shadow points from the Kinectsensor frame. We also cast rays upward and downward fromoccupied grid cells, and if any of the rays intersects with therays cast from the sensor to shadow points, the intersectinggrids are potential transparent cells. We made a similarassumption that the likelihood of finding transparent gridsdecays exponentially with distance away from occupied gridcells. The occupancy probability associated with transparentgrid cells is thus defined as p = p0× e−λtransx, where p0 isthe occupancy probability on an occupied grid cell, x is thedistance from the grid cell, and λ is the exponential decayrate. In our experiments, we use λtrans = 4.0.

(a)

(b)

Pointcloudfrom Kinect

OcclusionUncertainty

TransparencyUncertainty

Shadow

Grids CastedFrom Kinect

1.0

0.5

Fig. 2. (a) the pointcloud detected by Kinect is missing the transparentportion of the bottle, and also underestimating the depth of the object. (b) theprobabilistic map representation after tabletop uncertainty processing. Thetransparency and occlusion prior are used to extend the initial occupancydata obtained from Kinect. (The color bar represents occupancy probabilitybetween 0.5-1.0 in the logit space.)

D. Object Shape Inference Results

Figure 2 shows the visualization of the constructed prob-abilistic occupancy map after inferring object shape fromKinect data. The occupancy distribution was extended towardthe potential occluded and transparent portions of the object.In this particular example, the transparency prior matches thethe actual object shape even at the nozzle of the bottle. Wecan also see that grid cells are less likely to be occupied thefurther they are from the Kinect pointcloud.

V. SEASHELL EFFECT PRETOUCH SENSOR

We use the previously introduced seashell effect pretouchsensor [8] for the pretouch explorative sensing. Unlike theprevious sensor, the sensor used in this paper is completelyintegrated into the Willow Garage PR2 robot. A customizedprinted circuit board (PCB) and fingertip structure was de-signed to hold all the electronic and mechanical components,including the microphone and the pipe. Figure 3(b) shows thePCB and fingertip fixture with all the components attached,and Figure 3(a) shows the sensor fingertip installed on theWillow Garage PR2 gripper. The sensor data is transmittedto the FPGA in the PR2’s gripper, and is then accessiblefrom the FPGA to Robot Operating System (ROS) throughthe EtherCAT interface. The new embedded sensor designeliminates the constraints on robot arm motion caused bythe external wires and electronics, and thus broadens the ap-plicability of the sensors. The design presented here could beadapted to integrate the sensor into other robotic platforms.

Page 4: A Unified Framework for Grasping and Shape Acquisition Via ... · the object shape, we use occlusion and transparency priors to extend the occupied grid cells detected by Kinect,

(a)

MicrocontrollerMicrophone &Sensor cavity

Audio Amplifier

(b)

Fig. 3. (a) The integrated seashell effect pretouch sensing fingertip installedon the PR2 robot gripper. The microphone for the sensing channel isattached with an acoustic cavity to amplify (attenuate) the ambient sound.The microphone for the reference channel is used to collect the ambientsound for noise cancellation. (b) The sensor consists of a printed circuitboard, a microcontroller (Atmel ATMega168), electronic components, amicrophone, and an acoustic cavity integrated into the PR2 fingertip. Thesensor uses the existing hardware SPI interfaces on the gripper for powerand data communication with the PR2 robot.

VI. ITERATIVE GRASP REPLANNING ANDPRETOUCH EXPLORATION

After constructing the occupancy map by inferring theobject from Kinect data, an iterative grasp replanning andexploration algorithm is used to decide when further explo-ration is required, and where to explore in the scene usinga pretouch sensor. The information gathered by the explo-ration action is added directly to the robot’s environmentrepresentation and is considered automatically in the nextgrasp replanning iteration. The algorithm will terminate andexecute the grasp when it finds a grasp with high stabilityfor both the left finger and the right finger. Figure 1 outlinesthe schematic and flow of the framework. In the followingsubsections, we introduce the individual pieces required forthe algorithm.

A. Probabilistic Grasp Planning

The grasp planner plans grasps based on the occupiedgrids inferred from the objects in the probabilistic occupancymap. We use a variant of the cluster-based grasp planner forunknown objects described in [23]. The original planer firstfinds a oriented bounding box of the object using principalcomponent analysis (PCA) from the object’s pointcloud, andthen finds grasps aligning the PR2’s parallel gripper with the

(a)

(b) (d)

(c)k-nearest neighbors to

finger location

gripperfinger

location

Weighted PCA from

Probabilisticpointcloud

PCAfromProbabilisticpointcloud

PCA fromoriginal

pointcloud

0.5

1.0

Fig. 4. (a) The object (b) The comparison of the bounding boxes computedusing the regular PCA and weighted PCA. (c) Visualization of the gripperfinger location and the k-nearest points in the occupancy map to the fingerlocation for a given grasp pose displayed with the original object pointclouddetected by Kinect. (d) Visualization of the gripper finger location and thek-nearest points in the occupancy map to the finger location for a givengrasp pose displayed with the probabilistic occupancy map after inferringthe object’s shape from Kinect data.

principal axes. Our modified version of this grasp plannertakes the occupancy probability of each point in the objectpointcloud into account. We first convert the probabilisticoccupancy map into a probabilistic pointcloud with proba-bility associated with each point for the grids with occupancyprobability p > 0.5. The planner then finds the objectbounding box by computing the weighted PCA accordingto the probabilistic pointcloud. Figure 4(b) compares thedifferent object bounding boxes computed using the regularPCA and weighted PCA, and on the original pointcloudfrom Kinect, and the probabilistic pointcloud converted fromthe occupancy map. We can see the PCA bounding boxcomputed from the original pointcloud (red) does not coverthe transparent parts of the object, and the dimension in thedepth direction is too short due to the occlusion. In contrast,the weighted PCA bounding box computed from the proba-bilistic pointcloud (green) takes the occupancy probability inthe potentially transparent and occluded portion into account,thus provides the grasp planner the opportunity to plan agrasp around those areas.

Page 5: A Unified Framework for Grasping and Shape Acquisition Via ... · the object shape, we use occlusion and transparency priors to extend the occupied grid cells detected by Kinect,

Pretouch Point

Contact Point

Fig. 5. An example pretouch pose and direction. The robot uses the seashelleffect pretouch sensor to probe toward the potential contact area on theobject.

B. Grasp Stability Evaluation

To decide whether a planned grasp is ready to be excutedor not, we evaluate the its stability. We first find the po-tential contact areas on the object by finding the occupancyprobability pl,i and pr,i of the k-nearest neighbor points, fori = [0, k − 1], in the probabilistic pointcloud to the left andright fingertip surface centroid. We define sl and sr as thestability for a given grasp at the left finger and the rightfinger, respectively, to be the mean occupancy probability:sm = (

∑k−1i=0 pm,i)/k for m = l, r. We further define a

stability thresold st. If either sl or sr is smaller than thethresold st, it indicates the planned grasp is attempting tomake contact with the object from an uncertain area. Therobot would need to explore that area using the pretouchsensor to reduce the uncertainty instead of excuting the grasp.The subroutine to compute the stability of a planned graspis described in Subroutine 1.

C. Pretouch Exploration

When the planned grasp is not certain, pretouch explo-ration is required to reduce the uncertainty before grasping.The desired pose for the robot to probe using the pretouchsensor on its fingertip can be computed from the plannedgrasp. The pose is pointing to the gripper closing directionof the planned grasp, and pointing at the centroid of thek-nearest neighbor points. During the pretouch probe ex-ecution, the robot moves the sensor to the pretouch pose,and moves the end effector toward the potential contact areaat constant speed. Figure 5 illustrates an example pretouchprobe configuration. While probing, the robot continuouslyuses the pretouch sensor’s inverse sensor model to updateoccupancy probability of the grids within the sensing rangeof 4 mm (ppretouch,hit = 0.95, ppretouch,miss = 0.48). Fig-ure 6 shows an example of how a single probe succesfulltyupdates the occupancy when it detects the object.

D. Iterative Grasp Replanning and Pretouch Exploration

The complete procedure is described in Algorithm 1. Therobot first uses the weighted PCA cluster-based planner tofind a grasp candidate. The stability of the planned grasp iscomputed using Subroutine 1. For each side of the gripper’sufinger, If the stability is lower than the stability thresholdst, the robot uses pretouch sensor to explore the potential

Subroutine 1 ComputeStability (m,Grasp)Input: m: the side of the finger (l or r)Input: Grasp: a planned grasp from the grasp plannerOutput: sm: the contact stability at finger m

1: Find k-nearest neighbors in the probabilistic pointcloudto finger m on Grasp

2: for i← 0 to k − 1 do3: pm,i ← occupancy probability of point i4: end for5: sm ← (

∑k−1i=0 pm,i)/k

6: return sm

(a)

0.5

1.0(b)

Fig. 6. (a) the probabilistic occupancy map before the pretouch sensordetects the object; (b) the probabilistic occupancy map after the pretouchsensor detects the object and updates the occupancy probability around usingits sensor model. The color bar represents occupancy probability between0.5-1.0 in the logit space

contact areas, and then replan a grasp based on the updatedoccupancy grids. The algorithm terminates and execute theplanned grasp when a stable grasp is found.

Algorithm 1 Iterative Grasp Replanning and Exploration1: st ← stability threshold2: StableGraspFound← False3: while not StableGraspFound do4: Grasp← probabilistic grasp planning results5: for m = l, r do6: sm ← ComputeStability(m,Grasp)7: if sm < st then8: execute pretouch probe to the contact area9: end if

10: end for11: if no pretouch is excuted then12: StableGraspFound← True13: end if14: end while15: execute Grasp

VII. EXPERIMENTAL RESULTS

We evaluate the proposed framework on the PR2 robotusing the seashell effect pretouch sensor decribed in SectionV for object grasping. The framework’s parameters used inthis set of experiments are: st = 0.7, pkinect,hit = 0.7,pkinect,miss = 0.4, ppretouch,hit = 0.95, ppretouch,miss =0.48, λtrans = 4.0, λocc = 4.0, and k = 40 for the grasp sta-bility threshold, kinect’s inverse sensor model, pretouch sen-

Page 6: A Unified Framework for Grasping and Shape Acquisition Via ... · the object shape, we use occlusion and transparency priors to extend the occupied grid cells detected by Kinect,

Pretouch Grasp

ProbabilisticMap

Pointcloud

Pointcloud

Pretouch Grasp

(a)

ProbabilisticMap

(b)

Fig. 7. A set of example system states during the progress of graspingusing the proposed framework for the two objects (a) a coffee press; (b) apartially transparent juice bottle.

sor’s inverse model, the occlusion and transparency prior’sdecay rates, and the number of neighbor points to evaluatethe contact stability. The grasp stability st is determinedby testing with the combination of other parameters. Byusing a higher threshold, the framework is more robust, butalso make the algorithm slower to converge. The OctoMapresolution we use for the tabletop environment is 5 mm.

Two objects, a coffee press and a partially transparentbottle, were tested multiple times. Figure 7 shows a set ofexample system states during the progress of grasping forthe two objects. The coffee press is almost completely trans-parent except right at the bottom. Therefore, the pointcloudfrom Kinect is only at the bottom of the object as we can seein Figure 7(a). Most of the grasp planners would fail in thiscase. Using our framework, the initial probabilistic map afterinferring the object shape from the Kinect data successfullyextends the occupancy prior from the bottom to the upper

TABLE IRESULTS OF GRASPING TRIALS USING THE PROPOSED FRAMEWORK

Averaged numberof pretouch probesbefore grasping

Trials SuccessGrasp

SuccessRate

Coffee Press 1.54 12 12 100%Bottle 0.40 10 9 90%

transparent portion. The grasp planner can thus plan a graspbased on this prior occupancy distribution as we introducedin Section VI-A. However, since the extended occupancyprobability is lower than the occupancy probability at actualsensed grids, the planned grasp is often required to contactwith some hypothesized portion of the object. In conse-quence, the Algorithm 1 determines to probe those areas andthus reduces the uncertainty in the system at each iteration.In the example shown in 7(a), the pretouch probe sucessfullyincreases (verifies) the occupancy probability of grids nearthe grasping contact area, and the followed planned graspwas planned and execute at the verified location. A video ofthis example experiment is online at:

https://sensor.cs.washington.edu/pretouch/icra13/

The partially transparent bottle, on the other hand, requiresmuch less pretouch sensing before grasp execution. Thesensed pointcloud for the bottle is usually sufficient to plana stable grasp, because the grasp planner plans the grasp infavor of the area with higher occupancy probability usingweighted PCA. In contrast, the grasp planner is forced toplan grasps on uncertain areas for the coffee press becausethere are very few grids with high occupancy probability.We do notice the extended occupancy prior also makes thegrasp planner tend to favor front grasps in terms of theKinect’s coordinate frame, which could significantly reducesthe errors arising from underestimating the object depthwhile doing side grasps. Figure 7(b) shows an example ofsuccesful front grasp.

Table I summarizes the outcomes of the grasping trialson the two objects. The grasping tasks are mostly successfulexcept one case of inverse kinematics failure. Not surprising,the frequency of the pretouch probes is much higher whengrasping the coffee press than when grasping the bottle.Nonetheless, the iterative grasp-replanning and explorationalgorithm usually terminates within 5 pretouch probes byreducing the shape uncertainty and succesfully grasp thecoffee press.

VIII. CONCLUSIONS AND FUTURE WORK

We proposed and validated a unified probabilistic frame-work for pretouch exploration and grasping under uncer-tainty. The iterative grasp replanning and exploration reducesthe uncertainty of the object shape at each iteration of thealgorithm. We showed that under this framework the PR2robot is able to grasp an almost completely transparentunknown object with a high success rate by using pretouchexploration, which is not possible when using traditional

Page 7: A Unified Framework for Grasping and Shape Acquisition Via ... · the object shape, we use occlusion and transparency priors to extend the occupied grid cells detected by Kinect,

grasp planning on a deterministic pointcloud. The new fully-integrated seashell effect pretouch sensor enables the robotto gather local shape information of the object withoutdisturbing it.

In the future, we plan to experiment with different prob-abilistic grasp planners. While the current planner we usealready takes the object shape uncertainty into account, itmay not be the best-suited grasp planner for this framework.One could imagine designing a planner to favor previouslyexplored areas to speed up the convergence of the iterativegrasp replanning algorithm. Under the current framework,other sensing modalities can be integrated smoothly, such astouch sensors. We plan to integrate our previously introducedE-field pretouch sensor into the current PR2’s seashell effectpretouch finger sensor. Using multiple pretouch sensingmodalities will make each pretouch exploration more in-formative (i.e., reduce more uncertainty), and our algorithmcan converge faster and become more robust by setting upa higher stability threshold. To further utilize our approach,we also consider combining the grasp planning and stabilityevaluation with potentially graspable features in the voxelgrid by matching against a large database [24].

REFERENCES

[1] R. Platt, F. Permenter, and J. Pfeiffer, “Inferring hand-object configuration directly from tactile data,” in ICRAWorkshop on Mobile Manipulation, Anchorage, USA,2010.

[2] A. Petrovskaya and O. Khatib, “Global localization ofobjects via touch,” IEEE Transactions on Robotics, vol.27(3), pp. 569–585, 2011.

[3] K. Hsiao, P. Nangeroni, M. Huber, A. Saxena, andA. Y. Ng, “Reactive grasping using optical proximitysensors,” in To appear in International Conference onRobotics and Automation, Kobe, Japan, 2009.

[4] A. Maldonado, H. Alvarez, and M. Beetz, “Improv-ing robot manipulation through fingertip perception,”in IEEE/RSJ International Conference on IntelligentRobots and Systems, Vilamoura, Portugal, 2012.

[5] J. R. Smith, E. Garcia, R. Wistort, and G. Krish-namoorthy, “Electric field imaging pretouch for roboticgraspers,” in IEEE/RSJ International Conference onIntelligent Robots and Systems, San Diego, USA, 2007.

[6] R. Wistort and J. R. Smith, “Electric field servoingfor robotic manipulation,” in IEEE/RSJ InternationalConference on Intelligent Robots and Systems, Nice,France, 2008.

[7] B. Mayton, L. LeGrand, and J. Smith, “An electric fieldpretouch system for grasping and co-manipulation,”in IEEE International Conference on Robotics andAutomation, 2010.

[8] L. Jiang and J. Smith, “Seashell effect pretouch sensingfor robotic grasping,” in IEEE International Conferenceon Robotics and Automation, Saint Paul, USA, 2012.

[9] S. Ross, J. Pineau, S. Paquet, and B. Chaib-draa, “Usingbayesian filtering to localize flexible materials during

manipulation,” The Journal of Machine Learning Re-search, vol. 32, 2008.

[10] K. Hsiao, L. Kaelbling, and T. Lozano-Perez, “Grasp-ing pomdps,” in IEEE International Conference onRobotics and Automation, Roma, Italy, 2007.

[11] P. Hebert, J. W. Burdick, T. Howard, N. Hudson, andJ. Ma, “Action inference: The next best touch,” inRobotics: Science and Systems Workshop on MobileManipulation, Sydney, Australia, 2012.

[12] C. Potthast and G. S. Sukhatme, “A probabilisticframework for next best view estimation in a clutteredenvironment,” in IEEE/RSJ International Conferenceon Intelligent Robots and Systems, San Francisco, USA,2011.

[13] M. Dogar and S. Srinivasa, “A framework for push-grasping in clutter,” in Proceedings of Robotics: Scienceand Systems, Los Angeles, USA, 2011.

[14] R. Platt, L. Kaelbling, T. Lozano-Perez, and R. Tedrake,“Non-gaussian belief space planning: Correctness andcomplexity,” in IEEE International Conference onRobotics and Automation, Saint Paul, USA, 2012.

[15] S. Javdani, M. Klingensmith, J. A. D. Bagnell, N. Pol-lard, and S. Srinivasa, “Efficient touch based localiza-tion through submodularity,” Robotics Institute, Pitts-burgh, PA, Tech. Rep. CMU-RI-TR-12-25, 2012.

[16] H. Dang, J.Weisz, and P. Allen, “Blind grasping: Stablerobotic grasping using tactile feedback and hand kine-matics,” in IEEE International Conference on Roboticsand Automation, Shanghai, China, 2011.

[17] G. Hollinger, B. Englot, F. Hover, U. Mitra, andG. Sukhatme, “Uncertainty-driven view planning forunderwater inspection,” in IEEE International Confer-ence on Robotics and Automation, 2012.

[18] K. M. Wurm, A. Hornung, M. Bennewitz, C. Stachniss,and W. Burgard, “OctoMap: A probabilistic, flexible,and compact 3D map representation for robotic sys-tems,” in ICRA Workshop on Best Practice in 3DPerception and Modeling for Mobile Manipulation,Anchorage, USA, May 2010.

[19] “Collider ros package,” http://ros.org/wiki/collider.[20] R. Rusu, “3d is here: Point cloud library (pcl),” in IEEE

International Conference on Robotics and Automation,Shanghai, China, 2011.

[21] J. Glover, D. Rus, and N. Roy, “Probabilistic modelsof object geometry for grasp planning,” in Robotics:Science and Systems Conference, 2009.

[22] J. Amanatides and A. Woo, “A fast voxel traversal al-gorithm for ray tracing,” in Eurographics,, Amsterdam,The Netherlands, August 1987.

[23] K. Hsiao, S. Chitta, M. Ciocarlie, and E. Jones,“Contact-reactive grasping of objects with partial shapeinformation,” in IEEE International Conference onRobotics and Automation, 2010.

[24] L. Zhang, M. Ciocalie, and K. Hsiao, “Evaluationwith graspable feature matching,” in RSS Workshopon Mobile Manipulation: Learning to Manipulate, LosAngeles, USA, 2011.