7

Click here to load reader

Segmentation of Pen Strokes Using Pen Speed · Segmentation of Pen Strokes Using Pen Speed ... and freedom of sketching, ... corroborating pen speed information. Next, the ink between

Embed Size (px)

Citation preview

Page 1: Segmentation of Pen Strokes Using Pen Speed · Segmentation of Pen Strokes Using Pen Speed ... and freedom of sketching, ... corroborating pen speed information. Next, the ink between

Segmentation of Pen Strokes Using Pen Speed

Thomas F. StahovichMechanical Engineering Department

University of CaliforniaRiverside, California 92521

[email protected]

Abstract

We present a technique for segmenting pen strokes intolines and arcs. The technique uses pen speed infor-mation to help infer the segmentation intended by thedrawer. To begin, a set of candidate segment pointsis identified. This set includes speed minima below athreshold computed from the average pen speed. It alsoincludes curvature maxima at which the pen speed isagain below a threshold. The ink between each pair ofconsecutive segment points is then classified as eithera line or arc, depending on which fits best. Finally, afeedback process is employed, and segments are judi-ciously merged and split as necessary to improve thequality of the segmentation. Formal user studies wereconducted, and our system was observed to perform ac-curately, even for new users.

IntroductionDespite the power and sophistication of modern engineeringdesign tools, engineers often avoid using them until late inthe design process. Instead, it is common for engineers todo much of their early work on paper, using sketches exten-sively. After the major design issues have been resolved, thesketched designs are then recreated on the computer in orderto take advantage of the capabilities of design software. Theproblem here, we believe, is the cumbersomeness of the tra-ditional user interface. When designs are in flux, the incon-venience of such user interfaces places too much overheadon the creative process.

In our research, we are working to change this by creatinguser interfaces that allow users to operate software by meansof familiar sketching skills. The ultimate goal is to createsoftware that is as easy to use as paper and pencil, yet is aspowerful as traditional software. Rather than the user havingto learn how to use software, software should be able to read,understand, and use the kinds of sketches people ordinarilydraw. For example, an engineer should be able operate amechanical simulation tool by drawing the kinds of simplesketches that he or she would draw when solving problemsby hand.

In attempting to create software that embodies the easeand freedom of sketching, care must taken to avoid placingnew constraints on the drawing process. For example, some

Copyright c© 2004, UCR Smart Tools Lab. All rights reserved.

existing sketch-based systems require that each pen strokerepresent a single shape, such as a single line or arc seg-ment (Igarashi et al. 1997; Shpitalni & Lipson 1996). Othersystems allow pen strokes to have more complicated shapes,but, each stroke must constitute a single symbol or gesture(Rubine 1991; Landay & Myers 2001). While these kinds ofconstraints on drawing facilitate shape recognition, they canresult in a less than natural drawing environment.

The work presented here concerns the low level process-ing of pen strokes necessary to overcome some of thesekinds of constraints. In particular, we present an approachfor automatically segmenting pen strokes into the intendedgeometric primitives. Our approach enables one to draw ashape with as few or as many stokes as desired. For example,one can draw a triangle with one, two, or three pen strokes.Likewise, it enables one to include parts of different shapesor symbols in the same pen stroke.

The challenge in segmenting a pen stroke is decidingwhich bumps and bends are intended, and which are acci-dents. We have found it difficult to determine this by con-sidering shape alone. The size of the deviation from an idealline or arc is not a reliable indicator of what was intended:sometimes small deviations are intended, while other timeslarge ones are accidents. Our approach to segmentation re-lies on examining the motion of the pen tip as the pen strokesare created. We have observed that it is natural to slow thepen when making many kinds of intentional discontinuitiesin a shape. For example, even if not drawn with four preciselines, the intended corners of a square can be easily identi-fied as points at which the pen speed is a local minimum.

Our segmenter begins by examining a pen stroke to iden-tify the segment points, the points that divide the stroke intodifferent primitives. The initial set of candidate segmentpoints includes speed minima below a threshold, which iscomputed from the average pen speed. Points at which cur-vature is a maximum are also included, but only if there iscorroborating pen speed information. Next, the ink betweeneach pair of consecutive segment points is classified as ei-ther a line or an arc, depending upon which best fits theink. Finally, a feedback process is employed, and segmentsare judiciously merged and split as necessary to improve thequality of the segmentation.

Our segmenter serves as a foundation for our broader re-search efforts to build sketch understanding systems. For

Page 2: Segmentation of Pen Strokes Using Pen Speed · Segmentation of Pen Strokes Using Pen Speed ... and freedom of sketching, ... corroborating pen speed information. Next, the ink between

example, we have constructed parsers that examine a streamof segmented pen strokes and extract individual symbols(Gennari, Kara, & Stahovich 2004; Kara, Gennari, & Sta-hovich 2004). We have also constructed recognizers thatclassify symbols by examining their segmented pen strokes(Calhoun et al. 2002; Gennari, Kara, & Stahovich 2004).We have combined all of these tools and techniques to buildvarious sketch-based engineering applications, such as a cir-cuit analysis tool (Gennari, Kara, & Stahovich 2004) and atool for analyzing vibratory systems (Kara, Gennari, & Sta-hovich 2004).

The next section provides an overview of related work.This is followed by a detailed, technical description of ourapproach. A user study evaluating the performance of oursegmenter is then presented. Finally, proposed future workis discussed, and conclusions are presented.

BackgroundSegmentation of pen strokes is similar to corner detectionin digital curves. Corner detection algorithms typically lo-cate corners by searching for points of maximum curva-ture. To suppress noise and false corners, the data must besmoothed. The main difficulty is selecting a reliable amountof smoothing. Early approaches (e.g., (Teh & Chin 1989))relied on a single scale, which created difficulties for curvescontaining both large and small features. Later work has ad-dressed the problem of curves containing features at variousscales. For example, Rattarangsi and Chin (1992) developeda scale-space approach, in which curvature maxima that per-sist across multiple scales indicate corner points. Likewise,Lee et al. (1995) developed a multi-scale algorithm basedon the wavelet transform. Sezgin has applied a multi-scaleapproach to sketches. His work suggests that curvature dataalone is inadequate for segmenting hand-drawn pen strokes(see below).

Yu (2003) has developed a pen stroke segmentation ap-proach in which the curvature and tangent angle are itera-tively smoothed. The resulting segmentation is compared tothe original ink, and if the fit is not precise, the stroke is re-cursively subdivided until it is. In our experiments, we havefound that a precise fit to the raw ink is often not what thedrawer intended.

The earliest report of using pen speed for segmenting wehave been able to find is the work of Herot (1976). His sys-tem found corners by identifying points at which pen speedwas a minimum. The author reported that the system didnot work well for all users and he concluded that the pro-gram contained a “model of human sketching behavior thatfit some users more closely than others.”

Sezgin, Stahovich, and Davis (2001) presented a tech-nique that uses speed and curvature to segment hand-drawnpen strokes. Segment points are located at points of minimalspeed and maximal curvature. The technique is suitable forsegmenting pen strokes into line segments, but it cannot han-dle arcs. Also, this technique iteratively adds segment pointsuntil the error of fit between the line segments and raw inkis less than a threshold. Our approach is far less concernedwith the error of fit, as a tight fit to the ink is often not whatwas intended. As a variant of this technique, Sezgin (2001)

explored the use of multi-scale methods for selecting speedminima and curvature maxima. However, he found that un-less the pen strokes were exceptionally noisy, there was littlebenefit in doing so.

Agar and Novins (2003) have developed a segmenter forpolygons. If the mouse is stationary for more than 30ms, thelocation is taken to be a segment point. This is analogousto our pen speed approach, but it requires that the mouse bepaused at each corner. Additionally, the approach can handleonly line segments and not arcs.

Dudek and Tsotsos (1997) have developed a novel ap-proach that directly searches for segments rather than seg-ment (corner) points. Energy minimization is used to com-pute an approximation curve that best matches the inputcurve while at the same time attempting to maintain a de-sired curvature. If a low energy approximation cannot befound, the approximation curve is subdivided and the pro-cess is iterated. This process is repeated with different val-ues of the desired curvature, and can result in overlappingsegments with different curvature values. The approach maynot be suited to sketches, as most shape recognition tech-niques assume that segments do not overlap.

Segmenting TechniqueTo begin the segmentation process, an initial set of candidatesegment points is identified. This set includes speed minimabelow a threshold computed from the average pen speed.It also includes curvature maxima at which the pen speed isbelow a threshold. The ink between each pair of consecutivesegment points is then classified as either a line or an arc,depending on which fits best.

Although the initial segmentation is usually accurate, itcan often be improved through feedback. For example, iftwo adjacent segments form pieces of the same arc, it islikely that they were intended to be so. In this case, thetwo are merged into a single segment. Conversely, if a seg-ment is a particularly poor fit for the ink, this suggests thata segment point may have been missed. This often occurswhen there is a smooth change in the sign of curvature, forexample, when moving from one lobe of an “S” shape to theother as shown in Figure 4. This kind of transition can bemade without slowing the pen, and so cannot be detected asa speed minimum. Consequently, if a segment is a poor fitfor the ink, points at which the curvature changes sign areconsidered as possible additional segment points.

The sections that follow describe the various steps of thesegmentation process including: initial processing of theink, identification of segment points, fitting of segments, andmerging and splitting.

Initial Processing of the InkOur software is designed work with a digitizing tablet andstylus, or other similar device, that provides time-stampedcoordinates. For example, we have used Wacom Cintiq andIntuos2 tablets, and a Tablet PC. During the initial process-ing phase, we use the time-stamped coordinates to computepen speed and curvature. The first step is to construct thearc length coordinate of each point. Arc length is measured

Page 3: Segmentation of Pen Strokes Using Pen Speed · Segmentation of Pen Strokes Using Pen Speed ... and freedom of sketching, ... corroborating pen speed information. Next, the ink between

along the path of the pen stroke, and is computed in the ob-vious way by summing straight line distances:

di =i∑

j=1

∥∥∥�Pj − �Pj−1

∥∥∥ (1)

where �Pj is the coordinates of the j th data point. The firstdata point has index j = 0 and d0 = 0.

We then use a centered finite difference approach to com-pute pen speed:

si =di+1 − di−1

ti+1 − ti−1(2)

where ti is the time-stamp of the ith point. The speed at thefirst and last point of a pen stroke is taken to be equal to thatat the second and penultimate points, respectively. Often,there is noise in the pen speed signal. To correct this, weapply a simple smoothing filter: The speed at each point isaveraged with that of the two points on either side.

There are various ways of computing curvature. For ex-ample, one could use the standard analytic geometry tech-nique for computing the curvature of parametric curves(Mortenson 1985). Our approach, however, is to computecurvature as the derivative of the tangent angle, θ, with re-spect to arc length:

C =∂θ

∂s(3)

We use this approach for several reasons. First, our sys-tem already computes an accurate tangent, which is used forother purposes. Second, this method naturally smoothes thedata so that no additional smoothing is needed.

To construct the tangent at a given point, we first constructa least squares line fit to a window of data points centeredaround that point. Using a window of points has the effectof smoothing noise. The larger the window, the larger thesmoothing effect. We have found that a window of elevenpoints provides adequate smoothing without loss of essen-tial information about the shape. If the least squares line isan accurate fit for the window of points, it is used as an ap-proximation of the tangent. Otherwise, a least squares circlefit is constructed, and the tangent is taken from the circle.

We could compute the derivative of the tangent angleby means of numerical derivatives, but this would requiresmoothing. To avoid this, we again use a least squares linefit, this time applied to the graph of the tangent angle ver-sus arc length. The slope of the least squares line gives thecurvature in units of radians per pixel. Here again, whencomputing the least squares line, we use a window of elevenpoints.

We have found that our approach to calculating curva-ture works well in practice. In fact, this approach is similarin spirit to the way draftspersons used to compute graphi-cal derivatives in the era before computers. In some sense,we are smoothing the data the way a draftsperson would byeye. Although not reported here, we have shown that ourapproach is comparable to the traditional curvature calcula-tion based on the methods of analytic geometry, combined

(a) (b) (c)

Figure 1: A hand-drawn pivot symbol. (a) Raw ink. (b)Segmented ink. (c) Raw and segmented ink overlayed.

with Gaussian smoothing. Thus, if desired, one could di-rectly implement our segmentation approach using the moretraditional technique.

Candidate Segment Points

Once the initial processing of the ink is completed, the nextstep is to compute the set of initial candidate segment points.The first and last points of a pen stroke are always includedin this set. The remaining segment points are identified byexamining speed and curvature.

Our most reliable criterion for selecting segment points isbased on pen speed. Segment points typically occur at loca-tions where speed is a local minimum. Consider, for exam-ple, the sketch of a pivot in Figure 1(a). This sketch, whichwas drawn with a single pen stroke, was intended to be threelines and an arc (Figure 1(b)). Figure 2 shows the speedprofile for the pen stroke. The intended segment points cor-respond to local speed minima as indicated by circles. Thereare other speed minima that do not correspond to intendedsegment points, but these are distinguishable by their higherspeed.

Our approach, therefore, is to locate segment points atspeed minima that are slower than some threshold. We se-lect the threshold as a fraction of the average speed along thepen stroke. (The ordinate in Figure 2 directly corresponds topossible values of the threshold.) In practice, a threshold ofbetween 25% and 100% of the average speed works well. Alarger threshold will decrease the number of intended seg-ment points that are missed, while a smaller value will de-crease the number of unintended segment points that are se-lected.

Interestingly, we have found that our approach is not verysensitive to the particular value of the threshold used. Forexample, our user studies (below) show little variation inthe overall accuracy of the segmentation over the range inthreshold from 25% to 100%. The reasons for this are dis-cussed in “Discussion and Future Work.”

We typically, use a small threshold (25%) because verylow pen speed is a clear indication of an intended segmentpoint. If a speed minimum is above the threshold, the pointmay still be a segment point, but additional information isrequired to be certain. In this case, we examine the curva-ture of the ink. In Figure 2, for example, segment points (i)and (ii) are detected with a threshold of 25%. Segment point(iii) is above this threshold, but, as shown in Figure 3 thispoint corresponds to a maximum of curvature, which pro-vides additional evidence about the existence of a segmentpoint.

Page 4: Segmentation of Pen Strokes Using Pen Speed · Segmentation of Pen Strokes Using Pen Speed ... and freedom of sketching, ... corroborating pen speed information. Next, the ink between

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

0 200 400 600 800 1000

Arc Length, pixels

Sp

ee

d/A

ve

rag

eS

pe

ed

(i) (ii)

(iii)

Figure 2: Pen speed normalized by the average speed for thepivot in Figure 1. Intended segment points are indicated bycircles.

One approach to identifying segment points would be toidentify points that are both a minimum of speed and maxi-mum of curvature. In practice, we have found it adequate tosimply identify points that are a maximum of curvature andwhich have low speed. This avoids problems when speedminima and curvature maxima are nearby, but not preciselycoincident.

Based on empirical studies, we have identified a reliablecriterion based on both curvature and pen speed: If a point isan extremum of curvature, the absolute value of the magni-tude of the curvature exceeds 0.75 degree / pixel, and the penspeed is less than 80% of the average pen speed, the point isincluded in the initial set of candidate segment points. Thesecond requirement helps with nearly straight lines. Oftenthe sign of the curvature fluctuates for such lines, result-ing in multiple extrema. However, because the ink is nearlystraight, the magnitude of the curvature at the extrema isquite small. The thresholds use here work well for the hard-ware we use, and have proven to work well for a wide rangeof users, but will likely need tuning for other hardware.

The speed-based and curvature-based segment points arealways included in the initial set of segment points. Thereis a third class of segment points that are not considered ini-tially. These are the points at which the curvature changessign. We define three qualitative “signs” for curvature: +1if the magnitude is greater than 0.1 degree / pixel, -1 if themagnitude is less than -0.1 degree / pixel, and 0 otherwise.The thresholds were determined empirically to eliminate ir-relevant fluctuations in the curvature that occur for nearlystraight lines. Again, these values work well for our hard-ware, but will require tuning for other hardware.

A change in curvature sign is not a reliable indication ofan intended segment point, thus such points are consideredonly when the other segment points result in a poor fit forthe ink. For example, it is common for there to be a changein curvature sign on each side of a 90 degree corner. Clearly,such changes in curvature sign do not correspond to intendedsegment points. For this reason, segment points based oncurvature sign are not part of the initial set of candidate seg-ment points. Instead, they are considered only during thesplitting process described below.

It is possible for there be to be small clusters of closely lo-cated segment points. For example, there may be two speedminima separated by only a few data points. Consequently,

0

2

4

6

8

10

12

14

0 200 400 600 800 1000

Arc Length, pixels

Curv

atu

re,

degre

e/p

ixel

(i)

(ii)

(iii)

Figure 3: Ink curvature for the pivot in Figure 1.

once the speed and curvature segment points have been cal-culated, nearly coincident segment points are eliminated.

Fitting SegmentsOnce the initial set of candidate segment points have beenidentified, the next step is to fit primitives to the segments.Least squares line and circle fits are constructed for the seg-ment between each pair of consecutive segment points, andthe errors of fit are noted. We define error of fit as the av-erage distance from the least squares line or arc to the datapoints in question. The segment is typically classified bywhichever shape fits it with the smallest error of fit. How-ever, it is common for nearly straight lines to be accuratelyfit by an arc with a large radius. Thus, even if a segmentis best fit by an arc, it is classified as such only if it wouldrepresent at least one tenth of a circle (36o).

If a segment is classified as a line segment, its end pointsare determined by constructing perpendiculars from the firstand last data points to the least squares line. Similarly, forarcs, the end points are determined by a constructing radiallines through the first and last data points.

Merging and SplittingAfter the initial segments have been computed, a qualitycontrol process is begun. The segments are compared to theoriginal ink, and segments are merged, split, and deleted asnecessary. In this fashion, feedback is used to improve theaccuracy of the segmentation.

If there is a very short segment adjacent to a long one, wehave found that, frequently, the short one was unintended.Thus, if a segment is shorter than 20% of the length of an ad-jacent segment, the program attempts to merge them. (Thisconstant, as well as all of the others used for merging andsplitting, were obtained empirically.) The program com-putes a new segment containing the data points of the twooriginal segments. The type (line or arc) of this new segmentis forced to be the same as that of the longer of the originaltwo. If the error of fit of the new segment is no more than10% greater than the sum of the errors of fit of the originaltwo segments, they are discarded and replaced with the newone. Otherwise, the new segment is discarded.

We have found that at the start and end of a pen stroke, thestylus often leaves small, unintended bits of ink that formsharp discontinuities. Therefore, we eliminate segments atthe start or end of a pen stroke containing fewer than 15

Page 5: Segmentation of Pen Strokes Using Pen Speed · Segmentation of Pen Strokes Using Pen Speed ... and freedom of sketching, ... corroborating pen speed information. Next, the ink between

points. Similarly, if the first or last segment is shorter than10% of the average length of the three immediate neighbors,it is discarded.

If adjacent segments are of the same type, the programchecks to see if they might reasonably be interpreted as thesame segment. For example, if two arcs are adjacent, theprogram computes a new arc containing the data points fromthe two original arcs. If the error of fit is no more than 10%greater than the sum of the original errors of fit, the two arcsare replaced by the new one. Note that the program consid-ers merging two segments only if their drawing directionsare consistent.

If a particular least squares line or arc does not fit the inkwell, the program attempts to improve the fit by includinga segment point based on a change in the sign of the cur-vature. The program splits a segment in this fashion if theerror of fit is greater than seven pixels. In other words, ifon average, the data points are at least seven pixels from theleast squares line or arc, the program attempts to split thesegment. This value was determined empirically to workwith our hardware, but will likely require tuning for use withother hardware.

Typically there are only a few curvature-sign segmentpoints in any given segment. Consequently, it is feasibleto exhaustively consider each of them. The program con-siders splitting the segment with each of the curvature-signsegments points, one at time. The best choice is the one inwhich the sum of the errors of fit for the two new segmentsis minimum. If this minimum is less than 65% of the origi-nal error of fit, the new segmentation is retained, otherwise itis rejected. This threshold is designed to require significantimprovement in the fit before a new segment point is added.

Figure 4 shows an example of how curvature-sign pointsare used. In the initial segmentation, curvature sign pointsare excluded, and the stroke is incorrectly segmented intoa single arc segment. Because the fit is poor, the programtests both curvature-sign points in the middle of the curveand finds that an improved segmentation can be achieved.The result is shown in Figure 4b.

We have found it useful to apply our merging and split-ting routines repeatedly. The special merging routine thathandles noise at the start and end of each stroke is appliedfirst. Next, the general routines for merging segments are

(a) (b)

Figure 4: (a) Candidate segment points for an “s-curve.”Segmentation is a single arc if curvature sign points (shownas small triangles) are ignored. (b) Final segmentation whencurvature sign points are considered.

Figure 5: The ten shapes used in the user studies.

applied, followed by two applications of the splitting rou-tine. It is possible that splitting may produce segments thatshould be merged with their neighbors. Thus, the final stepconsists of an additional application of the general mergingroutines.

User StudiesTo test our segmenter, we conducted two user studies inwhich multiple subjects were asked to draw the set of shapesshown in Figure 5. The subjects were instructed to drawaccurately but naturally, and were informed that the exper-iment was intended to evaluate the accuracy of our seg-menter. We specifically selected subjects who had no pre-vious experience with our system, but who did have at leastsome experience using a PDA or digitizing tablet. Subjectswere given only a minute or two to become familiar withthe system before providing samples for the study. Thus,our results reveal how well our software performs for a newuser. We have found that after one has gained moderate ex-perience with our system, one is able to achieve even higheraccuracy than demonstrated in these studies.

For both user studies, we used an Intuos2 digitizing tabletwith an inking stylus. This stylus leaves physical ink on apiece of paper placed over the tablet. The computer displayshowed the raw ink rather than the segmented ink, as we didnot want the subject to alter his or her drawing based on theprogram’s performance. In fact, the subjects were given nofeedback at all about how well the program performed.

The first user study evaluated the suitability of our speedthreshold for the typical user. For this study, the digitizingtablet was set to a resolution of 1024x768 (low resolutionmode). Five subjects were asked to draw the ten symbols inFigure 5 four times each. All were asked to draw the shapesat a size of approximately 3cm, which is a comfortable sizewhen viewing the ink on the computer display. (The seconduser study, described below, explored accuracy as a functionof symbol size.)

Table 1 shows the results of the first study. The perfor-mance of the system was evaluated in terms the number ofmissing and extra segment points. Missing points can occurfor one of three reasons: (1) no candidate segment point wasfound, (2) a candidate was found but was later eliminatedby merging of the two adjacent segments, or (3) a candidatewas found but was later eliminated during the clean up ofthe start or end of the pen stroke. Extra points are those thatwere not intended as segment points, but were labeled assuch by the program.

When evaluating the accuracy of the computed segmen-tation, we accounted for variations in the way each subjectdrew the shapes. For example, the number of intended “wig-gles” in the spring-like symbol varied from one subject to

Page 6: Segmentation of Pen Strokes Using Pen Speed · Segmentation of Pen Strokes Using Pen Speed ... and freedom of sketching, ... corroborating pen speed information. Next, the ink between

(a) Threshold = 25% (b) Threshold = 100%User 1 2 3 4 5 Ave 1 2 3 4 5 AveNumber of Seg Points 230 229 244 229 233 233 230 229 244 229 233 233Missing Seg Points 11 5 17 0 2 7.0 0 0 0 0 0 0.0Mistakenly Merged 2 1 2 0 0 1.0 2 0 1 0 0 0.6Missing Start/End 0 1 3 0 0 0.8 0 1 4 0 0 1.0Extra Seg Points 0 3 0 0 2 1.0 12 13 3 1 5 6.8Correct Seg Points, % 94.3 95.6 91.0 100.0 98.3 95.8 93.9 93.9 96.7 99.6 97.9 96.4Correct Symbols, % 82.5 82.5 67.5 100.0 90.0 84.5 72.5 75.0 87.5 97.5 90.0 84.5

Table 1: User study with speed thresholds of (a) 25% and (b) 100%.

the next. Table 1 tabulates the number of intended segmentpoints for each subject, which was typically about 230. Thesegmentation error for each subject is defined as the sumof the missing and extra segment points divided by the to-tal number of intended segment points. The segmentationaccuracy is defined as one minus this value.

Table 1(a) shows the results obtained with a speed thresh-old of 25%. The average accuracy across all five subjectswas 95.8%. Additionally, on average, 84.5% of the symbolshad no segmentation errors of any kind. The symbols in thisstudy contained an average of 6 segment points, thus thereare multiple ways for there to be an error in a given symbol.This is why this second measure of accuracy is lower thanthe first.

Most of the segmentation errors occurred because no can-didate segment point was identified. On average, there were7 such errors for each set of 40 symbols. (Note, again, thateach set of 40 symbols contained about 230 segment points.)Significantly fewer points were missed because of segmentmerging or start/end cleaning – there was approximately oneof each of these errors for each set of 40 symbols. We didnotice, however, that some subjects drew the square root andsummation symbols with very small serifs, which were in-correctly eliminated as start/end noise. (Some subject drewlarge serifs, while other did not draw them at all.)

To evaluate how sensitive our approach is to the speedthreshold, we resegmented the ink using a larger thresholdof 100% of the average speed (Table 1(b)). With a thresholdof 25%, there was on average 8.8 missing segment pointsand 1 extra segment point for each set of 40 examples. Withthe higher threshold, there was on average 1.6 missing seg-ment points and 6.8 extra ones. As one would expect, as thethreshold increases, the number of missing points decreasesand the number of extras increases.

For four of the subjects, accuracy decreased only a littlewith the increased threshold. This suggests that the accuracyis not overly sensitive to the threshold. For the third subject,however, there was a significant increase in accuracy withthe larger threshold. Later discussions with that subject re-vealed that he was a trained calligrapher and was skilled atmaintaining a constant pen speed so as to avoid ink blotches.

The second user study (Table 2) was intended to evaluatethe accuracy of the system for various sizes of the ten shapesin Figure 5. This study also employed five participants, onlyone of which (subject 1) had participated in the first study.Each subject was asked to draw each of the ten symbols atsizes of 1cm, 2cm, and 4cm. Because small symbols wereused in this test, it was necessary to operate the tablet at a

Symbol Size 1cm 2cm 4cm 4cmTablet Resolution High High High LowNum Seg Pts, Ave 58.0 57.8 58.4 57.0Missing Seg Pts, Ave 2.0 0.2 0.0 0.4Mistaken Merge, Ave 0.2 0.2 0.4 0.0Missing Start/End, Ave 1.2 0.4 0.2 1.0Extra Seg Pts, Ave 1.0 2.0 2.0 1.0Correct Seg Pts, Ave, % 92.4 95.1 95.6 95.8Correct Sym’s, Ave, %. 68.0 76.0 80.0 80.0

Table 2: Size study. Speed threshold 85%.

higher resolution setting of 2048x1536. To obtain a basisfor comparison with the first user study, each subject wasalso asked to draw the symbols at 4cm using the low reso-lution setting (1024x768). Overall, we found that the therewas only a small decrease in accuracy for the smaller sizedshapes. Similarly, on average, the accuracy for large sym-bols with the high resolution mode was the same as with thelow resolution mode.

Discussion and Future WorkOur goal is not to match the ink precisely, it is to matchthe drawer’s intent. We believe that this necessitates the useof predefined parameter values, such as speed and curva-ture thresholds, in our program. These parameters are, inessence, a model of what a person would perceive as im-portant in a hand-drawn sketch. We do not believe that theintrinsic properties of a curve alone, are adequate to indi-cate the drawer’s intent. Empirically determined parametervalues are essential. We take an engineering perspective onthis issue: if our program can produce the outcome the userexpects, then we have achieved our goals.

The parameter values used in our program are matched tothe drawing hardware we use. They are designed to workwith drawing hardware that is about the size of a standardsheet of letter paper. Applying our techniques to a large de-vice, such as an electronic whiteboard, or a smaller device,such as a PDA, would require tuning of the parameters.

We have found that our system is not very sensitive tothe speed threshold used. This comes from several factors.When the threshold is too small, segment points are missed.However, such points are typically identified as curvature-based segment points. These points effectively have a higherspeed threshold, but use curvature information to ensure ac-curacy. Conversely, when the threshold is too large, the extrasegment points are eliminated by merging.

One purpose of our segmenter is to support symbol andshape recognizers. For this purpose, it is essential that weconsistently identify just the intended segment points. Our

Page 7: Segmentation of Pen Strokes Using Pen Speed · Segmentation of Pen Strokes Using Pen Speed ... and freedom of sketching, ... corroborating pen speed information. Next, the ink between

user studies have indicated that our program can achieve anaccuracy of around 96%. We have found that this level ofperformance is adequate for reliable recognition. Interest-ingly, recognition may be able to provide feedback to helpimprove segmentation. Once a symbol has been recognized,knowledge of its typical geometry and topology can be usedto correct segmentation errors.

Our user studies show that our program works well fortypical users, even those who have had little experience withour system. In future work, however, we would like to de-velop training techniques to customize our system for eachuser. One approach would be to have the user provide a setof standard training examples, and the program could opti-mize its parameters to maximize segmentation accuracy. Al-ternatively, the system could be adaptive and incrementallyimprove its performance as the user corrects segmentationerrors during ordinary use.

ConclusionThe challenge in segmenting a pen stroke is to identify thegeometric primitives intended by the drawer. Often, the in-tent is not a literal interpretation of the stroke. Consequently,a segmentation technique driven by the objective of match-ing the ink is likely to produce poor results. Rather, our ap-proach uses pen speed information to help infer intent. Wehave observed that it is common for the drawer to slow thepen tip at points of intended discontinuities in a pen stroke.

Based on this insight, we have developed a technique forsegmenting hand-drawn pen strokes into lines and arcs. Tobegin the segmentation process, an initial set of candidatesegment points is identified. This set includes speed minimabelow a threshold, where the threshold is computed fromthe average pen speed along the pen stroke. It also includescurvature maxima at which the pen speed is again below athreshold. Once the initial set of candidates has been gen-erated, the ink between each pair of consecutive segmentpoints is classified as either a line or an arc. A feedback pro-cess is then employed, and segments are judiciously mergedand split as necessary to improve the quality of the segmen-tation.

Our system does employ empirically determined con-stants. They are tuned to the particular hardware we use,and will likely need adjustment for optimal performance onother hardware. We have found that these constants are suit-able for most users who have tested our system.

User studies of our segmenter indicate that it performswell, even for the new user. In these studies, our segmenterhad an accuracy of between 92% and 96%, depending on thetask. One interesting observation from these studies is thatthe accuracy is surprisingly insensitive to the speed thresh-old used for identifying segment points.

In summary, this work has demonstrated the utility of penspeed for inferring the intended segmentation of pen strokes.It also demonstrates that feedback can be used to signifi-cantly improve the initial segmentation. There are clearlystill opportunities to improve the performance of our seg-mentation approach. However, our user studies suggest thatour approach is sufficiently reliably for implementing usablesystems.

ReferencesAgar, P., and Novins, K. 2003. Polygon recognitionin sketch-based interfaces with immediate and continuousfeedback. In Proc. 1st international conf. on comp. graph-ics and interactive techniques in Austalasia and South EastAsia, 147–150.Calhoun, C.; Stahovich, T. F.; Kurtoglu, T.; and Kara, L. B.2002. Recognizing multi-stroke symbols. In AAAI ’02Spring Symposium, Sketch Understanding.Dudek, G., and Tsotsos, J. 1997. Shape representation andrecognition from multiscale curvature. CVIU 68(2):170–189.Gennari, L. M.; Kara, L. B.; and Stahovich, T. F. 2004.Combining geometry and domain knowledge to interprethand-drawn diagrams. In AAAI ’04 Fall Symp., MakingPen-Based Interaction Intelligent & Natural.Herot, C. F. 1976. Graphical input through machinerecognition of sketches. In Proc. 3rd annual conf. oncomp. graphics and interactive techniques, 97–102.Igarashi, T.; Matsuoka, S.; Kawachiya, S.; and Tanaka, H.1997. Interactive beautification: A technique for rapid ge-ometric design. In UIST ’97, 105–114.Kara, L. B.; Gennari, L. M.; and Stahovich, T. F. 2004. Asketch-based interface for the design and analysis of sim-ple vibratory mechanical systems. In 2004 ASME DesignEngineering Technical Conf., DETC’04.Landay, J. A., and Myers, B. A. 2001. Sketching inter-faces: Toward more human interface design. IEEE Com-puter 34(3):56–64.Lee, J.-S.; Sun, Y.-N.; and Chen, C.-H. 1995. Multi-scale corner detection by using wavelet transform. IEEETrans. on Image Processing 4(1):100–104.Mortenson, M. E. 1985. Geometric modeling. John Wiley& Sons, Inc.Rattarangsi, A., and Chin, R. T. 1992. Scale-based detec-tion of corners of planar curves. IEEE PAMI 14(4):430–339.Rubine, D. 1991. Specifying gestures by example. Com-puter Graphics 25:329–337.Sezgin, T.; Stahovich, T.; and Davis, R. 2001. Sketch basedinterfaces: Early processing for sketch understanding. InPerceptive UI Workshop, PUI’01.Sezgin, T. M. 2001. Feature point detection and curveapproximation for early processing of free-hand sketches.Master’s thesis, MIT.Shpitalni, M., and Lipson, H. 1996. Classification ofsketch strokes and corner detection using conic sectionsand adaptive clustering. ASME J. of Mechanical Design119(2):131–135.Teh, C. H., and Chin, R. T. 1989. On the detection ofdominant points on digital curves. IEEE PAMI 11(8):859–872.Yu, B. 2003. Recognition of freehand sketches using meanshift. In Proc. International Conf. on Intelligent User In-terfaces, IUI’03.