Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
Active Contour Based Foreground Object
Segmentation
A THESIS
submitted by
SURYA PRAKASH
for the award of the degree
of
MASTER OF SCIENCE
(by Research)
DEPARTMENT OF COMPUTER SCIENCE &
ENGINEERING
INDIAN INSTITUTE OF TECHNOLOGY MADRAS
June 2008
Dedicated to my
beloved parents
&
respected teachers
i
THESIS CERTIFICATE
This is to certify that the thesis titled “Active Contour Based Foreground
Object Segmentation”, submitted by Surya Prakash, to the Indian Institute
of Technology, Madras, for the award of the degree of Master of Science, is a
bona fide record of the research work done by him under my supervision. The
contents of this thesis, in full or in parts, have not been submitted to any other
Institute or University for the award of any degree or diploma.
Dr. Sukhendu Das
Research Guide
Associate Professor
Department of Computer Science and Engineering,
IIT Madras, CHENNAI-600036.
Date: June 17, 2008
ACKNOWLEDGEMENTS
First and foremost, I would like to express my deep and sincere gratitude to my
research supervisor Dr. Sukhendu Das whose patient guidance, constant encour-
agement and excellent advice throughout the course of my MS program has made
this thesis possible. His expertise in the area and enthusiasm in the research have
been of great value for me. I am also thankful to him for placing the laboratory
facilities at my disposal.
I also take this opportunity to express my sincere thanks to my General Test
Committee members Prof. P. Sreenivasa Kumar, Prof. Hema A. Murthy and Dr.
V. Srinivasa Chakravarthy for their interest, encouragement, valuable suggestions
and thoughtful reviews.
I am also grateful to Prof. S. Raman (former HOD), Prof. Timonthy A.
Gonzalves (current HOD) for providing the best possible facilities to carry out
the research work. I am also grateful to Dr. C. Chandrashekhar, Prof. Hema A.
Murthy and Prof. B. Yegnanarayana for their role in building up the foundation
in subjects of Pattern Recognition and Artificial Neural Networks, useful for my
research.
My sincere thanks to Computer Science office and laboratory staff Mr. Na-
trajan, Mrs. Sharada, Ms. Poongodi, Mrs. Prema, Mr. Balu (at the department
library) for their valuable cooperation and assistance.
i
My special thanks to my lab-mates Sunando, Abhilash, Sundar, Deepti, Lalit,
Sreyasee, Manisha, Mirnalinee, Dyana, Aakanksha, Manika, Shivani, Poongodi,
Arpita, Vinod, Naresh, Uttara, Gyathri, Vidya for being tolerant and cooperative.
My special thanks to Sunando, Vinod, Aakanksha, Mirnalinee and Lalit for having
had long hours of research discussions with me and their advice during thesis
writing.
My stay at IITM has been made a memorable one by my friends Apoorve,
Sunando, Lalit, Abhilash, Sundar, Sai, Rakesh, Saurabh, Srini, Aditya, Shravan,
Rohit, Raj, Harendra.
Last but not the least, I would like to thank my parents, brother and sister
for being a source of encouragement and strength all throughout.
ii
ABSTRACT
Image segmentation is an important component in many image analysis and com-
puter vision tasks. Particularly, the problem of efficient interactive foreground ob-
ject segmentation in still images is of great practical importance in image editing
and has been the interest of research for a long time. Classical image segmentation
tools use either texture, color or edge (contrast) information for the purpose of
segmentation. Deformable models, Graph-cut, GrabCut etc. are some prominent
methods used for the segmentation of a foreground object. Object segmentation
methods have helped in many computer vision areas, such as scene representation
& interpretation, content based image retrieval, object tracking in videos, medical
applications etc.
Most object segmentation techniques in computer vision are based on the prin-
ciple of boundary detection. These segmentation techniques assume a significant
and constant gray level change between the object(s) of interest and the back-
ground. However, this is not true in the case of textured images. In textured
images, there exist many local edges of the texture micro units (texels), due to
the basic nature of a texture image. In case of textured images, the object bound-
ary is defined as the place where texture property changes. So to perform the
correct segmentation in case of textured images, there is a need to incorporate the
textural information in the segmentation process.
iii
The objective of this work is to develop efficient methods for foreground ob-
ject(s) segmentation in a given image. In the first part of the work, we develop
techniques for the segmentation of single or multiple object(s) from an image in
presence of foreground and background textures. We use active contour models
for the task of texture object segmentation by incorporating texture features. We
model texture characteristics of the image by building the scalogram obtained
using the discrete wavelet transform.
In the second part of our work, we develop a technique for efficient segmenta-
tion of an object in a color image. This technique deals with the complex problem
of segmentation of an object which contains holes in it, and in addition, the color
distribution of a part of the object is similar to that of the background. Color
object segmentation techniques, such as GrabCut etc., available in the literature,
often requires user’s post-corrective editing to perform correct segmentation. Our
proposed technique is semi-automatic and only requires the user to define a rect-
angle (or polygon) around the object to be segmented, and does not require post-
corrective editing. The proposed method is based on a probabilistic framework
to integrate the outputs of Snake and GrabCut. We have demonstrated the effi-
ciency and correctness of our proposed methods using a set of sufficiently difficult
simulated and real world images.
iv
TABLE OF CONTENTS
ACKNOWLEDGEMENTS i
ABSTRACT iii
LIST OF TABLES viii
LIST OF FIGURES xiii
LIST OF ALGORITHMS xiv
ABBREVIATIONS xv
1 Introduction 1
1.1 Problem Definition . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.1 Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Motivation and Scope . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.1 Applications . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 Purpose of Using Active Contour Based Methods for Segmentationof Textured Object(s) . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4 Brief Description of Work Done . . . . . . . . . . . . . . . . . . 7
1.4.1 Texture Object Segmentation using Parametric Active Con-tours . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.4.2 Segmentation of Multiple Textured Objects using GeodesicActive Contours . . . . . . . . . . . . . . . . . . . . . . . 9
1.4.3 SnakeCut: An Automatic Technique for Segmentation of aForeground Object with Holes . . . . . . . . . . . . . . . 10
1.5 Organization of the Thesis . . . . . . . . . . . . . . . . . . . . . 11
2 Literature Review 14
2.1 Object Segmentation Methods . . . . . . . . . . . . . . . . . . . 14
2.1.1 Energy Based Object Segmentation Methods . . . . . . . 15
v
2.2 Active Contour Models . . . . . . . . . . . . . . . . . . . . . . . 17
2.2.1 Parametric Active Contours (Snakes) . . . . . . . . . . . 18
2.2.2 Geometric (geodesic) Active Contours . . . . . . . . . . 19
2.3 Active Contour based Textured Object Segmentation . . . . . . 22
2.4 Graph-cut Based Methods . . . . . . . . . . . . . . . . . . . . . 25
2.5 Texture Representation and Analysis . . . . . . . . . . . . . . . 27
2.5.1 Texture Feature Extraction using Multiresolution Methods 28
2.5.2 Texture Feature Extraction using Spatial/Spatial-frequencyTechniques . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3 Textured Object Segmentation using Parametric Active Con-tours 38
3.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.1.1 Parametric Active Contour (Snake) Model . . . . . . . . 40
3.1.2 Discrete Wavelet Transform and Scalogram . . . . . . . . 41
3.2 Texture Feature Extraction . . . . . . . . . . . . . . . . . . . . 42
3.2.1 Scalogram estimation . . . . . . . . . . . . . . . . . . . . 43
3.2.2 Texture feature estimation . . . . . . . . . . . . . . . . . 44
3.3 Modeling of Texture Force . . . . . . . . . . . . . . . . . . . . . 45
3.4 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . 47
3.5 Summary and Discussion . . . . . . . . . . . . . . . . . . . . . . 50
4 Segmentation of Multiple Textured Objects using Geodesic Ac-tive Contours 53
4.1 Geodesic Active Contour . . . . . . . . . . . . . . . . . . . . . . 55
4.2 Multi-dimensional Texture Feature Extraction . . . . . . . . . . 58
4.2.1 Multi-dimensional Texture Feature Estimation . . . . . . 58
4.3 Geodesic Active Contours in Texture Feature Space . . . . . . . 60
4.3.1 Segmentation of multiple textured objects . . . . . . . . 62
4.4 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . 65
4.5 Summary and Discussion . . . . . . . . . . . . . . . . . . . . . . 73
5 SnakeCut: An Automatic Technique for Segmentation of aForeground Object with Holes 76
vi
5.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
5.1.1 Parametric Active Contour (Snake) Model for Color Images 77
5.1.2 GrabCut . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
5.2 Comparison of Active Contour and GrabCut Methods . . . . . . 81
5.3 SnakeCut: Integration of Active Contour and GrabCut . . . . . 85
5.4 SnakeCut Segmentation Results . . . . . . . . . . . . . . . . . . 91
5.5 Summary and Discussion . . . . . . . . . . . . . . . . . . . . . . 101
6 Conclusion 103
6.1 Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
6.2 Future Scope of Work . . . . . . . . . . . . . . . . . . . . . . . . 104
LIST OF TABLES
2.1 Energy based approaches for the object segmentation (taken fromBoykov and Funka-Lea, 2006) . . . . . . . . . . . . . . . . . . . 15
2.2 Properties of parametric and geometric (geodesic) active contours(taken from Delingette and Montagnat, 2001) . . . . . . . . . . 22
3.1 Computational time required for the segmentation of textured ob-jects in images shown in Fig. 3.4. Images are numbered from topto bottom in Fig. 3.4. . . . . . . . . . . . . . . . . . . . . . . . 49
4.1 Computational time required for the segmentation of textured ob-jects in images shown in Fig. 4.7. Images are numbered from topto bottom in Fig. 4.7. . . . . . . . . . . . . . . . . . . . . . . . 69
5.1 Computational times for foreground object segmentation, requiredby Snake, GrabCut and SnakeCut for various images. . . . . . . 101
viii
LIST OF FIGURES
1.1 Segmentation of foreground object: (a) input image, (b) segmentedforeground object. . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Problem of segmentation of multiple objects in presence of fore-ground and background textures: (a) input image, (b) desired seg-mentation result of foreground objects, where object boundariesare shown using black contours. . . . . . . . . . . . . . . . . . . 3
1.3 Problem of segmenting an object with holes: (a) input image. Userwants to segment the central elliptical object present in the scene,which contains a rectangular hole in it, (b) desired segmentationresult. In the result, background is visible through the hole of theobject. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4 Active Contour operating at a certain region of interest (ROI): (a)image with initial contour surrounding the ROI, (b) image showingthe deforming contour, and (c) image with the converged contour. 7
2.1 Categorization of filtering techniques used for texture representa-tion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.2 Real part of a 2-D Gabor filter with different combinations of scale(σ), orientation (θ) and frequency (ω): (a) σ = 6, θ = 0, ω = 2;(b) σ = 6, θ = 45, ω = 2.8. . . . . . . . . . . . . . . . . . . . . 36
3.1 (a) Synthetic texture image with initial contour, (b) result obtainedusing normal intensity based snake, and (c) desired segmentationresult. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.2 (a) Synthetic texture image, (b) magnified view of the 21 × 21window of the texture cropped at point P shown in (a). . . . . . 42
3.3 (a) 1-D texture profile of the texture window shown in Fig. 3.2(b),(b) scalogram of the signal shown in (a), (c) texture feature imagefor the image shown in Fig. 3.2(a). . . . . . . . . . . . . . . . . 43
3.4 Results of our proposed technique: first five texture images are com-posed of two Brodatz textures (Brodatz, 1966) and the last image isa natural Zebra image, (a) input texture images, (b) correspondingtexture feature images, (c) segmentation results. Contour shownin images depicts the estimated boundary of the foreground objectwith texture. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
ix
3.5 Comparative results: (I): results on a synthetic texture image: (a)from Sagiv et al. (2006), (b) our result, (II): results on Zebra-1image: (a) from Paragios and Deriche (2002b), (b) from Roussonet al. (2003), (c) our result, (III): results on Zebra-2 image: (a)from Kim et al. (2002), (b) from Rousson et al. (2003), (c) ourresult, (IV): results on Cheetah image: (a) from Kim et al. (2002),(b) from Rousson et al. (2003), (c) result of the proposed method. 51
3.6 Comparative results on Zebra image: (a) reproduced from Sagivet al. (2002), (b) reproduced from Rousson et al. (2003), (c) repro-duced from Awate et al. (2006), (d) reproduced from Sagiv et al.(2006), (e) obtained using the method presented in Gupta and Das(2006), (f) result of the proposed method. . . . . . . . . . . . . 52
4.1 Problem of segmentation of multiple objects using a parametricactive contour. . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.2 Geometric interpretation of the attraction force in 1-D: (a) The
original edge signal I, (b) I, the smoothed version of I, and (c) thederived stoping function g. The evolving contour is attracted tothe valley created by ∇g · ∇u. (figure taken from Caselles et al.,1997) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.3 Demonstration of segmentation of multiple objects using the tex-ture feature based geodesic active contour method : (a) Evolutionof level set surface (LSS), (b) evolving zero level set contour. Firstrow shows the initial LSS and zero level set contour. Last row showsthe final state of LSS and zero level set contour after convergence. 63
4.4 A close look at the process of level set surface (LSS) evolution andcontour splitting, from a different 3-D look angle, for the segmenta-tion of multiple textured objects shown in Fig. 4.3. From (a)-(h),figures show the evolving LSS with zero level set contour (shown inred color on the surface). Fig. (a) shows the initial LSS and Fig.(h) shows the final LSS after segmentation. . . . . . . . . . . . . 64
4.5 Segmentation result of a synthetic image: (a) input image, (b) tex-ture edge map produced by inverse edge detector (Eqn. 4.8) usingthe scalogram based texture features, (c) input image with initialcontour marked around the objects in black color, (d)-(i) intermedi-ate positions of the evolving contour during segmentation process,(k) segmentation result where segmented objects boundaries areshown in black color. . . . . . . . . . . . . . . . . . . . . . . . . 66
x
4.6 Segmentation result of a synthetic image: (a) input image, (b) tex-ture edge map produced by inverse edge detector using the scalo-gram based texture features, (c) input image with initial contourmarked around the objects in black color, (d)-(i) intermediate posi-tions of the evolving contour during segmentation process, (k) seg-mentation result where segmented objects boundaries are shown inblack color. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.7 Segmentation results of natural images: (a) input images whereinitial contours are marked around the object(s) in black color,(b) texture edge maps produced by inverse edge detector (Eqn.4.8) using the scalogram based texture features for the respectiveimages, (c) segmentation results of the image shown in column 1,where boundaries of the segmented objects are shown in black color. 70
4.8 Contour evolution in the segmentation of Zebra image: (a) inputimage with initial contour marked around the objects in black color,(b)-(i) intermediate positions of the evolving contour during thesegmentation process, (j) segmentation result where boundaries ofthe segmented objects are shown in black color. . . . . . . . . . 71
4.9 Comparative results (I): (a) input image, (b) result reproducedfrom Kim et al. (2002), (c) our result, (II): (a) input image, (b)result reproduced from Paragios and Deriche (2002a), (c) our result. 72
4.10 Comparative results: (I) result on Zebra image: (a) reproducedfrom Paragios and Deriche (2002b), (b) reproduced from Roussonet al. (2003), (c) produced by our proposed technique; (II) resulton Cheetah image: (a) reproduced from Kim et al. (2002), (b)reproduced from Rousson et al. (2003), (c) result produced by ourproposed technique. . . . . . . . . . . . . . . . . . . . . . . . . . 73
4.11 Comparison of single textured object segmentation results obtainedusing the parametric active contour based technique (presented insection 3.3) and the geodesic active contour based technique (pre-sented in section 4.3): (a) results obtained using parametric activecontour based technique, (b) results obtained using geodesic activecontour based technique. . . . . . . . . . . . . . . . . . . . . . . 74
5.1 Estimation of gradient in color and gray level images: (a) inputimage, (b) gradient image of (a) estimated using Eqn. 5.2, (c) grayscale image of (a); (d) gradient image of (c). . . . . . . . . . . . 78
5.2 (a) Input image, elliptical object present in the image contains arectangular hole at the center, (b) foreground initialization by user,(c) active contour segmentation result, and (d) GrabCut segmen-tation result. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
5.3 (a) Teapot image; segmentation results of (b) active contour and(c) GrabCut. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
xi
5.4 (a) Image containing wheel; segmentation results of (b) active con-tour and (c) GrabCut. . . . . . . . . . . . . . . . . . . . . . . . 83
5.5 (a) Soldier Image, segmentation results of (b) active contour and(c) GrabCut. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
5.6 Flow chart of the proposed SnakeCut technique. . . . . . . . . . 88
5.7 Segmentation of image shown in Fig. 5.2(a) using SnakeCut: (a)object boundary produced by active contour, (b) distance trans-form for the boundary contour shown in (a); (c) SnakeCut segmen-tation result. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
5.8 Effect of interval [a, b] on the non-linearity of the fuzzy distributionfunction (Eqn. 5.10). When a < b, transition from 0 (at a) to 1(at b) is smooth. When a ≥ b, we have a step function with thetransition at (a + b)/2. . . . . . . . . . . . . . . . . . . . . . . . 89
5.9 Demonstration of the impact of Pc and Ps values on the decisionmaking in Algorithm 2: (a) teapot image with a few points markedon it, (b) values of Pc, Ps, p, and the decision obtained using Algo-rithm 2, for the points labeled in (a). Values used for ρ and T are0.5 and 0.6 respectively. . . . . . . . . . . . . . . . . . . . . . . 92
5.10 Demonstration of a SnakeCut result on a synthetic image, whereSnake fails and GrabCut works: (a) input image with foregroundinitialized by the user (object contains a rectangular hole at thecenter), (b) Snake segmentation result (incorrect, output containsthe hole as a part of the object), (c) GrabCut segmentation result(correct, hole is removed), and (d) SnakeCut segmentation result(correct, hole is removed). . . . . . . . . . . . . . . . . . . . . . 92
5.11 Demonstration of a SnakeCut result on a synthetic image, whereSnake works and GrabCut fails: (a) input image with foregroundinitialized by the user, (b) Snake segmentation result (correct), (c)GrabCut segmentation result (incorrect, upper green part of theobject is removed), and (d) correct segmentation result producedby SnakeCut. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
5.12 Segmentation of real pot image: (a) input real image, (b) activecontour segmentation result (incorrect), (c) GrabCut segmentationresult (correct), and (d) SnakeCut segmentation result (correct,background pixels visible through the handles of the pot are re-moved). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
5.13 SnakeCut segmentation results of (a) teapot (for the image in Fig.5.3(a)), (b) wheel (for the image in Fig. 5.4(a)) and (c) soldier (forthe image in Fig. 5.5(a)). . . . . . . . . . . . . . . . . . . . . . . 95
xii
5.14 Segmentation of cup image: (a) input real image, (b) segmentationresult produced by Snake (incorrect, as background pixels visiblethrough the cup’s handle are detected as a part of the object), (c)GrabCut segmentation result (incorrect, as the spots present onthe cup’s handle are removed), and (d) correct segmentation resultproduced by SnakeCut. . . . . . . . . . . . . . . . . . . . . . . . 95
5.15 Segmentation of webcam bracket image: (a) input real image wherethe objective is to segment the lower bracket present in the image,(b) Snake segmentation result (incorrect, as background pixels vis-ible through the holes present in the object are detected as part ofthe foreground object), (c) GrabCut segmentation result (incorrect,as large portions of the bracket are removed in the result), and (d)correct segmentation result produced by SnakeCut. . . . . . . . 96
5.16 Comparison of the results: (a) SnakeCut result for teapot, (b)GrabCut result for teapot with user interaction, (c) SnakeCut resultfor wheel, (d) GrabCut result for wheel with user interaction, (e)SnakeCut result for soldier, (f) GrabCut result for soldier with userinteraction (reproduced from (Rother et al., 2004)), (g) SnakeCutresult for webcam bracket, (h) GrabCut result for webcam bracketwith user interaction. . . . . . . . . . . . . . . . . . . . . . . . . 98
5.17 Example where SnakeCut fails: (a) input image with foregroundinitialized by user, (b) active contour segmentation result (correct),(c) GrabCut segmentation result (incorrect), and (d) SnakeCut seg-mentation result (incorrect). . . . . . . . . . . . . . . . . . . . . 100
xiii
LIST OF ALGORITHMS
1 Identification of significant subbands . . . . . . . . . . . . . . . 442 Steps of SnakeCut . . . . . . . . . . . . . . . . . . . . . . . . . 90
xiv
ABBREVIATIONS
WT Wavelet Transform
DWT Discrete Wavelet Transform
DCT Discrete Cosine Transform
DFT Discrete Fourier Transform
ACM Active Contour Model
RGB Red Green Blue
PDE Partial Differential Equation
DP Dynamic Programming
MRF Markov Random Field
GVF Gravitational Vector Field
FCM Fuzzy C-Means
GMM Gaussian Mixture Model
DT Distance Transform
xv
CHAPTER 1
Introduction
An important component in many image analysis and computer vision tasks is
image segmentation, a process in which an image is partitioned into meaningful
constituent parts. Recently, interactive segmentation methods are becoming more
and more popular to alleviate the problems inherent in fully automatic segmen-
tation, which never seems to be perfect. In particular, the problem of efficient
interactive foreground object segmentation in still images is of great practical im-
portance in image editing and has been the interest of research for a long time. It
can be associated with the problem of boundary detection and integration, where
boundary is roughly defined as a curve or surface separating “homogenous” re-
gions. Fig. 1.1 shows an example of foreground object segmentation in an image.
Deformable models, Graph-cut, GrabCut etc. are some prominent methods
used for segmentation of foreground object, which use either color or edge (con-
trast) information for the segmentation purpose. In the recent past object seg-
mentation methods have helped in many computer vision areas, such as scene
representation & interpretation, content based image retrieval (CBIR), object
tracking in videos, medical applications etc. Common challenges in the object
segmentation process are effects of uneven sample illumination, shadowing, par-
tial occlusion, clutter, noise, subtle object to background differences and changes
etc. In fully automatic object segmentation methods, correct detection and seg-
(a) (b)
Fig. 1.1: Segmentation of foreground object: (a) input image, (b) segmented fore-ground object.
mentation of a semantic object in the image, is a challenging problem. Interactive
(semi-automatic) techniques that get some input from the user have naturally
more advantages over the automatic ones as they significantly improve the object
extraction.
1.1 Problem Definition
The main goal of this thesis is to develop efficient methods for segmentation of
foreground object(s) in a given image. In the first part of the work, we develop
techniques for the segmentation of single or multiple object(s) from an image, in
the presence of foreground and background textures. In the second part of our
work, we develop a technique for efficient segmentation of an object which contains
holes in it. This problem is further complicated by the fact that color distribution
of some parts of the object may also be similar to a part of the background. Figs.
1.2 and 1.3 show the inputs and the desired outputs in these segmentation tasks.
The two problems addressed in this thesis are as follows:
• Segmentation of single or multiple texture objects, in presence of a back-
2
(a) (b)
Fig. 1.2: Problem of segmentation of multiple objects in presence of foreground andbackground textures: (a) input image, (b) desired segmentation result of foregroundobjects, where object boundaries are shown using black contours.
ground texture, using a representation based on the scalogram (Clerc andMallat, 2002) of the DWT as a texture feature.
• Integration of a Snake (parametric active contour) (Kass et al., 1988) andGrabCut (Rother et al., 2004) using a probabilistic framework for the seg-mentation of a foreground object with holes in color images. We call theintegrated technique as “SnakeCut”.
1.1.1 Assumptions
We assume the following conditions for our proposed methods of foreground tex-
tured object segmentation:
1. Initialization provided by the user does not include any background object.
2. We assume that there are no occlusions in case of multiple objects.
3. In SnakeCut segmentation technique, we assume that the initialization forthe foreground includes only a single object.
4. No part of the foreground object has similarity with part of the background.
5. Input images are noise free.
3
(a) (b)
Fig. 1.3: Problem of segmenting an object with holes: (a) input image. User wants tosegment the central elliptical object present in the scene, which contains a rectangularhole in it, (b) desired segmentation result. In the result, background is visible throughthe hole of the object.
1.2 Motivation and Scope
Most object segmentation techniques in computer vision are based on the prin-
ciple of boundary detection. These segmentation techniques assume a significant
gray level change between the object(s) of interest and the background. However,
this is not true in the case of textured images. In textured images, many edges
of texture micro units (texels) exist due to the nature of the texture. So the
object segmentation techniques relying on intensity gradient are likely to fail in
such situations. In case of textured images, the object boundary is defined as the
place where texture property changes. So to perform the correct segmentation in
case of textured images, there is a need to incorporate the textural information
in the segmentation process. In the literature, few techniques are available which
perform object segmentation (contour-based) in presence of texture (Paragios and
Deriche, 1999b,a; Sagiv et al., 2000, 2006) and there is a need to develop efficient
object segmentation techniques in this area. This motivates us to develop ob-
ject segmentation techniques which can perform the task of segmentation in the
presence of texture. The segmentation methods developed in this work are quite
4
robust and efficient, and are able to perform object segmentation task in the case
of both synthetic and natural texture images.
We also developed an object segmentation technique for non-textured color
objects with holes. The developed technique is also very useful in the case where
the color distribution of some parts of the foreground object is similar to the image
background and object contains holes within it. Object segmentation techniques
(such as Rother et al. (2004)) available in the literature for colored images, often
require post-corrective editing by the user, to perform correct segmentation. So
there is a necessity to develop automatic segmentation techniques which can per-
form segmentation with minimum user interaction in such cases. Our proposed
technique is semi-automatic and only requires the user to define a rectangle or
polygon around the object to be segmented, and no post-corrective editing is re-
quired. We have demonstrated the efficiency and correctness of our method using
sufficiently difficult images.
1.2.1 Applications
In the recent past, object segmentation methods have helped in many computer
vision tasks. Few of them are listed below:
1. Object recognition,
2. Object segmentation and tracking in videos,
3. Object oriented video coding,
4. Surveillance and tracking,
5. Medical imaging applications, diagnosis and surgery,
6. Modeling focus of attention in visual perception,
5
7. Bin-picking problem in robotics,
8. Target detection and identification.
1.3 Purpose of Using Active Contour Based Meth-
ods for Segmentation of Textured Object(s)
The task of segmentation of textured object(s) is to divide an image into two
parts: foreground object(s) and background. There are many texture segmenta-
tion algorithms (based on active contour, clustering etc.) which can perform such
segmentation task. Following points favor the use of active contour for segmenta-
tion of textured object, against the others.
• Active contour has ability to operate on certain part of the image (Fig.1.4) and it does not need to consider the entire image. So the process ofsegmentation is fast.
• Active contour solves the segmentation problem by considering an objectboundary as a single, connected structure. It exploits a priori knowledgeof object shape and inherent smoothness, usually formulated as internaldeformation energies, to compensate for noise, gaps and other irregularitiesin the object boundaries. Its underlying geometric representation provide acompact analytical description of an object.
• In an object segmentation problem, the objective is to find the closed bound-ary of the object using which the object can be cropped. If any clusteringalgorithm is used for object segmentation, it will suffer from the followingproblems:
– First, boundary produced by the algorithm may not be a closed con-tour. Some algorithm has to be applied for the edge linking.
– Second, boundaries detected by the clustering algorithms are inaccu-rate most of the time. So linking of edges also does not help.
• Boundary information produced by active contour can be used as a shapefeature for the representation of the object in many applications such ascontent based image retrieval.
• Active contour can interpret sparse, incomplete and redundant informationand is selective with respect to false image features.
6
(a) (b) (c)
Fig. 1.4: Active Contour operating at a certain region of interest (ROI): (a) imagewith initial contour surrounding the ROI, (b) image showing the deforming contour,and (c) image with the converged contour.
Thus the purpose behind the use of active contour for textured object seg-
mentation is very apparent. Other segmentation algorithms can also be used to
produce the same results but active contour supercedes them in terms of simplicity,
ease of use and in many cases provides an efficient result for object segmentation.
1.4 Brief Description of Work Done
The work presented in this thesis is explained below in three sections: (1) textured
object segmentation using parametric active contours, (2) multiple textured ob-
jects segmentation using geodesic active contour and (3) SnakeCut: an automatic
technique for segmentation of a foreground object with holes.
7
1.4.1 Texture Object Segmentation using Parametric Ac-
tive Contours
In this part of the work, we focus on parametric active contours, which synthe-
size parametric curves within image domain and allow them to move towards
the desired image features under the influence of internal and external forces.
The internal force serves to impose a piecewise continuity and smoothness con-
straint whereas external force pushes the contour towards salient image features
like edges, lines and subjective contours. External force in the traditional active
contour is defined in terms of the image gradient. In the presence of such external
force, Snake is attracted towards large image gradients i.e. towards the edges in
the image. So if it is applied to the textured images, it will often get stuck on
local texel (micro-units or cells of a texture) edges and converge at non-object
boundary.
We extend energy based parametric active contour model for the segmentation
of textured objects. To overcome the above mentioned effect, we use a new for-
mulation of external force in active contour for textured images, which we name
as “texture force”. Texture force does not use the image pixel intensity values
directly for modeling but incorporates texture information present in the image.
We use scalogram (Clerc and Mallat, 2002) obtained from discrete wavelet trans-
form (DWT) for the extraction of texture information present in the image. The
active contour in presence of texture force runs over the texture image surface and
detects the object boundary of a texture surface, on a background texture. The
following section briefly presents our proposed texture feature extraction method.
8
Texture Feature Extraction
Extraction of texture features at a point in an image involves two steps: scalogram
estimation and texture feature estimation. To obtain scalogram (Clerc and Mallat,
2002) at a particular point (pixel) in the image, a n × n window is considered
around the point of interest. Intensities of the pixels in this window are arranged
in the form of a vector of length n2 whose elements are taken column-wise (or
row-wise) from the n×n cropped intensity matrix. This intensity vector (signal),
which basically represents the textural pattern around the pixel, is subsequently
used in the estimation of scalogram. Once the scalogram of the texture profile at a
particular point is obtained, a post-processing operation is carried out to eliminate
the non-significant subbands of the scalogram, and only the significant subbands
are used for further processing. This is done since significant subbands contain
major texture features information. Removal of non-significant subbands helps in
removing the redundant information and makes the computation fast. For texture
feature estimation at a particular point, an energy measure is calculated by taking
the mean of the scalogram coefficients from the significant subbands. This energy
measure computed for all pixels in an image constitute “texture feature image”,
which is further used in texture force modeling.
1.4.2 Segmentation of Multiple Textured Objects using
Geodesic Active Contours
Though parametric active contours are fast, efficient and easy to implement, they
are not designed to handle the topological changes during the process of seg-
9
mentation. Hence, they are not suitable to segment multiple objects intrinsi-
cally. Special topology-handling procedures must be added to handle topological
changes. Hence, to segment multiple objects in presence of foreground and back-
ground textures, we extend the concept of geodesic active contour which can
handle topological changes naturally.
We have developed an efficient algorithm based on the geodesic active contour
for segmentation of multiple objects in the presence of foreground/background
textures. Proposed technique is based on the generalization of geodesic active
contour from one dimensional intensity based feature space to multidimensional
texture feature space. In our approach, image is represented in an n-dimensional
texture feature space, which is derived from the image using DWT and scalogram
(Clerc and Mallat, 2002). We formulate an edge indication function, used in the
geodesic active contour, from the texture feature space of the image, by viewing
texture feature space as a Riemannian manifold (Sagiv et al., 2000, 2006).
1.4.3 SnakeCut: An Automatic Technique for Segmenta-
tion of a Foreground Object with Holes
We develop a technique for the segmentation of objects with holes in color im-
ages. We call our integrated technique as “SnakeCut”. This technique mainly
deals with the segmentation of an object which contains holes in it, and the color
distribution of some parts of the object is also similar to the background. The
proposed technique is based on Snake (parametric active contour) (Kass et al.,
1988) and GrabCut (Rother et al., 2004). Snake is a deformable contour, which
10
segments an object boundary using boundaries discontinuities by minimizing the
energy function associated with the contour. GrabCut is an interactive tool based
on iterative graph-cut for foreground object segmentation in still images. GrabCut
provides a convenient way to encode color features as segmentation cues to ob-
tain foreground segmentation from local pixel similarities using modified iterated
graph-cuts. Since active contour uses gradient information (boundary discontinu-
ities) present in the image to estimate the object boundary, it cannot penetrate
inside the object to detect a hole. It thus cannot remove any pixels inside the
object boundary which do not belong to the object. GrabCut, on the other hand,
works on the basis of pixels color (intensities) distribution, and hence has the
ability to remove interior pixels which are not part of the object. Major prob-
lem with the GrabCut is as follows. If some parts of the foreground object have
color distribution similar to the image background, then those parts will also be
eliminated by GrabCut. In GrabCut algorithm (Rother et al., 2004), missing
foreground data is recovered by user interaction. In SnakeCut, we present a novel
formulation based on a probabilistic framework, to integrate these two compli-
mentary techniques to obtain an automatic segmentation of a foreground object
with holes.
1.5 Organization of the Thesis
In this thesis, the problem of textured object segmentation has been attempted
by using the Snake (parametric active contour) (Kass et al., 1988) and geodesic
active contours (Caselles et al., 1997). We have also addressed the problem of
11
segmentation of an object with holes in color images using the integration of
Snake and GrabCut (Rother et al., 2004). The rest of the thesis is organized as
follows:
Chapter 2: Literature Review - This chapter discusses the various techniques
present in the literature for the segmentation of textured and non-textured objects.
Chapter 3: Textured Object Segmentation using Parametric Active
Contours - This chapter presents a method for segmentation of a textured ob-
ject using the parametric active contour. It also discusses the texture feature
extraction technique based on scalogram obtained using DWT.
Chapter 4: Segmentation of Multiple Textured Objects using Geodesic
Active Contours - This chapter presents a technique for segmentation of mul-
tiple textured objects using the geodesic active contour. Presented approach can
handle segmentation of multiple objects simultaneously in presence of foreground
and background textures. This approach is based on the generalization of geodesic
active contour model from one dimensional intensity based feature space to mul-
tidimensional texture feature space.
Chapter 5: SnakeCut: An Automatic Technique for Segmentation of
a Foreground Object with Holes - This chapter discusses the limitations of
active contour (Snake) and GrabCut segmentation techniques, and proposes an
efficient, semi-interactive method based on the integration of these two popular
techniques for the segmentation of an object in color images. Proposed technique,
termed as “SnakeCut”, is particularly very useful for the segmentation of an object
which contains holes within it and color distribution of some parts of the object
12
is same as background.
Chapter 6: Conclusion - This chapter concludes the thesis with discussion on
contribution and future scope of work.
13
CHAPTER 2
Literature Review
In this chapter, a review of techniques proposed for the segmentation of textured
and non-textured objects is presented. It also presents a short summary on texture
feature extraction and representation techniques.
2.1 Object Segmentation Methods
In the last two decades, the computer vision community has produced a num-
ber of useful algorithms for localizing object boundaries in images. Parametric
active contours (snakes) (Kass et al., 1988; Cohen, 1991), geodesic active con-
tours (Caselles et al., 1997; Yezzi et al., 1997), graph-cuts (Freedman and Zhang,
2005), GrabCut (Rother et al., 2004), intelligent scissors (Mortensen and Barrett,
1998), “shortest path” techniques (Mortensen and Barrett, 1998; Falcao et al.,
1998) and many other methods exist for partitioning an image into two segments:
“foreground object” and “background”. These methods integrate model-specific
visual cues and contextual information in order to accurately describe a particular
object. Each method comes with its own set of features.
Table 2.1: Energy based approaches for the object segmentation (taken from Boykovand Funka-Lea, 2006)
Variational Methods Combinatorial Methods(optimization in R∞) (optimization in Zn)
Explicitboundaryrepresentation
Snakes and Active contours(Variational formulations)(e.g. Kass et al., 1988,Cohen, 1991)
Dynamic Programming and“path-based” graphmethods (2-D only),Intelligent Scissors (e.g.Amini et al., 1990, Geigeret al., 1995, Mortensen andBarrett, 1998, Jermyn andIshikawa, 1999)
Implicitboundaryrepresentation
Level-sets, Geodesic activecontours (e.g. Caselleset al., 1997, Sethian, 1999,Sapiro, 2001, Osher andParagios, 2003)
Combinatorial Graph-cuts(Boykov and Jolly, 2001)
2.1.1 Energy Based Object Segmentation Methods
Most of the object segmentation techniques available in the literature are based
on some kind of energy formulation. A brief overview of these segmentation
techniques is given in Table 2.1. Energy-based segmentation methods can be
distinguished by the type of energy function they use and by the optimization
technique for minimizing it. The majority of standard algorithms can be divided
into two large groups:
A. Energy functional defined on a continuous contour or surface
B. Energy functional or cost function defined on a discrete set of variables
The standard methods in group A formulate segmentation problem in the do-
main of continuous functions R∞. For optimization, most of them rely on a vari-
ational approach and gradient descent. Numerical techniques for such variational
approaches are based on the finite differences or on finite elements. Few examples
15
of the methods of group A include snakes (Kass et al., 1988; Cohen, 1991), region
competition (Zhu and Yuille, 1996), geodesic active contours (Caselles et al., 1997),
and other methods based on level-sets (Sethian, 1999; Osher and Fedkiw, 2002;
Sapiro, 2001; Osher and Paragios, 2003). Typically, continuous surface function-
als incorporate various “regional” and “boundary” properties of segments, some
of which can be geometrically motivated (Caselles et al., 1997; Vasilevskiy and
Siddiqi, 2002). In most cases, methods in group A use variational optimization
techniques that can guarantee to find only a local minima of the corresponding
energy functional.
The segmentation methods in group B either directly formulate the prob-
lem as a combinatorial optimization in finite dimensional space Zn or optimize
some discrete energy function whose minima approximates the solution of some
continuous problem. Most of the discrete optimization methods for object seg-
mentation minimize an energy defined over a finite set of integer-valued variables.
Such variables are usually associated with graph nodes corresponding to image
pixels or control points. All previous combinatorial methods for object segmen-
tation use discrete variables whose values encode “direction” of a path along the
graph. Many path-based methods use Dynamic Programming (DP) to compute
the optimal paths. For example, intelligent scissors (Mortensen and Barrett, 1998)
and live-wire (Falcao et al., 1998) use Dijkstra algorithm while DPsnakes (Amini
et al., 1990) use Viterbi algorithm. Note that all path-based methods can natu-
rally encode boundary-based segmentation cues while the incorporation of region
properties in segments is less obvious (Jermyn and Ishikawa, 1999). In any case,
all path-based methods are limited to 2-D applications because object boundary
16
in 3-D volumes cannot be represented by a path.
Global vs. Local Optimization
In general, global solutions are attractive because of their potentially better sta-
bility. For example, imperfections in a globally optimal solution can be directly
related to the cost function rather than to the numerical problem during mini-
mization. Thus, global methods can be more reliable and robust. Some versions of
active contours (Cohen and Kimmel, 1997), shortest path algorithms (Mortensen
and Barrett, 1998; Falcao et al., 1998), ratio regions (Cox et al., 1996), and some
other segmentation methods (Jermyn and Ishikawa, 1999) compute a globally op-
timal solution (in 2-D applications), in case when the segmentation boundary is
a 1-D curve.
Since our study is based on the active contours and graph-cut based segmen-
tation methods, here we will review segmentation methods based on these two
techniques in detail.
2.2 Active Contour Models
Active contours (Kass et al., 1988) are extensively used in computer vision and
image understanding applications, particularly to locate object boundaries. They
are energy minimizing deformable contours that converge at the boundary of an
object in an image. Deformation in contour is caused because of internal and exter-
nal forces acting on it. Internal force is derived from the contour itself and external
17
force is invoked from the image. The internal and external forces are defined so
that the snake will conform to object boundary or other desired features within
the image. Snakes are widely used in many applications such as segmentation
(Leymarie and Levine, 1993), shape modeling (Terzopoulos and Fleischer, 1988),
edge detection (Kass et al., 1988), motion tracking (Terzopoulos and Szeliski,
1992) etc. Active contours can be classified as either parametric active contours
(snake) (Kass et al., 1988; Cohen, 1991) or geodesic (geometric) active contours
(Caselles et al., 1993, 1997) according to their representation and implementa-
tion. In particular, parametric active contours explicitly represent curves in their
parametric form in a Lagrangian framework whereas the geodesic active contours
implicitly represent model shape as the zero level set of a two-dimensional scalar
function. Geodesic active contours evolve in an Eulerian framework model based
on front propagation (Malladi et al., 1995) using the theory of curve evolution.
2.2.1 Parametric Active Contours (Snakes)
Parametric active contours synthesize parametric curves within image domain
and allow them to move towards the desired image features under the influence
of internal and external forces. The internal force serves to impose a piecewise
continuity and smoothness constraint whereas external force pushes the snake
towards salient image features like edges, lines and subjective contours. External
force in the traditional snake is defined as the negative of the image gradient.
Traditional Active contour model (Kass et al., 1988) suffers from the problems of
contour initialization and poor convergence to cavities. Initial contour is required
to be kept near the object boundary by the user to achieve the correct convergence.
18
Many attempts have been made to improve this model. Cohen (Cohen, 1991)
proposes a new external force, the balloon (inflation) force, to solve the contour
initialization problem. In the presence of this pressure force, snake behaves like
a balloon and get inflated. The initial curve no longer needs to be close to the
object boundary. Xu et al. (Xu and Prince, 1997, 1998a,b) introduce another
new external force, the gravitational vector flow (GVF) field, to allow flexible
initialization of the snake. GVF also encourages snake convergence to boundary
cavities. Paragios et al. (Paragios et al., 2004) presents an improved version
of GVF snake which gives better convergence when vector flows are tangent to
the contour or diverge within a neighborhood. Many researchers have suggested
neural network based minimization of the active contour energy. Tsai et al. (Tsai
et al., 1993) uses Hopfield network to minimize the snake energy. Venkatesh and
Rishikesh (Venkatesh and Rishikesh, 1997, 2000) presents a new implementation
of the snake (Kass et al., 1988) using self organization neural networks. Chiou et
al. (Chiou and Hwang, 1995) presents stochastic active contour based on neural
networks. Many techniques have been developed in recent years using multiple
snakes for object segmentation (Srinark and Kambhamettu, 2001, 2006).
2.2.2 Geometric (geodesic) Active Contours
Geometric active contours were independently introduced by Caselles et al. (Caselles
et al., 1993) and Malladi et al. (Malladi et al., 1995). These models were based
on curve evolution theory (Kimia et al., 1995) and level set method (Osher and
Fedkiw, 2002). The basic idea is to represent contours as the zero level set of
an “implicit function” defined in a higher dimensional space, usually referred as
19
the level set function, and then evolve the level set function according to a partial
differential equation (PDE). This approach presents following advantages over the
traditional parametric active contours:
1. the contours represented by the level set function may break or merge nat-urally during the evolution, and the topological changes are thus automat-ically handled. For parametric contours, it is in general, not possible toachieve any automatic topological changes. However, several algorithmshave been proposed to overcome this limitations (Leitner and Cinquin, 1991;McInerney and Terzopoulos, 1996; Lachaud and Montanvert, 1999).
2. the level set function always operates on a fixed grid, which allows efficientnumerical schemes.
Early geometric (geodesic) active contour models (Caselles et al., 1993; Malladi
et al., 1995; Caselles et al., 1997) are typically derived using a Lagrangian formu-
lation that yields a certain evolution PDE of a parameterized curve. This param-
eterized PDE is then converted to an evolution PDE for a level set function, using
the related Eulerian formulation from level set methods. As an alternative, the
evolution PDE of the level set function can be directly derived from the problem
of minimizing a certain energy functional defined on the level set function. This
type of variational methods are known as variational level set methods (Chan and
Vese, 2001; Vemuri and Chen, 2003; Zhao et al., 1996). Compared with pure PDE
driven level set methods, the variational level set methods are more convenient
and natural for incorporating additional information, such as region-based infor-
mation (Chan and Vese, 2001) and shape-prior information (Vemuri and Chen,
2003), into energy functionals that are directly formulated in the level set do-
main, which produces more robust results. For examples, Chan and Vese (Chan
and Vese, 2001) proposed an active contour model using a variational level set
formulation. By incorporating region-based information into energy functional as
20
an additional constraint, their model has much larger convergence range and flex-
ible initialization. Vemuri and Chen (Vemuri and Chen, 2003) proposed another
variational level set formulation. By incorporating shape-prior information, their
model is able to perform joint image registration and segmentation.
Implementation of geodesic active contours is easier compare to parametric
active contours but they suffer from speed limitations. This is because the up-
date of an implicit contour requires the update of at least a narrow band around
each contour. Furthermore, on parametric contours, vertex sampling may not be
constant or uniform, whereas on implicit contours the resolution is constrained by
the resolution of the regular grid. In recent past, many improvements have been
done to improve the speed of geodesic active contours. Speed-up algorithms for
the level set methods were proposed either by constraining the contour evolution
through the Fast-Marching method (Sethian, 1999) or by the asynchronous up-
date of the narrow-band (Paragios and Deriche, 1998). In (Paragios and Deriche,
1998), Paragios et al. apply some heuristic steps to perform asynchronous update
of the narrow-band. Weickert in (Weickert, 1998) uses multiresolution approach
to overcome the problem of speed in level set methods. Goldenberg et al. in
(Goldenberg et al., 2001) introduce a new method to maintain the numerical con-
sistency and make the geodesic active contour model computationally efficient.
They achieve this by: (1) canceling the limitation on the time step in the nu-
merical scheme, (2) limiting the computation to a narrow band around the active
contour, and (3) applying an efficient re-initialization technique. Their method
combines the narrow band level set method, with adaptive operator splitting and
the fast marching. Li et al. (Li et al., 2005a) presents a variational formulation
21
Table 2.2: Properties of parametric and geometric (geodesic) active contours (takenfrom Delingette and Montagnat, 2001)
Parametric active Geodesic activecontours contours
Efficiency Good PoorEase of implementation Easy Moderately difficultTopology change No YesOpen contours Yes NoInteractivity Good Poor
for geodesic active contours without re-initialization. Their approach can be eas-
ily implemented using a simple finite difference scheme and is computationally
very efficient compared to traditional level set methods. Table 2.2 summaries the
properties of the parametric and geometric (geodesic) active contours.
2.3 Active Contour based Textured Object Seg-
mentation
Problem of textured object(s) segmentation (sometimes referred as only texture
segmentation in this section) have been studied by many researchers (Paragios
and Deriche, 1999b,a; Sagiv et al., 2000; Paragios and Deriche, 2002b; Sagiv et al.,
2006; Lu and Pavlidis, 2007). In general, texture segmentation algorithms combine
following four major components:
1. First, a texture representation space is selected. Common choices are win-dowed Fourier transforms, the Gabor representation (Jain and Farrokhnia,1991), Discrete wavelet transforms (Chang and Kuo, 1993; Laine and Fan,1993; Arivazhagan and Ganesan, 2003; Bashar et al., 2003), local histograms(Hofmann et al., 1998), the local structure tensor (Rousson et al., 2003) etc.
2. In the second step, texture features are extracted, e.g. the magnitude of theresponse of the Gabor filters, wavelet coefficients, particular moments which
22
are calculated from local histograms (Jain and Farrokhnia, 1991; Lu et al.,1997; Sagiv et al., 2001).
3. In the third step, a measure of the texture characteristic features is defined.The measure indicates how much variability is characteristic of the texture.Kulback-Leibler, Mutual information, gradients, and other distance measureare the typical example for this stage.
4. Finally, some objective function is defined using the texture features, and thesegmentation is formulated as an optimization, minimization or clusteringproblem. In case of region based algorithms, the third and fourth stagesbecome inseparable.
In this section, we review some of the textured object segmentation methods
proposed in the literature. We focus on those methods which are based on active
contours. In section 2.5, we present a short summary on the texture feature
extraction and representation methods.
As we have seen in section 2.2, in the formulations for both types of active
contour (parametric and geometric), the external image force (traditionally) is
derived from edge or image gradient information. In the presence of such an
external force, snake is attracted towards large image gradients i.e. towards the
edges in the image. However this makes the models unfit for finding boundaries
of objects with complex large-scale texture patterns due to the presence of large
variations in image gradient and many local edges inside the object. So if it is
applied to the textured images, it will often get stuck on local texel (micro-units
or cells of a texture) edges and converge at non-object boundary. In the case of
textured object segmentation, snake should be able to find out the point where the
texture characteristic is changing. There have been efforts to address this problem
and many active contour based methods have been suggested in the literature in
the recent past to solve this problem (Zhu and Yuille, 1996; Paragios and Deriche,
23
1999b,a; Sagiv et al., 2000; Paragios and Deriche, 2002b; Pujol and Radeva, 2004;
Sagiv et al., 2006). These methods incorporate the texture features of the image
to determine the texture boundary.
Zhu et al. (Zhu and Yuille, 1996) proposed an approach called region com-
petition which performs texture segmentation by combining region growing and
active contours using multi-band input after applying a set of texture filters. The
method assumes multivariate Gaussian distributions on the filter response vector
inputs.
Geodesic Active Regions (Paragios and Deriche, 1999b) deal with supervised
texture segmentation in a frame partition framework using an implementation
for level-set deformable model. There are, however, several assumptions in this
supervised method, which include knowing beforehand the number of regions in
an image and the statistics for each region are learned off-line using a mixture
of Gaussian approximations. These assumptions limit the applicability of the
method to a large variety of natural images.
Paragios and Deriche in (Paragios and Deriche, 1999a) proposed an approach
based on geodesic active contour using Gabor filters, analyzing their responses as
multi-component conditional probability density functions. The texture segmen-
tation is obtained here by minimizing a geodesic active contour model objective
function, where boundary based information is expressed via discontinuities on
the statistical space associated with the multi-model texture feature space.
Another texture segmentation approach in a deformable model framework (Pu-
jol and Radeva, 2004) is based on deforming an improved active contour model (Xu
24
and Prince, 1998a,b) on a likelihood map instead of heuristically constructed edge
map. However, because of the artificial neighborhood operations, the results from
this approach suffer from blurring on the likelihood map, which causes the object
boundary detected to be “dilated” versions of the true boundary. The dilated
zone could be small or large depending on their neighborhood size parameter.
Lorigo et al. (Lorigo et al., 1998) extended vector-valued geodesic active con-
tour (Sapiro, 1996, 1997) for texture segmentation. Texture information in their
approach is incorporated into active contour framework through the use of vector-
valued geodesic active contour with local variance as a second value at each pixel,
in addition to intensity.
Sandberg et al. (Sandberg et al., 2002) applied a “vector-valued active contour
without edges” (Chan et al., 2000) mechanism to the Gabor filtered images. Sagiv
et al. in (Sagiv et al., 2000, 2006) use Beltrami framework (Sochen et al., 1998)
based multi-valued geodesic active contour algorithm in the Gabor feature space.
Presence of the texture edge in their approach is estimated by viewing Gabor
feature space as a manifold. Determinant of this manifold’s metric is interpreted
as a measure for the presence of the gradient on the manifold.
2.4 Graph-cut Based Methods
Boykov et al. (Boykov and Jolly, 2001) proposed a new technique for general pur-
pose interactive segmentation of N-dimensional images. In their technique, user
marks certain pixels as “object” or “background” to provide hard constraints for
25
segmentation. Additional soft constraints incorporate both boundary and region
information. Then technique uses graph-cuts to find the globally optimal segmen-
tation of an N-dimensional image. The obtained solution gives the best balance of
boundary and region properties among all possible segmentations satisfying the
constraints. The topology in this segmentation method is unrestricted and both
“object” and “background” segments may consist of several isolated parts. The
effectiveness of formulating the object segmentation problem via binary graph-
cuts is demonstrated by a large number of recent publications in computer vision
and graphics that directly build upon the basic concept outlined by Boykov et
al. (Boykov and Jolly, 2001). Researcher have extended graph-cut based ob-
ject segmentation technique (Boykov and Jolly, 2001) in a number of interesting
directions. Here we briefly review few of them.
Geo-cuts (Boykov and Kolmogorov, 2003; Kolmogorov and Boykov, 2005)
incorporate geometric cues for the segmentation. It combines geodesic active
contours and graph-cuts and produces segmentation by finding global minimum
geodesic active contours.
GrabCut (Rother et al., 2004) uses regional cues based on Gaussian mixture
models (Blake et al., 2004) for improved interactivity. GrabCut provides a con-
venient way to encode color features as segmentation cues to obtain foreground
segmentation from local pixel similarities using modified iterated graph-cuts.
Lazy snapping (Li et al., 2004) separates coarse and fine scale processing,
making object specification and detailed adjustment easy. It provides instant
visual feedback, snapping the cutout contour to the true object boundary effi-
26
ciently despite the presence of ambiguous or low contrast edges. Instant feedback
is made possible by a novel image segmentation algorithm which combines graph-
cut (Freedman and Zhang, 2005) with pre-computed over-segmentation.
Obj-cut (Kumar et al., 2005) integrates high-level contextual information. It
presents a principled Bayesian method for detecting and segmenting instances of
a particular object category within an image, providing a coherent methodology
for combining top down and bottom up cues.
Other important graph-cut based methods include multi-level and banded
methods (Lombaert et al., 2005; Xu et al., 2003; Juan and Boykov, 2006), bi-
nary segmentation using stereo cues (Kolmogorov et al., 2005), efficient algorithms
for dynamic applications (Kohli and Torr, 2005; Juan and Boykov, 2006) (flow-
and cut-recycling), extraction of moving or foreground objects from video (Li
et al., 2005b; Wang et al., 2005), simultaneous segmentation of multiple objects
(Li et al., 2006), efficient N-D image segmentation (Boykov and Funka-Lea, 2006)
and methods for solving surface evolution PDEs (Boykov et al., 2006).
2.5 Texture Representation and Analysis
In this section, the various feature extraction methods (as depicted in Fig. 2.1)
used in literature for texture representation are reviewed.
27
Texture Feature Extraction
Multiresolutiontechniques
Spatial/spatial−frequencytechniques
Markov random fields
Fig. 2.1: Categorization of filtering techniques used for texture representation.
2.5.1 Texture Feature Extraction using Multiresolution Meth-
ods
Due to variability in the size of texels in a texture image, it is often difficult to
define an optimal resolution for feature extraction a priori. One efficient way
to represent different image details is to re-organize the image into a number of
subsampled approximations at different resolutions (P. P. Raghu, 1995). This is
called a multiresolution representation. This scheme analysis the coarse image
details first and gradually increases the resolution to analyze the finer details.
In analyzing natural textures, feature extraction using a multiresolution scheme
is appropriated in order to capture features of variable sizes. In this section,
we review few multiresolution methods which have been used for texture feature
extraction.
Discrete Wavelet Transform (DWT)
The discrete wavelet transform analysis a signal based on its content in different
frequency ranges at varying scales. Therefore it is very useful in analyzing repet-
itive patterns such as texture. The wavelet transform is expressed as a decompo-
sition of a signal f(x) ∈ L2(R) into a family of functions which are translations
28
and dilations of a mother wavelet function Ψ(x). Employing the definition:
Ψs(x) =√
sΨ(sx− a), (2.1)
the wavelet transform of f(x) is defined by
Wf(s, a) =
∞∫
−∞
f(x)√
sΨ(sx− a)dx (2.2)
where s, a ∈ R indicate scale and translation parameters respectively. Since the
continuous wavelet transform is redundant, it is discretized by sampling parame-
ters s and a. The most common choice is s = 2i and a = n/2i; i, n ∈ Z. Inserting
these values in equation 2.2 yields the DWT of the signal f(x), as:
Wdf(i, n) =< f(x), ψ2i(x− n2−i) > (2.3)
where < ... > denotes the inner product. Since some existing wavelets ψ(x) ∈
L2(R) constitute an orthonormal basis
(ψ2i(x− (n/2i))); i, n ∈ Z,
this transform is called an orthogonal wavelet transform.
Introducing the so-called scaling function φ(x), the interscale coefficients g(n)
with high-pass (HP) characteristics and h(n) with low-pass (LP) characteristics,
it is possible to decompose a signal f(x) using the following L-level decomposition
29
scheme, as
f(x) =∑
n
c0,nφ(x− n)
=∞∑
n=−∞cL,nφ2−L(x− n2L) +
L∑i=1
∞∑n=−∞
di,nψ2−i(x− n2i) (2.4)
which is a finite approximation of equation 2.3. The coefficients ci,n and di,n are
obtained by
ci,n =∑
k
ci−1,nh(k − 2n) (2.5)
di,n =∑
k
ci−1,ng(k − 2n) (2.6)
which is same as convolving the signal ci−1,n with impulse responses
h(n) = h(−n), g(n) = g(−n) (2.7)
respectively, and subsequently discarding every other sample. The 2-D transform
uses a family of wavelet functions and its associated scaling function to decompose
the original image into different channels, namely the low-low, low-high, high-low
and high-high (A, V, H, D respectively) channels. The decomposition process can
be recursively applied to the low frequency channel (LL) to generate dyadic de-
composition at the next level. Methods of texture segmentation based on discrete
wavelet transform have been widely used. Mallat (Mallat, 1989) treated wavelet
transform as a texel decomposition where each texel corresponds to a particular
wavelet basis function. Mallat used a pyramidal algorithm based on convolutions
with quadrature mirror filters for computing the wavelet transforms. This ini-
30
tial proposal has been followed by several studies on texture classification with a
particular attention to the use of wavelet packets, which constitute a multiband
extension of pyramid structured wavelet transform.
Carter (Carter, 1991) used Morlet and Mexican hat wavelets for texture feature
extraction. The wavelets used by Carter were not orthogonal, and the Mexican
hat wavelets lacked the spatial orientation selectivity. Laine et al. (Laine and Fan,
1993) have used wavelet packets, which provide orthogonal and compactly sup-
ported wavelets with orientation selectivity. Chang et al. (Chang and Kuo, 1993)
used a tree-structured wavelet transform for the classification of textured images.
Some of the issues involved in using wavelet transform for texture analysis are the
choice of wavelet basis prototypes and the selection of dilations, translations and
rotations.
A method based on a hierarchical wavelet decomposition technique was intro-
duced by Salari and Ling (Salari and Ling, 1995), where the Daubechies 4-tap
filters are used to decompose the original image into three detail and one ap-
proximate sub-band images, followed by a K-means clustering for segmentation.
Lu et al. (Lu et al., 1997) proposed a method of unsupervised texture descrip-
tion using wavelet transform. This proposed methodology has four stages. The
first stage computes a smoothed local energy of the wavelet coefficients in high-
frequency bands, as features for segmentation. The second stage performs a coarse
segmentation using a multi-thresholding technique. In the third stage, the fea-
tures at different orientations and scales are fused in intra-scale and inter-scale
respectively. In the last stage, ambiguously labeled pixels are reclassified in a fine
segmentation technique. Segmentation results at various scales are integrated by
31
inter-scale fusion to determine the number of classes.
Charalampidis and Kasparis (Charalampidis and Kasparis, 2002) used a set
of new roughness features for texture segmentation. Wavelets are used to ex-
tract single-scale and multiple-scale texture roughness features. These are then
transformed to a rotational invariant feature vector, which has the information of
texture direction.
M-band wavelet transform is a direct generalization of the conventional wavelets.
The M-band wavelets are able to zoom in onto narrowband high frequency compo-
nents of a signal and are reported to give better energy compaction than 2-band
wavelets. M-band wavelets are a set of M-1 basis functions whose scaled and
translated versions form a tight frame for the set of square integrable functions
defined over a set of real numbers. Chitre et al. (Chitre and Dhawan, 1999) pre-
sented an M-band wavelet technique for the discrimination of natural textures.
The M-band wavelet filters were designed using a genetic algorithm based search
method over the householder set of parameters. 20 different categories of textures
were decomposed using a complete and overcomplete representation and features
were computed on the decomposed sub-bands. A KNN classifier was used to dis-
criminate 20 different combinations of test and training sets containing images of
arbitrary size. The results indicate that M-band wavelets have good efficiency to
discriminate between natural textures of varying sizes. Acharyya et al. (Acharyya
and Kundu, 2000) used M-band wavelets for two texture segmentation.
Wang et al. (Wang and Zhang, 2003) proposed a supervised texture segmenta-
tion algorithm based on wavelet transform for the application of remote sensing.
32
The feature extraction step involves the extraction of pixel neighborhood prop-
erties using discrete wavelet frame. A quadrant noise filtering method based on
contextual/spatial relationships between detail images was used to acquire a more
accurate estimate of feature space for texture segmentation. The estimated fea-
ture vector of each pixel is sent to a Bayes classifier to make an initial probabilistic
labeling. To obtain a more accurate result of segmentation, a probabilistic relax-
ation method is used to introduce the spatial constraints into the segmentation
algorithm. The results are shown on a set of satellite images.
Wang et al. (Wang and Liu, 1999) proposed a multiresolution MRF (MRMRF)
based modeling to describe textures. The subbands of decomposed textures using
wavelet transform are modeled with MRF and the corresponding Gibbs, clique
potential parameters are used as features for classification. Comparisons of the
pyramidal and tree-structured wavelets with the Gabor filtering approaches was
presented by Pichler et. al. (Olaf Pichler and Hosticka, 1996). Results show that
both the wavelet-based methods are sub-optimal for feature extraction purpose,
because the center frequency, orientation and bandwidth cannot be selected. The
paper concludes that Gabor filtering outperforms the wavelet cases, in special
cases but is computationally more expensive.
Gaussian and Laplacian Pyramids
Burt et al. (Burt and Andelson, 1983) used Gaussian and Laplacian pyramids to
decompose the original image into steps of low-pass and high-pass components re-
spectively. Each stage of the Gaussian pyramid is computed by low-pass filtering
33
the output of the previous stage. Subsequently, the filtered output is subsampled
by a factor 2j, where j is an integer. Subtraction of two corresponding adjacent
levels of Gaussian pyramids gives an approximation of the Laplacian of Gaussian.
The details are grouped into a pyramidal structure called Laplacian pyramid.
This pyramid approach has an advantage of fast computation, but it suffers from
the disadvantage that it cannot efficiently model the correlated information in-
volved at different levels. In addition, these pyramids have a severe limitation in
analyzing orientation textures and also textures with middle range of frequencies.
One main advantage of using the pyramids is that the computation is fast com-
pared to other transforms. But this decomposition suffers from the disadvantage
that the information at different levels are correlated and it is difficult to model
this correlation. Further, these pyramids do not possess the spatial orientation
selectivity in the decomposition process. This is a severe limitation in analyzing
orientated textures. Also, textures with middle range of frequencies are difficult
to characterize with the pyramids.
2.5.2 Texture Feature Extraction using Spatial/Spatial-
frequency Techniques
Discrete Cosine Transform (DCT)
The DCT transforms a signal from a spatial representation into a frequency rep-
resentation. Because of fast implementation and good results, DCT is widely
used in image compression algorithms such as JPEG. The DCT can decompose
34
the image into spectral sub-bands having different importance with respect to the
visual quality of the image. A N x 1 DCT basis vector um is expressed as
um(k) =
√1N
; m = 1
√2N
cos{
(2k−1)(m−1)π2N
}; m = 2, ..., N
(2.8)
These 1-D DCT vectors can then be used to generate 2-D transform filters appro-
priate for images. To do so, we simply multiply the column basis vectors with the
row vectors of identical length to produce a set of 2-D filters of N2 entities where
N is the vector length. Ng et al. (Ng et al., 1992) and Alim et al. (Abdel Alim
and Sharkas, 2002) presented a comparative study of two approaches: Gabor fil-
ter and DCT. They found that DCT gives high quality segmentation results than
Gabor filters. Alim et al. achieved recognition rate of 96% using DCT coefficients
and 92% using Gabor coefficients on human iris image data.
Gabor Filter
A Gabor filter is Gaussian modulated by a complex sinusoid. The fourier trans-
form of the Gabor filter is a Gaussian shifted in frequency space. A 2-D Gabor
filter is a bandpass filter with tunable center frequency, orientation and band-
width. A 2-D Gabor filter G, can be expressed as
G(x, y, ω, σ, θ) =1
2πσ2exp(−1
2(x2 + y2
σ2) + jω(xcosθ + ysinθ)) (2.9)
where, ω is the frequency of the modulating sinusoid, σ is the spatial extent
(width) of the Gaussian function and θ is the orientation (direction) of the spatial
35
(a) (b)
Fig. 2.2: Real part of a 2-D Gabor filter with different combinations of scale (σ),orientation (θ) and frequency (ω): (a) σ = 6, θ = 0, ω = 2; (b) σ = 6, θ = 45, ω= 2.8.
sinusoid. Fig. 2.2 shows 2-D Gabor filters with different parameters. A 2-D
Gabor filter can be considered to be equivalent to a 1-D Gabor filter applied in all
directions. On other hand, a 1-D Gabor filter is characterized only by the width
of the Gaussian window and the frequency of the modulating sinusoid. Thus, a
1-D Gabor filter g can be expressed as
g(x, ω, σ) =1
2πσ2exp(− x2
2σ2+ jωx) (2.10)
Elementary Gabor signals were introduced by Gabor (Gabor, 1946) and was
later extended to 2-D by Daugman (Daugman, 1985). Marcelja (Marcelja, 1980)
introduced Gabor filter as a mathematical representation of the receptive profiles
of visual cortical cells. When appropriately tuned, these filters have been found
to be extremely useful for performing texture feature extraction and texture edge
detection. Dunn et. al. (Dunn et al., 1994) presented an algorithm to design
specially tuned Gabor filters to segment images with bipartite textures. The
parameter tuning of the set of Gabor filter bank is the key contribution of this
approach.
G. M. Haley et al. (Haley and Manjunath, 1999) proposed a rotation invariant
36
texture classification using complete space-frequency model. In this approach,
they have used a polar, truly analytic (frequency channel) form of 2-D Gabor
functions and achieved the rotation invariance by transforming the Gabor features
into rotation invariant features (using autocorrelation and DFT magnitudes).
Rao et al. (Rao et al., 2004) proposed a method for texture segmentation by
combining features obtained using Discrete wavelet transform (DWT) and Gabor
filter bank. These features are classified using an unsupervised fuzzy-c means
(FCM) classifier. Results show that combined features gives superior classification
performance in comparison to features extracted only using DWT or Gabor filter.
37
CHAPTER 3
Textured Object Segmentation using Parametric
Active Contours
This chapter presents a novel technique for textured object segmentation using
parametric active contours, which are also known as snakes (Kass et al., 1988).
The proposed technique is based on the extension of parametric active contours
from the use of normal intensity based features to texture features. Parametric
active contours synthesize parametric curves within the image domain and allow
them to move towards the desired image features under the influence of internal
and external forces. The internal force serves to impose a piecewise continuity and
smoothness constraint, whereas external force pushes the snake towards salient
image features like edges, lines and subjective contours. External force in the
traditional snake is defined in terms of image gradient. In the presence of such
external force, snake is attracted towards large image gradients i.e. towards the
edges in the image. So if it is applied to the textured images, it will often get stuck
on local texel (micro-units or cells of a texture) edges and converges at non-object
boundary. Fig. 3.1 shows an example of a textured object segmentation using
normal intensity based snake. Segmentation result is shown in Fig. 3.1(b) and the
desired result is shown in Fig. 3.1(c). In the segmentation result, we can see that
the contour has latched to the non-object boundary (Fig. 3.1(b)). To overcome
this effect, we here present a new formulation of external force in active contour
(a) (b) (c)
Fig. 3.1: (a) Synthetic texture image with initial contour, (b) result obtained usingnormal intensity based snake, and (c) desired segmentation result.
for textured images, which we name as “texture force”. The snake in presence of
texture force runs over the texture image surface and detects the object boundary
of a texture surface, on a background texture. Texture force considers the texture
properties of the image for modeling in place of using the image pixel intensity
values directly.
The rest of the chapter is organized as follows. Section 3.1 briefly presents
parametric active contour model and scalogram (Clerc and Mallat, 2002) which
are required for the formulation of the proposed technique. Section 3.2 describes
a novel texture feature extraction technique which is based on scalogram. We use
this texture feature extraction technique to extract the texture characteristics of
the input texture images. Modeling of the texture force is presented in section
3.3. Texture force is the essential part of the modified parametric active contour.
Experimental results are presented in section 3.4 and section 3.5 concludes the
chapter.
39
3.1 Background
3.1.1 Parametric Active Contour (Snake) Model
A traditional active contour (Kass et al., 1988) is defined as a parametric curve
C(q) : [0, 1] → R2, which minimizes the following energy functional
E =
∫ 1
0
[1
2(α|C′
(q)|2 + β|C′′(q)|2) + Eext(C(q))
]dq (3.1)
where, C′(q) and C
′′(q) are the first and second order derivatives of C(q) respec-
tively and α, β are constants. First two terms in Eqn. 3.1 are called the elastic
and bending energy, which control the elastic and the bending ability of the snake
respectively. Relative importance of the elastic and bending ability of snake is
controlled by the constants α and β respectively. Since these elastic and bending
energies are derived from the contour itself, they constitute the internal compo-
nent of the snake energy. Eext, the last term in Eqn. 3.1, is derived from the image
and is called the external energy component of the snake. It attracts the snake
towards the salient features in the image such as edges, object boundaries etc. For
an image I(x, y), where (x, y) are spatial co-ordinates, typical external energy is
defined as follows to lead the snake towards step edges (Kass et al., 1988)
Eext = −|∇I(x, y)|2 (3.2)
where,∇ is gradient operator. A snake that minimizes E must satisfy following
40
Euler-Lagrange equation (Elsgolc, 1963)
αC′′(q)− βC
′′′′(q)−∇Eext = 0 (3.3)
Eqn. 3.3 can also be viewed as force balance equation
Fint + Fext = 0 (3.4)
where, Fint = αC′′(q) − βC
′′′′(q) and Fext = −∇Eext. Fint, the internal force, is
responsible for stretching and bending and Fext, the external force, attracts snake
towards the desired features in the image.
3.1.2 Discrete Wavelet Transform and Scalogram
The discrete wavelet transform (DWT) analyses a signal based on its content in
different frequency bandwidths. Therefore, it is very useful in analyzing repetitive
patterns such as texture. DWT decomposes a signal into different subbands (ap-
proximation and detail) with different resolution in frequency and spatial extent.
Let ξ(x) be the image signal and ψu,s(x) be a wavelet function at a particular
scale. Then signal filtered at point u is obtained by taking the inner product of
the two < ξ(x), ψu,s(x) >. This inner product is called wavelet coefficient of ξ(x)
at position u and scale s (Mallat, 1999). Scalogram (Clerc and Mallat, 2002) of a
signal ξ(x) is the variance of this wavelet coefficient:
w(u, s) = E{| < ξ(x), ψu,s(x) > |2} (3.5)
41
(a) (b)
Fig. 3.2: (a) Synthetic texture image, (b) magnified view of the 21 × 21 window ofthe texture cropped at point P shown in (a).
In our work, w(u, s) has been approximated by convolving the square modulus
of the filtered outputs with a Gaussian envelop of a suitable width (Clerc and
Mallat, 2002). w(u, s) gives the energy accumulated in a subband with frequency
bandwidth and center frequency inversely proportional to scale. We use scalogram
based discrete wavelet features to model the texture characteristics of the image
in our work. The scalogram will have large energy in any particular subband if the
signal has large spectral energy in the bandwidth of that particular scale/subband.
3.2 Texture Feature Extraction
In this section, we explain how the wavelet transform is used to extract the texture
features necessary for texture force estimation. It discusses the computational
framework based on multi-channel processing. We use DWT-based dyadic de-
composition of the signal to obtain texture properties. A simulated texture image
shown in Fig. 3.2(a) is used to illustrate our proposed computational framework
with the results of intermediate stages of processing.
Modeling of texture features at a point in an image involves two steps: scalo-
42
0 50 100 150 200 250 300 350 400 4500
50
100
150
200
250
(a)
20 40 60 80 100 120 140 160 180 200 220
12345
x 104
10 20 30 40 50 60 70 80 90 100 110
12345
x 104
5 10 15 20 25 30 35 40 45 50 55
12345
x 104
5 10 15 20 25
12345
x 104
5 10 15 20 25
12345
x 104
D1
D2
D3
D4
A4
(b) (c)
Fig. 3.3: (a) 1-D texture profile of the texture window shown in Fig. 3.2(b), (b)scalogram of the signal shown in (a), (c) texture feature image for the image shownin Fig. 3.2(a).
gram estimation and texture feature estimation. To obtain texture features at a
particular point (pixel) in an image, a n × n window is considered around the
point of interest (see Fig. 3.2(b), which shows the neighborhood of a point P in
Fig. 3.2(a)). Intensities of the pixels in this window are arranged in the form of
a 1-D vector of length n2, whose elements are taken column wise (or row wise)
from the n × n cropped intensity matrix. This intensity vector (signal), which
basically represents the textural pattern around the pixel, is subsequently used in
the estimation of the scalogram.
3.2.1 Scalogram estimation
An input signal, obtained after arranging the pixels of a n × n window as ex-
plained above, is used for scalogram estimation. This signal is decomposed using
a wavelet filter. We use orthogonal Daubechies 2-channel (with dyadic decom-
position) wavelet filter for this purpose. Daubechies filter with level-L dyadic
decomposition, yields wavelet coefficients {AL,DL,DL−1 , ..,D1}, where Ai repre-
sents the approximation coefficient and Di’s are detail coefficients. The steps of
processing to obtain scalogram from the wavelet coefficients are similar to that
described in (Greiner and Das, 2006; Rao et al., 2006). Fig. 3.3(b) presents an
43
example of scalogram obtained for the signal shown in Fig. 3.3(a) using level-4
DWT decomposition.
3.2.2 Texture feature estimation
Once the scalogram of the texture profile at a particular point is obtained, a
post-processing step is carried out to eliminate non-significant subbands of the
scalogram and only significant subbands are used for further processing. This is
done since significant subbands contain the information of major texture features.
Removal of non-significant subbands helps in removing the redundant information
and making the computation fast. Wavelet level-L decomposition gives following
set of wavelet subbands B = {AL, DL, DL−1, .., D1}. Let element Bi be the ith
wavelet subband in set B. We use Algorithm 1 to determine significant and non-
significant subbands, as given below.
Algorithm 1 Identification of significant subbands
for each Bi ∈ B doif variance for subband Bi ≤ threshold
Bi is non-significant subbandelse
Bi is significant subbandend-if
end-for
The value of threshold is decided empirically. Variance of subband Bi is defined
as var(Bi) = E[(Bi−µi)2], where µi is the mean of all wavelet coefficients belonging
to subband Bi. The significant subbands detected from the scalogram are used
for texture feature estimation. Texture features are estimated from the “energy
44
measure” of the scalogram coefficients of the significant subbands. This texture
feature is similar to the “texture energy measure” first proposed by Laws (Laws,
1980).
Let for a pixel k in image I, Dk be the set of all significant subbands of
scalogram S. Then the energy measure of pixel k is estimated to be the l1-norm
of the scalogram coefficients belonging to the significant subbands and is given as
follows:
Ek =1
N
{∑i∈Dk
∑j
S(i,j)
}(3.6)
where, S(i,j) is the jth element of the ith subband of scalogram S and N is the
sum of cardinalities of the members of Dk. This energy measure computed for all
pixels in an image constitute the “texture feature image”, which is further used in
texture force modeling. Fig. 3.3(c) shows a texture feature image computed from
the image shown in Fig. 3.2(a). We can see that pixels belonging to a particular
texture region exhibit identical energy level. More examples of the same will follow
in section 3.4.
3.3 Modeling of Texture Force
To make the active contour sensitive to texture boundaries, we exploit the gradient
present in the texture feature image. Let, for a given texture image I(x, y), f(x, y)
be the texture feature image obtained by the method explained in the previous
section. The external energy of the snake, based on the gradient present in the
45
texture feature image, can be defined as follows (similar to Eqn. 3.2)
Etexext = −|∇f(x, y)|2 (3.7)
Hence, the modified total energy of the snake, Etex, in case of texture, can be
given by replacing Eext by Etexext in Eqn. 3.1 as follows
Etex =
∫ 1
0
[1
2(α|C′
(q)|2 + β|C′′(q)|2) + Etex
ext(C(q))
]dq (3.8)
A snake that minimizes Etex would satisfy following Euler-Lagrange equation
αC′′(q)− βC
′′′′(q)−∇Etex
ext = 0 (3.9)
Viewing Eqn. 3.9 as force balancing equation, we can write
Fint + Ftexext = 0 (3.10)
where, Fint = αC′′(q)− βC
′′′′(q) is same as the internal force acting on the snake
applied to a normal intensity image. Ftexext defines the external force in texture
based model of snake, and we call it as “texture force”. Snake, in presence of
texture force, is attracted towards the texture object boundary. Texture force
causes the change in external energy (i.e. in Etexext) and is given as follows:
Ftexext = −∇Etex
ext (3.11)
To find the object boundary active contour deforms, so it can be represented
46
as a time varying curve C(q, t) = [x(q, t), y(q, t)] where q ∈ [0, 1] is the arc-length
and t ∈ R+ is time. Dynamics of the contour is governed by the following equation
which is obtained by setting the partial derivative of C(q, t) w.r.t. t equal to the
left hand side of Eqn. 3.9.
Ct(q, t) = αC′′(q)− βC
′′′′(q)−∇Etex
ext
= Fint + Ftexext (3.12)
When the solution of C(q, t) stabilizes, the term Ct(q, t) vanishes and provides
a solution to Eqn. 3.9. This dynamic equation can also be viewed as a gradient
descent algorithm (Cohen et al., 1992) designed to solve Eqn. 3.1. A solution
to Eqn. 3.12 can be found by discretizing the equation and solving the discrete
system iteratively (Kass et al., 1988). In the following section, results of the
proposed technique are presented.
3.4 Experimental Results
To automatically detect the boundary of a particular object using active contours
in presence of texture, a contour is initialized near the desired object. The contour
is then allowed to deform towards the object boundary until it latches around
the object. In case of textured images, object boundary is identified as the point
where texture property changes i.e. where two texture regions meet. On a texture
surface, snake in presence of texture force stops moving as it reaches a different
texture region. For a snake to stop at the texture boundary, net effect of the
47
(a) (b) (c)
Fig. 3.4: Results of our proposed technique: first five texture images are composedof two Brodatz textures (Brodatz, 1966) and the last image is a natural Zebra image,(a) input texture images, (b) corresponding texture feature images, (c) segmentationresults. Contour shown in images depicts the estimated boundary of the foregroundobject with texture.
48
damping, internal and external forces (texture force) should be zero for all snake
points at the object boundary. To demonstrate the performance of the snake in
the presence of texture force, various kinds of synthetic and natural textures are
used. We have used Daubechies 8-tap 2-channel filter for DWT decomposition in
all our experiments.
Experimental results are shown in Fig. 3.4 - 3.6. Fig. 3.4 shows results on few
synthetic and one natural texture images. The first column in this figure shows
the input texture images. First five input images are composed of two Brodatz
(Brodatz, 1966) textures and last image is a natural image of Zebra. Second
column of the Fig. 3.4 shows the texture feature images for the respective texture
images. One can note that the energy values exhibited in the texture feature
images are distinct for two different texture regions. Last column of the Fig.
3.4 shows the segmentation results using the proposed technique. To generate
the texture feature images of these input texture images, we considered a 11× 11
window at each pixel. DWT decomposition was done up to level-4. Computational
time required to perform object segmentation in the images shown in Fig. 3.4, is
given in Table 3.1.
Table 3.1: Computational time required for the segmentation of textured objects inimages shown in Fig. 3.4. Images are numbered from top to bottom in Fig. 3.4.
Image Name Image Size Computational(in pixels) Time (in seconds)
Image-1 197× 155 58Image-2 197× 155 60Image-3 197× 155 60Image-4 102× 94 26Image-5 197× 155 59Image-6 170× 145 46
49
Fig. 3.5 shows comparative results on one synthetic and few natural images.
Each row in the figure gives the comparative study of the results where we first
show the results available in the literature, followed by the result obtained using
our proposed technique. We can see that the results obtained using our proposed
technique are better than the results available in the literature in most of the
cases.
Fig. 3.6 explicitly shows the comparative study on the image of Zebra that
occurs quite often in the texture segmentation literature (Sagiv et al., 2002; Rous-
son et al., 2003; Sagiv et al., 2006; Awate et al., 2006). We can carefully observe
that our segmentation result (Fig. 3.6(f)) is superior than the results obtained
using other techniques (Sagiv et al., 2002; Rousson et al., 2003; Sagiv et al., 2006;
Awate et al., 2006; Gupta and Das, 2006). These results reported in the literature
show errors in any one of the following places of the object: mouth, back and area
near the legs.
3.5 Summary and Discussion
In this chapter, a new texture feature extraction technique is described. This
technique is based on scalogram (Clerc and Mallat, 2002) which is computed
using the discrete wavelet transform. We also introduce a new external force
for snakes, which we call as “texture force”. Texture force is modeled using the
scalogram based texture features. Snake, in presence of texture force, is used for
the texture object segmentation. We show the segmentation results of parametric
active contour in presence of texture force on various synthetic and natural texture
50
(a) (b) our result
(a) (b) (c) our result
(a) (b) (c) our result
(a) (b) (c) our result
Fig. 3.5: Comparative results: (I): results on a synthetic texture image: (a) fromSagiv et al. (2006), (b) our result, (II): results on Zebra-1 image: (a) from Paragiosand Deriche (2002b), (b) from Rousson et al. (2003), (c) our result, (III): results onZebra-2 image: (a) from Kim et al. (2002), (b) from Rousson et al. (2003), (c) ourresult, (IV): results on Cheetah image: (a) from Kim et al. (2002), (b) from Roussonet al. (2003), (c) result of the proposed method.
51
(a) (b) (c)
(d) (e) (f) our result
Fig. 3.6: Comparative results on Zebra image: (a) reproduced from Sagiv et al. (2002),(b) reproduced from Rousson et al. (2003), (c) reproduced from Awate et al. (2006),(d) reproduced from Sagiv et al. (2006), (e) obtained using the method presented inGupta and Das (2006), (f) result of the proposed method.
images. We also compare few of our results with the results of other techniques
available in the literature. We can see that in most of the cases our proposed
method provides better performance than the other texture object segmentation
technique proposed in the literature.
Since the texture object segmentation technique proposed in this chapter is
based on the parametric active contour which does not intrinsically deal with the
segmentation of multiple objects, this technique can only handle the segmentation
of a single object. The issue of the segmentation of multiple textured objects is
discussed in the next chapter.
52
CHAPTER 4
Segmentation of Multiple Textured Objects
using Geodesic Active Contours
In the previous chapter, we had extended parametric active contours for tex-
tured object segmentation. Parametric active contours have many advantages
over geodesic active contour such as straightforward implementation, computa-
tional efficiency, better user interactivity, fast convergence etc. However, its in-
ability to handle the topological changes of the contour automatically during the
evolution makes it unfit for the use of multiple objects segmentation. Several
algorithms have been proposed to overcome this limitation (Leitner and Cinquin,
1991; Szeliski et al., 1993; McInerney and Terzopoulos, 1996; Lachaud and Mon-
tanvert, 1999; Delingette and Montagnat, 2001)) of parametric active contours.
In most of these techniques, a procedure is setup to monitor the deformation of
the contour and to change the topology when required. Many times, heuristics are
also used for detecting possible splitting and merging of the deforming contours.
Figure 4.1 shows an example of multiple objects segmentation using parametric
active contours. It shows a deforming contour intersecting with itself. In such
cases, parametric active contour requires a special strategy which can monitor
the deformation of the contour, and in case of contour intersection handle the
splitting of the contour. Handling of the topological changes in case of parametric
active contours using special procedures and heuristics is a problem when an
unknown number of objects must be detected simultaneously. Also, the approach
of parametric active contours is non-intrinsic, since the energy of the contour
depends on the parametrization of the curve and not directly related to the object
geometry.
The above mentioned problems of parametric active contour are intrinsically
solved by using geodesic active contour proposed by (Caselles et al., 1997). Since
geodesic active contours can intrinsically handle topological changes during the
process of segmentation, in this chapter we extend geodesic active contour al-
gorithm for the segmentation of multiple textured objects in the presence of a
background texture. Our algorithm is based on the generalization of geodesic ac-
tive contour model from one-dimensional intensity based feature space to multi-
dimensional feature space (Sapiro, 1997). In our approach, image is represented in
an n-dimensional texture feature space which is derived from the image using the
scalogram (Clerc and Mallat, 2002) obtained from the discrete wavelet transform.
We use the geodesic active contour mechanism for textured object segmentation
by generalizing its edge indication function or stopping function from intensity
based feature space to texture feature space. Details of this are presented in sec-
tion 4.3. In the literature, similar approaches where the geodesic active contour
scheme is applied to some feature space of the image, were studied in (Sagiv et al.,
2000; Lorigo et al., 1998; Paragios and Deriche, 1999b; Sagiv et al., 2006).
Rest of the chapter is organized in the following way. Section 4.1 briefly dis-
cusses the geodesic active contour which provides essential background for the
formulation of the proposed technique. In Section 4.2, we present a novel texture
feature extraction technique to extract the multi-dimensional texture features of
54
Fig. 4.1: Problem of segmentation of multiple objects using a parametric active con-tour.
the input image. Section 4.3 presents modified geodesic active contour for seg-
mentation of multiple textured objects. In section 4.4, we present experimental
results and finally conclude the chapter in section 4.5.
4.1 Geodesic Active Contour
In this section, we briefly review the geodesic active contour technique (Caselles
et al., 1997) for non-textured images. Generalization of the technique for segmen-
tation of multiple textured objects is described in section 4.3.
Let C(q) : [0, 1] → R2 be a parameterized curve, and let I : [0,m]×[0, n] → R+
be the image where we want to detect the objects boundaries. Let g(r) : [0,∞] →
R+ be an inverse edge detector, so that g → 0 when r → ∞. g represents the
edges in the image and has a fundamental role in the success of geodesic active
contour mechanism. If it does not represent the edges well, application of the
geodesic active contour mechanism is likely to fail. In geodesic active contour
technique, minimization of the energy functional, proposed in the classical snakes
(Kass et al., 1988), is generalized to find a geodesic curve in the Riemannian space
(Caselles et al., 1997) with a metric derived from the image by minimizing the
55
following functional
LR =
∫g(|∇I(C(q))|)|C ′
(q)|dq (4.1)
LR is a new definition of length (called geodesic length) in the Riemannian space.
This new length can be considered as a weighted length of a curve, where the
Euclidian length element is weighted by a factor g(|∇I(C(q))|), which contains
information regarding the boundaries (edges) in the image. To find this geodesic
curve, steepest gradient descent method is used which gives the following curve
evolution equation, to obtain the local minima of LR. Complete geometric inter-
pretation of this can be found in (Caselles et al., 1997).
dC
dt= g(|∇I|)kN− (∇g.N)N (4.2)
where, k denotes Euclidian curvature and N is a unit inward normal to the curve.
Let us define a function u : [0,m]× [0, n] → R such that curve C is parameterized
as a level set of u, i.e. C = {(x, y)|u(x, y) = 0}. Now, we can use the Osher-
Sethian level sets approach (Osher and Sethian, 1988) and replace above evolution
equation for the curve C with an evolution equation for the embedded function u
as follows:
du
dt= |∇u|div
(g(∇I)
∇u
|∇u|)
= g(∇I)|∇u|k +∇g · ∇u (4.3)
where, div is a divergence operator. Stopping function g(∇I) is generally given
56
Fig. 4.2: Geometric interpretation of the attraction force in 1-D: (a) The original edge
signal I, (b) I, the smoothed version of I, and (c) the derived stoping function g.The evolving contour is attracted to the valley created by ∇g ·∇u. (figure taken fromCaselles et al., 1997)
by g(∇I(x, y)) = 11+|∇I(x,y)|p , where p is an integer and usually equals to 1 or 2.
The goal of g(∇I) is to stop the evolving curve when it reaches to the object
boundary. For an ideal edge ∇I is very large, and hence g = 0 at the edge and
the curve stops (ut(x, y) = 0). The boundary is then given by u(x, y) = 0.
The term ∇g ·∇u in Eqn. 4.3 has a special significance. This term attracts the
curve towards the boundaries of objects and plays an important role in cases where
there are different intensity gradient values along the edge (object boundary), as
often happens in the real images. Fig. 4.2 shows g and its gradient vectors for a 1-
D case. We can observe the way gradient vectors are directed towards the middle
of the boundary. These vectors direct the propagating curve into the “valley” of
the g function and do not allow it to move out once it falls in the valley.
57
4.2 Multi-dimensional Texture Feature Extrac-
tion
This section explains a novel technique of multi-dimensional texture feature ex-
traction using the scalogram (Clerc and Mallat, 2002). The technique involves
two main steps: scalogram estimation and texture feature estimation. Procedure
to obtain the scalogram at a particular point (pixel) is similar to that explained in
section 3.2.1. Following subsection presents the texture feature estimation tech-
nique used here to extract a multi-dimensional texture feature.
4.2.1 Multi-dimensional Texture Feature Estimation
Once the scalogram of the texture profile at a particular point is obtained, it is used
for multi-dimensional texture feature estimation. Texture features are estimated
from the “energy measure” of the coefficients of the scalogram subbands. This
texture feature is similar to the “texture energy measure” first proposed by Laws
(Laws, 1980).
Let E be the texture energy image for the input texture image I. E defines
a functional mapping from a 2-D pixel coordinate space onto a multi-dimensional
energy space Γ, i.e. E : [0,m] × [0, n] → Γ. Let for the kth pixel in I, Dk be the
set of all subbands for a scalogram S and Ek ∈ Γ be the texture energy vector
associated with it. In the simplest case, Ek can be a 1-D energy measure (similar
58
to that presented in the previous chapter), which can be estimated as follows
Ek =1
N
{∑i∈Dk
∑j
S(i,j)
}(4.4)
where, S(i,j) is the jth element of the ith subband of scalogram S and N is the
sum of the cardinalities of all the members of Dk. This energy measure considers
l1 norm of the coefficients belonging to all subbands of the scalogram computed
for the pixel k and has the advantage of being simple to compute. However, it
does not always well represent the textural information. A better texture energy
space, Γ, can be created by taking the l1 norm of each subband of the scalogram
S. In this case, Γ represents a n-dimensional energy space where n = L + 1 for
level-L decomposition. Formally, ith element of the energy vector Ek ∈ Γ, is given
as follows:
E(k,i) =1
Ni
{∑j
S(i,j)
}(4.5)
where, i represents the ith scalogram subband of set Dk and Ni is the cardinality
of this subband. Texture energy image computed using Eqn. 4.5 is a multi-
dimensional image and provides more discriminative information to estimate the
texture boundaries. This texture energy measure constitutes a texture feature
image f .
One common problem in texture segmentation is the problem of precise detec-
tion of the boundary efficiently. A pixel near the texture boundary has neighboring
pixels belonging to different texture regions. In addition, a textured image may
contain non-homogeneous, non-regular texture regions. This would cause the ob-
tained energy measure to deviate from its “expected” value. Hence, it is necessary
59
that the obtained feature image be further processed to remove noise and outliers.
To do so, we apply smoothing operation to the texture energy image in every band
separately. In our smoothing method, the energy measure of the kth pixel for a
particular band is replaced by the average of the block of energy measures cen-
tered at pixel k in that band. In addition, in order to reduce the block effects and
to reject outliers, the p percentage of the largest and the smallest energy values
within the window block are excluded from the calculation. Thus, the smooth
texture feature value of pixel k in ith band of the feature image is obtained as:
f(k,i) =1
w2(1− 2× p%)
(w2)(1−2×p%)∑j=1
E(k,j)
(4.6)
where, E(k,j)s are the energy measures computed within the w×w window centered
at pixel k, for the ith band of the texture energy image. The window size w × w
and the value of p are chosen experimentally to be 11× 11 and 10 respectively, in
our experiments. Texture feature image f , computed by smoothing the texture
energy image E as explain above, is used in the computation of texture edges
using inverse edge indicator function which is described in the next section.
4.3 Geodesic Active Contours in Texture Fea-
ture Space
We employ the geodesic active contour technique in the scalogram based texture
feature space, by using the generalized inverse edge detector function g proposed
in (Sochen et al., 1998; Sagiv et al., 2000). Geodesic active contour, in presence
60
of texture feature based inverse edge detector g, is attracted towards the texture
boundary. Sochen et al. (Sochen et al., 1998) suggest that an image can be
described as a 2-D Riemannian manifold which represents the spatial extent of
the image, embedded in a higher-dimensional feature space, via the Beltrami
framework (Kimmel et al., 1998). They show that the determinant of the metric
of the 2-D image manifold can be interpreted as a measure of the presence of the
gradient on the manifold. So the metric of this surface can be used as an indicator
of the edges present in the image. They also show that if the metric in the higher
dimensional embedding space is known, it can be used to derive the metric of the
lower space using the pullback mechanism (Sochen et al., 1998; Sagiv et al., 2000).
Let X : Σ → M be an embedding of Σ in M , where M is a Riemannian man-
ifold with known metrics, and Σ is another Riemannian manifold with unknown
metric. As proposed in (Sochen et al., 1998), metric on Σ can be constructed
using the knowledge of the metric on M using the pullback mechanism. If Σ is a
2-D image manifold embedded in n-dimensional manifold of texture feature space
−→f (x, y) = (f 1(x, y), ..., fn(x, y)), metric h(x, y) of the 2-D image manifold can
be obtained from the embedding texture feature space as follows (Sochen et al.,
1998):
h(x, y) =
1 + Σi(fix)
2 Σifixf
iy
Σifixf
iy 1 + Σi(f
iy)
2
(4.7)
As discussed above, the determinant of metric h provides a good indicator of the
gradient present in the image manifold. If the embedding space is created using
the image texture features, metric h can provide the information about the texture
edges present in the image. So given the texture features, we can derive the metric
61
of the image manifold embedded in that feature space, and use it as described to
create the edge indication function.
It turns out that the inverse of the metric’s determinant can serve as a good
edge indicator. Hence the stopping function g used in geodesic active contour
for texture boundary detection, can be given as the inverse of the determinant of
metric h as follows (Sagiv et al., 2000)
g(x, y) =1
det(h(x, y))(4.8)
where, det(h) is the determinant of h.
In our segmentation approach, we describe the input image as a manifold
embedded in the scalogram based texture feature space and derive the metric of
the image manifold using Eqn. 4.7. Then we use Eqn. 4.3, with g obtained
using Eqn. 4.8 for the segmentation of the textured object(s) from a background
texture.
4.3.1 Segmentation of multiple textured objects
Curve evolution in the modified geodesic active contour technique (presented
here), for segmentation of multiple textured objects, is performed using the level
set approach (Osher and Sethian, 1988). In the level set approach, a 2-D contour
is represented in an implicit way as a zero level set of a 3-D surface (called level
set surface). Deformation in the contour is carried out by deforming the level set
surface (see Eqn. 4.2 and Eqn. 4.3). We use scalogram based texture features to
62
(a) (b)
Fig. 4.3: Demonstration of segmentation of multiple objects using the texture featurebased geodesic active contour method : (a) Evolution of level set surface (LSS),(b) evolving zero level set contour. First row shows the initial LSS and zero levelset contour. Last row shows the final state of LSS and zero level set contour afterconvergence. 63
(a) (b)
(c) (d)
(e) (f)
(g) (h)
Fig. 4.4: A close look at the process of level set surface (LSS) evolution and contoursplitting, from a different 3-D look angle, for the segmentation of multiple texturedobjects shown in Fig. 4.3. From (a)-(h), figures show the evolving LSS with zero levelset contour (shown in red color on the surface). Fig. (a) shows the initial LSS andFig. (h) shows the final LSS after segmentation.
64
model the deformation of a level set surface for segmentation of multiple textured
objects. Contour evolution using level set approach makes our approach topolog-
ically independent, because different topologies of the zero level set contour do
not imply different topologies of the level set surface. Therefore evolving contours
naturally split and merge allowing the simultaneous detection (segmentation) of
several textured objects and number of objects to be segmented in the scene are
not required to be known a priori in the image. At the boundary of the textured
object, value of g (defined in Eqn. 4.8) vanishes and contour stops.
Fig. 4.3 demonstrate the segmentation of multiple textured objects using
the proposed approach. Fig 4.3(a) shows the evolving level set surface and Fig.
4.3(b) shows evolving zero level set contour on the texture image (from top to
bottom). Zero level set contour is also shown on the level set surface in red color.
Fig. 4.4 provides a different 3-D look angle for the evolving level set surface,
for segmentation of the objects in simulated image shown in Fig. 4.3(b). Since
level set surface need not change its topology to split the evolving contour, it can
intrinsically represent multiple splitted contours. Section 4.4 presents textured
object(s) segmentation results for the synthetic and natural images.
4.4 Experimental Results
We have used our proposed technique on both synthetic and natural texture im-
ages to show its effectiveness. For an input image, texture feature space is created
using the scalogram as discussed in section 4.2. In all of our experiments, we use
Eqn. 4.5 for texture energy estimation. We use the orthogonal Daubechies 2-
65
(a) (b)
(c) (d) (e)
(f) (g) (h)
(i) (j) (k)
Fig. 4.5: Segmentation result of a synthetic image: (a) input image, (b) texture edgemap produced by inverse edge detector (Eqn. 4.8) using the scalogram based texturefeatures, (c) input image with initial contour marked around the objects in black color,(d)-(i) intermediate positions of the evolving contour during segmentation process, (k)segmentation result where segmented objects boundaries are shown in black color.
66
(a) (b)
(c) (d) (e)
(f) (g) (h)
(i) (j) (k)
Fig. 4.6: Segmentation result of a synthetic image: (a) input image, (b) texture edgemap produced by inverse edge detector using the scalogram based texture features,(c) input image with initial contour marked around the objects in black color, (d)-(i) intermediate positions of the evolving contour during segmentation process, (k)segmentation result where segmented objects boundaries are shown in black color.
67
channel (with dyadic decomposition) wavelet filter for signal decomposition. The
metric of the image manifold is computed considering the image manifold em-
bedded in the higher dimensional scalogram based texture feature space. This
metric is used to obtain the texture edge detector, which is used in geodesic ac-
tive contour mechanism as a stopping term. Initialization of the geodesic active
contour is done using a signed distance function. To generate texture features of
the images in our experiments, we considered a 12 × 12 window at each pixel.
DWT decomposition was done up to level-4.
To start the segmentation, an initial contour is put around the object(s) to
be segmented. Contour moves towards the object boundary to minimize the
objective function LR (Eqn. 4.1) in the presence of a new g (Eqn. 4.8). Often,
to make computation fast and avoid the contour to get stuck into the spurious
edges, the contour is initialized near the object boundary (e.g. Fig. 4.7 (a), last
row). Segmentation results obtained using proposed technique are shown in Figs.
4.5-4.10.
Figs. 4.5-4.6 show results on two synthetic images. Figs. 4.5-4.6(a) show the
images to be segmented and Figs. 4.5-4.6(b) show the respective outputs of the
inverse edge detector (Eqn. 4.8) computed using the scalogram based texture
features for the input images. Figs. 4.5-4.6(c) and (k) show the positions of
the initial contour and the final segmentation output respectively. Intermediate
evolving contours are shown in Figs. 4.5-4.6(d) to (j).
Fig. 4.7 shows segmentation results on natural images. Input texture images
are shown with the initial contour (in black color) marked around the object(s)
68
to be segmented (Fig. 4.7(a)). Texture edge maps of the images, computed by
the inverse edge detector (Eqn. 4.8) using scalogram based texture features, are
shown in Fig. 4.7(b). Final segmentation results are shown in Fig. 4.7(c) where
the identified object boundaries are marked in black color. For one natural image
(Zebra image shown in Fig. 4.7(a) second row ), we show the evolution process
of the contour in Fig. 4.8. Results of the proposed approach are encouraging.
Proposed approach also works well even when the background has more than one
texture regions (see Fig. 4.7, last row). Computational time required to perform
object segmentation in the images shown in Fig. 4.7, is given in Table 4.1.
Table 4.1: Computational time required for the segmentation of textured objects inimages shown in Fig. 4.7. Images are numbered from top to bottom in Fig. 4.7.
Image Name Image Size Computational(in pixels) Time (in seconds)
Image-1 226× 392 300Image-2 200× 300 250Image-3 226× 372 330Image-4 210× 300 260
In Figs. 4.9 and 4.10, we compare the performance of our proposed textured
object segmentation method with that obtained using other techniques available in
the literature. Fig. 4.9 shows comparative results on two synthetic texture images,
while Fig. 4.10 shows comparative results on two natural images. Our results for
both types of images are quite encouraging. We see that in most of the cases, our
results are identical or better than the results obtained using other techniques.
In Fig. 4.11, we compare the results obtained using the geodesic active contour
based technique (presented here) with that described in the previous chapter for
the segmentation of single textured object (using parametric active contour). We
69
(a) (b) (c)
Fig. 4.7: Segmentation results of natural images: (a) input images where initial con-tours are marked around the object(s) in black color, (b) texture edge maps producedby inverse edge detector (Eqn. 4.8) using the scalogram based texture features for therespective images, (c) segmentation results of the image shown in column 1, whereboundaries of the segmented objects are shown in black color.
70
(a) (b)
(c) (d)
(e) (f)
(g) (h)
(i) (j)
Fig. 4.8: Contour evolution in the segmentation of Zebra image: (a) input image withinitial contour marked around the objects in black color, (b)-(i) intermediate positionsof the evolving contour during the segmentation process, (j) segmentation result whereboundaries of the segmented objects are shown in black color.
71
(a) (b) (c) our results
Fig. 4.9: Comparative results (I): (a) input image, (b) result reproduced from Kimet al. (2002), (c) our result, (II): (a) input image, (b) result reproduced from Paragiosand Deriche (2002a), (c) our result.
see that the results obtained using the technique presented in this chapter, are
better and promising than the results presented in the previous chapter. This is
mainly due to two reasons:
1. Multi-dimensional texture feature provides more discriminative informationabout the texture regions and helps the inverse edge detector to estimate thetexture edges more precisely. In contrary, a scalar texture feature was ob-tained by taking the average of all significant subbands in case of parametricactive contour (see section 3.2).
2. The algorithm presented in this chapter for texture object segmentationuses the geodesic active contour which provides better boundary localization(Caselles et al., 1997) and produces smooth contours.
Overall computational cost of texture feature extraction and segmentation is
in the range of 70 to 90 seconds on a P-IV, 3 GHz machine with 2 GB RAM, for
images of size 100× 100.
72
(a) (b) (c) our result
(a) (b) (c) our result
Fig. 4.10: Comparative results: (I) result on Zebra image: (a) reproduced from Para-gios and Deriche (2002b), (b) reproduced from Rousson et al. (2003), (c) producedby our proposed technique; (II) result on Cheetah image: (a) reproduced from Kimet al. (2002), (b) reproduced from Rousson et al. (2003), (c) result produced by ourproposed technique.
4.5 Summary and Discussion
In this chapter, we first described a novel multi-dimensional scalogram based
texture feature extraction technique and then use the texture features obtained
using this technique to derive the inverse edge indication function. Inverse edge
indication function is further used along with geodesic active contour to perform
segmentation of multiple textured objects in the presence of background texture.
In the segmentation approach presented in this chapter, input image is processed
to obtain a multi-dimensional texture feature image, thus representing the image
in a multi-dimensional feature space. An edge indication or stopping function,
used in geodesic active contour, is derived from the texture feature space of the
image by viewing the feature space as a manifold (Sochen et al., 1998). Geodesic
active contour in the presence of this edge indication function, stops at the texture
boundary. Since geodesic active contour can handle the topological changes in
73
(a) (b)
Fig. 4.11: Comparison of single textured object segmentation results obtained us-ing the parametric active contour based technique (presented in section 3.3) and thegeodesic active contour based technique (presented in section 4.3): (a) results ob-tained using parametric active contour based technique, (b) results obtained usinggeodesic active contour based technique.
74
the contour intrinsically, the segmentation technique presented here can handle
segmentation of multiple textured objects simultaneously. Main contributions
of this chapter are as follows: (1) development of new scalogram based multi-
dimensional texture feature which has a strong texture discriminating power, and
in turn (2) use of this feature to define a good texture edge indicator function
which is used in geodesic active contour for segmentation of multiple textured
objects.
We validated our technique using various synthetic and natural texture images.
We also compare few of our results with that of other techniques available in the
literature. Segmentation results obtained are quite encouraging and accurate for
both synthetic and natural images. From Fig. 4.11, we observe that the results
obtained for single texture object segmentation using the technique presented in
this chapter are better than the results obtained in chapter 3.
75
CHAPTER 5
SnakeCut: An Automatic Technique for
Segmentation of a Foreground Object with Holes
This chapter proposes an efficient, semi-interactive method for foreground object
segmentation in color images using the integration of two popular foreground ob-
ject segmentation techniques, namely parametric active contours (Snakes) (Kass
et al., 1988) and GrabCut (Rother et al., 2004). As seen in chapter 3, Snake
is a deformable contour, which segments an object boundary using boundaries
discontinuities by minimizing the energy function associated with the contour.
GrabCut is an interactive tool based on iterative graph-cut for foreground object
segmentation in still images. GrabCut provides a convenient way to encode color
features as segmentation cues to obtain foreground segmentation from local pixel
similarities using modified iterated graph-cuts (Boykov and Jolly, 2001). Grab-
Cut has been applied in many applications for the foreground extraction (Haasch
et al., 2005; Moller et al., 2005; Deepti et al., 2007) from an image. In this chap-
ter, we first present a comparative study of these two segmentation techniques,
and illustrate conditions under which either or both of them fail. We propose a
novel formulation for integrating these two complimentary techniques to obtain an
automatic segmentation of a foreground object with holes. We call our proposed
integrated approach as “SnakeCut”, which is based on a probabilistic framework.
Rest of this chapter is organized as follows. In section 5.1, we briefly present
active contour (for colored images) and GrabCut techniques which provides the
theoretical basis for the chapter. Section 5.2 compares the two techniques and dis-
cusses the limitations of both. In section 5.3, we present the SnakeCut algorithm,
our proposed segmentation technique for segmentation of foreground object. Sec-
tion 5.4 presents results on simulated and natural images. We conclude the chapter
in section 5.5.
5.1 Preliminaries
5.1.1 Parametric Active Contour (Snake) Model for Color
Images1
Active contours are energy minimizing contours. Energy associated with the con-
tour is defined using the internal (derived from contour) and external (derived
from the image) parameters of the contour, which minimizes its energy to esti-
mate the object boundary. We extend the traditional active contour algorithm
for color images, by modifying its external energy function. In a traditional active
contour for gray level images, typical external energy is defined (as follows) to
lead Snake towards step edges (Kass et al., 1988):
Eext = −|∇I(x, y)|2 (5.1)
where, I(x, y) is an image with (x, y) as spatial co-ordinates. So the external
energy in gray level images depends on the intensity gradient present in the image.
1details of Snake technique, have been presented in section 3.1.1
77
(a) (b) (c) (d)
Fig. 5.1: Estimation of gradient in color and gray level images: (a) input image, (b)gradient image of (a) estimated using Eqn. 5.2, (c) gray scale image of (a); (d)gradient image of (c).
To define the external energy in color images, we estimate the intensity gradient
by taking the maximum of the gradients of R, G and B bands at every pixel,
using the following equation:
|∇I| = max(|∇R|, |∇G|, |∇B|) (5.2)
Alternatively, external energy of the contour in color images can also be defined by
first converting the color image to gray level and then using Eqn. 5.1 to estimate
it, but this method does not accurately represent the edges present in an image.
Fig. 5.1(b) shows an example of intensity gradient estimated using Eqn. 5.2,
for the image shown in Fig. 5.1(a). Fig. 5.1(d) shows the intensity gradient for
the same input image estimated after converting it to a gray level image (Fig.
5.1(c)). We can see that the gradient obtained using Eqn. 5.2 gives better edge
information. In this work, we use Snake energy (Eqn. 3.1) for the segmentation of
colored objects, where external energy component of the Snake is estimated using
Eqn. 5.2. One can also explore the use of various other color edge detectors such
as Cumani (1991); Toivanen et al. (2003); Evans and Liu (2006) to estimate the
external energy of the Snake.
78
5.1.2 GrabCut
GrabCut (Rother et al., 2004) is an interactive tool based on iterative graph-cut
(Boykov and Jolly, 2001) for foreground extraction in still images. To segment
a foreground object using GrabCut, the user has to select an area of interest
(AOI) with a rectangle to obtain the desired result. GrabCut extends the graph-
cut based segmentation technique, introduced by Boykov and Jolly (Boykov and
Jolly, 2001), using color information. In this section, we briefly discuss about the
process of GrabCut. More details of GrabCut can be obtained from (Rother et al.,
2004).
Consider image I as an array z = (z1, ..., zn, ..., zN) of pixels, indexed by the
single index n, where zn is in RGB space. Segmentation of the image is expressed
as an array of “opacity” values α = (α1, ..., αn, ..., αN) at each pixel. Generally
0 ≤ αn ≤ 1, but for hard segmentation αn ∈ {0, 1} with 0 for background and 1 for
foreground. For the purpose of segmentation, GrabCut constructs two separate
Gaussian mixture models (GMMs) to express the color distributions for the back-
ground and foreground. Each GMM, one for foreground and one for background,
is taken to be a full-covariance Gaussian mixture with K components. In order
to deal with the GMM tractability in an optimization framework, an additional
vector k = (k1, ..., kn, ..., kN) is taken, with kn ∈ {1, ..., K}, assigning to each pixel
a unique GMM component, which is either from the foreground or background
according to αn = 0 or 1.
GrabCut defines an energy function E such that its minimum should corre-
spond to a good segmentation, in the sense that it is guided both by the observed
79
foreground and background GMMs and that the opacity is “coherent”. This is
captured by “Gibbs” energy in the following form:
E(α,k, θ, z) = U(α,k, θ, z) + V (α, z) (5.3)
The data term U evaluates the fit of the opacity distribution α to the data z. It
takes into account the color GMM models, defined as
U(α,k, θ, z) =∑
n
D(αn, kn, θ, zn) (5.4)
where,
D(αn, kn, θ, zn) = − log p(zn|αn, kn, θn)− log π(αn, kn) (5.5)
Here, p(.) is a Gaussian probability distribution, and π(.) are mixture weighting
coefficients. Therefore, the parameters of the model are now
θ = {π(α, k), µ(α, k), Σ(α, k); α = 0, 1; k = 1..K}
where, π, µ and Σ’s represent the weights, means and covariances respectively of
the 2K Gaussian components for the background and the foreground distributions.
In Eqn. 5.3, the term V is called the smoothness term and is given as follows:
V (α, z) = γ∑
(m,n)∈R
1
dist(m, n)[αn 6= αm]exp(−β(‖zm − zn‖2)) (5.6)
where, [φ] denotes the indicator function taking values {0, 1} for a predicate φ, γ is
a constant, R is the set of neighboring pixels, and dist(.) is the Euclidian distance
80
(a) (b) (c) (d)
Fig. 5.2: (a) Input image, elliptical object present in the image contains a rectangularhole at the center, (b) foreground initialization by user, (c) active contour segmenta-tion result, and (d) GrabCut segmentation result.
of neighboring pixels. This energy function in Eqn. 5.6 encourages coherence in
the regions of similar color distribution.
Once the energy model is defined, segmentation can be estimated as a global
minimum:
α = arg minα
E(α, θ) (5.7)
Energy minimization in GrabCut is done by using standard minimum cut
algorithm (Boykov and Jolly, 2001). Minimization follows an iterative procedure
that alternates between estimation and parameter learning.
5.2 Comparison of Active Contour and GrabCut
Methods
Active contour relies on the presence of intensity gradient (boundary discontinu-
ities) in the image. So it is a good tool for the estimation of the object boundaries.
An evolving contour searches the places of high intensity gradient (edges) in the
image and latches there. But, since the contour cannot penetrate inside the object
81
boundary, it is not able to remove the undesired parts, say holes, present inside
the object boundary. If an object has holes in it, active contour will detect the
holes as part of the object. Fig. 5.2(c) shows one such segmentation example us-
ing active contour for a synthetic image shown in Fig. 5.2(a). Input image (Fig.
5.2(a)) contains a foreground object which has a rectangular hole at the center,
through which the gray color background is visible. Segmentation result for this
image (shown in Fig. 5.2(c)) using active contour, contains the hole included as
a part of the detected object which is incorrect. Since the Snake could not go
inside the object boundary, it has converted the outer background into white but
retained the hole as gray. Similar erroneous segmentation result of active contour
for a real image (shown in Fig. 5.3(a)) is shown in Fig. 5.3(b). One can see that
segmentation output contains a part of the background region (e.g. black back-
ground visible through teapot’s handle) along with the foreground object. Fig.
5.4(b) shows another erroneous active contour segmentation result for the image
shown in Fig. 5.4(a). Segmentation output contains some pixels in the interior
part of the foreground object (wheel) from the background texture region. One
more incorrect segmentation result of active contour for a real image (shown in
Fig. 5.5(a)) is shown in Fig. 5.5(b). One can see that segmentation output con-
tains a part of the background region (e.g. grass patch between legs) along with
the foreground object.
On the other hand, GrabCut considers global color distribution (with local
pixels similarities) of the background and foreground pixels for segmentation. So
it has the ability to remove interior pixels which are not part of the object. To
segment the object using GrabCut, user draws a rectangle enclosing the foreground
82
(a) (b) (c)
Fig. 5.3: (a) Teapot image; segmentation results of (b) active contour and (c) Grab-Cut.
(a)(b) (c)
Fig. 5.4: (a) Image containing wheel; segmentation results of (b) active contour and(c) GrabCut.
(a) (b) (c)
Fig. 5.5: (a) Soldier Image, segmentation results of (b) active contour and (c) Grab-Cut.
83
object. Pixels outside the rectangle are considered as background pixel and pixels
inside the rectangle are considered as unknown. GrabCut estimates the color
distribution for the background and the unknown region using separate GMMs
(see section 5.1.2). Then, it iteratively removes the pixels from the unknown
region which belong to background.
Major problem with the GrabCut is as follows. If some parts of the foreground
object have color distribution similar to the image background, then those parts
of the foreground object will also be eliminated by the GrabCut. So GrabCut is
not intelligent enough to distinguish between the desired and unnecessary pixels,
while eliminating some of the pixels from the unknown region. Fig. 5.2(d) shows
one such segmentation result of GrabCut for the image shown in Fig. 5.2(a),
where the objective is to segment the object with a hole present in the image.
Although the hole inside the foreground object has been removed, segmentation
result does not produce the upper part of the object (shown in green color in Fig.
5.2(a)) near the boundary. This happens because in the input image (Fig. 5.2(a)),
a few pixels with green color are present as a part of the background region. Fig.
5.3(c) presents a GrabCut segmentation result for a real world image shown in
Fig. 5.3(a). The objective in this case is to crop the teapot from the input image.
GrabCut segmentation result for this input image does not produce the teapot’s
left side portion, due to its similarity with the background. In another real world
example in Fig. 5.4(a), where the user targets to crop the wheel present in the
image, GrabCut segmentation result (Fig. 5.4(c)) does not produce the wheel’s
grayish green rubber part. This is due to the presence of some other objects
with similar color in the background. Fig. 5.5(c) presents one more GrabCut
84
segmentation result for a real world image shown in Fig. 5.5(a). The objective
in this case is to crop the soldier from the input image. GrabCut segmentation
result for this input image does not produce the soldier’s hat and the legs, due to
their similarities with background texture.
In all these cases (Figs. 5.3, 5.4 and 5.5), the hole(s) within the foreground
object have been detected and removed by the GrabCut algorithm.
In GrabCut (Rother et al., 2004) algorithm, missing data of the foreground
object is often recovered by user interaction. User has to mark the missing object
parts as compulsory foreground. In this chapter, we present an automatic fore-
ground object segmentation technique based on the integration of active contour
and GrabCut, which can produce accurate segmentation in situations where both
or either of these techniques fail. We call our proposed technique as “SnakeCut”,
which is based on a probabilistic framework for integration. We present it in the
next section.
5.3 SnakeCut: Integration of Active Contour and
GrabCut
Active contour works on the principle of intensity gradient, where the user ini-
tializes a contour around or inside the object for it to detect the boundary of
the object easily. GrabCut, on the other hand, works on the basis of the pixel’s
color distribution and considers global cues for segmentation. Hence it can eas-
ily remove the unwanted part (parts from the background) present inside the
85
object boundary. These two segmentation techniques use complementary infor-
mation (edge and region based) for segmentation. In SnakeCut, we combine these
complementary techniques and present an integrated method for superior object
segmentation. Fig. 5.6 presents the overall flow chart of our proposed segmenta-
tion technique. In SnakeCut, input image is segmented using the active contour
and GrabCut separately. These two segmentation results are provided as inputs
to the probabilistic framework of SnakeCut, which integrates the two segmenta-
tion results based on a probabilistic criterion and produces the final segmentation
result.
Main steps of the SnakeCut algorithm are provided in Algorithm 2. The
probabilistic framework used to integrate the two outputs is as follows. Inside the
object boundary C0 (detected by the active contour), every pixel zi is assigned
two probabilities, Pc(zi) and Ps(zi), where
• Pc(zi): provides information about the pixel’s nearness to the boundary, and
• Ps(zi): indicates how similar the pixel is to the background
A large value of Pc(zi) indicates that pixel zi is far from the boundary and a
large value of Ps(zi) specifies that the pixel is more similar to the background.
To take the decision about a pixel belonging to foreground or background, we
evaluate a decision function p as follows:
p(zi) = ρPc(zi) + (1− ρ)Ps(zi) (5.8)
where, ρ is the weight which controls the relative importance of the two tech-
niques, which is learnt empirically. Probability Pc is computed from the distance
86
transform (DT) (Breu et al., 1995) of the object boundary C0. DT has been used
in many computer vision applications (Paglieroni et al., 1994; Tsang et al., 1994;
Sanjay et al., 1998; Lee et al., 2007). For image I, its DT image Id is given by the
following equation:
Id(zi) =
0, if zi lies on contour C0
d, otherwise
(5.9)
where, d is the Euclidian distance of pixel zi to the nearest contour point. Fig.
5.7(b) shows an example of DT image for the contour image shown in Fig. 5.7(a).
Distance transform values are first normalized in the range [0, 1], before they are
used for the estimation of Pc.
Let In be the normalized distance transform image of Id and dn be the DT
value of a pixel zi in In (i.e. dn = In(zi)). Probability Pc of zi is estimated using
the following fuzzy distribution function:
Pc(zi) =
0, 0 ≤ dn < a;
2(
dn−ab−a
)2, a ≤ dn < a+b
2;
1− 2(
b−dn
b−a
)2, a+b
2≤ dn < b;
1, b ≤ dn ≤ 1.
(5.10)
where, a and b are constants and a < b. When a ≥ b this becomes a step
function with transition at (a+b)/2 from 0 to 1. Probability distribution function
(Eqn. 5.10) has been chosen in such way that the probability value Pc is small
near the contour C0 and large for points farther away. In this fuzzy function, a
87
Fig. 5.6: Flow chart of the proposed SnakeCut technique.
(a) (b) (c)
Fig. 5.7: Segmentation of image shown in Fig. 5.2(a) using SnakeCut: (a) objectboundary produced by active contour, (b) distance transform for the boundary contourshown in (a); (c) SnakeCut segmentation result.
and b dictate the non-linear behavior of the function. The parameters a and b
control the extents (distance from the boundary) to which the output response
is considered from Snake and then onwards from that of GrabCut respectively.
The extent to which points are considered to be near the contour, can be suitably
controlled by choosing appropriate values of a and b. The value of Pc is zero
(0) when the distance of the pixel from the boundary is in the range [0..a], and
one (1) in the range [b..1] (all values normalized). For values between [a..b], we
empirically found the smooth, non-linear S-shaped function to provide the best
result. Fig. 5.8 shows the effect of the interval [a, b] on the distribution function.
Probability value Ps is obtained from the GrabCut segmentation process.
GrabCut assigns two likelihood values to each pixel in the image using the GMMs
88
Fig. 5.8: Effect of interval [a, b] on the non-linearity of the fuzzy distribution function(Eqn. 5.10). When a < b, transition from 0 (at a) to 1 (at b) is smooth. Whena ≥ b, we have a step function with the transition at (a + b)/2.
constructed for the foreground and background, which represent how likely a pixel
belongs to the foreground and background respectively. In our approach, after the
segmentation of the object using GrabCut, final background GMMs are used to
estimate Ps. For each pixel zi inside C0, D(zi) is computed using Eqn. 5.5 con-
sidering background GMMs. Normalized values of D between 0 and 1, for all the
pixels inside C0, define the probability Ps.
Using the decision function p(zi) estimated in Eqn. 5.8 and a empirically
estimated threshold T , GrabCut and active contour results are integrated using
the SnakeCut algorithm (Algorithm 2). In the integration process of the SnakeCut
algorithm, segmentation output for a pixel is taken from the GrabCut result if
p > T , otherwise it is taken from the active contour result. In our experiments,
we have used ρ = 0.5. The values of T , a and b ranges from 0.60−0.80, 0.10−0.15
and 0.15− 0.20 respectively to give the best results.
We demonstrate the integrated approach to the process of foreground segmen-
tation with the help of a simulated example in Fig. 5.7, with the simulated image
89
Algorithm 2 Steps of SnakeCut
• Input I and output Isc.
• All pixels of Isc are initialized to zero.
A. Initial Segmentation
1. Segment desired object in I using active contour. Say, object boundaryidentified by the active contour is C0 and segmentation output of activecontour is Iac.
2. Segment desired object in I using GrabCut. Say, segmentation output isIgc.
B. Integration using SnakeCut
1. Find set of pixels Z in image I, which lie inside contour C0.
2. For each pixel zi ∈ Z,
(a) Compute p(zi) using Eqn. 5.8.
(b) if p(zi) ≤ T thenIsc(zi) = Iac(zi)
elseIsc(zi) = Igc(zi)
end if
90
shown in Fig. 5.2(a). Intermediate segmentation outputs produced by active con-
tour and GrabCut for this image have been shown in Fig. 5.2(c) & Fig. 5.2(d).
These outputs are integrated by the SnakeCut algorithm. Fig. 5.7(a) shows the
object boundary obtained by active contour for the object shown in Fig. 5.2(a).
Active contour boundary is used to estimate the distance transform, shown in
Fig. 5.7(b), using Eqn. 5.9. Probability values Pc and Ps are estimated for all
the pixels inside the object boundary obtained by active contour as described
earlier. SnakeCut algorithm (Algorithm 2) is then used to integrate the results of
active contour and GrabCut. Fig. 5.7(c) shows the final SnakeCut output after
integration of the intermediate outputs (Fig. 5.2(c) & Fig. 5.2(d)) obtained using
active contour and GrabCut algorithms. Our proposed method is able to retain a
part of the object which appears similar to background color and simultaneously
eliminate the hole within the object.
To demonstrate the impact of the probability values Pc and Ps, and its impact
on the decision making in SnakeCut algorithm, we use the teapot image (Fig.
5.3(a)). The values of Pc, Ps and p for a few points marked on the teapot image
(Fig. 5.9(a)) are shown in Fig. 5.9(b). The SnakeCut algorithm is then used to
obtain the final segmentation decision. Last column of the table in Fig. 5.9(b)
shows the final decision taken by SnakeCut based on the estimated value of p.
5.4 SnakeCut Segmentation Results
To extract a foreground object using SnakeCut, user needs to draw a rectangle (or
polygon) surrounding the object as the Region of Interest (ROI). This rectangle
91
(a)
Point Pc Ps p Outputtaken from
A 0.1684 0.6551 0.4118 SnakeB 1.0000 0.5000 0.7500 GrabCutC 1.0000 0.7031 0.8516 GrabCutD 0.0000 0.5000 0.2500 SnakeE 1.0000 0.5000 0.7500 GrabCutF 0.0000 0.6769 0.3384 SnakeG 0.0000 0.5000 0.2500 Snake
(b)
Fig. 5.9: Demonstration of the impact of Pc and Ps values on the decision making inAlgorithm 2: (a) teapot image with a few points marked on it, (b) values of Pc, Ps,p, and the decision obtained using Algorithm 2, for the points labeled in (a). Valuesused for ρ and T are 0.5 and 0.6 respectively.
(a) (b) (c) (d)
Fig. 5.10: Demonstration of a SnakeCut result on a synthetic image, where Snake failsand GrabCut works: (a) input image with foreground initialized by the user (objectcontains a rectangular hole at the center), (b) Snake segmentation result (incorrect,output contains the hole as a part of the object), (c) GrabCut segmentation result(correct, hole is removed), and (d) SnakeCut segmentation result (correct, hole isremoved).
(a) (b) (c) (d)
Fig. 5.11: Demonstration of a SnakeCut result on a synthetic image, where Snakeworks and GrabCut fails: (a) input image with foreground initialized by the user, (b)Snake segmentation result (correct), (c) GrabCut segmentation result (incorrect, uppergreen part of the object is removed), and (d) correct segmentation result produced bySnakeCut.
92
(a) (b) (c) (d)
Fig. 5.12: Segmentation of real pot image: (a) input real image, (b) active contoursegmentation result (incorrect), (c) GrabCut segmentation result (correct), and (d)SnakeCut segmentation result (correct, background pixels visible through the handlesof the pot are removed).
is used in the segmentation process of active contour as well as GrabCut. Active
Contour considers the rectangle as an initial contour and deforms it to converge on
the object boundary. GrabCut uses the rectangle to define the background and
unknown regions. Pixels outside the rectangle are taken as known background
and those inside as unknown. GrabCut algorithm (using GMM based modeling
and minimal cost graph-cut) iterates and converges to a minimum energy level
producing the final segmentation output. Segmentation outputs of active contour
and GrabCut are integrated using the SnakeCut algorithm to obtain the final seg-
mentation result. First, we present a few results of segmentation using SnakeCut
on synthetic and natural images, where either Snake or GrabCut fails to work.
This will be followed by a few examples where both Snake and GrabCut tech-
niques fail to produce correct segmentation, whereas integration of the outputs of
both these techniques using SnakeCut algorithm gives the correct segmentation
results.
Fig. 5.10 shows a result on a synthetic image where active contour fails but
93
GrabCut works, and their integration (i.e. SnakeCut) also produces the correct
segmentation. Fig. 5.10(a) shows an image where the object to be segmented
has a rectangular hole (at the center) in it through which the gray colored back-
ground is visible. Segmentation result produced by active contour (Fig. 5.10(b))
shows the hole as a part of the segmented object which is incorrect. In this case,
GrabCut produces the correct segmentation (Fig. 5.10(c)) of the object. Fig.
5.10(d) also shows the correct segmentation result, produced by SnakeCut for
this image. Fig. 5.11 shows a result on another synthetic image where active
contour works but GrabCut fails, and their integration (i.e. SnakeCut) produces
the correct segmentation. Fig. 5.11(a) shows an image where the object (with no
hole) to be segmented has a part (upper green region) similar to the background
(green flowers). Active contour, in this example, produces correct segmentation
(Fig. 5.11(b)) while GrabCut fails (Fig. 5.11(c)). Fig. 5.11(d) shows the correct
segmentation result produced by SnakeCut for this image. Fig. 5.12 presents the
result of SnakeCut segmentation for a real world image. In this example, active
contour fails but GrabCut performs correct segmentation. We see in Fig. 5.12(b)
that the active contour segmentation result contains the part of the background
(visible through the handles) which is incorrect. SnakeCut algorithm produces
correct segmentation result which is shown in Fig. 5.12(d).
In the examples presented so far, we have seen that only one among the two
(Snake and GrabCut) techniques fail to perform correct segmentation. In these
cases, either the Snake is unable to remove holes from the foreground object or
GrabCut is unable to retain the parts of the object which are similar to the
background. SnakeCut performs well in all such situations. We now present a few
94
(a) (b) (c)
Fig. 5.13: SnakeCut segmentation results of (a) teapot (for the image in Fig. 5.3(a)),(b) wheel (for the image in Fig. 5.4(a)) and (c) soldier (for the image in Fig. 5.5(a)).
(a) (b) (c) (d)
Fig. 5.14: Segmentation of cup image: (a) input real image, (b) segmentation resultproduced by Snake (incorrect, as background pixels visible through the cup’s handleare detected as a part of the object), (c) GrabCut segmentation result (incorrect, asthe spots present on the cup’s handle are removed), and (d) correct segmentationresult produced by SnakeCut.
95
(a)
(b)
(c) (d)
Fig. 5.15: Segmentation of webcam bracket image: (a) input real image where theobjective is to segment the lower bracket present in the image, (b) Snake segmentationresult (incorrect, as background pixels visible through the holes present in the object aredetected as part of the foreground object), (c) GrabCut segmentation result (incorrect,as large portions of the bracket are removed in the result), and (d) correct segmentationresult produced by SnakeCut.
results on synthetic and real images, where SnakeCut performs well even when
both the Snake and GrabCut techniques fail to perform correct segmentation.
Fig. 5.13 presents three such SnakeCut results on real world images. Fig. 5.13(a)
shows the segmentation result produced by SnakeCut for the teapot image shown
in Fig. 5.3(a). This result is obtained without user interaction, by integrating
the active contour and GrabCut outputs shown in Figs. 5.3(b) and 5.3(c). Fig.
5.13(b) shows the segmentation result produced by SnakeCut, for the wheel image
shown in Fig. 5.4(a). Intermediate active contour and GrabCut segmentation
results for the wheel are shown in Figs. 5.4(b) and 5.4(c). Fig. 5.13(c) shows
the segmentation result produced by SnakeCut, for the soldier image shown in
Fig. 5.5(a). Intermediate active contour and GrabCut segmentation results for
the soldier image are shown in Figs. 5.5(b) and 5.5(c).
96
Two more SnakeCut segmentation results are presented in Figs. 5.14 and
5.15 for a cup and a part of webcam bracket images, where both Snake and
GrabCut techniques fail to perform correct segmentation. The objective in the
cup example (Fig. 5.14(a)) is to segment the cup in the image. Cup’s handle
has some blue color spots similar to the background color. Snake and GrabCut
results for this image are shown in Fig. 5.14(b) and Fig. 5.14(c) respectively.
One can observe that both these results are erroneous. Result obtained using
Snake contains some part of the background which is visible through the handle.
GrabCut has removed the spots in the handle since their color is similar to the
background. Correct segmentation result produced by SnakeCut is shown in Fig.
5.14(d). Objective in the webcam bracket example (Fig. 5.15(a)) is to segment
the lower bracket (inside the red contour initialized by the user) present in the
image. Snake and GrabCut results for this image are shown in Fig. 5.15(b) and
Fig. 5.15(c) respectively. We can see that both these results are erroneous. The
result obtained using Snake contains some part of the background which is visible
through the holes. GrabCut has removed large portions of the bracket. This is
due to the similarity of the distribution of the metallic color of a part of another
webcam bracket present in the background (it should be noted that the color
distribution of the two webcam brackets are not exactly same due to different
lighting effects). Correct segmentation result produced by SnakeCut is shown in
Fig. 5.15(d). We also observed a similar performance when the initialization was
done around the bracket on the top, for the image in Fig. 5.15(a).
In Fig. 5.16, we compare the automatic SnakeCut segmentation results of
teapot (Fig. 5.3(a)), wheel (Fig. 5.4(a)), soldier (Fig. 5.5(a)) and webcam bracket
97
(a) (b)
(c) (d)
(e) (f)
(g) (h)
Fig. 5.16: Comparison of the results: (a) SnakeCut result for teapot, (b) GrabCutresult for teapot with user interaction, (c) SnakeCut result for wheel, (d) GrabCutresult for wheel with user interaction, (e) SnakeCut result for soldier, (f) GrabCutresult for soldier with user interaction (reproduced from (Rother et al., 2004)), (g)SnakeCut result for webcam bracket, (h) GrabCut result for webcam bracket with userinteraction.
98
(Fig. 5.15(a)) images with the interactive GrabCut outputs. To obtain correct
segmentation for these images with GrabCut, a large amount of user interaction
was necessary to obtain the results shown in Figs. 5.16(b), 5.16(d), 5.16(f) &
5.16(h). In case of teapot (Fig. 5.16(b)), user marked the teapot’s left side
portion as parts of the compulsory foreground. In case of wheel (Fig. 5.16(d))
user marked the outer grayish green region of the wheel as a compulsory part
of the foreground object and in case of soldier (Fig. 5.16(f)), user marked the
soldier’s hat and legs as a part of the compulsory foreground. In case of webcam
bracket (Fig. 5.16(h)) user marked missing regions as compulsory parts of the
foreground object. Segmentation results using SnakeCut were obtained without
any user interaction (except initialization of ROI and in few cases some hard
constraint points at the object boundary) and are mostly better than the results
obtained by GrabCut with user’s corrective editing. One can easily observe the
smooth edges obtained at the border of teapot in Fig. 5.16(a), unlike that in Fig.
5.16(b). The same is true for Fig. 5.16(e) (w.r.t Fig. 5.16(f)) and Fig. 5.16(g)
(w.r.t Fig. 5.16(h)), which may be noticed after careful observations.
The presented approach combines the complementary strengths of active con-
tour and GrabCut processes, to produce correct segmentation in cases where one
or both of these techniques fail. However, the proposed technique (SnakeCut) was
observed to have the following limitations:
1. Since the SnakeCut relies on Active contours for regions near the objectboundary, it fails when holes of the object (through which the backgroundis visible) lie very close to the boundary.
2. Since the Snake cannot penetrate inside the object boundary and detectholes, the proposed method of SnakeCut has to rely on the response of theGrabCut algorithm in such cases. This may result in a hazardous situation
99
(a) (b) (c) (d)
Fig. 5.17: Example where SnakeCut fails: (a) input image with foreground initializedby user, (b) active contour segmentation result (correct), (c) GrabCut segmentationresult (incorrect), and (d) SnakeCut segmentation result (incorrect).
only when the GrabCut detects an interior part, actually belonging to theobject, as a hole due to its high degree of similarity with the background.Since decision logic of SnakeCut relies on GrabCut response for interior partsof the object, it may fail in cases where GrabCut does not detect those partsof the object as foreground.
Fig. 5.17 presents one such situation (using a simulated image) where Snake-
Cut fails to perform correct segmentation. Fig. 5.17(a) shows a synthetic image
where active contour works correctly (see Fig. 5.17(b)) but GrabCut fails (see Fig.
5.17(c)). GrabCut removes the central rectangular green part of the object in the
segmented output, which is actually (should be perceived) a part of the object.
We see in this case that SnakeCut also does not perform correct segmentation and
removes the object’s central rectangular green part from the segmentation result.
SnakeCut thus fails when certain parts of the foreground object are far away (and
interior) from its boundary and very similar to the background.
The heuristic values of some of the parameters used in our algorithm, which
were obtained empirically, were not so critical (with in a certain range) for accurate
foreground object segmentation. The overall computational times required by
SnakeCut on a P-IV, 3 GHz machine with 2 GB RAM, are given in Table 5.1 for
some of the images.
100
Table 5.1: Computational times for foreground object segmentation, required bySnake, GrabCut and SnakeCut for various images.
Image Size Time required (in seconds) forImage Name (in pixels) Snake GrabCut Integrationa SnakeCut
(A) (B) (C) (A+B+C)
Synthetic Image250× 250 4 5 2 11
(Fig. 5.2(a))Teapot Image
519× 375 10 9 4 23(Fig. 5.3(a))Wheel Image
640× 480 6 14 5 25(Fig. 5.4(a))Soldier Image 321× 481 13 12 7 32(Fig. 5.5(a))Synthetic Image
250× 250 4 5 2 11(Fig. 5.10(a))Pot image
296× 478 6 7 4 17(Fig. 5.12(a))Cup image
285× 274 5 7 3 15(Fig. 5.14(a))Webcam bracket
321× 481 7 8 3 18(Fig. 5.15(a))
atime required to integrate Snake and GrabCut outputs using the probabilistic integrator.
5.5 Summary and Discussion
In this chapter, we have presented a novel object segmentation technique based on
the integration of two complementary object segmentation techniques, namely ac-
tive contour and GrabCut. Active contour cannot remove the holes in the interior
part of the object. GrabCut produces poor segmentation results in cases when the
color distribution of some part of the foreground object is similar to background.
The proposed segmentation technique, termed SnakeCut, based on a probabilistic
framework provides an automatic way of object segmentation, where the user has
to only specify the rectangular boundary (ROI) around the desired foreground ob-
ject. Our proposed method is able to retain parts of the foreground object which
101
appear similar to background color and simultaneously eliminate holes present
within the object. SnakeCut is more suitable for object segmentation when the
object’s boundary localization by the GrabCut is not good and object contains
holes in it. In this situation, Snake helps in the localization of the boundary and
GrabCut removes the holes or unwanted background parts from the object’s in-
terior. Hence SnakeCut produces correct segmentation result by combining the
advantages and power of these two complementary techniques. We had validated
our technique with a few synthetic and natural images. Results obtained using
SnakeCut are quite encouraging and promising.
102
CHAPTER 6
Conclusion
In this thesis, we have presented novel methods for foreground object segmentation
in textured and non-textured images. In the first part of the work, we developed
techniques for the segmentation of single or multiple object(s) from a given image,
in presence of foreground and background textures. In the second part of our work,
we developed a technique for efficient segmentation of an object, which contains
holes in it and the color distribution of a part of the object is similar to the
background. We conducted experiments on both synthetic and natural images to
validate our techniques. Following are the key contributions of the work presented
in this thesis.
6.1 Contribution
The key contributions of the work presented in the thesis are as follows:
• It proposes a scalogram based texture feature extraction technique for snakes.
• A new external force for snakes has been introduced, which we call as “tex-ture force”. Texture force is modeled using the scalogram based texturefeatures. Snake, in presence of texture force, is able to segment textured ob-ject. In most of the cases, our proposed method provides better performancecompared to other existing techniques.
• We describe a novel multidimensional scalogram based texture feature ex-traction technique and use it to develop an efficient geodesic active con-tour based segmentation technique for multiple textured objects. Proposedsegmentation technique is based on the generalization of geodesic active
contour, from one dimensional intensity based feature space to multidimen-sional texture feature space. Experimental results show the effectiveness ofthe proposed technique. Results are also compared with existing techniquesin literature.
• We propose a novel, efficient and semi-interactive method for foregroundobject segmentation in color images using the integration of two popularforeground object segmentation techniques, namely parametric active con-tours (Snakes) and GrabCut. The proposed technique, termed as “Snake-Cut”, segments a foreground object with holes. It is based on a probabilisticframework and provides an automatic way of segmenting an object contain-ing holes in it. Our results are quite satisfactory and comparable with thoseobtained using GrabCut with user’s post-corrective editing.
6.2 Future Scope of Work
The proposed methods of object segmentation presented in this thesis can lead to
various other problems for investigations. In the following, we list a few extensions
of our proposed methods for future scope of work:
• Texture features developed in our work can be used in many segmentationand clustering algorithms.
• Texture object segmentation methods presented in this thesis are based onboundary-based information. As a possible extension, one can integrate thesame with region-based information to improve the performance.
• SnakeCut technique handles the segmentation of a single object. As anextension of this work, one can use geodesic active contour (which can in-trinsically segment multiple objects) to make the technique suitable for thesegmentation of multiple objects.
• SnakeCut algorithm can be extended for textured objects by incorporatingtexture features of the image.
• Segmentation methods presented in this work can be extended for objectsegmentation and tracking in videos.
• Auto-initialization is an intelligent process which minimizes the requirementof user interface to a further degree. This may make the process moreautomatic for a particular application - say focus of attention, as in humanbeings, for intelligent robotic vision applications.
• Performance of the same in case of noisy and blurred images.
104
REFERENCES
1. Abdel Alim, O. and M. Sharkas, Texture classification of the human iris usingartificial neural networks. In Proceedings of IEEE MELECON . Cairo, Egypt,2002, 580–583.
2. Acharyya, M. and M. K. Kundu, Two texture segmentation using M-bandwavelet transform. In Proceedings of International on Pattern Recognition, ICPR’00 , volume 3. 2000, 401–404.
3. Amini, A. A., T. E. Weymouth, and R. C. Jain (1990). Using dynamicprogramming for solving variational problems in vision. IEEE Transactions onPattern Analysis and Machine Intelligence, 12(9), 855–867.
4. Arivazhagan, S. and L. Ganesan (2003). Texture classification using wavelettransform. Pattern Recognition Letters , 24(9-10), 1513–1521.
5. Awate, S. P., T. Tasdizen, and R. T. Whitaker, Unsupervised texture seg-mentation with nonparametric neighborhood statistics. In Proceedings of Euro-pean Conference on Computer Vision, ECCV’ 06 , LNCS 3952. 2006, 494–507.
6. Bashar, M. K., T. Matsumoto, and N. Ohnishi (2003). Wavelet transform-based locally orderless images for texture segmentation. Pattern Recognition Let-ters , 24(15), 2633–2650.
7. Blake, A., C. Rother, M. Brown, P. Perez, and Torr, Interactive imagesegmentation using an adaptive GMMRF model. In Proceedings of EuropeanConference on Computer Vision, ECCV’ 04 . Prague, Chech Republic, 2004, 428–441.
8. Boykov, Y. and G. Funka-Lea (2006). Graph-cuts and efficient N-D imagesegmentation. International Journal of Computer Vision, 70(2), 109–131.
9. Boykov, Y. and M.-P. Jolly, Interactive graph-cuts for optimal boundary andregion segmentation of objects in N-D images. In Proceedings of InternationalConference on Computer Vision, ICCV’ 01 , volume 1. 2001, 105–112.
10. Boykov, Y. and V. Kolmogorov, Computing geodesics and minimal surfacesvia graph-cuts. In Proceedings of International Conference on Computer Vision,ICCV’ 03 . IEEE Computer Society, 2003, 26–33.
11. Boykov, Y., V. Kolmogorov, D. Cremers, and Delong, An integral solutionto surface evolution PDEs via geo-cuts. In Proceedings of European Conferenceon Computer Vision, ECCV’ 06 , volume 3 of LNCS 3953 . Graz, Austria, 2006,409–422.
105
12. Breu, H., J. Gil, D. Kirkpatrick, and M. Werman (1995). Linear time eu-clidean distance algorithms. IEEE Transactions on Pattern Analysis and MachineIntelligence, 17(5), 529–533.
13. Brodatz, P., Textures: A photographic album for Artists and Designers . DoverPublications Inc, New York, 1966.
14. Burt, P. J. and E. H. Andelson (1983). The Laplacian pyramid as a compactimage code. IEEE Transactions on Communication, COM-31, 532–540.
15. Carter, P. H., Texture discrimination using wavelets. In Proceedings of SPIE,Applications of Digital Image Processing-XIV , volume 1567. 1991, 432–438.
16. Caselles, V., F. Catte, and T. Coll (1993). A geometric model for activecontours. Numerische Mathematik , 66, 1–31.
17. Caselles, V., R. Kimmel, and G. Saprio (1997). Geodesic active contours.International Journal of Computer Vision, 22(1), 61–79.
18. Chan, T., B. Sandberg, and L. Vese (2000). Active contours without edgesfor vector-valued images. Journal of Visual Communication and Image Represen-tation, 11(2), 130–141.
19. Chan, T. and L. Vese (2001). Active contours without edges. IEEE Transactionson Image Processing , 10(2), 266–277.
20. Chang, T. and C. Kuo (1993). Texture analysis and classification with tree-structured wavelet transform. IEEE Transactions on Image Processing , 2(4),429–440.
21. Charalampidis, D. and T. Kasparis (2002). Wavelet-based rotational invariantroughness features for texture classification and segmentation. IEEE Transactionson Image Processing , 11(8), 825–837.
22. Chiou, G. I. and J. N. Hwang (1995). A neural network-based stochasticactive contour model NNS-SNAKE for contour finding of distinct features. IEEETransactions on Image Processing , 4(10), 1407–1416.
23. Chitre, Y. and A. P. Dhawan (1999). M-band wavelet discrimination of naturaltextures. Pattern Recognition, 32, 773–789.
24. Clerc, M. and S. Mallat (2002). The texture gradient equations for recover-ing shape from texture. IEEE Transactions on Pattern Analysis and MachineIntelligence, 24(4), 536–549.
25. Cohen, I., L. D. Cohen, and N. Ayache (1992). Using deformable surfaces tosegment 3-D images and infer differential structures. CVGIP: Image Understand-ing , 56(2), 242–263.
26. Cohen, L. D. (1991). On active contour models and balloons. CVGIP: ImageUnderstanding , 53(2), 211–218.
106
27. Cohen, L. D. and R. Kimmel (1997). Global minimum for active contourmodels: A minimal path approach. International Journal of Computer Vision,24(1), 57–78.
28. Cox, I. J., S. B. Rao, and Y. Zhong, “Ratio regions”: A technique for imagesegmentation. In Proceedings of International Conference on Pattern Recognition,ICPR’ 96 . 1996, 557–564.
29. Cumani, A. (1991). Edge detection in multispectral images. CVGIP: GraphicalModels and Image Processing , 53(1), 40–51.
30. Daugman, J. G. (1985). Uncertainty relation for resolution in space, spatialfrequency, and orientation optimized by two-dimensional visual. Journal of OpticalSociety of America (A), 2(7), 1160–1169.
31. Deepti, P., R. Abhilash, and S. Das, Integrating linear subspace analysis andinteractive graphcuts for content-based video retrieval. In Proceedings of Interna-tional Conference on Advances in Pattern Recognition, ICAPR’ 07, January 2-4,2007 . World Scientific, Singapore, ISI Calcutta, India, 2007, 263–267.
32. Delingette, H. and J. Montagnat (2001). Shape and topology constraintson parametric active contours. Computer Vision Image Understanding , 83(2),140–171.
33. Dunn, D., E. Higgins, and J. Wakeley (1994). Texture segmentation using2-D Gabor elementary functions. IEEE Transactions on Pattern Analysis andMachine Intelligence, 16(2), 130–149.
34. Elsgolc, L. E., Calculus of Variations . Pergamon Press, 1963.
35. Evans, A. and X. Liu (2006). A morphological gradient approach to color edgedetection. IEEE Transactions on Image Processing , 15(6), 1454–1463.
36. Falcao, A. X., J. K. Udupa, S. Samarasekera, S. Sharma, B. E. Hirsch,and R. de A. Lotufo (1998). User-steered image segmentation paradigms: livewire and live lane. Graphical Models and Image Processing , 60(4), 233–260.
37. Freedman, D. and T. Zhang, Interactive graph-cut based segmentation withshape priors. In Proceedings of Conference on Computer Vision and PatternRecognition, CVPR’ 05 , volume 1. 2005, 755–762.
38. Gabor, D. (1946). Theory of communication. Journal of IEE , 93, Part III(26),429–457.
39. Geiger, D., A. Gupta, L. A. Costa, and J. Vlontzos (1995). Dynamicprogramming for detecting, tracking, and matching deformable contours. IEEETransactions on Pattern Analysis and Machine Intelligence, 17(3), 294–302.
40. Goldenberg, R., R. Kimmel, E. Rivlin, and M. Rudzsky (2001). Fastgeodesic active contours. IEEE Transaction on Image Processing , 10(10), 1467–1475.
107
41. Greiner, T. and S. Das (2006). Bestimmung der objektform durch bildanalyseder oberfluentextur-merkmalsgewinnung mittels m-kanal wavelet-transformation.Automatisierungstechnik , 54(10), 475–485.
42. Gupta, L. and S. Das, Texture edge detection using multi-resolution featuresand SOM. In Proceedings of the 18th International Conference on Pattern Recog-nition, ICPR ’06 . 2006, 199–202.
43. Haasch, A., N. Hofemann, J. Fritsch, and G. Sagerer, A multi-modal objectattention system for a mobile robot. In Proceedings of IEEE/RSJ InternationalConference on Intelligent Robots and Systems, IROS 2005 . Edmonton, Alberta,Canada, 2005, 1499–1504.
44. Haley, G. M. and B. S. Manjunath (1999). Rotation invariant texture clas-sification using complete space-frequency model. IEEE Transactions on ImageProcessing , 8(2), 255–269.
45. Hofmann, T., J. Puzicha, and J. Buhmann (1998). Unsupervised texture seg-mentation in a deterministic annealing framework. IEEE Transactions on PatternAnalysis and Machine Intelligence, 20(8), 803–818.
46. Jain, A. K. and F. Farrokhnia (1991). Unsupervised texture segmentationusing Gabor filters. Pattern Recognition, 24(12), 1167–1186.
47. Jermyn, I. H. and H. Ishikawa, Globally optimal regions and boundaries.In Proceedings of the International Conference on Computer Vision, ICCV’ 99 ,volume 2. 1999, 904.
48. Juan, O. and Y. Boykov, Active graph-cuts. In Proceedings of IEEE Conferenceon Computer Vision and Pattern Recognition, CVPR’ 06 . 2006, 1023–1029.
49. Kass, M., A. Witkin, and D. Terzopoulos (1988). Snakes: Active contourmodels. International Journal of Computer Vision, 1(4), 321–331.
50. Kim, J., J. W. F. III, J. Anthony Yezzi, M. Cetin, and A. S. Willsky, Non-parametric methods for image segmentation using information theory and curveevolution. In Proceedings of IEEE International Conference on Image Processing,ICIP’ 02 , volume 3. Rochester, New York, 2002, 797–800.
51. Kimia, B. B., A. R. Tannenbaum, and S. W. Zucker (1995). Shapes, shocks,and deformations I: the components of two-dimensional shape and the reaction-diffusion space. International Journal of Computer Vision, 15(3), 189–224.
52. Kimmel, R., R. Malladi, and N. A. Sochen, Image processing via the beltramioperator. In Proceedings of the 3rd Asian Conference on Computer Vision, ACCV’98 , volume 1 of LNCS 1351 . 1998, 574–581.
53. Kohli, P. and P. H. S. Torr, Effciently solving dynamic Markov random fieldsusing graph-cuts. In Proceedings of International Conference on Computer Vision,ICCV’ 05 . 2005, 922–929.
108
54. Kolmogorov, V. and Y. Boykov, What metrics can be approximated by geo-cuts, or global optimization of length/area and flux. In Proceedings of Interna-tional Conference on Computer Vision, ICCV’ 05 , volume 1. 2005, 564–571.
55. Kolmogorov, V., A. Criminisi, A. Blake, G. Cross, and C. Rother, Bi-layersegmentation of binocular stereo video. In Proceedings of IEEE Conference onComputer Vision and Pattern Recognition, CVPR’ 05 , volume 2. 2005, 407–414.
56. Kumar, M. P., P. H. S. Torr, and A. Zisserman, Obj cut. In Proceed-ings IEEE Conference on Computer Vision and Pattern Recognition, CVPR’ 05 ,volume 1. 2005, 18–25.
57. Lachaud, J.-O. and A. Montanvert (1999). Deformable meshes with auto-mated topology changes for coarse-to-fine three-dimensional surface extraction.Medical Image Analalysis , 3(2), 187–207.
58. Laine, A. and J. Fan (1993). Texture classification by wavelet packet signatures.IEEE Transactions on Pattern Analysis and Machine Intelligence, 15(11), 1186–1191.
59. Laws, K. (1980). Textured image segmentation. Ph.D. thesis, Dept. of ElectricalEngineering, University of Southern California.
60. Lee, D.-J., J. Archibald, X. Xu, and P. Zhan (2007). Using distance trans-form to solve real-time machine vision inspection problems. Machine Vision andApplications , 18(2), 85–93.
61. Leitner, F. and P. Cinquin, Complex topology 3-D objects segmentation. InProceedings of SPIE Conference on Advances in Intelligent Robotics Systems , vol-ume 1609. 1991, 16–26.
62. Leymarie, F. and M. D. Levine (1993). Tracking deformable objects in theplane using an active contour model. IEEE Transactions on Pattern Analysis andMachine Intelligence, 15(6), 617–634.
63. Li, C., C. Xu, C. Gui, and M. D. Fox, Level set evolution without re-initialization: A new variational formulation. In Proceedings of InternationalConference on Computer Vision and Pattern Recognition, CVPR’ 05 , volume 1.2005a, 430–436.
64. Li, K., X. Wu, D. Z. Chen, and M. Sonka (2006). Optimal surface segmen-tation in volumetric images-a graph-theoretic approach. IEEE Transactions onPattern Analysis and Machine Intelligence, 28(1), 119–134.
65. Li, Y., J. Sun, and H.-Y. Shum, Video object cut and paste. In SIGGRAPH’05: ACM SIGGRAPH 2005 Papers . 2005b, 595–600.
66. Li, Y., J. Sun, C.-K. Tang, and H.-Y. Shum (2004). Lazy snapping. ACMTransactions on Graphics , 23(3), 303–308.
67. Lombaert, H., Y. Sun, L. Grady, and C. Xu, A multilevel banded graph-cutsmethod for fast image segmentation. In Proceedings of International Conferenceon Computer Vision, ICCV’ 05 , volume 1. 2005, 259–265.
109
68. Lorigo, L. M., O. D. Faugeras, W. E. L. Grimson, R. Keriven, andR. Kikinis, Segmentation of bone in clinical knee MRI using texture-basedgeodesic active contours. In Proceedings of the International Conference on Med-ical Image Computing and Computer-Assisted Intervention, MICCAI ’98 , LNCS1496. Springer-Verlag, London, UK, 1998, 1195–1204.
69. Lu, C., P. Chung, and C. Chen (1997). Unsupervised texture segmentationvia wavelet transform. Pattern Recognition, 30(5), 729–742.
70. Lu, K. and T. Pavlidis (2007). Detecting textured objects using convex hull.Machine Vision and Applications , 18(2), 123–133.
71. Malladi, R., J. A. Sethian, and B. C. Vemuri (1995). Shape modeling withfront propagation: A level set approach. IEEE Transactions on Pattern Analysisand Machine Intelligence, 17(2), 158–175.
72. Mallat, S. (1989). A theory for multiresolution signal decomposition: Thewavelet representation. IEEE Transactions on Pattern Analysis and MachineIntelligence, 11(7), 674–693.
73. Mallat, S., A Wavelet Tour of Signal Processing . Academic Press, 1999.
74. Marcelja, S. (1980). Mathematical description of the responses of simple corticalcells. Journal of Optical Society of America, 7(11), 1297–1300.
75. McInerney, T. and D. Terzopoulos (1996). Deformable models in medicalimages analysis: a survey. Medical Image Analysis , 1(2), 91–108.
76. Moller, B., S. Posch, A. Haasch, J. Fritsch, and G. Sagerer, Interactiveobject learning for robot companions using mosaic images. In Proceedings ofIEEE/RSJ International Conference on Intelligent Robots and Systems, IROS2005, 2-6 August, 2005 . Edmonton, Alberta, Canada, 2005, 2650– 2655.
77. Mortensen, E. N. and W. A. Barrett (1998). Interactive segmentation withintelligent scissors. Graphical Models in Image Processing , 60(5), 349–384.
78. Ng, I., T. Tan, and J. Kittler, On linear transform and Gabor filter represen-tation of texture. In Proceedings of IAPR International Conference on PatternRecognition, volume 3. The Hauge, Netherlands, 1992, 627–631.
79. Olaf Pichler, A. T. and B. J. Hosticka (1996). A comparison of texture featureextraction using adaptive Gabor filtering, pyramidal and tree structured wavelettransforms. Pattern Recognition, 29(5), 733–742.
80. Osher, S. and R. Fedkiw, Level Set Methods and Dynamic Implicit Surfaces .Springer-Verlag, New York, 2002.
81. Osher, S. and N. Paragios, Geometric Level Set Methods in Imaging,Vision,andGraphics . Springer-Verlag New York, Inc., Secaucus, NJ, USA, 2003.
82. Osher, S. J. and J. A. Sethian (1988). Fronts propagating with curvature-dependent speed: Algorithms based on Hamilton-Jacobi formulations. Journal ofComputational Physics , 79, 12–49.
110
83. P. P. Raghu (1995). Artificial Neural Network Models for Texture Analysis .Ph.D. thesis, Dept. of Computer Science and Engineering, IIT Madras, Chennai.
84. Paglieroni, D. W., G. E. Ford, and E. M. Tsujimoto (1994). The position-orientation masking approach to parametric search for template matching. IEEETransactions on Pattern Analysis and Machine Intelligence, 16(7), 740–747.
85. Paragios, N. and R. Deriche, A PDE-based level-set approach for detectionand tracking of moving objects. In Proceedings of 6th International Conferencein Computer Vision. Bombay, India, 1998, 1139–1145.
86. Paragios, N. and R. Deriche, Geodesic active contours for supervised texturesegmentation. In Proceedings of IEEE Conference on Computer Vision and Pat-tern Recognition, CVPR’ 99 , volume 2. 1999a, 427–432.
87. Paragios, N. and R. Deriche, Geodesic active regions for supervised texturesegmentation. In Proceedings of International Conference on Computer Vision,ICCV’ 99 . 1999b, 926–932.
88. Paragios, N. and R. Deriche (2002a). Geodesic active regions: a new paradigmto deal with frame partition problems in computer vision. Journal of Visual Com-munication and Image Representation, Special Issue on Partial Differential Equa-tions in Image Processing, Computer Vision and Computer Graphics , 13(1/2),249–268.
89. Paragios, N. and R. Deriche (2002b). Geodesic active regions and level setmethods for supervised texture segmentation. International Journal of ComputerVision, 46(3), 223–247.
90. Paragios, N., O. Mellina-Gottardo, and V. Ramesh (2004). Gradient vectorflow fast geometric active contours. IEEE Transactions on Pattern Analysis andMachine Intelligence, 26(3), 402–407.
91. Pujol, O. and P. Radeva (2004). Texture segmentation by statistical deformablemodels. International Journal of Image and Graphics , 4(3), 433–452.
92. Rao, S. G., S. Das, T. Greiner, and M. Kalra, Error analysis of M-channelDWT based method for orientation estimation of an inclined planar texture sur-face. In Proceedings of International Conference on Visual Information Engineer-ing, VIE’ 06 . Bangalore, India, 2006, 321–326.
93. Rao, S. G., M. Puri, and S. Das, Unsupervised segmentation of texture imagesusing a combination of Gabor and wavelet features. In Proceedings of IndianConference on Computer Vision, Graphics and Image Processing, ICVGIP’ 04 .Kolkata, India, 2004, 370–375.
94. Rother, C., V. Kolmogorov, and A. Blake (2004). GrabCut: Interactiveforeground extraction using iterated graph-cuts. ACM Transactions on Graphics ,23(3), 309–34.
111
95. Rousson, M., T. Brox, and R. Deriche, Active unsupervised texture segmen-tation on a diffusion based feature space. In Proceedings of IEEE Conferenceon Computer Vision and Pattern Recognition, CVPR’ 03 , volume 2. 2003, II:699–704.
96. Sagiv, C., N. Sochen, and Y. Zeevi, Texture segmentation via a diffusion-segmentation scheme in the gabor feature space. In International Workshop onTexture Analysis and Synthesis, Texture’ 02 . 2002, 123–128.
97. Sagiv, C., N. Sochen, and Y. Zeevi (2006). Integrated active contours fortexture segmentation. IEEE Transactions on Image Processing , 15(6), 1633–1646.
98. Sagiv, C., N. A. Sochen, and Y. Y. Zeevi, Gabor-space geodesic active con-tours. In Proceedings of 2nd International Workshop on Algebraic Frames for thePerception-Action Cycle, AFPAC ’00 , LNCS 1888. 2000, 309–318.
99. Sagiv, C., N. A. Sochen, and Y. Y. Zeevi, Geodesic active contours applied totexture feature space. In Proceedings of International Conference on Scale-Spaceand Morphology in Computer Vision, LNCS 2106. Springer-Verlag, Vancouver,Canada, 2001, 344–352.
100. Salari, E. and Z. Ling (1995). Texture segmentation using hierarchical waveletdecomposition. Pattern Recognition, 28(12), 1819–1824.
101. Sandberg, B., T. Chan, and L. Vese (2002). A level-set and Gabor-basedactive contour algorithm for segmenting textured images. UCLA Department ofMathematics CAM report 02-39 .
102. Sanjay, M., S. Das, and B. Yegnanarayana, Robust template matching fornoisy bitmap images invariant to translation and rotation. In Indian Conferenceon Computer Vision, Graphics and Image Processing, December 21-23, 1998 . NewDelhi, INDIA, 1998, 82–88.
103. Sapiro, G., Vector-valued active contours. In Proceedings of International Con-ference on Computer Vision and Pattern Recognition, CVPR’ 96 . IEEE ComputerSociety, 1996, 680–685.
104. Sapiro, G. (1997). Color snake. Computer Vision and Image Understanding ,68(2), 247–253.
105. Sapiro, G., Geometric partial differential equations and image analysis . Cam-bridge University Press, New York, NY, USA, 2001.
106. Sethian, J. A., Level Set Methods and Fast Marching Methods . CambridgeUniversity Press, Cambridge, 1999.
107. Sochen, N., R. Kimmel, and R. Malladi (1998). A general framework for lowlevel vision. IEEE Transactions on Image Processing , 7(3), 310–318.
108. Srinark, T. and C. Kambhamettu, A framework for multiple snakes. In Pro-ceedings of Computer Vision and Pattern Recognition, CVPR’ 01 . 2001, II: 202–209.
112
109. Srinark, T. and C. Kambhamettu (2006). A framework for multiple snakesand its applications. Pattern Recognition, 39(9), 1555–1565.
110. Szeliski, R., D. Tonnesen, and D. Terzopoulos, Modeling surfaces of arbi-trary topology with dynamic particles. In Proceedings of IEEE Conference onComputer Vision and Pattern Recognition, CVPR’ 93 . 1993, 82–87.
111. Terzopoulos, D. and K. Fleischer (1988). Deformable models. The VisualComputer , 4(6), 306–331.
112. Terzopoulos, D. and R. Szeliski, Tracking with Kalman Snakes . MIT Press,Cambridge, MA, 1992.
113. Toivanen, P., J. Ansamaki, J. Parkkinen, and J. Mielikainen (2003). Edgedetection in multispectral images using the self-organizing map. Pattern Recogni-tion Letters , 24(16), 2987–2994.
114. Tsai, C.-T., Y.-N. Sun, and P.-C. Chung, Minimising the energy of activecontour model using a Hopfield network. In IEE Proceedings-Computers andDigital Techniques , volume 140(6). 1993, 297–303.
115. Tsang, P., P. Yuen, and F. Lam (1994). Classification of partially occludedobjects using 3-point matching and distance transformation. Pattern Recognition,27(1), 27–40.
116. Vasilevskiy, A. and K. Siddiqi (2002). Flux maximizing geometric flows. IEEETransactions on Pattern Analysis and Machine Intelligence, 24(12), 1565–1578.
117. Vemuri, B. and Y. Chen, Joint image registration and segmentation. In Pro-ceedings of Geometric level set methods in imaging, vision and graphics . Springer,New York, 2003, 251–269.
118. Venkatesh, Y. V. and N. Rishikesh, Modelling active contours using neuralnetworks isomorphic to boundaries. In Proceedings International Conference onNeural Networks , volume 3. Houston,Texas, 1997, 1669–1672.
119. Venkatesh, Y. V. and N. Rishikesh (2000). Self-organizing neural networksbased on spatial isomorphism for active contour modeling. Pattern Recognition,33(7), 1239–1250.
120. Wang, B. and L. Zhang, Supervised texture segmentation using wavelet trans-form. In Proceedings of International Conference on Neural Networks and SignalProcessing , volume 2. Nanjing, China, 2003, 1078–1082.
121. Wang, J., P. Bhat, R. A. Colburn, M. Agrawala, and M. F. Cohen (2005).Interactive video cutout. ACM Transactions on Graphics , 24(3), 585–594.
122. Wang, L. and J. Liu (1999). Texture classification using multiresolution Markovrandom field models. Pattern Recognition Letters , 20(2), 171–182.
123. Weickert, J., Fast segmentation methods based on partial differential equa-tions and the watershed transformation. In Mustererkenmung . Berlin, Germany:Springer, 1998, 93–100.
113
124. Xu, C. and J. Prince, Gradient vector flow: A new external force for snakes.In Proceedings of Computer Vision and Pattern Recognition, CVPR’ 97 . 1997,66–71.
125. Xu, C. and J. Prince (1998a). Snakes, shapes, and gradient vector flow. IEEETransactions on Image Processing , 7(3), 359–369.
126. Xu, C. and J. L. Prince (1998b). Generalized gradient vector flow externalforces for active contours. Signal Processing , 71(2), 131–139.
127. Xu, N., R. Bansal, and N. Ahuja, Object segmentation using graph-cutsbased active contours. In Proceedings of IEEE Conference on Computer Visionand Pattern Recognition, CVPR’ 03 , volume 2. 2003, 46–53.
128. Yezzi, A., S. Kichenassamy, A. Kumar, P. Olver, and A. Tennenbaum(1997). A geometric snake model for segmentation of medical imagery. IEEETransaction on Medical Imaging , 16(2), 199–209.
129. Zhao, H.-K., T. Chan, B. Merriman, and S. Osher (1996). A variational levelset approach to multiphase motion. Journal of Computational Physics , 127(1),179–195.
130. Zhu, S. C. and A. Yuille (1996). Region competition: Unifying snakes, regiongrowing, and bayes/MDL for multiband image segmentation. IEEE Transactionson Pattern Analysis and Machine Intelligence, 18(9), 884–900.
114
LIST OF PAPERS BASED ON THESIS
1. Surya Prakash and Sukhendu Das, “External Force Modeling of Snake usingDWT for Texture Object Segmentation”, In Proceedings of InternationalConference on Advances in Pattern Recognition, ICAPR’ 07, January 2-4,2007, World Scientific, Singapore, pp. 215-219.
2. Surya Prakash, R. Abhilash and Sukhendu Das, “SnakeCut: IntegratingActive Contour and GrabCut for Superior Object Segmentation”, ElectronicLetters on Computer Vision and Image Analysis (ELCVIA), Vol. 6, No. 3,pp. 13-29, December 2007.
3. Surya Prakash and Sukhendu Das, “Segmenting Multiple Textured Objectsusing Geodesic Active Contour and DWT”, Proceedings of InternationalConference on Pattern Recognition and Machine Intelligence (PReMI’ 07),LNCS 4815, pp. 111-118, December 2007.
4. Surya Prakash, “Multiple Textured Objects Segmentation using DWT basedTexture Features in Geodesic Active Contour”, Proceedings of InternationalConference on Computational Intelligence and Multimedia Applications, IC-CIMA’ 07, Vol. 2, IEEE Computer Society, pp. 532-536, December 2007.
115
Curriculum Vitae
1. Name: Surya Prakash
2. Date of Birth: June 25, 1980
3. Educational Qualification:Bachelor of Technology (B.Tech.)Year: 2001Institute: Institute of Engineering and Technology,CSJM University Kanpur.Specialization: Computer Science & Engineering
Master of Science (M.S.)Year: 2008Institute: Indian Institute of Technology Madras,Chennai-600 036Specialization: Computer Science & EngineeringRegistration Date: January 3, 2005
116
General Test Committee
Chairperson:
Dr. Kamala Krithivasan
Professor,
Department of Computer Sc. & Engineering,
IIT Madras, Chennai - 600036.
Guide:
Dr. Sukhendu Das
Associate Professor,
Department of Computer Sc. & Engineering,
IIT Madras, Chennai - 600036.
Members:
Dr. Hema A Murthy
Professor,
Department of Computer Sc. & Engineering,
IIT Madras, Chennai - 600036.
Dr. Srinivasa Chakravarthy V
Associate Professor,
Department of Biotechnology,
IIT Madras, Chennai - 600036.
117