Active Contour Based Foreground Object …THESIS CERTIFICATE This is to certify that the thesis titled \Active Contour Based Foreground Object Segmentation", submitted by Surya Prakash,

$Page 1: Active Contour Based Foreground Object …THESIS CERTIFICATE This is to certify that the thesis titled \Active Contour Based Foreground Object Segmentation", submitted by Surya Prakash,$
Active Contour Based Foreground Object

Segmentation

A THESIS

submitted by

SURYA PRAKASH

for the award of the degree

of

MASTER OF SCIENCE

(by Research)

DEPARTMENT OF COMPUTER SCIENCE &

ENGINEERING

INDIAN INSTITUTE OF TECHNOLOGY MADRAS

June 2008

Dedicated to my

beloved parents

&

respected teachers

i

THESIS CERTIFICATE

This is to certify that the thesis titled “Active Contour Based Foreground

Object Segmentation”, submitted by Surya Prakash, to the Indian Institute

of Technology, Madras, for the award of the degree of Master of Science, is a

bona fide record of the research work done by him under my supervision. The

contents of this thesis, in full or in parts, have not been submitted to any other

Institute or University for the award of any degree or diploma.

Dr. Sukhendu Das

Research Guide

Associate Professor

Department of Computer Science and Engineering,

IIT Madras, CHENNAI-600036.

Date: June 17, 2008

ACKNOWLEDGEMENTS

First and foremost, I would like to express my deep and sincere gratitude to my

research supervisor Dr. Sukhendu Das whose patient guidance, constant encour-

agement and excellent advice throughout the course of my MS program has made

this thesis possible. His expertise in the area and enthusiasm in the research have

been of great value for me. I am also thankful to him for placing the laboratory

facilities at my disposal.

I also take this opportunity to express my sincere thanks to my General Test

Committee members Prof. P. Sreenivasa Kumar, Prof. Hema A. Murthy and Dr.

V. Srinivasa Chakravarthy for their interest, encouragement, valuable suggestions

and thoughtful reviews.

I am also grateful to Prof. S. Raman (former HOD), Prof. Timonthy A.

Gonzalves (current HOD) for providing the best possible facilities to carry out

the research work. I am also grateful to Dr. C. Chandrashekhar, Prof. Hema A.

Murthy and Prof. B. Yegnanarayana for their role in building up the foundation

in subjects of Pattern Recognition and Artificial Neural Networks, useful for my

research.

My sincere thanks to Computer Science office and laboratory staff Mr. Na-

trajan, Mrs. Sharada, Ms. Poongodi, Mrs. Prema, Mr. Balu (at the department

library) for their valuable cooperation and assistance.

i

My special thanks to my lab-mates Sunando, Abhilash, Sundar, Deepti, Lalit,

Sreyasee, Manisha, Mirnalinee, Dyana, Aakanksha, Manika, Shivani, Poongodi,

Arpita, Vinod, Naresh, Uttara, Gyathri, Vidya for being tolerant and cooperative.

My special thanks to Sunando, Vinod, Aakanksha, Mirnalinee and Lalit for having

had long hours of research discussions with me and their advice during thesis

writing.

My stay at IITM has been made a memorable one by my friends Apoorve,

Sunando, Lalit, Abhilash, Sundar, Sai, Rakesh, Saurabh, Srini, Aditya, Shravan,

Rohit, Raj, Harendra.

Last but not the least, I would like to thank my parents, brother and sister

for being a source of encouragement and strength all throughout.

ii

ABSTRACT

Image segmentation is an important component in many image analysis and com-

puter vision tasks. Particularly, the problem of efficient interactive foreground ob-

ject segmentation in still images is of great practical importance in image editing

and has been the interest of research for a long time. Classical image segmentation

tools use either texture, color or edge (contrast) information for the purpose of

segmentation. Deformable models, Graph-cut, GrabCut etc. are some prominent

methods used for the segmentation of a foreground object. Object segmentation

methods have helped in many computer vision areas, such as scene representation

& interpretation, content based image retrieval, object tracking in videos, medical

applications etc.

Most object segmentation techniques in computer vision are based on the prin-

ciple of boundary detection. These segmentation techniques assume a significant

and constant gray level change between the object(s) of interest and the back-

ground. However, this is not true in the case of textured images. In textured

images, there exist many local edges of the texture micro units (texels), due to

the basic nature of a texture image. In case of textured images, the object bound-

ary is defined as the place where texture property changes. So to perform the

correct segmentation in case of textured images, there is a need to incorporate the

textural information in the segmentation process.

iii

The objective of this work is to develop efficient methods for foreground ob-

ject(s) segmentation in a given image. In the first part of the work, we develop

techniques for the segmentation of single or multiple object(s) from an image in

presence of foreground and background textures. We use active contour models

for the task of texture object segmentation by incorporating texture features. We

model texture characteristics of the image by building the scalogram obtained

using the discrete wavelet transform.

In the second part of our work, we develop a technique for efficient segmenta-

tion of an object in a color image. This technique deals with the complex problem

of segmentation of an object which contains holes in it, and in addition, the color

distribution of a part of the object is similar to that of the background. Color

object segmentation techniques, such as GrabCut etc., available in the literature,

often requires user’s post-corrective editing to perform correct segmentation. Our

proposed technique is semi-automatic and only requires the user to define a rect-

angle (or polygon) around the object to be segmented, and does not require post-

corrective editing. The proposed method is based on a probabilistic framework

to integrate the outputs of Snake and GrabCut. We have demonstrated the effi-

ciency and correctness of our proposed methods using a set of sufficiently difficult

simulated and real world images.

iv

TABLE OF CONTENTS

ACKNOWLEDGEMENTS i

ABSTRACT iii

LIST OF TABLES viii

LIST OF FIGURES xiii

LIST OF ALGORITHMS xiv

ABBREVIATIONS xv

1 Introduction 1

1.1 Problem Definition . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.1.1 Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.2 Motivation and Scope . . . . . . . . . . . . . . . . . . . . . . . . 4

1.2.1 Applications . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.3 Purpose of Using Active Contour Based Methods for Segmentationof Textured Object(s) . . . . . . . . . . . . . . . . . . . . . . . . 6

1.4 Brief Description of Work Done . . . . . . . . . . . . . . . . . . 7

1.4.1 Texture Object Segmentation using Parametric Active Con-tours . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

1.4.2 Segmentation of Multiple Textured Objects using GeodesicActive Contours . . . . . . . . . . . . . . . . . . . . . . . 9

1.4.3 SnakeCut: An Automatic Technique for Segmentation of aForeground Object with Holes . . . . . . . . . . . . . . . 10

1.5 Organization of the Thesis . . . . . . . . . . . . . . . . . . . . . 11

2 Literature Review 14

2.1 Object Segmentation Methods . . . . . . . . . . . . . . . . . . . 14

2.1.1 Energy Based Object Segmentation Methods . . . . . . . 15

v

2.2 Active Contour Models . . . . . . . . . . . . . . . . . . . . . . . 17

2.2.1 Parametric Active Contours (Snakes) . . . . . . . . . . . 18

2.2.2 Geometric (geodesic) Active Contours . . . . . . . . . . 19

2.3 Active Contour based Textured Object Segmentation . . . . . . 22

2.4 Graph-cut Based Methods . . . . . . . . . . . . . . . . . . . . . 25

2.5 Texture Representation and Analysis . . . . . . . . . . . . . . . 27

2.5.1 Texture Feature Extraction using Multiresolution Methods 28

2.5.2 Texture Feature Extraction using Spatial/Spatial-frequencyTechniques . . . . . . . . . . . . . . . . . . . . . . . . . . 34

3 Textured Object Segmentation using Parametric Active Con-tours 38

3.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

3.1.1 Parametric Active Contour (Snake) Model . . . . . . . . 40

3.1.2 Discrete Wavelet Transform and Scalogram . . . . . . . . 41

3.2 Texture Feature Extraction . . . . . . . . . . . . . . . . . . . . 42

3.2.1 Scalogram estimation . . . . . . . . . . . . . . . . . . . . 43

3.2.2 Texture feature estimation . . . . . . . . . . . . . . . . . 44

3.3 Modeling of Texture Force . . . . . . . . . . . . . . . . . . . . . 45

3.4 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . 47

3.5 Summary and Discussion . . . . . . . . . . . . . . . . . . . . . . 50

4 Segmentation of Multiple Textured Objects using Geodesic Ac-tive Contours 53

4.1 Geodesic Active Contour . . . . . . . . . . . . . . . . . . . . . . 55

4.2 Multi-dimensional Texture Feature Extraction . . . . . . . . . . 58

4.2.1 Multi-dimensional Texture Feature Estimation . . . . . . 58

4.3 Geodesic Active Contours in Texture Feature Space . . . . . . . 60

4.3.1 Segmentation of multiple textured objects . . . . . . . . 62

4.4 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . 65


5 SnakeCut: An Automatic Technique for Segmentation of aForeground Object with Holes 76

vi

5.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

5.1.1 Parametric Active Contour (Snake) Model for Color Images 77

5.1.2 GrabCut . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

5.2 Comparison of Active Contour and GrabCut Methods . . . . . . 81

5.3 SnakeCut: Integration of Active Contour and GrabCut . . . . . 85

5.4 SnakeCut Segmentation Results . . . . . . . . . . . . . . . . . . 91


6 Conclusion 103

6.1 Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

6.2 Future Scope of Work . . . . . . . . . . . . . . . . . . . . . . . . 104

LIST OF TABLES

2.1 Energy based approaches for the object segmentation (taken fromBoykov and Funka-Lea, 2006) . . . . . . . . . . . . . . . . . . . 15

2.2 Properties of parametric and geometric (geodesic) active contours(taken from Delingette and Montagnat, 2001) . . . . . . . . . . 22

3.1 Computational time required for the segmentation of textured ob-jects in images shown in Fig. 3.4. Images are numbered from topto bottom in Fig. 3.4. . . . . . . . . . . . . . . . . . . . . . . . 49

4.1 Computational time required for the segmentation of textured ob-jects in images shown in Fig. 4.7. Images are numbered from topto bottom in Fig. 4.7. . . . . . . . . . . . . . . . . . . . . . . . 69

5.1 Computational times for foreground object segmentation, requiredby Snake, GrabCut and SnakeCut for various images. . . . . . . 101

viii

LIST OF FIGURES

1.1 Segmentation of foreground object: (a) input image, (b) segmentedforeground object. . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.2 Problem of segmentation of multiple objects in presence of fore-ground and background textures: (a) input image, (b) desired seg-mentation result of foreground objects, where object boundariesare shown using black contours. . . . . . . . . . . . . . . . . . . 3

1.3 Problem of segmenting an object with holes: (a) input image. Userwants to segment the central elliptical object present in the scene,which contains a rectangular hole in it, (b) desired segmentationresult. In the result, background is visible through the hole of theobject. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.4 Active Contour operating at a certain region of interest (ROI): (a)image with initial contour surrounding the ROI, (b) image showingthe deforming contour, and (c) image with the converged contour. 7

2.1 Categorization of filtering techniques used for texture representa-tion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

2.2 Real part of a 2-D Gabor filter with different combinations of scale(σ), orientation (θ) and frequency (ω): (a) σ = 6, θ = 0, ω = 2;(b) σ = 6, θ = 45, ω = 2.8. . . . . . . . . . . . . . . . . . . . . 36

3.1 (a) Synthetic texture image with initial contour, (b) result obtainedusing normal intensity based snake, and (c) desired segmentationresult. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

3.2 (a) Synthetic texture image, (b) magnified view of the 21 × 21window of the texture cropped at point P shown in (a). . . . . . 42

3.3 (a) 1-D texture profile of the texture window shown in Fig. 3.2(b),(b) scalogram of the signal shown in (a), (c) texture feature imagefor the image shown in Fig. 3.2(a). . . . . . . . . . . . . . . . . 43

3.4 Results of our proposed technique: first five texture images are com-posed of two Brodatz textures (Brodatz, 1966) and the last image isa natural Zebra image, (a) input texture images, (b) correspondingtexture feature images, (c) segmentation results. Contour shownin images depicts the estimated boundary of the foreground objectwith texture. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

ix

3.5 Comparative results: (I): results on a synthetic texture image: (a)from Sagiv et al. (2006), (b) our result, (II): results on Zebra-1image: (a) from Paragios and Deriche (2002b), (b) from Roussonet al. (2003), (c) our result, (III): results on Zebra-2 image: (a)from Kim et al. (2002), (b) from Rousson et al. (2003), (c) ourresult, (IV): results on Cheetah image: (a) from Kim et al. (2002),(b) from Rousson et al. (2003), (c) result of the proposed method. 51

3.6 Comparative results on Zebra image: (a) reproduced from Sagivet al. (2002), (b) reproduced from Rousson et al. (2003), (c) repro-duced from Awate et al. (2006), (d) reproduced from Sagiv et al.(2006), (e) obtained using the method presented in Gupta and Das(2006), (f) result of the proposed method. . . . . . . . . . . . . 52

4.1 Problem of segmentation of multiple objects using a parametricactive contour. . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

4.2 Geometric interpretation of the attraction force in 1-D: (a) The

original edge signal I, (b) I, the smoothed version of I, and (c) thederived stoping function g. The evolving contour is attracted tothe valley created by ∇g · ∇u. (figure taken from Caselles et al.,1997) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

4.3 Demonstration of segmentation of multiple objects using the tex-ture feature based geodesic active contour method : (a) Evolutionof level set surface (LSS), (b) evolving zero level set contour. Firstrow shows the initial LSS and zero level set contour. Last row showsthe final state of LSS and zero level set contour after convergence. 63

4.4 A close look at the process of level set surface (LSS) evolution andcontour splitting, from a different 3-D look angle, for the segmenta-tion of multiple textured objects shown in Fig. 4.3. From (a)-(h),figures show the evolving LSS with zero level set contour (shown inred color on the surface). Fig. (a) shows the initial LSS and Fig.(h) shows the final LSS after segmentation. . . . . . . . . . . . . 64

4.5 Segmentation result of a synthetic image: (a) input image, (b) tex-ture edge map produced by inverse edge detector (Eqn. 4.8) usingthe scalogram based texture features, (c) input image with initialcontour marked around the objects in black color, (d)-(i) intermedi-ate positions of the evolving contour during segmentation process,(k) segmentation result where segmented objects boundaries areshown in black color. . . . . . . . . . . . . . . . . . . . . . . . . 66

x

4.6 Segmentation result of a synthetic image: (a) input image, (b) tex-ture edge map produced by inverse edge detector using the scalo-gram based texture features, (c) input image with initial contourmarked around the objects in black color, (d)-(i) intermediate posi-tions of the evolving contour during segmentation process, (k) seg-mentation result where segmented objects boundaries are shown inblack color. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

4.7 Segmentation results of natural images: (a) input images whereinitial contours are marked around the object(s) in black color,(b) texture edge maps produced by inverse edge detector (Eqn.4.8) using the scalogram based texture features for the respectiveimages, (c) segmentation results of the image shown in column 1,where boundaries of the segmented objects are shown in black color. 70

4.8 Contour evolution in the segmentation of Zebra image: (a) inputimage with initial contour marked around the objects in black color,(b)-(i) intermediate positions of the evolving contour during thesegmentation process, (j) segmentation result where boundaries ofthe segmented objects are shown in black color. . . . . . . . . . 71

4.9 Comparative results (I): (a) input image, (b) result reproducedfrom Kim et al. (2002), (c) our result, (II): (a) input image, (b)result reproduced from Paragios and Deriche (2002a), (c) our result. 72

4.10 Comparative results: (I) result on Zebra image: (a) reproducedfrom Paragios and Deriche (2002b), (b) reproduced from Roussonet al. (2003), (c) produced by our proposed technique; (II) resulton Cheetah image: (a) reproduced from Kim et al. (2002), (b)reproduced from Rousson et al. (2003), (c) result produced by ourproposed technique. . . . . . . . . . . . . . . . . . . . . . . . . . 73

4.11 Comparison of single textured object segmentation results obtainedusing the parametric active contour based technique (presented insection 3.3) and the geodesic active contour based technique (pre-sented in section 4.3): (a) results obtained using parametric activecontour based technique, (b) results obtained using geodesic activecontour based technique. . . . . . . . . . . . . . . . . . . . . . . 74

5.1 Estimation of gradient in color and gray level images: (a) inputimage, (b) gradient image of (a) estimated using Eqn. 5.2, (c) grayscale image of (a); (d) gradient image of (c). . . . . . . . . . . . 78

5.2 (a) Input image, elliptical object present in the image contains arectangular hole at the center, (b) foreground initialization by user,(c) active contour segmentation result, and (d) GrabCut segmen-tation result. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

5.3 (a) Teapot image; segmentation results of (b) active contour and(c) GrabCut. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

xi

5.4 (a) Image containing wheel; segmentation results of (b) active con-tour and (c) GrabCut. . . . . . . . . . . . . . . . . . . . . . . . 83

5.5 (a) Soldier Image, segmentation results of (b) active contour and(c) GrabCut. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

5.6 Flow chart of the proposed SnakeCut technique. . . . . . . . . . 88

5.7 Segmentation of image shown in Fig. 5.2(a) using SnakeCut: (a)object boundary produced by active contour, (b) distance trans-form for the boundary contour shown in (a); (c) SnakeCut segmen-tation result. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

5.8 Effect of interval [a, b] on the non-linearity of the fuzzy distributionfunction (Eqn. 5.10). When a < b, transition from 0 (at a) to 1(at b) is smooth. When a ≥ b, we have a step function with thetransition at (a + b)/2. . . . . . . . . . . . . . . . . . . . . . . . 89

5.9 Demonstration of the impact of Pc and Ps values on the decisionmaking in Algorithm 2: (a) teapot image with a few points markedon it, (b) values of Pc, Ps, p, and the decision obtained using Algo-rithm 2, for the points labeled in (a). Values used for ρ and T are0.5 and 0.6 respectively. . . . . . . . . . . . . . . . . . . . . . . 92

5.10 Demonstration of a SnakeCut result on a synthetic image, whereSnake fails and GrabCut works: (a) input image with foregroundinitialized by the user (object contains a rectangular hole at thecenter), (b) Snake segmentation result (incorrect, output containsthe hole as a part of the object), (c) GrabCut segmentation result(correct, hole is removed), and (d) SnakeCut segmentation result(correct, hole is removed). . . . . . . . . . . . . . . . . . . . . . 92

5.11 Demonstration of a SnakeCut result on a synthetic image, whereSnake works and GrabCut fails: (a) input image with foregroundinitialized by the user, (b) Snake segmentation result (correct), (c)GrabCut segmentation result (incorrect, upper green part of theobject is removed), and (d) correct segmentation result producedby SnakeCut. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

5.12 Segmentation of real pot image: (a) input real image, (b) activecontour segmentation result (incorrect), (c) GrabCut segmentationresult (correct), and (d) SnakeCut segmentation result (correct,background pixels visible through the handles of the pot are re-moved). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

5.13 SnakeCut segmentation results of (a) teapot (for the image in Fig.5.3(a)), (b) wheel (for the image in Fig. 5.4(a)) and (c) soldier (forthe image in Fig. 5.5(a)). . . . . . . . . . . . . . . . . . . . . . . 95

xii

5.14 Segmentation of cup image: (a) input real image, (b) segmentationresult produced by Snake (incorrect, as background pixels visiblethrough the cup’s handle are detected as a part of the object), (c)GrabCut segmentation result (incorrect, as the spots present onthe cup’s handle are removed), and (d) correct segmentation resultproduced by SnakeCut. . . . . . . . . . . . . . . . . . . . . . . . 95

5.15 Segmentation of webcam bracket image: (a) input real image wherethe objective is to segment the lower bracket present in the image,(b) Snake segmentation result (incorrect, as background pixels vis-ible through the holes present in the object are detected as part ofthe foreground object), (c) GrabCut segmentation result (incorrect,as large portions of the bracket are removed in the result), and (d)correct segmentation result produced by SnakeCut. . . . . . . . 96

5.16 Comparison of the results: (a) SnakeCut result for teapot, (b)GrabCut result for teapot with user interaction, (c) SnakeCut resultfor wheel, (d) GrabCut result for wheel with user interaction, (e)SnakeCut result for soldier, (f) GrabCut result for soldier with userinteraction (reproduced from (Rother et al., 2004)), (g) SnakeCutresult for webcam bracket, (h) GrabCut result for webcam bracketwith user interaction. . . . . . . . . . . . . . . . . . . . . . . . . 98

5.17 Example where SnakeCut fails: (a) input image with foregroundinitialized by user, (b) active contour segmentation result (correct),(c) GrabCut segmentation result (incorrect), and (d) SnakeCut seg-mentation result (incorrect). . . . . . . . . . . . . . . . . . . . . 100

xiii

LIST OF ALGORITHMS

1 Identification of significant subbands . . . . . . . . . . . . . . . 442 Steps of SnakeCut . . . . . . . . . . . . . . . . . . . . . . . . . 90

xiv

ABBREVIATIONS

WT Wavelet Transform

DWT Discrete Wavelet Transform

DCT Discrete Cosine Transform

DFT Discrete Fourier Transform

ACM Active Contour Model

RGB Red Green Blue

PDE Partial Differential Equation

DP Dynamic Programming

MRF Markov Random Field

GVF Gravitational Vector Field

FCM Fuzzy C-Means

GMM Gaussian Mixture Model

DT Distance Transform

xv

CHAPTER 1

Introduction

An important component in many image analysis and computer vision tasks is

image segmentation, a process in which an image is partitioned into meaningful

constituent parts. Recently, interactive segmentation methods are becoming more

and more popular to alleviate the problems inherent in fully automatic segmen-

tation, which never seems to be perfect. In particular, the problem of efficient

interactive foreground object segmentation in still images is of great practical im-

portance in image editing and has been the interest of research for a long time. It

can be associated with the problem of boundary detection and integration, where

boundary is roughly defined as a curve or surface separating “homogenous” re-

gions. Fig. 1.1 shows an example of foreground object segmentation in an image.

Deformable models, Graph-cut, GrabCut etc. are some prominent methods

used for segmentation of foreground object, which use either color or edge (con-

trast) information for the segmentation purpose. In the recent past object seg-

mentation methods have helped in many computer vision areas, such as scene

representation & interpretation, content based image retrieval (CBIR), object

tracking in videos, medical applications etc. Common challenges in the object

segmentation process are effects of uneven sample illumination, shadowing, par-

tial occlusion, clutter, noise, subtle object to background differences and changes

etc. In fully automatic object segmentation methods, correct detection and seg-

(a) (b)

Fig. 1.1: Segmentation of foreground object: (a) input image, (b) segmented fore-ground object.

mentation of a semantic object in the image, is a challenging problem. Interactive

(semi-automatic) techniques that get some input from the user have naturally

more advantages over the automatic ones as they significantly improve the object

extraction.

1.1 Problem Definition

The main goal of this thesis is to develop efficient methods for segmentation of

foreground object(s) in a given image. In the first part of the work, we develop

techniques for the segmentation of single or multiple object(s) from an image, in

the presence of foreground and background textures. In the second part of our

work, we develop a technique for efficient segmentation of an object which contains

holes in it. This problem is further complicated by the fact that color distribution

of some parts of the object may also be similar to a part of the background. Figs.

1.2 and 1.3 show the inputs and the desired outputs in these segmentation tasks.

The two problems addressed in this thesis are as follows:

• Segmentation of single or multiple texture objects, in presence of a back-

2

(a) (b)

Fig. 1.2: Problem of segmentation of multiple objects in presence of foreground andbackground textures: (a) input image, (b) desired segmentation result of foregroundobjects, where object boundaries are shown using black contours.

ground texture, using a representation based on the scalogram (Clerc andMallat, 2002) of the DWT as a texture feature.

• Integration of a Snake (parametric active contour) (Kass et al., 1988) andGrabCut (Rother et al., 2004) using a probabilistic framework for the seg-mentation of a foreground object with holes in color images. We call theintegrated technique as “SnakeCut”.

1.1.1 Assumptions

We assume the following conditions for our proposed methods of foreground tex-

tured object segmentation:

1. Initialization provided by the user does not include any background object.

2. We assume that there are no occlusions in case of multiple objects.

3. In SnakeCut segmentation technique, we assume that the initialization forthe foreground includes only a single object.

4. No part of the foreground object has similarity with part of the background.

5. Input images are noise free.

3

(a) (b)

Fig. 1.3: Problem of segmenting an object with holes: (a) input image. User wants tosegment the central elliptical object present in the scene, which contains a rectangularhole in it, (b) desired segmentation result. In the result, background is visible throughthe hole of the object.

1.2 Motivation and Scope

Most object segmentation techniques in computer vision are based on the prin-

ciple of boundary detection. These segmentation techniques assume a significant

gray level change between the object(s) of interest and the background. However,

this is not true in the case of textured images. In textured images, many edges

of texture micro units (texels) exist due to the nature of the texture. So the

object segmentation techniques relying on intensity gradient are likely to fail in

such situations. In case of textured images, the object boundary is defined as the

place where texture property changes. So to perform the correct segmentation in

case of textured images, there is a need to incorporate the textural information

in the segmentation process. In the literature, few techniques are available which

perform object segmentation (contour-based) in presence of texture (Paragios and

Deriche, 1999b,a; Sagiv et al., 2000, 2006) and there is a need to develop efficient

object segmentation techniques in this area. This motivates us to develop ob-

ject segmentation techniques which can perform the task of segmentation in the

presence of texture. The segmentation methods developed in this work are quite

4

robust and efficient, and are able to perform object segmentation task in the case

of both synthetic and natural texture images.

We also developed an object segmentation technique for non-textured color

objects with holes. The developed technique is also very useful in the case where

the color distribution of some parts of the foreground object is similar to the image

background and object contains holes within it. Object segmentation techniques

(such as Rother et al. (2004)) available in the literature for colored images, often

require post-corrective editing by the user, to perform correct segmentation. So

there is a necessity to develop automatic segmentation techniques which can per-

form segmentation with minimum user interaction in such cases. Our proposed

technique is semi-automatic and only requires the user to define a rectangle or

polygon around the object to be segmented, and no post-corrective editing is re-

quired. We have demonstrated the efficiency and correctness of our method using

sufficiently difficult images.

1.2.1 Applications

In the recent past, object segmentation methods have helped in many computer

vision tasks. Few of them are listed below:

1. Object recognition,

2. Object segmentation and tracking in videos,

3. Object oriented video coding,

4. Surveillance and tracking,

5. Medical imaging applications, diagnosis and surgery,

6. Modeling focus of attention in visual perception,

5

7. Bin-picking problem in robotics,

8. Target detection and identification.

1.3 Purpose of Using Active Contour Based Meth-

ods for Segmentation of Textured Object(s)

The task of segmentation of textured object(s) is to divide an image into two

parts: foreground object(s) and background. There are many texture segmenta-

tion algorithms (based on active contour, clustering etc.) which can perform such

segmentation task. Following points favor the use of active contour for segmenta-

tion of textured object, against the others.

• Active contour has ability to operate on certain part of the image (Fig.1.4) and it does not need to consider the entire image. So the process ofsegmentation is fast.

• Active contour solves the segmentation problem by considering an objectboundary as a single, connected structure. It exploits a priori knowledgeof object shape and inherent smoothness, usually formulated as internaldeformation energies, to compensate for noise, gaps and other irregularitiesin the object boundaries. Its underlying geometric representation provide acompact analytical description of an object.

• In an object segmentation problem, the objective is to find the closed bound-ary of the object using which the object can be cropped. If any clusteringalgorithm is used for object segmentation, it will suffer from the followingproblems:

– First, boundary produced by the algorithm may not be a closed con-tour. Some algorithm has to be applied for the edge linking.

– Second, boundaries detected by the clustering algorithms are inaccu-rate most of the time. So linking of edges also does not help.

• Boundary information produced by active contour can be used as a shapefeature for the representation of the object in many applications such ascontent based image retrieval.

• Active contour can interpret sparse, incomplete and redundant informationand is selective with respect to false image features.

6

(a) (b) (c)

Fig. 1.4: Active Contour operating at a certain region of interest (ROI): (a) imagewith initial contour surrounding the ROI, (b) image showing the deforming contour,and (c) image with the converged contour.

Thus the purpose behind the use of active contour for textured object seg-

mentation is very apparent. Other segmentation algorithms can also be used to

produce the same results but active contour supercedes them in terms of simplicity,

ease of use and in many cases provides an efficient result for object segmentation.

1.4 Brief Description of Work Done

The work presented in this thesis is explained below in three sections: (1) textured

object segmentation using parametric active contours, (2) multiple textured ob-

jects segmentation using geodesic active contour and (3) SnakeCut: an automatic

technique for segmentation of a foreground object with holes.

7

1.4.1 Texture Object Segmentation using Parametric Ac-

tive Contours

In this part of the work, we focus on parametric active contours, which synthe-

size parametric curves within image domain and allow them to move towards

the desired image features under the influence of internal and external forces.

The internal force serves to impose a piecewise continuity and smoothness con-

straint whereas external force pushes the contour towards salient image features

like edges, lines and subjective contours. External force in the traditional active

contour is defined in terms of the image gradient. In the presence of such external

force, Snake is attracted towards large image gradients i.e. towards the edges in

the image. So if it is applied to the textured images, it will often get stuck on

local texel (micro-units or cells of a texture) edges and converge at non-object

boundary.

We extend energy based parametric active contour model for the segmentation

of textured objects. To overcome the above mentioned effect, we use a new for-

mulation of external force in active contour for textured images, which we name

as “texture force”. Texture force does not use the image pixel intensity values

directly for modeling but incorporates texture information present in the image.

We use scalogram (Clerc and Mallat, 2002) obtained from discrete wavelet trans-

form (DWT) for the extraction of texture information present in the image. The

active contour in presence of texture force runs over the texture image surface and

detects the object boundary of a texture surface, on a background texture. The

following section briefly presents our proposed texture feature extraction method.

8

Texture Feature Extraction

Extraction of texture features at a point in an image involves two steps: scalogram

estimation and texture feature estimation. To obtain scalogram (Clerc and Mallat,

2002) at a particular point (pixel) in the image, a n × n window is considered

around the point of interest. Intensities of the pixels in this window are arranged

in the form of a vector of length n2 whose elements are taken column-wise (or

row-wise) from the n×n cropped intensity matrix. This intensity vector (signal),

which basically represents the textural pattern around the pixel, is subsequently

used in the estimation of scalogram. Once the scalogram of the texture profile at a

particular point is obtained, a post-processing operation is carried out to eliminate

the non-significant subbands of the scalogram, and only the significant subbands

are used for further processing. This is done since significant subbands contain

major texture features information. Removal of non-significant subbands helps in

removing the redundant information and makes the computation fast. For texture

feature estimation at a particular point, an energy measure is calculated by taking

the mean of the scalogram coefficients from the significant subbands. This energy

measure computed for all pixels in an image constitute “texture feature image”,

which is further used in texture force modeling.

1.4.2 Segmentation of Multiple Textured Objects using

Geodesic Active Contours

Though parametric active contours are fast, efficient and easy to implement, they

are not designed to handle the topological changes during the process of seg-

9

mentation. Hence, they are not suitable to segment multiple objects intrinsi-

cally. Special topology-handling procedures must be added to handle topological

changes. Hence, to segment multiple objects in presence of foreground and back-

ground textures, we extend the concept of geodesic active contour which can

handle topological changes naturally.

We have developed an efficient algorithm based on the geodesic active contour

for segmentation of multiple objects in the presence of foreground/background

textures. Proposed technique is based on the generalization of geodesic active

contour from one dimensional intensity based feature space to multidimensional

texture feature space. In our approach, image is represented in an n-dimensional

texture feature space, which is derived from the image using DWT and scalogram

(Clerc and Mallat, 2002). We formulate an edge indication function, used in the

geodesic active contour, from the texture feature space of the image, by viewing

texture feature space as a Riemannian manifold (Sagiv et al., 2000, 2006).

1.4.3 SnakeCut: An Automatic Technique for Segmenta-

tion of a Foreground Object with Holes

We develop a technique for the segmentation of objects with holes in color im-

ages. We call our integrated technique as “SnakeCut”. This technique mainly

deals with the segmentation of an object which contains holes in it, and the color

distribution of some parts of the object is also similar to the background. The

proposed technique is based on Snake (parametric active contour) (Kass et al.,

1988) and GrabCut (Rother et al., 2004). Snake is a deformable contour, which

10

segments an object boundary using boundaries discontinuities by minimizing the

energy function associated with the contour. GrabCut is an interactive tool based

on iterative graph-cut for foreground object segmentation in still images. GrabCut

provides a convenient way to encode color features as segmentation cues to ob-

tain foreground segmentation from local pixel similarities using modified iterated

graph-cuts. Since active contour uses gradient information (boundary discontinu-

ities) present in the image to estimate the object boundary, it cannot penetrate

inside the object to detect a hole. It thus cannot remove any pixels inside the

object boundary which do not belong to the object. GrabCut, on the other hand,

works on the basis of pixels color (intensities) distribution, and hence has the

ability to remove interior pixels which are not part of the object. Major prob-

lem with the GrabCut is as follows. If some parts of the foreground object have

color distribution similar to the image background, then those parts will also be

eliminated by GrabCut. In GrabCut algorithm (Rother et al., 2004), missing

foreground data is recovered by user interaction. In SnakeCut, we present a novel

formulation based on a probabilistic framework, to integrate these two compli-

mentary techniques to obtain an automatic segmentation of a foreground object

with holes.

1.5 Organization of the Thesis

In this thesis, the problem of textured object segmentation has been attempted

by using the Snake (parametric active contour) (Kass et al., 1988) and geodesic

active contours (Caselles et al., 1997). We have also addressed the problem of

11

segmentation of an object with holes in color images using the integration of

Snake and GrabCut (Rother et al., 2004). The rest of the thesis is organized as

follows:

Chapter 2: Literature Review - This chapter discusses the various techniques

present in the literature for the segmentation of textured and non-textured objects.

Chapter 3: Textured Object Segmentation using Parametric Active

Contours - This chapter presents a method for segmentation of a textured ob-

ject using the parametric active contour. It also discusses the texture feature

extraction technique based on scalogram obtained using DWT.

Chapter 4: Segmentation of Multiple Textured Objects using Geodesic

Active Contours - This chapter presents a technique for segmentation of mul-

tiple textured objects using the geodesic active contour. Presented approach can

handle segmentation of multiple objects simultaneously in presence of foreground

and background textures. This approach is based on the generalization of geodesic

active contour model from one dimensional intensity based feature space to mul-

tidimensional texture feature space.

Chapter 5: SnakeCut: An Automatic Technique for Segmentation of

a Foreground Object with Holes - This chapter discusses the limitations of

active contour (Snake) and GrabCut segmentation techniques, and proposes an

efficient, semi-interactive method based on the integration of these two popular

techniques for the segmentation of an object in color images. Proposed technique,

termed as “SnakeCut”, is particularly very useful for the segmentation of an object

which contains holes within it and color distribution of some parts of the object

12

is same as background.

Chapter 6: Conclusion - This chapter concludes the thesis with discussion on

contribution and future scope of work.

13

CHAPTER 2

Literature Review

In this chapter, a review of techniques proposed for the segmentation of textured

and non-textured objects is presented. It also presents a short summary on texture

feature extraction and representation techniques.

2.1 Object Segmentation Methods

In the last two decades, the computer vision community has produced a num-

ber of useful algorithms for localizing object boundaries in images. Parametric

active contours (snakes) (Kass et al., 1988; Cohen, 1991), geodesic active con-

tours (Caselles et al., 1997; Yezzi et al., 1997), graph-cuts (Freedman and Zhang,

2005), GrabCut (Rother et al., 2004), intelligent scissors (Mortensen and Barrett,

1998), “shortest path” techniques (Mortensen and Barrett, 1998; Falcao et al.,

1998) and many other methods exist for partitioning an image into two segments:

“foreground object” and “background”. These methods integrate model-specific

visual cues and contextual information in order to accurately describe a particular

object. Each method comes with its own set of features.

Table 2.1: Energy based approaches for the object segmentation (taken from Boykovand Funka-Lea, 2006)

Variational Methods Combinatorial Methods(optimization in R∞) (optimization in Zn)

Explicitboundaryrepresentation

Snakes and Active contours(Variational formulations)(e.g. Kass et al., 1988,Cohen, 1991)

Dynamic Programming and“path-based” graphmethods (2-D only),Intelligent Scissors (e.g.Amini et al., 1990, Geigeret al., 1995, Mortensen andBarrett, 1998, Jermyn andIshikawa, 1999)

Implicitboundaryrepresentation

Level-sets, Geodesic activecontours (e.g. Caselleset al., 1997, Sethian, 1999,Sapiro, 2001, Osher andParagios, 2003)

Combinatorial Graph-cuts(Boykov and Jolly, 2001)

2.1.1 Energy Based Object Segmentation Methods

Most of the object segmentation techniques available in the literature are based

on some kind of energy formulation. A brief overview of these segmentation

techniques is given in Table 2.1. Energy-based segmentation methods can be

distinguished by the type of energy function they use and by the optimization

technique for minimizing it. The majority of standard algorithms can be divided

into two large groups:

A. Energy functional defined on a continuous contour or surface

B. Energy functional or cost function defined on a discrete set of variables

The standard methods in group A formulate segmentation problem in the do-

main of continuous functions R∞. For optimization, most of them rely on a vari-

ational approach and gradient descent. Numerical techniques for such variational

approaches are based on the finite differences or on finite elements. Few examples

15

of the methods of group A include snakes (Kass et al., 1988; Cohen, 1991), region

competition (Zhu and Yuille, 1996), geodesic active contours (Caselles et al., 1997),

and other methods based on level-sets (Sethian, 1999; Osher and Fedkiw, 2002;

Sapiro, 2001; Osher and Paragios, 2003). Typically, continuous surface function-

als incorporate various “regional” and “boundary” properties of segments, some

of which can be geometrically motivated (Caselles et al., 1997; Vasilevskiy and

Siddiqi, 2002). In most cases, methods in group A use variational optimization

techniques that can guarantee to find only a local minima of the corresponding

energy functional.

The segmentation methods in group B either directly formulate the prob-

lem as a combinatorial optimization in finite dimensional space Zn or optimize

some discrete energy function whose minima approximates the solution of some

continuous problem. Most of the discrete optimization methods for object seg-

mentation minimize an energy defined over a finite set of integer-valued variables.

Such variables are usually associated with graph nodes corresponding to image

pixels or control points. All previous combinatorial methods for object segmen-

tation use discrete variables whose values encode “direction” of a path along the

graph. Many path-based methods use Dynamic Programming (DP) to compute

the optimal paths. For example, intelligent scissors (Mortensen and Barrett, 1998)

and live-wire (Falcao et al., 1998) use Dijkstra algorithm while DPsnakes (Amini

et al., 1990) use Viterbi algorithm. Note that all path-based methods can natu-

rally encode boundary-based segmentation cues while the incorporation of region

properties in segments is less obvious (Jermyn and Ishikawa, 1999). In any case,

all path-based methods are limited to 2-D applications because object boundary

16

in 3-D volumes cannot be represented by a path.

Global vs. Local Optimization

In general, global solutions are attractive because of their potentially better sta-

bility. For example, imperfections in a globally optimal solution can be directly

related to the cost function rather than to the numerical problem during mini-

mization. Thus, global methods can be more reliable and robust. Some versions of

active contours (Cohen and Kimmel, 1997), shortest path algorithms (Mortensen

and Barrett, 1998; Falcao et al., 1998), ratio regions (Cox et al., 1996), and some

other segmentation methods (Jermyn and Ishikawa, 1999) compute a globally op-

timal solution (in 2-D applications), in case when the segmentation boundary is

a 1-D curve.

Since our study is based on the active contours and graph-cut based segmen-

tation methods, here we will review segmentation methods based on these two

techniques in detail.

2.2 Active Contour Models

Active contours (Kass et al., 1988) are extensively used in computer vision and

image understanding applications, particularly to locate object boundaries. They

are energy minimizing deformable contours that converge at the boundary of an

object in an image. Deformation in contour is caused because of internal and exter-

nal forces acting on it. Internal force is derived from the contour itself and external

17

force is invoked from the image. The internal and external forces are defined so

that the snake will conform to object boundary or other desired features within

the image. Snakes are widely used in many applications such as segmentation

(Leymarie and Levine, 1993), shape modeling (Terzopoulos and Fleischer, 1988),

edge detection (Kass et al., 1988), motion tracking (Terzopoulos and Szeliski,

1992) etc. Active contours can be classified as either parametric active contours

(snake) (Kass et al., 1988; Cohen, 1991) or geodesic (geometric) active contours

(Caselles et al., 1993, 1997) according to their representation and implementa-

tion. In particular, parametric active contours explicitly represent curves in their

parametric form in a Lagrangian framework whereas the geodesic active contours

implicitly represent model shape as the zero level set of a two-dimensional scalar

function. Geodesic active contours evolve in an Eulerian framework model based

on front propagation (Malladi et al., 1995) using the theory of curve evolution.

2.2.1 Parametric Active Contours (Snakes)

Parametric active contours synthesize parametric curves within image domain

and allow them to move towards the desired image features under the influence

of internal and external forces. The internal force serves to impose a piecewise

continuity and smoothness constraint whereas external force pushes the snake

towards salient image features like edges, lines and subjective contours. External

force in the traditional snake is defined as the negative of the image gradient.

Traditional Active contour model (Kass et al., 1988) suffers from the problems of

contour initialization and poor convergence to cavities. Initial contour is required

to be kept near the object boundary by the user to achieve the correct convergence.

18

Many attempts have been made to improve this model. Cohen (Cohen, 1991)

proposes a new external force, the balloon (inflation) force, to solve the contour

initialization problem. In the presence of this pressure force, snake behaves like

a balloon and get inflated. The initial curve no longer needs to be close to the

object boundary. Xu et al. (Xu and Prince, 1997, 1998a,b) introduce another

new external force, the gravitational vector flow (GVF) field, to allow flexible

initialization of the snake. GVF also encourages snake convergence to boundary

cavities. Paragios et al. (Paragios et al., 2004) presents an improved version

of GVF snake which gives better convergence when vector flows are tangent to

the contour or diverge within a neighborhood. Many researchers have suggested

neural network based minimization of the active contour energy. Tsai et al. (Tsai

et al., 1993) uses Hopfield network to minimize the snake energy. Venkatesh and

Rishikesh (Venkatesh and Rishikesh, 1997, 2000) presents a new implementation

of the snake (Kass et al., 1988) using self organization neural networks. Chiou et

al. (Chiou and Hwang, 1995) presents stochastic active contour based on neural

networks. Many techniques have been developed in recent years using multiple

snakes for object segmentation (Srinark and Kambhamettu, 2001, 2006).

2.2.2 Geometric (geodesic) Active Contours

Geometric active contours were independently introduced by Caselles et al. (Caselles

et al., 1993) and Malladi et al. (Malladi et al., 1995). These models were based

on curve evolution theory (Kimia et al., 1995) and level set method (Osher and

Fedkiw, 2002). The basic idea is to represent contours as the zero level set of

an “implicit function” defined in a higher dimensional space, usually referred as

19

the level set function, and then evolve the level set function according to a partial

differential equation (PDE). This approach presents following advantages over the

traditional parametric active contours:

1. the contours represented by the level set function may break or merge nat-urally during the evolution, and the topological changes are thus automat-ically handled. For parametric contours, it is in general, not possible toachieve any automatic topological changes. However, several algorithmshave been proposed to overcome this limitations (Leitner and Cinquin, 1991;McInerney and Terzopoulos, 1996; Lachaud and Montanvert, 1999).

2. the level set function always operates on a fixed grid, which allows efficientnumerical schemes.

Early geometric (geodesic) active contour models (Caselles et al., 1993; Malladi

et al., 1995; Caselles et al., 1997) are typically derived using a Lagrangian formu-

lation that yields a certain evolution PDE of a parameterized curve. This param-

eterized PDE is then converted to an evolution PDE for a level set function, using

the related Eulerian formulation from level set methods. As an alternative, the

evolution PDE of the level set function can be directly derived from the problem

of minimizing a certain energy functional defined on the level set function. This

type of variational methods are known as variational level set methods (Chan and

Vese, 2001; Vemuri and Chen, 2003; Zhao et al., 1996). Compared with pure PDE

driven level set methods, the variational level set methods are more convenient

and natural for incorporating additional information, such as region-based infor-

mation (Chan and Vese, 2001) and shape-prior information (Vemuri and Chen,

2003), into energy functionals that are directly formulated in the level set do-

main, which produces more robust results. For examples, Chan and Vese (Chan

and Vese, 2001) proposed an active contour model using a variational level set

formulation. By incorporating region-based information into energy functional as

20

an additional constraint, their model has much larger convergence range and flex-

ible initialization. Vemuri and Chen (Vemuri and Chen, 2003) proposed another

variational level set formulation. By incorporating shape-prior information, their

model is able to perform joint image registration and segmentation.

Implementation of geodesic active contours is easier compare to parametric

active contours but they suffer from speed limitations. This is because the up-

date of an implicit contour requires the update of at least a narrow band around

each contour. Furthermore, on parametric contours, vertex sampling may not be

constant or uniform, whereas on implicit contours the resolution is constrained by

the resolution of the regular grid. In recent past, many improvements have been

done to improve the speed of geodesic active contours. Speed-up algorithms for

the level set methods were proposed either by constraining the contour evolution

through the Fast-Marching method (Sethian, 1999) or by the asynchronous up-

date of the narrow-band (Paragios and Deriche, 1998). In (Paragios and Deriche,

1998), Paragios et al. apply some heuristic steps to perform asynchronous update

of the narrow-band. Weickert in (Weickert, 1998) uses multiresolution approach

to overcome the problem of speed in level set methods. Goldenberg et al. in

(Goldenberg et al., 2001) introduce a new method to maintain the numerical con-

sistency and make the geodesic active contour model computationally efficient.

They achieve this by: (1) canceling the limitation on the time step in the nu-

merical scheme, (2) limiting the computation to a narrow band around the active

contour, and (3) applying an efficient re-initialization technique. Their method

combines the narrow band level set method, with adaptive operator splitting and

the fast marching. Li et al. (Li et al., 2005a) presents a variational formulation

21

Table 2.2: Properties of parametric and geometric (geodesic) active contours (takenfrom Delingette and Montagnat, 2001)

Parametric active Geodesic activecontours contours

Efficiency Good PoorEase of implementation Easy Moderately difficultTopology change No YesOpen contours Yes NoInteractivity Good Poor

for geodesic active contours without re-initialization. Their approach can be eas-

ily implemented using a simple finite difference scheme and is computationally

very efficient compared to traditional level set methods. Table 2.2 summaries the

properties of the parametric and geometric (geodesic) active contours.

2.3 Active Contour based Textured Object Seg-

mentation

Problem of textured object(s) segmentation (sometimes referred as only texture

segmentation in this section) have been studied by many researchers (Paragios

and Deriche, 1999b,a; Sagiv et al., 2000; Paragios and Deriche, 2002b; Sagiv et al.,

2006; Lu and Pavlidis, 2007). In general, texture segmentation algorithms combine

following four major components:

1. First, a texture representation space is selected. Common choices are win-dowed Fourier transforms, the Gabor representation (Jain and Farrokhnia,1991), Discrete wavelet transforms (Chang and Kuo, 1993; Laine and Fan,1993; Arivazhagan and Ganesan, 2003; Bashar et al., 2003), local histograms(Hofmann et al., 1998), the local structure tensor (Rousson et al., 2003) etc.

2. In the second step, texture features are extracted, e.g. the magnitude of theresponse of the Gabor filters, wavelet coefficients, particular moments which

22

are calculated from local histograms (Jain and Farrokhnia, 1991; Lu et al.,1997; Sagiv et al., 2001).

3. In the third step, a measure of the texture characteristic features is defined.The measure indicates how much variability is characteristic of the texture.Kulback-Leibler, Mutual information, gradients, and other distance measureare the typical example for this stage.

4. Finally, some objective function is defined using the texture features, and thesegmentation is formulated as an optimization, minimization or clusteringproblem. In case of region based algorithms, the third and fourth stagesbecome inseparable.

In this section, we review some of the textured object segmentation methods

proposed in the literature. We focus on those methods which are based on active

contours. In section 2.5, we present a short summary on the texture feature

extraction and representation methods.

As we have seen in section 2.2, in the formulations for both types of active

contour (parametric and geometric), the external image force (traditionally) is

derived from edge or image gradient information. In the presence of such an

external force, snake is attracted towards large image gradients i.e. towards the

edges in the image. However this makes the models unfit for finding boundaries

of objects with complex large-scale texture patterns due to the presence of large

variations in image gradient and many local edges inside the object. So if it is

applied to the textured images, it will often get stuck on local texel (micro-units

or cells of a texture) edges and converge at non-object boundary. In the case of

textured object segmentation, snake should be able to find out the point where the

texture characteristic is changing. There have been efforts to address this problem

and many active contour based methods have been suggested in the literature in

the recent past to solve this problem (Zhu and Yuille, 1996; Paragios and Deriche,

23

1999b,a; Sagiv et al., 2000; Paragios and Deriche, 2002b; Pujol and Radeva, 2004;

Sagiv et al., 2006). These methods incorporate the texture features of the image

to determine the texture boundary.

Zhu et al. (Zhu and Yuille, 1996) proposed an approach called region com-

petition which performs texture segmentation by combining region growing and

active contours using multi-band input after applying a set of texture filters. The

method assumes multivariate Gaussian distributions on the filter response vector

inputs.

Geodesic Active Regions (Paragios and Deriche, 1999b) deal with supervised

texture segmentation in a frame partition framework using an implementation

for level-set deformable model. There are, however, several assumptions in this

supervised method, which include knowing beforehand the number of regions in

an image and the statistics for each region are learned off-line using a mixture

of Gaussian approximations. These assumptions limit the applicability of the

method to a large variety of natural images.

Paragios and Deriche in (Paragios and Deriche, 1999a) proposed an approach

based on geodesic active contour using Gabor filters, analyzing their responses as

multi-component conditional probability density functions. The texture segmen-

tation is obtained here by minimizing a geodesic active contour model objective

function, where boundary based information is expressed via discontinuities on

the statistical space associated with the multi-model texture feature space.

Another texture segmentation approach in a deformable model framework (Pu-

jol and Radeva, 2004) is based on deforming an improved active contour model (Xu

24

and Prince, 1998a,b) on a likelihood map instead of heuristically constructed edge

map. However, because of the artificial neighborhood operations, the results from

this approach suffer from blurring on the likelihood map, which causes the object

boundary detected to be “dilated” versions of the true boundary. The dilated

zone could be small or large depending on their neighborhood size parameter.

Lorigo et al. (Lorigo et al., 1998) extended vector-valued geodesic active con-

tour (Sapiro, 1996, 1997) for texture segmentation. Texture information in their

approach is incorporated into active contour framework through the use of vector-

valued geodesic active contour with local variance as a second value at each pixel,

in addition to intensity.

Sandberg et al. (Sandberg et al., 2002) applied a “vector-valued active contour

without edges” (Chan et al., 2000) mechanism to the Gabor filtered images. Sagiv

et al. in (Sagiv et al., 2000, 2006) use Beltrami framework (Sochen et al., 1998)

based multi-valued geodesic active contour algorithm in the Gabor feature space.

Presence of the texture edge in their approach is estimated by viewing Gabor

feature space as a manifold. Determinant of this manifold’s metric is interpreted

as a measure for the presence of the gradient on the manifold.

2.4 Graph-cut Based Methods

Boykov et al. (Boykov and Jolly, 2001) proposed a new technique for general pur-

pose interactive segmentation of N-dimensional images. In their technique, user

marks certain pixels as “object” or “background” to provide hard constraints for

25

segmentation. Additional soft constraints incorporate both boundary and region

information. Then technique uses graph-cuts to find the globally optimal segmen-

tation of an N-dimensional image. The obtained solution gives the best balance of

boundary and region properties among all possible segmentations satisfying the

constraints. The topology in this segmentation method is unrestricted and both

“object” and “background” segments may consist of several isolated parts. The

effectiveness of formulating the object segmentation problem via binary graph-

cuts is demonstrated by a large number of recent publications in computer vision

and graphics that directly build upon the basic concept outlined by Boykov et

al. (Boykov and Jolly, 2001). Researcher have extended graph-cut based ob-

ject segmentation technique (Boykov and Jolly, 2001) in a number of interesting

directions. Here we briefly review few of them.

Geo-cuts (Boykov and Kolmogorov, 2003; Kolmogorov and Boykov, 2005)

incorporate geometric cues for the segmentation. It combines geodesic active

contours and graph-cuts and produces segmentation by finding global minimum

geodesic active contours.

GrabCut (Rother et al., 2004) uses regional cues based on Gaussian mixture

models (Blake et al., 2004) for improved interactivity. GrabCut provides a con-

venient way to encode color features as segmentation cues to obtain foreground

segmentation from local pixel similarities using modified iterated graph-cuts.

Lazy snapping (Li et al., 2004) separates coarse and fine scale processing,

making object specification and detailed adjustment easy. It provides instant

visual feedback, snapping the cutout contour to the true object boundary effi-

26

ciently despite the presence of ambiguous or low contrast edges. Instant feedback

is made possible by a novel image segmentation algorithm which combines graph-

cut (Freedman and Zhang, 2005) with pre-computed over-segmentation.

Obj-cut (Kumar et al., 2005) integrates high-level contextual information. It

presents a principled Bayesian method for detecting and segmenting instances of

a particular object category within an image, providing a coherent methodology

for combining top down and bottom up cues.

Other important graph-cut based methods include multi-level and banded

methods (Lombaert et al., 2005; Xu et al., 2003; Juan and Boykov, 2006), bi-

nary segmentation using stereo cues (Kolmogorov et al., 2005), efficient algorithms

for dynamic applications (Kohli and Torr, 2005; Juan and Boykov, 2006) (flow-

and cut-recycling), extraction of moving or foreground objects from video (Li

et al., 2005b; Wang et al., 2005), simultaneous segmentation of multiple objects

(Li et al., 2006), efficient N-D image segmentation (Boykov and Funka-Lea, 2006)

and methods for solving surface evolution PDEs (Boykov et al., 2006).

2.5 Texture Representation and Analysis

In this section, the various feature extraction methods (as depicted in Fig. 2.1)

used in literature for texture representation are reviewed.

27

Texture Feature Extraction

Multiresolutiontechniques

Spatial/spatial−frequencytechniques

Markov random fields

Fig. 2.1: Categorization of filtering techniques used for texture representation.

2.5.1 Texture Feature Extraction using Multiresolution Meth-

ods

Due to variability in the size of texels in a texture image, it is often difficult to

define an optimal resolution for feature extraction a priori. One efficient way

to represent different image details is to re-organize the image into a number of

subsampled approximations at different resolutions (P. P. Raghu, 1995). This is

called a multiresolution representation. This scheme analysis the coarse image

details first and gradually increases the resolution to analyze the finer details.

In analyzing natural textures, feature extraction using a multiresolution scheme

is appropriated in order to capture features of variable sizes. In this section,

we review few multiresolution methods which have been used for texture feature

extraction.

Discrete Wavelet Transform (DWT)

The discrete wavelet transform analysis a signal based on its content in different

frequency ranges at varying scales. Therefore it is very useful in analyzing repet-

itive patterns such as texture. The wavelet transform is expressed as a decompo-

sition of a signal f(x) ∈ L2(R) into a family of functions which are translations

28

and dilations of a mother wavelet function Ψ(x). Employing the definition:

Ψs(x) =√

sΨ(sx− a), (2.1)

the wavelet transform of f(x) is defined by

Wf(s, a) =

∞∫

−∞

f(x)√

sΨ(sx− a)dx (2.2)

where s, a ∈ R indicate scale and translation parameters respectively. Since the

continuous wavelet transform is redundant, it is discretized by sampling parame-

ters s and a. The most common choice is s = 2i and a = n/2i; i, n ∈ Z. Inserting

these values in equation 2.2 yields the DWT of the signal f(x), as:

Wdf(i, n) =< f(x), ψ2i(x− n2−i) > (2.3)

where < ... > denotes the inner product. Since some existing wavelets ψ(x) ∈

L2(R) constitute an orthonormal basis

(ψ2i(x− (n/2i))); i, n ∈ Z,

this transform is called an orthogonal wavelet transform.

Introducing the so-called scaling function φ(x), the interscale coefficients g(n)

with high-pass (HP) characteristics and h(n) with low-pass (LP) characteristics,

it is possible to decompose a signal f(x) using the following L-level decomposition

29

scheme, as

f(x) =∑

n

c0,nφ(x− n)

=∞∑

n=−∞cL,nφ2−L(x− n2L) +

L∑i=1

∞∑n=−∞

di,nψ2−i(x− n2i) (2.4)

which is a finite approximation of equation 2.3. The coefficients ci,n and di,n are

obtained by

ci,n =∑

k

ci−1,nh(k − 2n) (2.5)

di,n =∑

k

ci−1,ng(k − 2n) (2.6)

which is same as convolving the signal ci−1,n with impulse responses

h(n) = h(−n), g(n) = g(−n) (2.7)

respectively, and subsequently discarding every other sample. The 2-D transform

uses a family of wavelet functions and its associated scaling function to decompose

the original image into different channels, namely the low-low, low-high, high-low

and high-high (A, V, H, D respectively) channels. The decomposition process can

be recursively applied to the low frequency channel (LL) to generate dyadic de-

composition at the next level. Methods of texture segmentation based on discrete

wavelet transform have been widely used. Mallat (Mallat, 1989) treated wavelet

transform as a texel decomposition where each texel corresponds to a particular

wavelet basis function. Mallat used a pyramidal algorithm based on convolutions

with quadrature mirror filters for computing the wavelet transforms. This ini-

30

tial proposal has been followed by several studies on texture classification with a

particular attention to the use of wavelet packets, which constitute a multiband

extension of pyramid structured wavelet transform.

Carter (Carter, 1991) used Morlet and Mexican hat wavelets for texture feature

extraction. The wavelets used by Carter were not orthogonal, and the Mexican

hat wavelets lacked the spatial orientation selectivity. Laine et al. (Laine and Fan,

1993) have used wavelet packets, which provide orthogonal and compactly sup-

ported wavelets with orientation selectivity. Chang et al. (Chang and Kuo, 1993)

used a tree-structured wavelet transform for the classification of textured images.

Some of the issues involved in using wavelet transform for texture analysis are the

choice of wavelet basis prototypes and the selection of dilations, translations and

rotations.

A method based on a hierarchical wavelet decomposition technique was intro-

duced by Salari and Ling (Salari and Ling, 1995), where the Daubechies 4-tap

filters are used to decompose the original image into three detail and one ap-

proximate sub-band images, followed by a K-means clustering for segmentation.

Lu et al. (Lu et al., 1997) proposed a method of unsupervised texture descrip-

tion using wavelet transform. This proposed methodology has four stages. The

first stage computes a smoothed local energy of the wavelet coefficients in high-

frequency bands, as features for segmentation. The second stage performs a coarse

segmentation using a multi-thresholding technique. In the third stage, the fea-

tures at different orientations and scales are fused in intra-scale and inter-scale

respectively. In the last stage, ambiguously labeled pixels are reclassified in a fine

segmentation technique. Segmentation results at various scales are integrated by

31

inter-scale fusion to determine the number of classes.

Charalampidis and Kasparis (Charalampidis and Kasparis, 2002) used a set

of new roughness features for texture segmentation. Wavelets are used to ex-

tract single-scale and multiple-scale texture roughness features. These are then

transformed to a rotational invariant feature vector, which has the information of

texture direction.

M-band wavelet transform is a direct generalization of the conventional wavelets.

The M-band wavelets are able to zoom in onto narrowband high frequency compo-

nents of a signal and are reported to give better energy compaction than 2-band

wavelets. M-band wavelets are a set of M-1 basis functions whose scaled and

translated versions form a tight frame for the set of square integrable functions

defined over a set of real numbers. Chitre et al. (Chitre and Dhawan, 1999) pre-

sented an M-band wavelet technique for the discrimination of natural textures.

The M-band wavelet filters were designed using a genetic algorithm based search

method over the householder set of parameters. 20 different categories of textures

were decomposed using a complete and overcomplete representation and features

were computed on the decomposed sub-bands. A KNN classifier was used to dis-

criminate 20 different combinations of test and training sets containing images of

arbitrary size. The results indicate that M-band wavelets have good efficiency to

discriminate between natural textures of varying sizes. Acharyya et al. (Acharyya

and Kundu, 2000) used M-band wavelets for two texture segmentation.

Wang et al. (Wang and Zhang, 2003) proposed a supervised texture segmenta-

tion algorithm based on wavelet transform for the application of remote sensing.

32

The feature extraction step involves the extraction of pixel neighborhood prop-

erties using discrete wavelet frame. A quadrant noise filtering method based on

contextual/spatial relationships between detail images was used to acquire a more

accurate estimate of feature space for texture segmentation. The estimated fea-

ture vector of each pixel is sent to a Bayes classifier to make an initial probabilistic

labeling. To obtain a more accurate result of segmentation, a probabilistic relax-

ation method is used to introduce the spatial constraints into the segmentation

algorithm. The results are shown on a set of satellite images.

Wang et al. (Wang and Liu, 1999) proposed a multiresolution MRF (MRMRF)

based modeling to describe textures. The subbands of decomposed textures using

wavelet transform are modeled with MRF and the corresponding Gibbs, clique

potential parameters are used as features for classification. Comparisons of the

pyramidal and tree-structured wavelets with the Gabor filtering approaches was

presented by Pichler et. al. (Olaf Pichler and Hosticka, 1996). Results show that

both the wavelet-based methods are sub-optimal for feature extraction purpose,

because the center frequency, orientation and bandwidth cannot be selected. The

paper concludes that Gabor filtering outperforms the wavelet cases, in special

cases but is computationally more expensive.

Gaussian and Laplacian Pyramids

Burt et al. (Burt and Andelson, 1983) used Gaussian and Laplacian pyramids to

decompose the original image into steps of low-pass and high-pass components re-

spectively. Each stage of the Gaussian pyramid is computed by low-pass filtering

33

the output of the previous stage. Subsequently, the filtered output is subsampled

by a factor 2j, where j is an integer. Subtraction of two corresponding adjacent

levels of Gaussian pyramids gives an approximation of the Laplacian of Gaussian.

The details are grouped into a pyramidal structure called Laplacian pyramid.

This pyramid approach has an advantage of fast computation, but it suffers from

the disadvantage that it cannot efficiently model the correlated information in-

volved at different levels. In addition, these pyramids have a severe limitation in

analyzing orientation textures and also textures with middle range of frequencies.

One main advantage of using the pyramids is that the computation is fast com-

pared to other transforms. But this decomposition suffers from the disadvantage

that the information at different levels are correlated and it is difficult to model

this correlation. Further, these pyramids do not possess the spatial orientation

selectivity in the decomposition process. This is a severe limitation in analyzing

orientated textures. Also, textures with middle range of frequencies are difficult

to characterize with the pyramids.

2.5.2 Texture Feature Extraction using Spatial/Spatial-

frequency Techniques

Discrete Cosine Transform (DCT)

The DCT transforms a signal from a spatial representation into a frequency rep-

resentation. Because of fast implementation and good results, DCT is widely

used in image compression algorithms such as JPEG. The DCT can decompose

34

the image into spectral sub-bands having different importance with respect to the

visual quality of the image. A N x 1 DCT basis vector um is expressed as

um(k) =

√1N

; m = 1

√2N

cos{

(2k−1)(m−1)π2N

}; m = 2, ..., N

(2.8)

These 1-D DCT vectors can then be used to generate 2-D transform filters appro-

priate for images. To do so, we simply multiply the column basis vectors with the

row vectors of identical length to produce a set of 2-D filters of N2 entities where

N is the vector length. Ng et al. (Ng et al., 1992) and Alim et al. (Abdel Alim

and Sharkas, 2002) presented a comparative study of two approaches: Gabor fil-

ter and DCT. They found that DCT gives high quality segmentation results than

Gabor filters. Alim et al. achieved recognition rate of 96% using DCT coefficients

and 92% using Gabor coefficients on human iris image data.

Gabor Filter

A Gabor filter is Gaussian modulated by a complex sinusoid. The fourier trans-

form of the Gabor filter is a Gaussian shifted in frequency space. A 2-D Gabor

filter is a bandpass filter with tunable center frequency, orientation and band-

width. A 2-D Gabor filter G, can be expressed as

G(x, y, ω, σ, θ) =1

2πσ2exp(−1

2(x2 + y2

σ2) + jω(xcosθ + ysinθ)) (2.9)

where, ω is the frequency of the modulating sinusoid, σ is the spatial extent

(width) of the Gaussian function and θ is the orientation (direction) of the spatial

35

(a) (b)

Fig. 2.2: Real part of a 2-D Gabor filter with different combinations of scale (σ),orientation (θ) and frequency (ω): (a) σ = 6, θ = 0, ω = 2; (b) σ = 6, θ = 45, ω= 2.8.

sinusoid. Fig. 2.2 shows 2-D Gabor filters with different parameters. A 2-D

Gabor filter can be considered to be equivalent to a 1-D Gabor filter applied in all

directions. On other hand, a 1-D Gabor filter is characterized only by the width

of the Gaussian window and the frequency of the modulating sinusoid. Thus, a

1-D Gabor filter g can be expressed as

g(x, ω, σ) =1

2πσ2exp(− x2

2σ2+ jωx) (2.10)

Elementary Gabor signals were introduced by Gabor (Gabor, 1946) and was

later extended to 2-D by Daugman (Daugman, 1985). Marcelja (Marcelja, 1980)

introduced Gabor filter as a mathematical representation of the receptive profiles

of visual cortical cells. When appropriately tuned, these filters have been found

to be extremely useful for performing texture feature extraction and texture edge

detection. Dunn et. al. (Dunn et al., 1994) presented an algorithm to design

specially tuned Gabor filters to segment images with bipartite textures. The

parameter tuning of the set of Gabor filter bank is the key contribution of this

approach.

G. M. Haley et al. (Haley and Manjunath, 1999) proposed a rotation invariant

36

texture classification using complete space-frequency model. In this approach,

they have used a polar, truly analytic (frequency channel) form of 2-D Gabor

functions and achieved the rotation invariance by transforming the Gabor features

into rotation invariant features (using autocorrelation and DFT magnitudes).

Rao et al. (Rao et al., 2004) proposed a method for texture segmentation by

combining features obtained using Discrete wavelet transform (DWT) and Gabor

filter bank. These features are classified using an unsupervised fuzzy-c means

(FCM) classifier. Results show that combined features gives superior classification

performance in comparison to features extracted only using DWT or Gabor filter.

37

CHAPTER 3

Textured Object Segmentation using Parametric

Active Contours

This chapter presents a novel technique for textured object segmentation using

parametric active contours, which are also known as snakes (Kass et al., 1988).

The proposed technique is based on the extension of parametric active contours

from the use of normal intensity based features to texture features. Parametric

active contours synthesize parametric curves within the image domain and allow

them to move towards the desired image features under the influence of internal

and external forces. The internal force serves to impose a piecewise continuity and

smoothness constraint, whereas external force pushes the snake towards salient

image features like edges, lines and subjective contours. External force in the

traditional snake is defined in terms of image gradient. In the presence of such

external force, snake is attracted towards large image gradients i.e. towards the

edges in the image. So if it is applied to the textured images, it will often get stuck

on local texel (micro-units or cells of a texture) edges and converges at non-object

boundary. Fig. 3.1 shows an example of a textured object segmentation using

normal intensity based snake. Segmentation result is shown in Fig. 3.1(b) and the

desired result is shown in Fig. 3.1(c). In the segmentation result, we can see that

the contour has latched to the non-object boundary (Fig. 3.1(b)). To overcome

this effect, we here present a new formulation of external force in active contour

(a) (b) (c)

Fig. 3.1: (a) Synthetic texture image with initial contour, (b) result obtained usingnormal intensity based snake, and (c) desired segmentation result.

for textured images, which we name as “texture force”. The snake in presence of

texture force runs over the texture image surface and detects the object boundary

of a texture surface, on a background texture. Texture force considers the texture

properties of the image for modeling in place of using the image pixel intensity

values directly.

The rest of the chapter is organized as follows. Section 3.1 briefly presents

parametric active contour model and scalogram (Clerc and Mallat, 2002) which

are required for the formulation of the proposed technique. Section 3.2 describes

a novel texture feature extraction technique which is based on scalogram. We use

this texture feature extraction technique to extract the texture characteristics of

the input texture images. Modeling of the texture force is presented in section

3.3. Texture force is the essential part of the modified parametric active contour.

Experimental results are presented in section 3.4 and section 3.5 concludes the

chapter.

39

3.1 Background

3.1.1 Parametric Active Contour (Snake) Model

A traditional active contour (Kass et al., 1988) is defined as a parametric curve

C(q) : [0, 1] → R2, which minimizes the following energy functional

E =

∫ 1

0

[1

2(α|C′

(q)|2 + β|C′′(q)|2) + Eext(C(q))

]dq (3.1)

where, C′(q) and C

′′(q) are the first and second order derivatives of C(q) respec-

tively and α, β are constants. First two terms in Eqn. 3.1 are called the elastic

and bending energy, which control the elastic and the bending ability of the snake

respectively. Relative importance of the elastic and bending ability of snake is

controlled by the constants α and β respectively. Since these elastic and bending

energies are derived from the contour itself, they constitute the internal compo-

nent of the snake energy. Eext, the last term in Eqn. 3.1, is derived from the image

and is called the external energy component of the snake. It attracts the snake

towards the salient features in the image such as edges, object boundaries etc. For

an image I(x, y), where (x, y) are spatial co-ordinates, typical external energy is

defined as follows to lead the snake towards step edges (Kass et al., 1988)

Eext = −|∇I(x, y)|2 (3.2)

where,∇ is gradient operator. A snake that minimizes E must satisfy following

40

Euler-Lagrange equation (Elsgolc, 1963)

αC′′(q)− βC

′′′′(q)−∇Eext = 0 (3.3)

Eqn. 3.3 can also be viewed as force balance equation

Fint + Fext = 0 (3.4)

where, Fint = αC′′(q) − βC

′′′′(q) and Fext = −∇Eext. Fint, the internal force, is

responsible for stretching and bending and Fext, the external force, attracts snake

towards the desired features in the image.

3.1.2 Discrete Wavelet Transform and Scalogram

The discrete wavelet transform (DWT) analyses a signal based on its content in

different frequency bandwidths. Therefore, it is very useful in analyzing repetitive

patterns such as texture. DWT decomposes a signal into different subbands (ap-

proximation and detail) with different resolution in frequency and spatial extent.

Let ξ(x) be the image signal and ψu,s(x) be a wavelet function at a particular

scale. Then signal filtered at point u is obtained by taking the inner product of

the two < ξ(x), ψu,s(x) >. This inner product is called wavelet coefficient of ξ(x)

at position u and scale s (Mallat, 1999). Scalogram (Clerc and Mallat, 2002) of a

signal ξ(x) is the variance of this wavelet coefficient:

w(u, s) = E{| < ξ(x), ψu,s(x) > |2} (3.5)

41

(a) (b)

Fig. 3.2: (a) Synthetic texture image, (b) magnified view of the 21 × 21 window ofthe texture cropped at point P shown in (a).

In our work, w(u, s) has been approximated by convolving the square modulus

of the filtered outputs with a Gaussian envelop of a suitable width (Clerc and

Mallat, 2002). w(u, s) gives the energy accumulated in a subband with frequency

bandwidth and center frequency inversely proportional to scale. We use scalogram

based discrete wavelet features to model the texture characteristics of the image

in our work. The scalogram will have large energy in any particular subband if the

signal has large spectral energy in the bandwidth of that particular scale/subband.

3.2 Texture Feature Extraction

In this section, we explain how the wavelet transform is used to extract the texture

features necessary for texture force estimation. It discusses the computational

framework based on multi-channel processing. We use DWT-based dyadic de-

composition of the signal to obtain texture properties. A simulated texture image

shown in Fig. 3.2(a) is used to illustrate our proposed computational framework

with the results of intermediate stages of processing.

Modeling of texture features at a point in an image involves two steps: scalo-

42

0 50 100 150 200 250 300 350 400 4500

50

100

150

200

250

(a)

20 40 60 80 100 120 140 160 180 200 220

12345

x 104

10 20 30 40 50 60 70 80 90 100 110

12345

x 104

5 10 15 20 25 30 35 40 45 50 55

12345

x 104

5 10 15 20 25

12345

x 104

5 10 15 20 25

12345

x 104

D1

D2

D3

D4

A4

(b) (c)

Fig. 3.3: (a) 1-D texture profile of the texture window shown in Fig. 3.2(b), (b)scalogram of the signal shown in (a), (c) texture feature image for the image shownin Fig. 3.2(a).

gram estimation and texture feature estimation. To obtain texture features at a

particular point (pixel) in an image, a n × n window is considered around the

point of interest (see Fig. 3.2(b), which shows the neighborhood of a point P in

Fig. 3.2(a)). Intensities of the pixels in this window are arranged in the form of

a 1-D vector of length n2, whose elements are taken column wise (or row wise)

from the n × n cropped intensity matrix. This intensity vector (signal), which

basically represents the textural pattern around the pixel, is subsequently used in

the estimation of the scalogram.

3.2.1 Scalogram estimation

An input signal, obtained after arranging the pixels of a n × n window as ex-

plained above, is used for scalogram estimation. This signal is decomposed using

a wavelet filter. We use orthogonal Daubechies 2-channel (with dyadic decom-

position) wavelet filter for this purpose. Daubechies filter with level-L dyadic

decomposition, yields wavelet coefficients {AL,DL,DL−1 , ..,D1}, where Ai repre-

sents the approximation coefficient and Di’s are detail coefficients. The steps of

processing to obtain scalogram from the wavelet coefficients are similar to that

described in (Greiner and Das, 2006; Rao et al., 2006). Fig. 3.3(b) presents an

43

example of scalogram obtained for the signal shown in Fig. 3.3(a) using level-4

DWT decomposition.

3.2.2 Texture feature estimation

Once the scalogram of the texture profile at a particular point is obtained, a

post-processing step is carried out to eliminate non-significant subbands of the

scalogram and only significant subbands are used for further processing. This is

done since significant subbands contain the information of major texture features.

Removal of non-significant subbands helps in removing the redundant information

and making the computation fast. Wavelet level-L decomposition gives following

set of wavelet subbands B = {AL, DL, DL−1, .., D1}. Let element Bi be the ith

wavelet subband in set B. We use Algorithm 1 to determine significant and non-

significant subbands, as given below.

Algorithm 1 Identification of significant subbands

for each Bi ∈ B doif variance for subband Bi ≤ threshold

Bi is non-significant subbandelse

Bi is significant subbandend-if

end-for

The value of threshold is decided empirically. Variance of subband Bi is defined

as var(Bi) = E[(Bi−µi)2], where µi is the mean of all wavelet coefficients belonging

to subband Bi. The significant subbands detected from the scalogram are used

for texture feature estimation. Texture features are estimated from the “energy

44

measure” of the scalogram coefficients of the significant subbands. This texture

feature is similar to the “texture energy measure” first proposed by Laws (Laws,

1980).

Let for a pixel k in image I, Dk be the set of all significant subbands of

scalogram S. Then the energy measure of pixel k is estimated to be the l1-norm

of the scalogram coefficients belonging to the significant subbands and is given as

follows:

Ek =1

N

{∑i∈Dk

∑j

S(i,j)

}(3.6)

where, S(i,j) is the jth element of the ith subband of scalogram S and N is the

sum of cardinalities of the members of Dk. This energy measure computed for all

pixels in an image constitute the “texture feature image”, which is further used in

texture force modeling. Fig. 3.3(c) shows a texture feature image computed from

the image shown in Fig. 3.2(a). We can see that pixels belonging to a particular

texture region exhibit identical energy level. More examples of the same will follow

in section 3.4.

3.3 Modeling of Texture Force

To make the active contour sensitive to texture boundaries, we exploit the gradient

present in the texture feature image. Let, for a given texture image I(x, y), f(x, y)

be the texture feature image obtained by the method explained in the previous

section. The external energy of the snake, based on the gradient present in the

45

texture feature image, can be defined as follows (similar to Eqn. 3.2)

Etexext = −|∇f(x, y)|2 (3.7)

Hence, the modified total energy of the snake, Etex, in case of texture, can be

given by replacing Eext by Etexext in Eqn. 3.1 as follows

Etex =

∫ 1

0

[1

2(α|C′

(q)|2 + β|C′′(q)|2) + Etex

ext(C(q))

]dq (3.8)

A snake that minimizes Etex would satisfy following Euler-Lagrange equation

αC′′(q)− βC

′′′′(q)−∇Etex

ext = 0 (3.9)

Viewing Eqn. 3.9 as force balancing equation, we can write

Fint + Ftexext = 0 (3.10)

where, Fint = αC′′(q)− βC

′′′′(q) is same as the internal force acting on the snake

applied to a normal intensity image. Ftexext defines the external force in texture

based model of snake, and we call it as “texture force”. Snake, in presence of

texture force, is attracted towards the texture object boundary. Texture force

causes the change in external energy (i.e. in Etexext) and is given as follows:

Ftexext = −∇Etex

ext (3.11)

To find the object boundary active contour deforms, so it can be represented

46

as a time varying curve C(q, t) = [x(q, t), y(q, t)] where q ∈ [0, 1] is the arc-length

and t ∈ R+ is time. Dynamics of the contour is governed by the following equation

which is obtained by setting the partial derivative of C(q, t) w.r.t. t equal to the

left hand side of Eqn. 3.9.

Ct(q, t) = αC′′(q)− βC

′′′′(q)−∇Etex

ext

= Fint + Ftexext (3.12)

When the solution of C(q, t) stabilizes, the term Ct(q, t) vanishes and provides

a solution to Eqn. 3.9. This dynamic equation can also be viewed as a gradient

descent algorithm (Cohen et al., 1992) designed to solve Eqn. 3.1. A solution

to Eqn. 3.12 can be found by discretizing the equation and solving the discrete

system iteratively (Kass et al., 1988). In the following section, results of the

proposed technique are presented.

3.4 Experimental Results

To automatically detect the boundary of a particular object using active contours

in presence of texture, a contour is initialized near the desired object. The contour

is then allowed to deform towards the object boundary until it latches around

the object. In case of textured images, object boundary is identified as the point

where texture property changes i.e. where two texture regions meet. On a texture

surface, snake in presence of texture force stops moving as it reaches a different

texture region. For a snake to stop at the texture boundary, net effect of the

47

(a) (b) (c)

Fig. 3.4: Results of our proposed technique: first five texture images are composedof two Brodatz textures (Brodatz, 1966) and the last image is a natural Zebra image,(a) input texture images, (b) corresponding texture feature images, (c) segmentationresults. Contour shown in images depicts the estimated boundary of the foregroundobject with texture.

48

damping, internal and external forces (texture force) should be zero for all snake

points at the object boundary. To demonstrate the performance of the snake in

the presence of texture force, various kinds of synthetic and natural textures are

used. We have used Daubechies 8-tap 2-channel filter for DWT decomposition in

all our experiments.

Experimental results are shown in Fig. 3.4 - 3.6. Fig. 3.4 shows results on few

synthetic and one natural texture images. The first column in this figure shows

the input texture images. First five input images are composed of two Brodatz

(Brodatz, 1966) textures and last image is a natural image of Zebra. Second

column of the Fig. 3.4 shows the texture feature images for the respective texture

images. One can note that the energy values exhibited in the texture feature

images are distinct for two different texture regions. Last column of the Fig.

3.4 shows the segmentation results using the proposed technique. To generate

the texture feature images of these input texture images, we considered a 11× 11

window at each pixel. DWT decomposition was done up to level-4. Computational

time required to perform object segmentation in the images shown in Fig. 3.4, is

given in Table 3.1.

Table 3.1: Computational time required for the segmentation of textured objects inimages shown in Fig. 3.4. Images are numbered from top to bottom in Fig. 3.4.

Image Name Image Size Computational(in pixels) Time (in seconds)

Image-1 197× 155 58Image-2 197× 155 60Image-3 197× 155 60Image-4 102× 94 26Image-5 197× 155 59Image-6 170× 145 46

49

Fig. 3.5 shows comparative results on one synthetic and few natural images.

Each row in the figure gives the comparative study of the results where we first

show the results available in the literature, followed by the result obtained using

our proposed technique. We can see that the results obtained using our proposed

technique are better than the results available in the literature in most of the

cases.

Fig. 3.6 explicitly shows the comparative study on the image of Zebra that

occurs quite often in the texture segmentation literature (Sagiv et al., 2002; Rous-

son et al., 2003; Sagiv et al., 2006; Awate et al., 2006). We can carefully observe

that our segmentation result (Fig. 3.6(f)) is superior than the results obtained

using other techniques (Sagiv et al., 2002; Rousson et al., 2003; Sagiv et al., 2006;

Awate et al., 2006; Gupta and Das, 2006). These results reported in the literature

show errors in any one of the following places of the object: mouth, back and area

near the legs.

3.5 Summary and Discussion

In this chapter, a new texture feature extraction technique is described. This

technique is based on scalogram (Clerc and Mallat, 2002) which is computed

using the discrete wavelet transform. We also introduce a new external force

for snakes, which we call as “texture force”. Texture force is modeled using the

scalogram based texture features. Snake, in presence of texture force, is used for

the texture object segmentation. We show the segmentation results of parametric

active contour in presence of texture force on various synthetic and natural texture

50

(a) (b) our result

(a) (b) (c) our result



Fig. 3.5: Comparative results: (I): results on a synthetic texture image: (a) fromSagiv et al. (2006), (b) our result, (II): results on Zebra-1 image: (a) from Paragiosand Deriche (2002b), (b) from Rousson et al. (2003), (c) our result, (III): results onZebra-2 image: (a) from Kim et al. (2002), (b) from Rousson et al. (2003), (c) ourresult, (IV): results on Cheetah image: (a) from Kim et al. (2002), (b) from Roussonet al. (2003), (c) result of the proposed method.

51

(a) (b) (c)

(d) (e) (f) our result

Fig. 3.6: Comparative results on Zebra image: (a) reproduced from Sagiv et al. (2002),(b) reproduced from Rousson et al. (2003), (c) reproduced from Awate et al. (2006),(d) reproduced from Sagiv et al. (2006), (e) obtained using the method presented inGupta and Das (2006), (f) result of the proposed method.

images. We also compare few of our results with the results of other techniques

available in the literature. We can see that in most of the cases our proposed

method provides better performance than the other texture object segmentation

technique proposed in the literature.

Since the texture object segmentation technique proposed in this chapter is

based on the parametric active contour which does not intrinsically deal with the

segmentation of multiple objects, this technique can only handle the segmentation

of a single object. The issue of the segmentation of multiple textured objects is

discussed in the next chapter.

52

CHAPTER 4

Segmentation of Multiple Textured Objects

using Geodesic Active Contours

In the previous chapter, we had extended parametric active contours for tex-

tured object segmentation. Parametric active contours have many advantages

over geodesic active contour such as straightforward implementation, computa-

tional efficiency, better user interactivity, fast convergence etc. However, its in-

ability to handle the topological changes of the contour automatically during the

evolution makes it unfit for the use of multiple objects segmentation. Several

algorithms have been proposed to overcome this limitation (Leitner and Cinquin,

1991; Szeliski et al., 1993; McInerney and Terzopoulos, 1996; Lachaud and Mon-

tanvert, 1999; Delingette and Montagnat, 2001)) of parametric active contours.

In most of these techniques, a procedure is setup to monitor the deformation of

the contour and to change the topology when required. Many times, heuristics are

also used for detecting possible splitting and merging of the deforming contours.

Figure 4.1 shows an example of multiple objects segmentation using parametric

active contours. It shows a deforming contour intersecting with itself. In such

cases, parametric active contour requires a special strategy which can monitor

the deformation of the contour, and in case of contour intersection handle the

splitting of the contour. Handling of the topological changes in case of parametric

active contours using special procedures and heuristics is a problem when an

unknown number of objects must be detected simultaneously. Also, the approach

of parametric active contours is non-intrinsic, since the energy of the contour

depends on the parametrization of the curve and not directly related to the object

geometry.

The above mentioned problems of parametric active contour are intrinsically

solved by using geodesic active contour proposed by (Caselles et al., 1997). Since

geodesic active contours can intrinsically handle topological changes during the

process of segmentation, in this chapter we extend geodesic active contour al-

gorithm for the segmentation of multiple textured objects in the presence of a

background texture. Our algorithm is based on the generalization of geodesic ac-

tive contour model from one-dimensional intensity based feature space to multi-

dimensional feature space (Sapiro, 1997). In our approach, image is represented in

an n-dimensional texture feature space which is derived from the image using the

scalogram (Clerc and Mallat, 2002) obtained from the discrete wavelet transform.

We use the geodesic active contour mechanism for textured object segmentation

by generalizing its edge indication function or stopping function from intensity

based feature space to texture feature space. Details of this are presented in sec-

tion 4.3. In the literature, similar approaches where the geodesic active contour

scheme is applied to some feature space of the image, were studied in (Sagiv et al.,

2000; Lorigo et al., 1998; Paragios and Deriche, 1999b; Sagiv et al., 2006).

Rest of the chapter is organized in the following way. Section 4.1 briefly dis-

cusses the geodesic active contour which provides essential background for the

formulation of the proposed technique. In Section 4.2, we present a novel texture

feature extraction technique to extract the multi-dimensional texture features of

54

Fig. 4.1: Problem of segmentation of multiple objects using a parametric active con-tour.

the input image. Section 4.3 presents modified geodesic active contour for seg-

mentation of multiple textured objects. In section 4.4, we present experimental

results and finally conclude the chapter in section 4.5.

4.1 Geodesic Active Contour

In this section, we briefly review the geodesic active contour technique (Caselles

et al., 1997) for non-textured images. Generalization of the technique for segmen-

tation of multiple textured objects is described in section 4.3.

Let C(q) : [0, 1] → R2 be a parameterized curve, and let I : [0,m]×[0, n] → R+

be the image where we want to detect the objects boundaries. Let g(r) : [0,∞] →

R+ be an inverse edge detector, so that g → 0 when r → ∞. g represents the

edges in the image and has a fundamental role in the success of geodesic active

contour mechanism. If it does not represent the edges well, application of the

geodesic active contour mechanism is likely to fail. In geodesic active contour

technique, minimization of the energy functional, proposed in the classical snakes

(Kass et al., 1988), is generalized to find a geodesic curve in the Riemannian space

(Caselles et al., 1997) with a metric derived from the image by minimizing the

55

following functional

LR =

∫g(|∇I(C(q))|)|C ′

(q)|dq (4.1)

LR is a new definition of length (called geodesic length) in the Riemannian space.

This new length can be considered as a weighted length of a curve, where the

Euclidian length element is weighted by a factor g(|∇I(C(q))|), which contains

information regarding the boundaries (edges) in the image. To find this geodesic

curve, steepest gradient descent method is used which gives the following curve

evolution equation, to obtain the local minima of LR. Complete geometric inter-

pretation of this can be found in (Caselles et al., 1997).

dC

dt= g(|∇I|)kN− (∇g.N)N (4.2)

where, k denotes Euclidian curvature and N is a unit inward normal to the curve.

Let us define a function u : [0,m]× [0, n] → R such that curve C is parameterized

as a level set of u, i.e. C = {(x, y)|u(x, y) = 0}. Now, we can use the Osher-

Sethian level sets approach (Osher and Sethian, 1988) and replace above evolution

equation for the curve C with an evolution equation for the embedded function u

as follows:

du

dt= |∇u|div

(g(∇I)

∇u

|∇u|)

= g(∇I)|∇u|k +∇g · ∇u (4.3)

where, div is a divergence operator. Stopping function g(∇I) is generally given

56

Fig. 4.2: Geometric interpretation of the attraction force in 1-D: (a) The original edge

signal I, (b) I, the smoothed version of I, and (c) the derived stoping function g.The evolving contour is attracted to the valley created by ∇g ·∇u. (figure taken fromCaselles et al., 1997)

by g(∇I(x, y)) = 11+|∇I(x,y)|p , where p is an integer and usually equals to 1 or 2.

The goal of g(∇I) is to stop the evolving curve when it reaches to the object

boundary. For an ideal edge ∇I is very large, and hence g = 0 at the edge and

the curve stops (ut(x, y) = 0). The boundary is then given by u(x, y) = 0.

The term ∇g ·∇u in Eqn. 4.3 has a special significance. This term attracts the

curve towards the boundaries of objects and plays an important role in cases where

there are different intensity gradient values along the edge (object boundary), as

often happens in the real images. Fig. 4.2 shows g and its gradient vectors for a 1-

D case. We can observe the way gradient vectors are directed towards the middle

of the boundary. These vectors direct the propagating curve into the “valley” of

the g function and do not allow it to move out once it falls in the valley.

57

4.2 Multi-dimensional Texture Feature Extrac-

tion

This section explains a novel technique of multi-dimensional texture feature ex-

traction using the scalogram (Clerc and Mallat, 2002). The technique involves

two main steps: scalogram estimation and texture feature estimation. Procedure

to obtain the scalogram at a particular point (pixel) is similar to that explained in

section 3.2.1. Following subsection presents the texture feature estimation tech-

nique used here to extract a multi-dimensional texture feature.

4.2.1 Multi-dimensional Texture Feature Estimation

Once the scalogram of the texture profile at a particular point is obtained, it is used

for multi-dimensional texture feature estimation. Texture features are estimated

from the “energy measure” of the coefficients of the scalogram subbands. This

texture feature is similar to the “texture energy measure” first proposed by Laws

(Laws, 1980).

Let E be the texture energy image for the input texture image I. E defines

a functional mapping from a 2-D pixel coordinate space onto a multi-dimensional

energy space Γ, i.e. E : [0,m] × [0, n] → Γ. Let for the kth pixel in I, Dk be the

set of all subbands for a scalogram S and Ek ∈ Γ be the texture energy vector

associated with it. In the simplest case, Ek can be a 1-D energy measure (similar

58

to that presented in the previous chapter), which can be estimated as follows

Ek =1

N

{∑i∈Dk

∑j

S(i,j)

}(4.4)

where, S(i,j) is the jth element of the ith subband of scalogram S and N is the

sum of the cardinalities of all the members of Dk. This energy measure considers

l1 norm of the coefficients belonging to all subbands of the scalogram computed

for the pixel k and has the advantage of being simple to compute. However, it

does not always well represent the textural information. A better texture energy

space, Γ, can be created by taking the l1 norm of each subband of the scalogram

S. In this case, Γ represents a n-dimensional energy space where n = L + 1 for

level-L decomposition. Formally, ith element of the energy vector Ek ∈ Γ, is given

as follows:

E(k,i) =1

Ni

{∑j

S(i,j)

}(4.5)

where, i represents the ith scalogram subband of set Dk and Ni is the cardinality

of this subband. Texture energy image computed using Eqn. 4.5 is a multi-

dimensional image and provides more discriminative information to estimate the

texture boundaries. This texture energy measure constitutes a texture feature

image f .

One common problem in texture segmentation is the problem of precise detec-

tion of the boundary efficiently. A pixel near the texture boundary has neighboring

pixels belonging to different texture regions. In addition, a textured image may

contain non-homogeneous, non-regular texture regions. This would cause the ob-

tained energy measure to deviate from its “expected” value. Hence, it is necessary

59

that the obtained feature image be further processed to remove noise and outliers.

To do so, we apply smoothing operation to the texture energy image in every band

separately. In our smoothing method, the energy measure of the kth pixel for a

particular band is replaced by the average of the block of energy measures cen-

tered at pixel k in that band. In addition, in order to reduce the block effects and

to reject outliers, the p percentage of the largest and the smallest energy values

within the window block are excluded from the calculation. Thus, the smooth

texture feature value of pixel k in ith band of the feature image is obtained as:

f(k,i) =1

w2(1− 2× p%)

(w2)(1−2×p%)∑j=1

E(k,j)

(4.6)

where, E(k,j)s are the energy measures computed within the w×w window centered

at pixel k, for the ith band of the texture energy image. The window size w × w

and the value of p are chosen experimentally to be 11× 11 and 10 respectively, in

our experiments. Texture feature image f , computed by smoothing the texture

energy image E as explain above, is used in the computation of texture edges

using inverse edge indicator function which is described in the next section.

4.3 Geodesic Active Contours in Texture Fea-

ture Space

We employ the geodesic active contour technique in the scalogram based texture

feature space, by using the generalized inverse edge detector function g proposed

in (Sochen et al., 1998; Sagiv et al., 2000). Geodesic active contour, in presence

60

of texture feature based inverse edge detector g, is attracted towards the texture

boundary. Sochen et al. (Sochen et al., 1998) suggest that an image can be

described as a 2-D Riemannian manifold which represents the spatial extent of

the image, embedded in a higher-dimensional feature space, via the Beltrami

framework (Kimmel et al., 1998). They show that the determinant of the metric

of the 2-D image manifold can be interpreted as a measure of the presence of the

gradient on the manifold. So the metric of this surface can be used as an indicator

of the edges present in the image. They also show that if the metric in the higher

dimensional embedding space is known, it can be used to derive the metric of the

lower space using the pullback mechanism (Sochen et al., 1998; Sagiv et al., 2000).

Let X : Σ → M be an embedding of Σ in M , where M is a Riemannian man-

ifold with known metrics, and Σ is another Riemannian manifold with unknown

metric. As proposed in (Sochen et al., 1998), metric on Σ can be constructed

using the knowledge of the metric on M using the pullback mechanism. If Σ is a

2-D image manifold embedded in n-dimensional manifold of texture feature space

−→f (x, y) = (f 1(x, y), ..., fn(x, y)), metric h(x, y) of the 2-D image manifold can

be obtained from the embedding texture feature space as follows (Sochen et al.,

1998):

h(x, y) =

1 + Σi(fix)

2 Σifixf

iy

Σifixf

iy 1 + Σi(f

iy)

2

(4.7)

As discussed above, the determinant of metric h provides a good indicator of the

gradient present in the image manifold. If the embedding space is created using

the image texture features, metric h can provide the information about the texture

edges present in the image. So given the texture features, we can derive the metric

61

of the image manifold embedded in that feature space, and use it as described to

create the edge indication function.

It turns out that the inverse of the metric’s determinant can serve as a good

edge indicator. Hence the stopping function g used in geodesic active contour

for texture boundary detection, can be given as the inverse of the determinant of

metric h as follows (Sagiv et al., 2000)

g(x, y) =1

det(h(x, y))(4.8)

where, det(h) is the determinant of h.

In our segmentation approach, we describe the input image as a manifold

embedded in the scalogram based texture feature space and derive the metric of

the image manifold using Eqn. 4.7. Then we use Eqn. 4.3, with g obtained

using Eqn. 4.8 for the segmentation of the textured object(s) from a background

texture.

4.3.1 Segmentation of multiple textured objects

Curve evolution in the modified geodesic active contour technique (presented

here), for segmentation of multiple textured objects, is performed using the level

set approach (Osher and Sethian, 1988). In the level set approach, a 2-D contour

is represented in an implicit way as a zero level set of a 3-D surface (called level

set surface). Deformation in the contour is carried out by deforming the level set

surface (see Eqn. 4.2 and Eqn. 4.3). We use scalogram based texture features to

62

(a) (b)

Fig. 4.3: Demonstration of segmentation of multiple objects using the texture featurebased geodesic active contour method : (a) Evolution of level set surface (LSS),(b) evolving zero level set contour. First row shows the initial LSS and zero levelset contour. Last row shows the final state of LSS and zero level set contour afterconvergence. 63

(a) (b)

(c) (d)

(e) (f)

(g) (h)

Fig. 4.4: A close look at the process of level set surface (LSS) evolution and contoursplitting, from a different 3-D look angle, for the segmentation of multiple texturedobjects shown in Fig. 4.3. From (a)-(h), figures show the evolving LSS with zero levelset contour (shown in red color on the surface). Fig. (a) shows the initial LSS andFig. (h) shows the final LSS after segmentation.

64

model the deformation of a level set surface for segmentation of multiple textured

objects. Contour evolution using level set approach makes our approach topolog-

ically independent, because different topologies of the zero level set contour do

not imply different topologies of the level set surface. Therefore evolving contours

naturally split and merge allowing the simultaneous detection (segmentation) of

several textured objects and number of objects to be segmented in the scene are

not required to be known a priori in the image. At the boundary of the textured

object, value of g (defined in Eqn. 4.8) vanishes and contour stops.

Fig. 4.3 demonstrate the segmentation of multiple textured objects using

the proposed approach. Fig 4.3(a) shows the evolving level set surface and Fig.

4.3(b) shows evolving zero level set contour on the texture image (from top to

bottom). Zero level set contour is also shown on the level set surface in red color.

Fig. 4.4 provides a different 3-D look angle for the evolving level set surface,

for segmentation of the objects in simulated image shown in Fig. 4.3(b). Since

level set surface need not change its topology to split the evolving contour, it can

intrinsically represent multiple splitted contours. Section 4.4 presents textured

object(s) segmentation results for the synthetic and natural images.

4.4 Experimental Results

We have used our proposed technique on both synthetic and natural texture im-

ages to show its effectiveness. For an input image, texture feature space is created

using the scalogram as discussed in section 4.2. In all of our experiments, we use

Eqn. 4.5 for texture energy estimation. We use the orthogonal Daubechies 2-

65

(a) (b)

(c) (d) (e)

(f) (g) (h)

(i) (j) (k)

Fig. 4.5: Segmentation result of a synthetic image: (a) input image, (b) texture edgemap produced by inverse edge detector (Eqn. 4.8) using the scalogram based texturefeatures, (c) input image with initial contour marked around the objects in black color,(d)-(i) intermediate positions of the evolving contour during segmentation process, (k)segmentation result where segmented objects boundaries are shown in black color.

66

(a) (b)

(c) (d) (e)

(f) (g) (h)

(i) (j) (k)

Fig. 4.6: Segmentation result of a synthetic image: (a) input image, (b) texture edgemap produced by inverse edge detector using the scalogram based texture features,(c) input image with initial contour marked around the objects in black color, (d)-(i) intermediate positions of the evolving contour during segmentation process, (k)segmentation result where segmented objects boundaries are shown in black color.

67

channel (with dyadic decomposition) wavelet filter for signal decomposition. The

metric of the image manifold is computed considering the image manifold em-

bedded in the higher dimensional scalogram based texture feature space. This

metric is used to obtain the texture edge detector, which is used in geodesic ac-

tive contour mechanism as a stopping term. Initialization of the geodesic active

contour is done using a signed distance function. To generate texture features of

the images in our experiments, we considered a 12 × 12 window at each pixel.

DWT decomposition was done up to level-4.

To start the segmentation, an initial contour is put around the object(s) to

be segmented. Contour moves towards the object boundary to minimize the

objective function LR (Eqn. 4.1) in the presence of a new g (Eqn. 4.8). Often,

to make computation fast and avoid the contour to get stuck into the spurious

edges, the contour is initialized near the object boundary (e.g. Fig. 4.7 (a), last

row). Segmentation results obtained using proposed technique are shown in Figs.

4.5-4.10.

Figs. 4.5-4.6 show results on two synthetic images. Figs. 4.5-4.6(a) show the

images to be segmented and Figs. 4.5-4.6(b) show the respective outputs of the

inverse edge detector (Eqn. 4.8) computed using the scalogram based texture

features for the input images. Figs. 4.5-4.6(c) and (k) show the positions of

the initial contour and the final segmentation output respectively. Intermediate

evolving contours are shown in Figs. 4.5-4.6(d) to (j).

Fig. 4.7 shows segmentation results on natural images. Input texture images

are shown with the initial contour (in black color) marked around the object(s)

68

to be segmented (Fig. 4.7(a)). Texture edge maps of the images, computed by

the inverse edge detector (Eqn. 4.8) using scalogram based texture features, are

shown in Fig. 4.7(b). Final segmentation results are shown in Fig. 4.7(c) where

the identified object boundaries are marked in black color. For one natural image

(Zebra image shown in Fig. 4.7(a) second row ), we show the evolution process

of the contour in Fig. 4.8. Results of the proposed approach are encouraging.

Proposed approach also works well even when the background has more than one

texture regions (see Fig. 4.7, last row). Computational time required to perform

object segmentation in the images shown in Fig. 4.7, is given in Table 4.1.

Table 4.1: Computational time required for the segmentation of textured objects inimages shown in Fig. 4.7. Images are numbered from top to bottom in Fig. 4.7.

Image Name Image Size Computational(in pixels) Time (in seconds)

Image-1 226× 392 300Image-2 200× 300 250Image-3 226× 372 330Image-4 210× 300 260

In Figs. 4.9 and 4.10, we compare the performance of our proposed textured

object segmentation method with that obtained using other techniques available in

the literature. Fig. 4.9 shows comparative results on two synthetic texture images,

while Fig. 4.10 shows comparative results on two natural images. Our results for

both types of images are quite encouraging. We see that in most of the cases, our

results are identical or better than the results obtained using other techniques.

In Fig. 4.11, we compare the results obtained using the geodesic active contour

based technique (presented here) with that described in the previous chapter for

the segmentation of single textured object (using parametric active contour). We

69

(a) (b) (c)

Fig. 4.7: Segmentation results of natural images: (a) input images where initial con-tours are marked around the object(s) in black color, (b) texture edge maps producedby inverse edge detector (Eqn. 4.8) using the scalogram based texture features for therespective images, (c) segmentation results of the image shown in column 1, whereboundaries of the segmented objects are shown in black color.

70

(a) (b)

(c) (d)

(e) (f)

(g) (h)

(i) (j)

Fig. 4.8: Contour evolution in the segmentation of Zebra image: (a) input image withinitial contour marked around the objects in black color, (b)-(i) intermediate positionsof the evolving contour during the segmentation process, (j) segmentation result whereboundaries of the segmented objects are shown in black color.

71

(a) (b) (c) our results

Fig. 4.9: Comparative results (I): (a) input image, (b) result reproduced from Kimet al. (2002), (c) our result, (II): (a) input image, (b) result reproduced from Paragiosand Deriche (2002a), (c) our result.

see that the results obtained using the technique presented in this chapter, are

better and promising than the results presented in the previous chapter. This is

mainly due to two reasons:

1. Multi-dimensional texture feature provides more discriminative informationabout the texture regions and helps the inverse edge detector to estimate thetexture edges more precisely. In contrary, a scalar texture feature was ob-tained by taking the average of all significant subbands in case of parametricactive contour (see section 3.2).

2. The algorithm presented in this chapter for texture object segmentationuses the geodesic active contour which provides better boundary localization(Caselles et al., 1997) and produces smooth contours.

Overall computational cost of texture feature extraction and segmentation is

in the range of 70 to 90 seconds on a P-IV, 3 GHz machine with 2 GB RAM, for

images of size 100× 100.

72



Fig. 4.10: Comparative results: (I) result on Zebra image: (a) reproduced from Para-gios and Deriche (2002b), (b) reproduced from Rousson et al. (2003), (c) producedby our proposed technique; (II) result on Cheetah image: (a) reproduced from Kimet al. (2002), (b) reproduced from Rousson et al. (2003), (c) result produced by ourproposed technique.


In this chapter, we first described a novel multi-dimensional scalogram based

texture feature extraction technique and then use the texture features obtained

using this technique to derive the inverse edge indication function. Inverse edge

indication function is further used along with geodesic active contour to perform

segmentation of multiple textured objects in the presence of background texture.

In the segmentation approach presented in this chapter, input image is processed

to obtain a multi-dimensional texture feature image, thus representing the image

in a multi-dimensional feature space. An edge indication or stopping function,

used in geodesic active contour, is derived from the texture feature space of the

image by viewing the feature space as a manifold (Sochen et al., 1998). Geodesic

active contour in the presence of this edge indication function, stops at the texture

boundary. Since geodesic active contour can handle the topological changes in

73

(a) (b)

Fig. 4.11: Comparison of single textured object segmentation results obtained us-ing the parametric active contour based technique (presented in section 3.3) and thegeodesic active contour based technique (presented in section 4.3): (a) results ob-tained using parametric active contour based technique, (b) results obtained usinggeodesic active contour based technique.

74

the contour intrinsically, the segmentation technique presented here can handle

segmentation of multiple textured objects simultaneously. Main contributions

of this chapter are as follows: (1) development of new scalogram based multi-

dimensional texture feature which has a strong texture discriminating power, and

in turn (2) use of this feature to define a good texture edge indicator function

which is used in geodesic active contour for segmentation of multiple textured

objects.

We validated our technique using various synthetic and natural texture images.

We also compare few of our results with that of other techniques available in the

literature. Segmentation results obtained are quite encouraging and accurate for

both synthetic and natural images. From Fig. 4.11, we observe that the results

obtained for single texture object segmentation using the technique presented in

this chapter are better than the results obtained in chapter 3.

75

CHAPTER 5

SnakeCut: An Automatic Technique for

Segmentation of a Foreground Object with Holes

This chapter proposes an efficient, semi-interactive method for foreground object

segmentation in color images using the integration of two popular foreground ob-

ject segmentation techniques, namely parametric active contours (Snakes) (Kass

et al., 1988) and GrabCut (Rother et al., 2004). As seen in chapter 3, Snake

is a deformable contour, which segments an object boundary using boundaries

discontinuities by minimizing the energy function associated with the contour.

GrabCut is an interactive tool based on iterative graph-cut for foreground object

segmentation in still images. GrabCut provides a convenient way to encode color

features as segmentation cues to obtain foreground segmentation from local pixel

similarities using modified iterated graph-cuts (Boykov and Jolly, 2001). Grab-

Cut has been applied in many applications for the foreground extraction (Haasch

et al., 2005; Moller et al., 2005; Deepti et al., 2007) from an image. In this chap-

ter, we first present a comparative study of these two segmentation techniques,

and illustrate conditions under which either or both of them fail. We propose a

novel formulation for integrating these two complimentary techniques to obtain an

automatic segmentation of a foreground object with holes. We call our proposed

integrated approach as “SnakeCut”, which is based on a probabilistic framework.

Rest of this chapter is organized as follows. In section 5.1, we briefly present

active contour (for colored images) and GrabCut techniques which provides the

theoretical basis for the chapter. Section 5.2 compares the two techniques and dis-

cusses the limitations of both. In section 5.3, we present the SnakeCut algorithm,

our proposed segmentation technique for segmentation of foreground object. Sec-

tion 5.4 presents results on simulated and natural images. We conclude the chapter

in section 5.5.

5.1 Preliminaries

5.1.1 Parametric Active Contour (Snake) Model for Color

Images1

Active contours are energy minimizing contours. Energy associated with the con-

tour is defined using the internal (derived from contour) and external (derived

from the image) parameters of the contour, which minimizes its energy to esti-

mate the object boundary. We extend the traditional active contour algorithm

for color images, by modifying its external energy function. In a traditional active

contour for gray level images, typical external energy is defined (as follows) to

lead Snake towards step edges (Kass et al., 1988):

Eext = −|∇I(x, y)|2 (5.1)

where, I(x, y) is an image with (x, y) as spatial co-ordinates. So the external

energy in gray level images depends on the intensity gradient present in the image.

1details of Snake technique, have been presented in section 3.1.1

77

(a) (b) (c) (d)

Fig. 5.1: Estimation of gradient in color and gray level images: (a) input image, (b)gradient image of (a) estimated using Eqn. 5.2, (c) gray scale image of (a); (d)gradient image of (c).

To define the external energy in color images, we estimate the intensity gradient

by taking the maximum of the gradients of R, G and B bands at every pixel,

using the following equation:

|∇I| = max(|∇R|, |∇G|, |∇B|) (5.2)

Alternatively, external energy of the contour in color images can also be defined by

first converting the color image to gray level and then using Eqn. 5.1 to estimate

it, but this method does not accurately represent the edges present in an image.

Fig. 5.1(b) shows an example of intensity gradient estimated using Eqn. 5.2,

for the image shown in Fig. 5.1(a). Fig. 5.1(d) shows the intensity gradient for

the same input image estimated after converting it to a gray level image (Fig.

5.1(c)). We can see that the gradient obtained using Eqn. 5.2 gives better edge

information. In this work, we use Snake energy (Eqn. 3.1) for the segmentation of

colored objects, where external energy component of the Snake is estimated using

Eqn. 5.2. One can also explore the use of various other color edge detectors such

as Cumani (1991); Toivanen et al. (2003); Evans and Liu (2006) to estimate the

external energy of the Snake.

78

5.1.2 GrabCut

GrabCut (Rother et al., 2004) is an interactive tool based on iterative graph-cut

(Boykov and Jolly, 2001) for foreground extraction in still images. To segment

a foreground object using GrabCut, the user has to select an area of interest

(AOI) with a rectangle to obtain the desired result. GrabCut extends the graph-

cut based segmentation technique, introduced by Boykov and Jolly (Boykov and

Jolly, 2001), using color information. In this section, we briefly discuss about the

process of GrabCut. More details of GrabCut can be obtained from (Rother et al.,

2004).

Consider image I as an array z = (z1, ..., zn, ..., zN) of pixels, indexed by the

single index n, where zn is in RGB space. Segmentation of the image is expressed

as an array of “opacity” values α = (α1, ..., αn, ..., αN) at each pixel. Generally

0 ≤ αn ≤ 1, but for hard segmentation αn ∈ {0, 1} with 0 for background and 1 for

foreground. For the purpose of segmentation, GrabCut constructs two separate

Gaussian mixture models (GMMs) to express the color distributions for the back-

ground and foreground. Each GMM, one for foreground and one for background,

is taken to be a full-covariance Gaussian mixture with K components. In order

to deal with the GMM tractability in an optimization framework, an additional

vector k = (k1, ..., kn, ..., kN) is taken, with kn ∈ {1, ..., K}, assigning to each pixel

a unique GMM component, which is either from the foreground or background

according to αn = 0 or 1.

GrabCut defines an energy function E such that its minimum should corre-

spond to a good segmentation, in the sense that it is guided both by the observed

79

foreground and background GMMs and that the opacity is “coherent”. This is

captured by “Gibbs” energy in the following form:

E(α,k, θ, z) = U(α,k, θ, z) + V (α, z) (5.3)

The data term U evaluates the fit of the opacity distribution α to the data z. It

takes into account the color GMM models, defined as

U(α,k, θ, z) =∑

n

D(αn, kn, θ, zn) (5.4)

where,

D(αn, kn, θ, zn) = − log p(zn|αn, kn, θn)− log π(αn, kn) (5.5)

Here, p(.) is a Gaussian probability distribution, and π(.) are mixture weighting

coefficients. Therefore, the parameters of the model are now

θ = {π(α, k), µ(α, k), Σ(α, k); α = 0, 1; k = 1..K}

where, π, µ and Σ’s represent the weights, means and covariances respectively of

the 2K Gaussian components for the background and the foreground distributions.

In Eqn. 5.3, the term V is called the smoothness term and is given as follows:

V (α, z) = γ∑

(m,n)∈R

1

dist(m, n)[αn 6= αm]exp(−β(‖zm − zn‖2)) (5.6)

where, [φ] denotes the indicator function taking values {0, 1} for a predicate φ, γ is

a constant, R is the set of neighboring pixels, and dist(.) is the Euclidian distance

80

(a) (b) (c) (d)

Fig. 5.2: (a) Input image, elliptical object present in the image contains a rectangularhole at the center, (b) foreground initialization by user, (c) active contour segmenta-tion result, and (d) GrabCut segmentation result.

of neighboring pixels. This energy function in Eqn. 5.6 encourages coherence in

the regions of similar color distribution.

Once the energy model is defined, segmentation can be estimated as a global

minimum:

α = arg minα

E(α, θ) (5.7)

Energy minimization in GrabCut is done by using standard minimum cut

algorithm (Boykov and Jolly, 2001). Minimization follows an iterative procedure

that alternates between estimation and parameter learning.

5.2 Comparison of Active Contour and GrabCut

Methods

Active contour relies on the presence of intensity gradient (boundary discontinu-

ities) in the image. So it is a good tool for the estimation of the object boundaries.

An evolving contour searches the places of high intensity gradient (edges) in the

image and latches there. But, since the contour cannot penetrate inside the object

81

boundary, it is not able to remove the undesired parts, say holes, present inside

the object boundary. If an object has holes in it, active contour will detect the

holes as part of the object. Fig. 5.2(c) shows one such segmentation example us-

ing active contour for a synthetic image shown in Fig. 5.2(a). Input image (Fig.

5.2(a)) contains a foreground object which has a rectangular hole at the center,

through which the gray color background is visible. Segmentation result for this

image (shown in Fig. 5.2(c)) using active contour, contains the hole included as

a part of the detected object which is incorrect. Since the Snake could not go

inside the object boundary, it has converted the outer background into white but

retained the hole as gray. Similar erroneous segmentation result of active contour

for a real image (shown in Fig. 5.3(a)) is shown in Fig. 5.3(b). One can see that

segmentation output contains a part of the background region (e.g. black back-

ground visible through teapot’s handle) along with the foreground object. Fig.

5.4(b) shows another erroneous active contour segmentation result for the image

shown in Fig. 5.4(a). Segmentation output contains some pixels in the interior

part of the foreground object (wheel) from the background texture region. One

more incorrect segmentation result of active contour for a real image (shown in

Fig. 5.5(a)) is shown in Fig. 5.5(b). One can see that segmentation output con-

tains a part of the background region (e.g. grass patch between legs) along with

the foreground object.

On the other hand, GrabCut considers global color distribution (with local

pixels similarities) of the background and foreground pixels for segmentation. So

it has the ability to remove interior pixels which are not part of the object. To

segment the object using GrabCut, user draws a rectangle enclosing the foreground

82

(a) (b) (c)

Fig. 5.3: (a) Teapot image; segmentation results of (b) active contour and (c) Grab-Cut.

(a)(b) (c)

Fig. 5.4: (a) Image containing wheel; segmentation results of (b) active contour and(c) GrabCut.

(a) (b) (c)

Fig. 5.5: (a) Soldier Image, segmentation results of (b) active contour and (c) Grab-Cut.

83

object. Pixels outside the rectangle are considered as background pixel and pixels

inside the rectangle are considered as unknown. GrabCut estimates the color

distribution for the background and the unknown region using separate GMMs

(see section 5.1.2). Then, it iteratively removes the pixels from the unknown

region which belong to background.

Major problem with the GrabCut is as follows. If some parts of the foreground

object have color distribution similar to the image background, then those parts

of the foreground object will also be eliminated by the GrabCut. So GrabCut is

not intelligent enough to distinguish between the desired and unnecessary pixels,

while eliminating some of the pixels from the unknown region. Fig. 5.2(d) shows

one such segmentation result of GrabCut for the image shown in Fig. 5.2(a),

where the objective is to segment the object with a hole present in the image.

Although the hole inside the foreground object has been removed, segmentation

result does not produce the upper part of the object (shown in green color in Fig.

5.2(a)) near the boundary. This happens because in the input image (Fig. 5.2(a)),

a few pixels with green color are present as a part of the background region. Fig.

5.3(c) presents a GrabCut segmentation result for a real world image shown in

Fig. 5.3(a). The objective in this case is to crop the teapot from the input image.

GrabCut segmentation result for this input image does not produce the teapot’s

left side portion, due to its similarity with the background. In another real world

example in Fig. 5.4(a), where the user targets to crop the wheel present in the

image, GrabCut segmentation result (Fig. 5.4(c)) does not produce the wheel’s

grayish green rubber part. This is due to the presence of some other objects

with similar color in the background. Fig. 5.5(c) presents one more GrabCut

84

segmentation result for a real world image shown in Fig. 5.5(a). The objective

in this case is to crop the soldier from the input image. GrabCut segmentation

result for this input image does not produce the soldier’s hat and the legs, due to

their similarities with background texture.

In all these cases (Figs. 5.3, 5.4 and 5.5), the hole(s) within the foreground

object have been detected and removed by the GrabCut algorithm.

In GrabCut (Rother et al., 2004) algorithm, missing data of the foreground

object is often recovered by user interaction. User has to mark the missing object

parts as compulsory foreground. In this chapter, we present an automatic fore-

ground object segmentation technique based on the integration of active contour

and GrabCut, which can produce accurate segmentation in situations where both

or either of these techniques fail. We call our proposed technique as “SnakeCut”,

which is based on a probabilistic framework for integration. We present it in the

next section.

5.3 SnakeCut: Integration of Active Contour and

GrabCut

Active contour works on the principle of intensity gradient, where the user ini-

tializes a contour around or inside the object for it to detect the boundary of

the object easily. GrabCut, on the other hand, works on the basis of the pixel’s

color distribution and considers global cues for segmentation. Hence it can eas-

ily remove the unwanted part (parts from the background) present inside the

85

object boundary. These two segmentation techniques use complementary infor-

mation (edge and region based) for segmentation. In SnakeCut, we combine these

complementary techniques and present an integrated method for superior object

segmentation. Fig. 5.6 presents the overall flow chart of our proposed segmenta-

tion technique. In SnakeCut, input image is segmented using the active contour

and GrabCut separately. These two segmentation results are provided as inputs

to the probabilistic framework of SnakeCut, which integrates the two segmenta-

tion results based on a probabilistic criterion and produces the final segmentation

result.

Main steps of the SnakeCut algorithm are provided in Algorithm 2. The

probabilistic framework used to integrate the two outputs is as follows. Inside the

object boundary C0 (detected by the active contour), every pixel zi is assigned

two probabilities, Pc(zi) and Ps(zi), where

• Pc(zi): provides information about the pixel’s nearness to the boundary, and

• Ps(zi): indicates how similar the pixel is to the background

A large value of Pc(zi) indicates that pixel zi is far from the boundary and a

large value of Ps(zi) specifies that the pixel is more similar to the background.

To take the decision about a pixel belonging to foreground or background, we

evaluate a decision function p as follows:

p(zi) = ρPc(zi) + (1− ρ)Ps(zi) (5.8)

where, ρ is the weight which controls the relative importance of the two tech-

niques, which is learnt empirically. Probability Pc is computed from the distance

86

transform (DT) (Breu et al., 1995) of the object boundary C0. DT has been used

in many computer vision applications (Paglieroni et al., 1994; Tsang et al., 1994;

Sanjay et al., 1998; Lee et al., 2007). For image I, its DT image Id is given by the

following equation:

Id(zi) =

0, if zi lies on contour C0

d, otherwise

(5.9)

where, d is the Euclidian distance of pixel zi to the nearest contour point. Fig.

5.7(b) shows an example of DT image for the contour image shown in Fig. 5.7(a).

Distance transform values are first normalized in the range [0, 1], before they are

used for the estimation of Pc.

Let In be the normalized distance transform image of Id and dn be the DT

value of a pixel zi in In (i.e. dn = In(zi)). Probability Pc of zi is estimated using

the following fuzzy distribution function:

Pc(zi) =

0, 0 ≤ dn < a;

2(

dn−ab−a

)2, a ≤ dn < a+b

2;

1− 2(

b−dn

b−a

)2, a+b

2≤ dn < b;

1, b ≤ dn ≤ 1.

(5.10)

where, a and b are constants and a < b. When a ≥ b this becomes a step

function with transition at (a+b)/2 from 0 to 1. Probability distribution function

(Eqn. 5.10) has been chosen in such way that the probability value Pc is small

near the contour C0 and large for points farther away. In this fuzzy function, a

87

Fig. 5.6: Flow chart of the proposed SnakeCut technique.

(a) (b) (c)

Fig. 5.7: Segmentation of image shown in Fig. 5.2(a) using SnakeCut: (a) objectboundary produced by active contour, (b) distance transform for the boundary contourshown in (a); (c) SnakeCut segmentation result.

and b dictate the non-linear behavior of the function. The parameters a and b

control the extents (distance from the boundary) to which the output response

is considered from Snake and then onwards from that of GrabCut respectively.

The extent to which points are considered to be near the contour, can be suitably

controlled by choosing appropriate values of a and b. The value of Pc is zero

(0) when the distance of the pixel from the boundary is in the range [0..a], and

one (1) in the range [b..1] (all values normalized). For values between [a..b], we

empirically found the smooth, non-linear S-shaped function to provide the best

result. Fig. 5.8 shows the effect of the interval [a, b] on the distribution function.

Probability value Ps is obtained from the GrabCut segmentation process.

GrabCut assigns two likelihood values to each pixel in the image using the GMMs

88

Fig. 5.8: Effect of interval [a, b] on the non-linearity of the fuzzy distribution function(Eqn. 5.10). When a < b, transition from 0 (at a) to 1 (at b) is smooth. Whena ≥ b, we have a step function with the transition at (a + b)/2.

constructed for the foreground and background, which represent how likely a pixel

belongs to the foreground and background respectively. In our approach, after the

segmentation of the object using GrabCut, final background GMMs are used to

estimate Ps. For each pixel zi inside C0, D(zi) is computed using Eqn. 5.5 con-

sidering background GMMs. Normalized values of D between 0 and 1, for all the

pixels inside C0, define the probability Ps.

Using the decision function p(zi) estimated in Eqn. 5.8 and a empirically

estimated threshold T , GrabCut and active contour results are integrated using

the SnakeCut algorithm (Algorithm 2). In the integration process of the SnakeCut

algorithm, segmentation output for a pixel is taken from the GrabCut result if

p > T , otherwise it is taken from the active contour result. In our experiments,

we have used ρ = 0.5. The values of T , a and b ranges from 0.60−0.80, 0.10−0.15

and 0.15− 0.20 respectively to give the best results.

We demonstrate the integrated approach to the process of foreground segmen-

tation with the help of a simulated example in Fig. 5.7, with the simulated image

89

Algorithm 2 Steps of SnakeCut

• Input I and output Isc.

• All pixels of Isc are initialized to zero.

A. Initial Segmentation

1. Segment desired object in I using active contour. Say, object boundaryidentified by the active contour is C0 and segmentation output of activecontour is Iac.

2. Segment desired object in I using GrabCut. Say, segmentation output isIgc.

B. Integration using SnakeCut

1. Find set of pixels Z in image I, which lie inside contour C0.

2. For each pixel zi ∈ Z,

(a) Compute p(zi) using Eqn. 5.8.

(b) if p(zi) ≤ T thenIsc(zi) = Iac(zi)

elseIsc(zi) = Igc(zi)

end if

90

shown in Fig. 5.2(a). Intermediate segmentation outputs produced by active con-

tour and GrabCut for this image have been shown in Fig. 5.2(c) & Fig. 5.2(d).

These outputs are integrated by the SnakeCut algorithm. Fig. 5.7(a) shows the

object boundary obtained by active contour for the object shown in Fig. 5.2(a).

Active contour boundary is used to estimate the distance transform, shown in

Fig. 5.7(b), using Eqn. 5.9. Probability values Pc and Ps are estimated for all

the pixels inside the object boundary obtained by active contour as described

earlier. SnakeCut algorithm (Algorithm 2) is then used to integrate the results of

active contour and GrabCut. Fig. 5.7(c) shows the final SnakeCut output after

integration of the intermediate outputs (Fig. 5.2(c) & Fig. 5.2(d)) obtained using

active contour and GrabCut algorithms. Our proposed method is able to retain a

part of the object which appears similar to background color and simultaneously

eliminate the hole within the object.

To demonstrate the impact of the probability values Pc and Ps, and its impact

on the decision making in SnakeCut algorithm, we use the teapot image (Fig.

5.3(a)). The values of Pc, Ps and p for a few points marked on the teapot image

(Fig. 5.9(a)) are shown in Fig. 5.9(b). The SnakeCut algorithm is then used to

obtain the final segmentation decision. Last column of the table in Fig. 5.9(b)

shows the final decision taken by SnakeCut based on the estimated value of p.

5.4 SnakeCut Segmentation Results

To extract a foreground object using SnakeCut, user needs to draw a rectangle (or

polygon) surrounding the object as the Region of Interest (ROI). This rectangle

91

(a)

Point Pc Ps p Outputtaken from

A 0.1684 0.6551 0.4118 SnakeB 1.0000 0.5000 0.7500 GrabCutC 1.0000 0.7031 0.8516 GrabCutD 0.0000 0.5000 0.2500 SnakeE 1.0000 0.5000 0.7500 GrabCutF 0.0000 0.6769 0.3384 SnakeG 0.0000 0.5000 0.2500 Snake

(b)

Fig. 5.9: Demonstration of the impact of Pc and Ps values on the decision making inAlgorithm 2: (a) teapot image with a few points marked on it, (b) values of Pc, Ps,p, and the decision obtained using Algorithm 2, for the points labeled in (a). Valuesused for ρ and T are 0.5 and 0.6 respectively.

(a) (b) (c) (d)

Fig. 5.10: Demonstration of a SnakeCut result on a synthetic image, where Snake failsand GrabCut works: (a) input image with foreground initialized by the user (objectcontains a rectangular hole at the center), (b) Snake segmentation result (incorrect,output contains the hole as a part of the object), (c) GrabCut segmentation result(correct, hole is removed), and (d) SnakeCut segmentation result (correct, hole isremoved).

(a) (b) (c) (d)

Fig. 5.11: Demonstration of a SnakeCut result on a synthetic image, where Snakeworks and GrabCut fails: (a) input image with foreground initialized by the user, (b)Snake segmentation result (correct), (c) GrabCut segmentation result (incorrect, uppergreen part of the object is removed), and (d) correct segmentation result produced bySnakeCut.

92

(a) (b) (c) (d)

Fig. 5.12: Segmentation of real pot image: (a) input real image, (b) active contoursegmentation result (incorrect), (c) GrabCut segmentation result (correct), and (d)SnakeCut segmentation result (correct, background pixels visible through the handlesof the pot are removed).

is used in the segmentation process of active contour as well as GrabCut. Active

Contour considers the rectangle as an initial contour and deforms it to converge on

the object boundary. GrabCut uses the rectangle to define the background and

unknown regions. Pixels outside the rectangle are taken as known background

and those inside as unknown. GrabCut algorithm (using GMM based modeling

and minimal cost graph-cut) iterates and converges to a minimum energy level

producing the final segmentation output. Segmentation outputs of active contour

and GrabCut are integrated using the SnakeCut algorithm to obtain the final seg-

mentation result. First, we present a few results of segmentation using SnakeCut

on synthetic and natural images, where either Snake or GrabCut fails to work.

This will be followed by a few examples where both Snake and GrabCut tech-

niques fail to produce correct segmentation, whereas integration of the outputs of

both these techniques using SnakeCut algorithm gives the correct segmentation

results.

Fig. 5.10 shows a result on a synthetic image where active contour fails but

93

GrabCut works, and their integration (i.e. SnakeCut) also produces the correct

segmentation. Fig. 5.10(a) shows an image where the object to be segmented

has a rectangular hole (at the center) in it through which the gray colored back-

ground is visible. Segmentation result produced by active contour (Fig. 5.10(b))

shows the hole as a part of the segmented object which is incorrect. In this case,

GrabCut produces the correct segmentation (Fig. 5.10(c)) of the object. Fig.

5.10(d) also shows the correct segmentation result, produced by SnakeCut for

this image. Fig. 5.11 shows a result on another synthetic image where active

contour works but GrabCut fails, and their integration (i.e. SnakeCut) produces

the correct segmentation. Fig. 5.11(a) shows an image where the object (with no

hole) to be segmented has a part (upper green region) similar to the background

(green flowers). Active contour, in this example, produces correct segmentation

(Fig. 5.11(b)) while GrabCut fails (Fig. 5.11(c)). Fig. 5.11(d) shows the correct

segmentation result produced by SnakeCut for this image. Fig. 5.12 presents the

result of SnakeCut segmentation for a real world image. In this example, active

contour fails but GrabCut performs correct segmentation. We see in Fig. 5.12(b)

that the active contour segmentation result contains the part of the background

(visible through the handles) which is incorrect. SnakeCut algorithm produces

correct segmentation result which is shown in Fig. 5.12(d).

In the examples presented so far, we have seen that only one among the two

(Snake and GrabCut) techniques fail to perform correct segmentation. In these

cases, either the Snake is unable to remove holes from the foreground object or

GrabCut is unable to retain the parts of the object which are similar to the

background. SnakeCut performs well in all such situations. We now present a few

94

(a) (b) (c)

Fig. 5.13: SnakeCut segmentation results of (a) teapot (for the image in Fig. 5.3(a)),(b) wheel (for the image in Fig. 5.4(a)) and (c) soldier (for the image in Fig. 5.5(a)).

(a) (b) (c) (d)

Fig. 5.14: Segmentation of cup image: (a) input real image, (b) segmentation resultproduced by Snake (incorrect, as background pixels visible through the cup’s handleare detected as a part of the object), (c) GrabCut segmentation result (incorrect, asthe spots present on the cup’s handle are removed), and (d) correct segmentationresult produced by SnakeCut.

95

(a)

(b)

(c) (d)

Fig. 5.15: Segmentation of webcam bracket image: (a) input real image where theobjective is to segment the lower bracket present in the image, (b) Snake segmentationresult (incorrect, as background pixels visible through the holes present in the object aredetected as part of the foreground object), (c) GrabCut segmentation result (incorrect,as large portions of the bracket are removed in the result), and (d) correct segmentationresult produced by SnakeCut.

results on synthetic and real images, where SnakeCut performs well even when

both the Snake and GrabCut techniques fail to perform correct segmentation.

Fig. 5.13 presents three such SnakeCut results on real world images. Fig. 5.13(a)

shows the segmentation result produced by SnakeCut for the teapot image shown

in Fig. 5.3(a). This result is obtained without user interaction, by integrating

the active contour and GrabCut outputs shown in Figs. 5.3(b) and 5.3(c). Fig.

5.13(b) shows the segmentation result produced by SnakeCut, for the wheel image

shown in Fig. 5.4(a). Intermediate active contour and GrabCut segmentation

results for the wheel are shown in Figs. 5.4(b) and 5.4(c). Fig. 5.13(c) shows

the segmentation result produced by SnakeCut, for the soldier image shown in

Fig. 5.5(a). Intermediate active contour and GrabCut segmentation results for

the soldier image are shown in Figs. 5.5(b) and 5.5(c).

96

Two more SnakeCut segmentation results are presented in Figs. 5.14 and

5.15 for a cup and a part of webcam bracket images, where both Snake and

GrabCut techniques fail to perform correct segmentation. The objective in the

cup example (Fig. 5.14(a)) is to segment the cup in the image. Cup’s handle

has some blue color spots similar to the background color. Snake and GrabCut

results for this image are shown in Fig. 5.14(b) and Fig. 5.14(c) respectively.

One can observe that both these results are erroneous. Result obtained using

Snake contains some part of the background which is visible through the handle.

GrabCut has removed the spots in the handle since their color is similar to the

background. Correct segmentation result produced by SnakeCut is shown in Fig.

5.14(d). Objective in the webcam bracket example (Fig. 5.15(a)) is to segment

the lower bracket (inside the red contour initialized by the user) present in the

image. Snake and GrabCut results for this image are shown in Fig. 5.15(b) and

Fig. 5.15(c) respectively. We can see that both these results are erroneous. The

result obtained using Snake contains some part of the background which is visible

through the holes. GrabCut has removed large portions of the bracket. This is

due to the similarity of the distribution of the metallic color of a part of another

webcam bracket present in the background (it should be noted that the color

distribution of the two webcam brackets are not exactly same due to different

lighting effects). Correct segmentation result produced by SnakeCut is shown in

Fig. 5.15(d). We also observed a similar performance when the initialization was

done around the bracket on the top, for the image in Fig. 5.15(a).

In Fig. 5.16, we compare the automatic SnakeCut segmentation results of

teapot (Fig. 5.3(a)), wheel (Fig. 5.4(a)), soldier (Fig. 5.5(a)) and webcam bracket

97

(a) (b)

(c) (d)

(e) (f)

(g) (h)

Fig. 5.16: Comparison of the results: (a) SnakeCut result for teapot, (b) GrabCutresult for teapot with user interaction, (c) SnakeCut result for wheel, (d) GrabCutresult for wheel with user interaction, (e) SnakeCut result for soldier, (f) GrabCutresult for soldier with user interaction (reproduced from (Rother et al., 2004)), (g)SnakeCut result for webcam bracket, (h) GrabCut result for webcam bracket with userinteraction.

98

(Fig. 5.15(a)) images with the interactive GrabCut outputs. To obtain correct

segmentation for these images with GrabCut, a large amount of user interaction

was necessary to obtain the results shown in Figs. 5.16(b), 5.16(d), 5.16(f) &

5.16(h). In case of teapot (Fig. 5.16(b)), user marked the teapot’s left side

portion as parts of the compulsory foreground. In case of wheel (Fig. 5.16(d))

user marked the outer grayish green region of the wheel as a compulsory part

of the foreground object and in case of soldier (Fig. 5.16(f)), user marked the

soldier’s hat and legs as a part of the compulsory foreground. In case of webcam

bracket (Fig. 5.16(h)) user marked missing regions as compulsory parts of the

foreground object. Segmentation results using SnakeCut were obtained without

any user interaction (except initialization of ROI and in few cases some hard

constraint points at the object boundary) and are mostly better than the results

obtained by GrabCut with user’s corrective editing. One can easily observe the

smooth edges obtained at the border of teapot in Fig. 5.16(a), unlike that in Fig.

5.16(b). The same is true for Fig. 5.16(e) (w.r.t Fig. 5.16(f)) and Fig. 5.16(g)

(w.r.t Fig. 5.16(h)), which may be noticed after careful observations.

The presented approach combines the complementary strengths of active con-

tour and GrabCut processes, to produce correct segmentation in cases where one

or both of these techniques fail. However, the proposed technique (SnakeCut) was

observed to have the following limitations:

1. Since the SnakeCut relies on Active contours for regions near the objectboundary, it fails when holes of the object (through which the backgroundis visible) lie very close to the boundary.

2. Since the Snake cannot penetrate inside the object boundary and detectholes, the proposed method of SnakeCut has to rely on the response of theGrabCut algorithm in such cases. This may result in a hazardous situation

99

(a) (b) (c) (d)

Fig. 5.17: Example where SnakeCut fails: (a) input image with foreground initializedby user, (b) active contour segmentation result (correct), (c) GrabCut segmentationresult (incorrect), and (d) SnakeCut segmentation result (incorrect).

only when the GrabCut detects an interior part, actually belonging to theobject, as a hole due to its high degree of similarity with the background.Since decision logic of SnakeCut relies on GrabCut response for interior partsof the object, it may fail in cases where GrabCut does not detect those partsof the object as foreground.

Fig. 5.17 presents one such situation (using a simulated image) where Snake-

Cut fails to perform correct segmentation. Fig. 5.17(a) shows a synthetic image

where active contour works correctly (see Fig. 5.17(b)) but GrabCut fails (see Fig.

5.17(c)). GrabCut removes the central rectangular green part of the object in the

segmented output, which is actually (should be perceived) a part of the object.

We see in this case that SnakeCut also does not perform correct segmentation and

removes the object’s central rectangular green part from the segmentation result.

SnakeCut thus fails when certain parts of the foreground object are far away (and

interior) from its boundary and very similar to the background.

The heuristic values of some of the parameters used in our algorithm, which

were obtained empirically, were not so critical (with in a certain range) for accurate

foreground object segmentation. The overall computational times required by

SnakeCut on a P-IV, 3 GHz machine with 2 GB RAM, are given in Table 5.1 for

some of the images.

100

Table 5.1: Computational times for foreground object segmentation, required bySnake, GrabCut and SnakeCut for various images.

Image Size Time required (in seconds) forImage Name (in pixels) Snake GrabCut Integrationa SnakeCut

(A) (B) (C) (A+B+C)

Synthetic Image250× 250 4 5 2 11

(Fig. 5.2(a))Teapot Image

519× 375 10 9 4 23(Fig. 5.3(a))Wheel Image

640× 480 6 14 5 25(Fig. 5.4(a))Soldier Image 321× 481 13 12 7 32(Fig. 5.5(a))Synthetic Image

250× 250 4 5 2 11(Fig. 5.10(a))Pot image

296× 478 6 7 4 17(Fig. 5.12(a))Cup image

285× 274 5 7 3 15(Fig. 5.14(a))Webcam bracket

321× 481 7 8 3 18(Fig. 5.15(a))

atime required to integrate Snake and GrabCut outputs using the probabilistic integrator.


In this chapter, we have presented a novel object segmentation technique based on

the integration of two complementary object segmentation techniques, namely ac-

tive contour and GrabCut. Active contour cannot remove the holes in the interior

part of the object. GrabCut produces poor segmentation results in cases when the

color distribution of some part of the foreground object is similar to background.

The proposed segmentation technique, termed SnakeCut, based on a probabilistic

framework provides an automatic way of object segmentation, where the user has

to only specify the rectangular boundary (ROI) around the desired foreground ob-

ject. Our proposed method is able to retain parts of the foreground object which

101

appear similar to background color and simultaneously eliminate holes present

within the object. SnakeCut is more suitable for object segmentation when the

object’s boundary localization by the GrabCut is not good and object contains

holes in it. In this situation, Snake helps in the localization of the boundary and

GrabCut removes the holes or unwanted background parts from the object’s in-

terior. Hence SnakeCut produces correct segmentation result by combining the

advantages and power of these two complementary techniques. We had validated

our technique with a few synthetic and natural images. Results obtained using

SnakeCut are quite encouraging and promising.

102

CHAPTER 6

Conclusion

In this thesis, we have presented novel methods for foreground object segmentation

in textured and non-textured images. In the first part of the work, we developed

techniques for the segmentation of single or multiple object(s) from a given image,

in presence of foreground and background textures. In the second part of our work,

we developed a technique for efficient segmentation of an object, which contains

holes in it and the color distribution of a part of the object is similar to the

background. We conducted experiments on both synthetic and natural images to

validate our techniques. Following are the key contributions of the work presented

in this thesis.

6.1 Contribution

The key contributions of the work presented in the thesis are as follows:

• It proposes a scalogram based texture feature extraction technique for snakes.

• A new external force for snakes has been introduced, which we call as “tex-ture force”. Texture force is modeled using the scalogram based texturefeatures. Snake, in presence of texture force, is able to segment textured ob-ject. In most of the cases, our proposed method provides better performancecompared to other existing techniques.

• We describe a novel multidimensional scalogram based texture feature ex-traction technique and use it to develop an efficient geodesic active con-tour based segmentation technique for multiple textured objects. Proposedsegmentation technique is based on the generalization of geodesic active

contour, from one dimensional intensity based feature space to multidimen-sional texture feature space. Experimental results show the effectiveness ofthe proposed technique. Results are also compared with existing techniquesin literature.

• We propose a novel, efficient and semi-interactive method for foregroundobject segmentation in color images using the integration of two popularforeground object segmentation techniques, namely parametric active con-tours (Snakes) and GrabCut. The proposed technique, termed as “Snake-Cut”, segments a foreground object with holes. It is based on a probabilisticframework and provides an automatic way of segmenting an object contain-ing holes in it. Our results are quite satisfactory and comparable with thoseobtained using GrabCut with user’s post-corrective editing.

6.2 Future Scope of Work

The proposed methods of object segmentation presented in this thesis can lead to

various other problems for investigations. In the following, we list a few extensions

of our proposed methods for future scope of work:

• Texture features developed in our work can be used in many segmentationand clustering algorithms.

• Texture object segmentation methods presented in this thesis are based onboundary-based information. As a possible extension, one can integrate thesame with region-based information to improve the performance.

• SnakeCut technique handles the segmentation of a single object. As anextension of this work, one can use geodesic active contour (which can in-trinsically segment multiple objects) to make the technique suitable for thesegmentation of multiple objects.

• SnakeCut algorithm can be extended for textured objects by incorporatingtexture features of the image.

• Segmentation methods presented in this work can be extended for objectsegmentation and tracking in videos.

• Auto-initialization is an intelligent process which minimizes the requirementof user interface to a further degree. This may make the process moreautomatic for a particular application - say focus of attention, as in humanbeings, for intelligent robotic vision applications.

• Performance of the same in case of noisy and blurred images.

104

REFERENCES

1. Abdel Alim, O. and M. Sharkas, Texture classification of the human iris usingartificial neural networks. In Proceedings of IEEE MELECON . Cairo, Egypt,2002, 580–583.

2. Acharyya, M. and M. K. Kundu, Two texture segmentation using M-bandwavelet transform. In Proceedings of International on Pattern Recognition, ICPR’00 , volume 3. 2000, 401–404.

3. Amini, A. A., T. E. Weymouth, and R. C. Jain (1990). Using dynamicprogramming for solving variational problems in vision. IEEE Transactions onPattern Analysis and Machine Intelligence, 12(9), 855–867.

4. Arivazhagan, S. and L. Ganesan (2003). Texture classification using wavelettransform. Pattern Recognition Letters , 24(9-10), 1513–1521.

5. Awate, S. P., T. Tasdizen, and R. T. Whitaker, Unsupervised texture seg-mentation with nonparametric neighborhood statistics. In Proceedings of Euro-pean Conference on Computer Vision, ECCV’ 06 , LNCS 3952. 2006, 494–507.

6. Bashar, M. K., T. Matsumoto, and N. Ohnishi (2003). Wavelet transform-based locally orderless images for texture segmentation. Pattern Recognition Let-ters , 24(15), 2633–2650.

7. Blake, A., C. Rother, M. Brown, P. Perez, and Torr, Interactive imagesegmentation using an adaptive GMMRF model. In Proceedings of EuropeanConference on Computer Vision, ECCV’ 04 . Prague, Chech Republic, 2004, 428–441.

8. Boykov, Y. and G. Funka-Lea (2006). Graph-cuts and efficient N-D imagesegmentation. International Journal of Computer Vision, 70(2), 109–131.

9. Boykov, Y. and M.-P. Jolly, Interactive graph-cuts for optimal boundary andregion segmentation of objects in N-D images. In Proceedings of InternationalConference on Computer Vision, ICCV’ 01 , volume 1. 2001, 105–112.

10. Boykov, Y. and V. Kolmogorov, Computing geodesics and minimal surfacesvia graph-cuts. In Proceedings of International Conference on Computer Vision,ICCV’ 03 . IEEE Computer Society, 2003, 26–33.

11. Boykov, Y., V. Kolmogorov, D. Cremers, and Delong, An integral solutionto surface evolution PDEs via geo-cuts. In Proceedings of European Conferenceon Computer Vision, ECCV’ 06 , volume 3 of LNCS 3953 . Graz, Austria, 2006,409–422.

105

12. Breu, H., J. Gil, D. Kirkpatrick, and M. Werman (1995). Linear time eu-clidean distance algorithms. IEEE Transactions on Pattern Analysis and MachineIntelligence, 17(5), 529–533.

13. Brodatz, P., Textures: A photographic album for Artists and Designers . DoverPublications Inc, New York, 1966.

14. Burt, P. J. and E. H. Andelson (1983). The Laplacian pyramid as a compactimage code. IEEE Transactions on Communication, COM-31, 532–540.

15. Carter, P. H., Texture discrimination using wavelets. In Proceedings of SPIE,Applications of Digital Image Processing-XIV , volume 1567. 1991, 432–438.

16. Caselles, V., F. Catte, and T. Coll (1993). A geometric model for activecontours. Numerische Mathematik , 66, 1–31.

17. Caselles, V., R. Kimmel, and G. Saprio (1997). Geodesic active contours.International Journal of Computer Vision, 22(1), 61–79.

18. Chan, T., B. Sandberg, and L. Vese (2000). Active contours without edgesfor vector-valued images. Journal of Visual Communication and Image Represen-tation, 11(2), 130–141.

19. Chan, T. and L. Vese (2001). Active contours without edges. IEEE Transactionson Image Processing , 10(2), 266–277.

20. Chang, T. and C. Kuo (1993). Texture analysis and classification with tree-structured wavelet transform. IEEE Transactions on Image Processing , 2(4),429–440.

21. Charalampidis, D. and T. Kasparis (2002). Wavelet-based rotational invariantroughness features for texture classification and segmentation. IEEE Transactionson Image Processing , 11(8), 825–837.

22. Chiou, G. I. and J. N. Hwang (1995). A neural network-based stochasticactive contour model NNS-SNAKE for contour finding of distinct features. IEEETransactions on Image Processing , 4(10), 1407–1416.

23. Chitre, Y. and A. P. Dhawan (1999). M-band wavelet discrimination of naturaltextures. Pattern Recognition, 32, 773–789.

24. Clerc, M. and S. Mallat (2002). The texture gradient equations for recover-ing shape from texture. IEEE Transactions on Pattern Analysis and MachineIntelligence, 24(4), 536–549.

25. Cohen, I., L. D. Cohen, and N. Ayache (1992). Using deformable surfaces tosegment 3-D images and infer differential structures. CVGIP: Image Understand-ing , 56(2), 242–263.

26. Cohen, L. D. (1991). On active contour models and balloons. CVGIP: ImageUnderstanding , 53(2), 211–218.

106

27. Cohen, L. D. and R. Kimmel (1997). Global minimum for active contourmodels: A minimal path approach. International Journal of Computer Vision,24(1), 57–78.

28. Cox, I. J., S. B. Rao, and Y. Zhong, “Ratio regions”: A technique for imagesegmentation. In Proceedings of International Conference on Pattern Recognition,ICPR’ 96 . 1996, 557–564.

29. Cumani, A. (1991). Edge detection in multispectral images. CVGIP: GraphicalModels and Image Processing , 53(1), 40–51.

30. Daugman, J. G. (1985). Uncertainty relation for resolution in space, spatialfrequency, and orientation optimized by two-dimensional visual. Journal of OpticalSociety of America (A), 2(7), 1160–1169.

31. Deepti, P., R. Abhilash, and S. Das, Integrating linear subspace analysis andinteractive graphcuts for content-based video retrieval. In Proceedings of Interna-tional Conference on Advances in Pattern Recognition, ICAPR’ 07, January 2-4,2007 . World Scientific, Singapore, ISI Calcutta, India, 2007, 263–267.

32. Delingette, H. and J. Montagnat (2001). Shape and topology constraintson parametric active contours. Computer Vision Image Understanding , 83(2),140–171.

33. Dunn, D., E. Higgins, and J. Wakeley (1994). Texture segmentation using2-D Gabor elementary functions. IEEE Transactions on Pattern Analysis andMachine Intelligence, 16(2), 130–149.

34. Elsgolc, L. E., Calculus of Variations . Pergamon Press, 1963.

35. Evans, A. and X. Liu (2006). A morphological gradient approach to color edgedetection. IEEE Transactions on Image Processing , 15(6), 1454–1463.

36. Falcao, A. X., J. K. Udupa, S. Samarasekera, S. Sharma, B. E. Hirsch,and R. de A. Lotufo (1998). User-steered image segmentation paradigms: livewire and live lane. Graphical Models and Image Processing , 60(4), 233–260.

37. Freedman, D. and T. Zhang, Interactive graph-cut based segmentation withshape priors. In Proceedings of Conference on Computer Vision and PatternRecognition, CVPR’ 05 , volume 1. 2005, 755–762.

38. Gabor, D. (1946). Theory of communication. Journal of IEE , 93, Part III(26),429–457.

39. Geiger, D., A. Gupta, L. A. Costa, and J. Vlontzos (1995). Dynamicprogramming for detecting, tracking, and matching deformable contours. IEEETransactions on Pattern Analysis and Machine Intelligence, 17(3), 294–302.

40. Goldenberg, R., R. Kimmel, E. Rivlin, and M. Rudzsky (2001). Fastgeodesic active contours. IEEE Transaction on Image Processing , 10(10), 1467–1475.

107

41. Greiner, T. and S. Das (2006). Bestimmung der objektform durch bildanalyseder oberfluentextur-merkmalsgewinnung mittels m-kanal wavelet-transformation.Automatisierungstechnik , 54(10), 475–485.

42. Gupta, L. and S. Das, Texture edge detection using multi-resolution featuresand SOM. In Proceedings of the 18th International Conference on Pattern Recog-nition, ICPR ’06 . 2006, 199–202.

43. Haasch, A., N. Hofemann, J. Fritsch, and G. Sagerer, A multi-modal objectattention system for a mobile robot. In Proceedings of IEEE/RSJ InternationalConference on Intelligent Robots and Systems, IROS 2005 . Edmonton, Alberta,Canada, 2005, 1499–1504.

44. Haley, G. M. and B. S. Manjunath (1999). Rotation invariant texture clas-sification using complete space-frequency model. IEEE Transactions on ImageProcessing , 8(2), 255–269.

45. Hofmann, T., J. Puzicha, and J. Buhmann (1998). Unsupervised texture seg-mentation in a deterministic annealing framework. IEEE Transactions on PatternAnalysis and Machine Intelligence, 20(8), 803–818.

46. Jain, A. K. and F. Farrokhnia (1991). Unsupervised texture segmentationusing Gabor filters. Pattern Recognition, 24(12), 1167–1186.

47. Jermyn, I. H. and H. Ishikawa, Globally optimal regions and boundaries.In Proceedings of the International Conference on Computer Vision, ICCV’ 99 ,volume 2. 1999, 904.

48. Juan, O. and Y. Boykov, Active graph-cuts. In Proceedings of IEEE Conferenceon Computer Vision and Pattern Recognition, CVPR’ 06 . 2006, 1023–1029.

49. Kass, M., A. Witkin, and D. Terzopoulos (1988). Snakes: Active contourmodels. International Journal of Computer Vision, 1(4), 321–331.

50. Kim, J., J. W. F. III, J. Anthony Yezzi, M. Cetin, and A. S. Willsky, Non-parametric methods for image segmentation using information theory and curveevolution. In Proceedings of IEEE International Conference on Image Processing,ICIP’ 02 , volume 3. Rochester, New York, 2002, 797–800.

51. Kimia, B. B., A. R. Tannenbaum, and S. W. Zucker (1995). Shapes, shocks,and deformations I: the components of two-dimensional shape and the reaction-diffusion space. International Journal of Computer Vision, 15(3), 189–224.

52. Kimmel, R., R. Malladi, and N. A. Sochen, Image processing via the beltramioperator. In Proceedings of the 3rd Asian Conference on Computer Vision, ACCV’98 , volume 1 of LNCS 1351 . 1998, 574–581.

53. Kohli, P. and P. H. S. Torr, Effciently solving dynamic Markov random fieldsusing graph-cuts. In Proceedings of International Conference on Computer Vision,ICCV’ 05 . 2005, 922–929.

108

54. Kolmogorov, V. and Y. Boykov, What metrics can be approximated by geo-cuts, or global optimization of length/area and flux. In Proceedings of Interna-tional Conference on Computer Vision, ICCV’ 05 , volume 1. 2005, 564–571.

55. Kolmogorov, V., A. Criminisi, A. Blake, G. Cross, and C. Rother, Bi-layersegmentation of binocular stereo video. In Proceedings of IEEE Conference onComputer Vision and Pattern Recognition, CVPR’ 05 , volume 2. 2005, 407–414.

56. Kumar, M. P., P. H. S. Torr, and A. Zisserman, Obj cut. In Proceed-ings IEEE Conference on Computer Vision and Pattern Recognition, CVPR’ 05 ,volume 1. 2005, 18–25.

57. Lachaud, J.-O. and A. Montanvert (1999). Deformable meshes with auto-mated topology changes for coarse-to-fine three-dimensional surface extraction.Medical Image Analalysis , 3(2), 187–207.

58. Laine, A. and J. Fan (1993). Texture classification by wavelet packet signatures.IEEE Transactions on Pattern Analysis and Machine Intelligence, 15(11), 1186–1191.

59. Laws, K. (1980). Textured image segmentation. Ph.D. thesis, Dept. of ElectricalEngineering, University of Southern California.

60. Lee, D.-J., J. Archibald, X. Xu, and P. Zhan (2007). Using distance trans-form to solve real-time machine vision inspection problems. Machine Vision andApplications , 18(2), 85–93.

61. Leitner, F. and P. Cinquin, Complex topology 3-D objects segmentation. InProceedings of SPIE Conference on Advances in Intelligent Robotics Systems , vol-ume 1609. 1991, 16–26.

62. Leymarie, F. and M. D. Levine (1993). Tracking deformable objects in theplane using an active contour model. IEEE Transactions on Pattern Analysis andMachine Intelligence, 15(6), 617–634.

63. Li, C., C. Xu, C. Gui, and M. D. Fox, Level set evolution without re-initialization: A new variational formulation. In Proceedings of InternationalConference on Computer Vision and Pattern Recognition, CVPR’ 05 , volume 1.2005a, 430–436.

64. Li, K., X. Wu, D. Z. Chen, and M. Sonka (2006). Optimal surface segmen-tation in volumetric images-a graph-theoretic approach. IEEE Transactions onPattern Analysis and Machine Intelligence, 28(1), 119–134.

65. Li, Y., J. Sun, and H.-Y. Shum, Video object cut and paste. In SIGGRAPH’05: ACM SIGGRAPH 2005 Papers . 2005b, 595–600.

66. Li, Y., J. Sun, C.-K. Tang, and H.-Y. Shum (2004). Lazy snapping. ACMTransactions on Graphics , 23(3), 303–308.

67. Lombaert, H., Y. Sun, L. Grady, and C. Xu, A multilevel banded graph-cutsmethod for fast image segmentation. In Proceedings of International Conferenceon Computer Vision, ICCV’ 05 , volume 1. 2005, 259–265.

109

68. Lorigo, L. M., O. D. Faugeras, W. E. L. Grimson, R. Keriven, andR. Kikinis, Segmentation of bone in clinical knee MRI using texture-basedgeodesic active contours. In Proceedings of the International Conference on Med-ical Image Computing and Computer-Assisted Intervention, MICCAI ’98 , LNCS1496. Springer-Verlag, London, UK, 1998, 1195–1204.

69. Lu, C., P. Chung, and C. Chen (1997). Unsupervised texture segmentationvia wavelet transform. Pattern Recognition, 30(5), 729–742.

70. Lu, K. and T. Pavlidis (2007). Detecting textured objects using convex hull.Machine Vision and Applications , 18(2), 123–133.

71. Malladi, R., J. A. Sethian, and B. C. Vemuri (1995). Shape modeling withfront propagation: A level set approach. IEEE Transactions on Pattern Analysisand Machine Intelligence, 17(2), 158–175.

72. Mallat, S. (1989). A theory for multiresolution signal decomposition: Thewavelet representation. IEEE Transactions on Pattern Analysis and MachineIntelligence, 11(7), 674–693.

73. Mallat, S., A Wavelet Tour of Signal Processing . Academic Press, 1999.

74. Marcelja, S. (1980). Mathematical description of the responses of simple corticalcells. Journal of Optical Society of America, 7(11), 1297–1300.

75. McInerney, T. and D. Terzopoulos (1996). Deformable models in medicalimages analysis: a survey. Medical Image Analysis , 1(2), 91–108.

76. Moller, B., S. Posch, A. Haasch, J. Fritsch, and G. Sagerer, Interactiveobject learning for robot companions using mosaic images. In Proceedings ofIEEE/RSJ International Conference on Intelligent Robots and Systems, IROS2005, 2-6 August, 2005 . Edmonton, Alberta, Canada, 2005, 2650– 2655.

77. Mortensen, E. N. and W. A. Barrett (1998). Interactive segmentation withintelligent scissors. Graphical Models in Image Processing , 60(5), 349–384.

78. Ng, I., T. Tan, and J. Kittler, On linear transform and Gabor filter represen-tation of texture. In Proceedings of IAPR International Conference on PatternRecognition, volume 3. The Hauge, Netherlands, 1992, 627–631.

79. Olaf Pichler, A. T. and B. J. Hosticka (1996). A comparison of texture featureextraction using adaptive Gabor filtering, pyramidal and tree structured wavelettransforms. Pattern Recognition, 29(5), 733–742.

80. Osher, S. and R. Fedkiw, Level Set Methods and Dynamic Implicit Surfaces .Springer-Verlag, New York, 2002.

81. Osher, S. and N. Paragios, Geometric Level Set Methods in Imaging,Vision,andGraphics . Springer-Verlag New York, Inc., Secaucus, NJ, USA, 2003.

82. Osher, S. J. and J. A. Sethian (1988). Fronts propagating with curvature-dependent speed: Algorithms based on Hamilton-Jacobi formulations. Journal ofComputational Physics , 79, 12–49.

110

83. P. P. Raghu (1995). Artificial Neural Network Models for Texture Analysis .Ph.D. thesis, Dept. of Computer Science and Engineering, IIT Madras, Chennai.

84. Paglieroni, D. W., G. E. Ford, and E. M. Tsujimoto (1994). The position-orientation masking approach to parametric search for template matching. IEEETransactions on Pattern Analysis and Machine Intelligence, 16(7), 740–747.

85. Paragios, N. and R. Deriche, A PDE-based level-set approach for detectionand tracking of moving objects. In Proceedings of 6th International Conferencein Computer Vision. Bombay, India, 1998, 1139–1145.

86. Paragios, N. and R. Deriche, Geodesic active contours for supervised texturesegmentation. In Proceedings of IEEE Conference on Computer Vision and Pat-tern Recognition, CVPR’ 99 , volume 2. 1999a, 427–432.

87. Paragios, N. and R. Deriche, Geodesic active regions for supervised texturesegmentation. In Proceedings of International Conference on Computer Vision,ICCV’ 99 . 1999b, 926–932.

88. Paragios, N. and R. Deriche (2002a). Geodesic active regions: a new paradigmto deal with frame partition problems in computer vision. Journal of Visual Com-munication and Image Representation, Special Issue on Partial Differential Equa-tions in Image Processing, Computer Vision and Computer Graphics , 13(1/2),249–268.

89. Paragios, N. and R. Deriche (2002b). Geodesic active regions and level setmethods for supervised texture segmentation. International Journal of ComputerVision, 46(3), 223–247.

90. Paragios, N., O. Mellina-Gottardo, and V. Ramesh (2004). Gradient vectorflow fast geometric active contours. IEEE Transactions on Pattern Analysis andMachine Intelligence, 26(3), 402–407.

91. Pujol, O. and P. Radeva (2004). Texture segmentation by statistical deformablemodels. International Journal of Image and Graphics , 4(3), 433–452.

92. Rao, S. G., S. Das, T. Greiner, and M. Kalra, Error analysis of M-channelDWT based method for orientation estimation of an inclined planar texture sur-face. In Proceedings of International Conference on Visual Information Engineer-ing, VIE’ 06 . Bangalore, India, 2006, 321–326.

93. Rao, S. G., M. Puri, and S. Das, Unsupervised segmentation of texture imagesusing a combination of Gabor and wavelet features. In Proceedings of IndianConference on Computer Vision, Graphics and Image Processing, ICVGIP’ 04 .Kolkata, India, 2004, 370–375.

94. Rother, C., V. Kolmogorov, and A. Blake (2004). GrabCut: Interactiveforeground extraction using iterated graph-cuts. ACM Transactions on Graphics ,23(3), 309–34.

111

95. Rousson, M., T. Brox, and R. Deriche, Active unsupervised texture segmen-tation on a diffusion based feature space. In Proceedings of IEEE Conferenceon Computer Vision and Pattern Recognition, CVPR’ 03 , volume 2. 2003, II:699–704.

96. Sagiv, C., N. Sochen, and Y. Zeevi, Texture segmentation via a diffusion-segmentation scheme in the gabor feature space. In International Workshop onTexture Analysis and Synthesis, Texture’ 02 . 2002, 123–128.

97. Sagiv, C., N. Sochen, and Y. Zeevi (2006). Integrated active contours fortexture segmentation. IEEE Transactions on Image Processing , 15(6), 1633–1646.

98. Sagiv, C., N. A. Sochen, and Y. Y. Zeevi, Gabor-space geodesic active con-tours. In Proceedings of 2nd International Workshop on Algebraic Frames for thePerception-Action Cycle, AFPAC ’00 , LNCS 1888. 2000, 309–318.

99. Sagiv, C., N. A. Sochen, and Y. Y. Zeevi, Geodesic active contours applied totexture feature space. In Proceedings of International Conference on Scale-Spaceand Morphology in Computer Vision, LNCS 2106. Springer-Verlag, Vancouver,Canada, 2001, 344–352.

100. Salari, E. and Z. Ling (1995). Texture segmentation using hierarchical waveletdecomposition. Pattern Recognition, 28(12), 1819–1824.

101. Sandberg, B., T. Chan, and L. Vese (2002). A level-set and Gabor-basedactive contour algorithm for segmenting textured images. UCLA Department ofMathematics CAM report 02-39 .

102. Sanjay, M., S. Das, and B. Yegnanarayana, Robust template matching fornoisy bitmap images invariant to translation and rotation. In Indian Conferenceon Computer Vision, Graphics and Image Processing, December 21-23, 1998 . NewDelhi, INDIA, 1998, 82–88.

103. Sapiro, G., Vector-valued active contours. In Proceedings of International Con-ference on Computer Vision and Pattern Recognition, CVPR’ 96 . IEEE ComputerSociety, 1996, 680–685.

104. Sapiro, G. (1997). Color snake. Computer Vision and Image Understanding ,68(2), 247–253.

105. Sapiro, G., Geometric partial differential equations and image analysis . Cam-bridge University Press, New York, NY, USA, 2001.

106. Sethian, J. A., Level Set Methods and Fast Marching Methods . CambridgeUniversity Press, Cambridge, 1999.

107. Sochen, N., R. Kimmel, and R. Malladi (1998). A general framework for lowlevel vision. IEEE Transactions on Image Processing , 7(3), 310–318.

108. Srinark, T. and C. Kambhamettu, A framework for multiple snakes. In Pro-ceedings of Computer Vision and Pattern Recognition, CVPR’ 01 . 2001, II: 202–209.

112

109. Srinark, T. and C. Kambhamettu (2006). A framework for multiple snakesand its applications. Pattern Recognition, 39(9), 1555–1565.

110. Szeliski, R., D. Tonnesen, and D. Terzopoulos, Modeling surfaces of arbi-trary topology with dynamic particles. In Proceedings of IEEE Conference onComputer Vision and Pattern Recognition, CVPR’ 93 . 1993, 82–87.

111. Terzopoulos, D. and K. Fleischer (1988). Deformable models. The VisualComputer , 4(6), 306–331.

112. Terzopoulos, D. and R. Szeliski, Tracking with Kalman Snakes . MIT Press,Cambridge, MA, 1992.

113. Toivanen, P., J. Ansamaki, J. Parkkinen, and J. Mielikainen (2003). Edgedetection in multispectral images using the self-organizing map. Pattern Recogni-tion Letters , 24(16), 2987–2994.

114. Tsai, C.-T., Y.-N. Sun, and P.-C. Chung, Minimising the energy of activecontour model using a Hopfield network. In IEE Proceedings-Computers andDigital Techniques , volume 140(6). 1993, 297–303.

115. Tsang, P., P. Yuen, and F. Lam (1994). Classification of partially occludedobjects using 3-point matching and distance transformation. Pattern Recognition,27(1), 27–40.

116. Vasilevskiy, A. and K. Siddiqi (2002). Flux maximizing geometric flows. IEEETransactions on Pattern Analysis and Machine Intelligence, 24(12), 1565–1578.

117. Vemuri, B. and Y. Chen, Joint image registration and segmentation. In Pro-ceedings of Geometric level set methods in imaging, vision and graphics . Springer,New York, 2003, 251–269.

118. Venkatesh, Y. V. and N. Rishikesh, Modelling active contours using neuralnetworks isomorphic to boundaries. In Proceedings International Conference onNeural Networks , volume 3. Houston,Texas, 1997, 1669–1672.

119. Venkatesh, Y. V. and N. Rishikesh (2000). Self-organizing neural networksbased on spatial isomorphism for active contour modeling. Pattern Recognition,33(7), 1239–1250.

120. Wang, B. and L. Zhang, Supervised texture segmentation using wavelet trans-form. In Proceedings of International Conference on Neural Networks and SignalProcessing , volume 2. Nanjing, China, 2003, 1078–1082.

121. Wang, J., P. Bhat, R. A. Colburn, M. Agrawala, and M. F. Cohen (2005).Interactive video cutout. ACM Transactions on Graphics , 24(3), 585–594.

122. Wang, L. and J. Liu (1999). Texture classification using multiresolution Markovrandom field models. Pattern Recognition Letters , 20(2), 171–182.

123. Weickert, J., Fast segmentation methods based on partial differential equa-tions and the watershed transformation. In Mustererkenmung . Berlin, Germany:Springer, 1998, 93–100.

113

124. Xu, C. and J. Prince, Gradient vector flow: A new external force for snakes.In Proceedings of Computer Vision and Pattern Recognition, CVPR’ 97 . 1997,66–71.

125. Xu, C. and J. Prince (1998a). Snakes, shapes, and gradient vector flow. IEEETransactions on Image Processing , 7(3), 359–369.

126. Xu, C. and J. L. Prince (1998b). Generalized gradient vector flow externalforces for active contours. Signal Processing , 71(2), 131–139.

127. Xu, N., R. Bansal, and N. Ahuja, Object segmentation using graph-cutsbased active contours. In Proceedings of IEEE Conference on Computer Visionand Pattern Recognition, CVPR’ 03 , volume 2. 2003, 46–53.

128. Yezzi, A., S. Kichenassamy, A. Kumar, P. Olver, and A. Tennenbaum(1997). A geometric snake model for segmentation of medical imagery. IEEETransaction on Medical Imaging , 16(2), 199–209.

129. Zhao, H.-K., T. Chan, B. Merriman, and S. Osher (1996). A variational levelset approach to multiphase motion. Journal of Computational Physics , 127(1),179–195.

130. Zhu, S. C. and A. Yuille (1996). Region competition: Unifying snakes, regiongrowing, and bayes/MDL for multiband image segmentation. IEEE Transactionson Pattern Analysis and Machine Intelligence, 18(9), 884–900.

114

LIST OF PAPERS BASED ON THESIS

1. Surya Prakash and Sukhendu Das, “External Force Modeling of Snake usingDWT for Texture Object Segmentation”, In Proceedings of InternationalConference on Advances in Pattern Recognition, ICAPR’ 07, January 2-4,2007, World Scientific, Singapore, pp. 215-219.

2. Surya Prakash, R. Abhilash and Sukhendu Das, “SnakeCut: IntegratingActive Contour and GrabCut for Superior Object Segmentation”, ElectronicLetters on Computer Vision and Image Analysis (ELCVIA), Vol. 6, No. 3,pp. 13-29, December 2007.

3. Surya Prakash and Sukhendu Das, “Segmenting Multiple Textured Objectsusing Geodesic Active Contour and DWT”, Proceedings of InternationalConference on Pattern Recognition and Machine Intelligence (PReMI’ 07),LNCS 4815, pp. 111-118, December 2007.

4. Surya Prakash, “Multiple Textured Objects Segmentation using DWT basedTexture Features in Geodesic Active Contour”, Proceedings of InternationalConference on Computational Intelligence and Multimedia Applications, IC-CIMA’ 07, Vol. 2, IEEE Computer Society, pp. 532-536, December 2007.

115

Curriculum Vitae

1. Name: Surya Prakash

2. Date of Birth: June 25, 1980

3. Educational Qualification:Bachelor of Technology (B.Tech.)Year: 2001Institute: Institute of Engineering and Technology,CSJM University Kanpur.Specialization: Computer Science & Engineering

Master of Science (M.S.)Year: 2008Institute: Indian Institute of Technology Madras,Chennai-600 036Specialization: Computer Science & EngineeringRegistration Date: January 3, 2005

116

General Test Committee

Chairperson:

Dr. Kamala Krithivasan

Professor,

Department of Computer Sc. & Engineering,

IIT Madras, Chennai - 600036.

Guide:

Dr. Sukhendu Das

Associate Professor,



Members:

Dr. Hema A Murthy

Professor,



Dr. Srinivasa Chakravarthy V

Associate Professor,

Department of Biotechnology,


117

Documents

Active Contour Based Foreground Object …THESIS CERTIFICATE This is to certify that the thesis titled \Active Contour Based Foreground Object Segmentation", submitted by Surya Prakash,