Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
Segmentation
Terminology• Segmentation, grouping, perceptual
organization: gathering features that belong together
• Fitting: associating a model with observed features
• Top-down segmentation: pixels belong together because they come from the same object
• Bottom-up segmentation: pixels belong together because they look similar
The goals of segmentation• Separate image into coherent “objects”
Berkeley segmentation database:http://www.eecs.berkeley.edu/Research/Projects/CS/vision/grouping/segbench/
image human segmentation
The goals of segmentation• Separate image into coherent “objects”
• Top-down or bottom-up process?• Supervised or unsupervised?
• Group together similar-looking pixels for efficiency of further processing
X. Ren and J. Malik. Learning a classification model for segmentation.ICCV 2003.
“superpixels”
The goals of segmentation• Separate image into coherent “objects”
• Top-down or bottom-up process?• Supervised or unsupervised?
• Group together similar-looking pixels for efficiency of further processing• Related to image compression• Measure of success is often application-dependent
Segmentation: Outline• Inspiration from psychology• Segmentation as clustering
• K-means• Mean shift
• Segmentation as partitioning• Graph-based segmentation, normalized cuts
• Integrating top-down and bottom-up segmentation for recognition
The Gestalt school• Grouping is key to visual perception
The Muller-Lyer illusion
The Gestalt school• Grouping is key to visual perception• Elements in a collection can have properties
that result from relationships • “The whole is greater than the sum of its parts”
The Gestalt school• Grouping is key to visual perception• Elements in a collection can have properties
that result from relationships • “The whole is greater than the sum of its parts”
subjective contours occlusion
familiar configuration
http://en.wikipedia.org/wiki/Gestalt_psychology
Figure-ground discrimination
The ultimate Gestalt?
Gestalt factors
• These factors make intuitive sense, but are very difficult to translate into algorithms
Gestalt factors• They may be hard to put into algorithms, but
understanding them can come in useful for interface design
Gestalt factors• They may be hard to put into algorithms, but
understanding them can come in useful for interface design
Segmentation as clustering
Source: K. Grauman
Segmentation as clustering
Source: K. Grauman
Different clustering strategies• Agglomerative clustering
• Start with each point in a separate cluster• At each iteration, merge two of the “closest” clusters
• Divisive clustering• Start with all points grouped into a single cluster• At each iteration, split the “largest” cluster
• K-means clustering• Iterate: assign points to clusters, compute means
• K-medoids• Same as k-means, only cluster center cannot be computed
by averaging• The “medoid” of each cluster is the most centrally located
point in that cluster (i.e., point with lowest average distance to the other points)
Image Intensity-based clusters Color-based clusters
K-Means clustering• K-means clustering based on intensity or
color is essentially vector quantization of the image attributes• Clusters don’t have to be spatially coherent
K-Means clustering• K-means clustering based on intensity or
color is essentially vector quantization of the image attributes• Clusters don’t have to be spatially coherent
• Clustering based on (r,g,b,x,y) values enforces more spatial coherence
K-Means pros and cons• Pros
• Simple and fast• Converges to a local minimum of the error function
• Cons• Need to pick K• Sensitive to initialization• Sensitive to outliers• Only finds “spherical”
clusters
http://www.caip.rutgers.edu/~comanici/MSPAMI/msPamiResults.html
Mean shift segmentation• An advanced and versatile technique for
clustering-based segmentation
D. Comaniciu and P. Meer, Mean Shift: A Robust Approach toward Feature Space Analysis, PAMI 2002.
• The mean shift algorithm seeks a mode or local maximum of density of a given distribution
• Choose a search window (width and location)• Compute the mean of the data in the search window• Center the search window at the new mean location • Repeat until convergence
Mean shift algorithm
Region ofinterest
Center ofmass
Mean Shiftvector
Mean shift
Slide by Y. Ukrainitz & B. Sarel
Region ofinterest
Center ofmass
Mean Shiftvector
Mean shift
Slide by Y. Ukrainitz & B. Sarel
Region ofinterest
Center ofmass
Mean Shiftvector
Mean shift
Slide by Y. Ukrainitz & B. Sarel
Region ofinterest
Center ofmass
Mean Shiftvector
Mean shift
Slide by Y. Ukrainitz & B. Sarel
Region ofinterest
Center ofmass
Mean Shiftvector
Mean shift
Slide by Y. Ukrainitz & B. Sarel
Region ofinterest
Center ofmass
Mean Shiftvector
Mean shift
Slide by Y. Ukrainitz & B. Sarel
Region ofinterest
Center ofmass
Mean shift
Slide by Y. Ukrainitz & B. Sarel
• Cluster: all data points in the attraction basin of a mode
• Attraction basin: the region for which all trajectories lead to the same mode
Mean shift clustering
Slide by Y. Ukrainitz & B. Sarel
• Find features (color, gradients, texture, etc)• Initialize windows at individual pixel locations• Perform mean shift for each window until convergence• Merge windows that end up near the same “peak” or mode
Mean shift clustering/segmentation
http://www.caip.rutgers.edu/~comanici/MSPAMI/msPamiResults.html
Mean shift segmentation results
More results
More results
Mean shift pros and cons• Pros
• Does not assume spherical clusters• Just a single parameter (window size) • Finds variable number of modes• Robust to outliers
• Cons• Output depends on window size• Computationally expensive• Does not scale well with dimension of feature space
Graph-based segmentation• Represent features and their relationships
using a graph• Cut the graph to get subgraphs with strong
interior links and weaker exterior links
Images as graphs
• Node for every pixel• Edge between every pair of pixels (or every pair
of “sufficiently close” pixels)• Each edge is weighted by the affinity or
similarity of the two nodes
wiji
j
Source: S. Seitz
Segmentation by graph partitioning
• Break Graph into Segments• Delete links that cross between segments• Easiest to break links that have low affinity
– similar pixels should be in the same segments– dissimilar pixels should be in different segments
A B C
Source: S. Seitz
wiji
j
Measuring Affinity
Intensity
Color
Distance
aff x, y( )= exp − 12σ i
2⎛ ⎝
⎞ ⎠ I x( )− I y( ) 2( )⎧
⎨ ⎩
⎫ ⎬ ⎭
aff x, y( )= exp − 12σ d
2⎛ ⎝
⎞ ⎠ x − y 2( )⎧
⎨ ⎩
⎫ ⎬ ⎭
aff x, y( )= exp − 12σ t
2⎛ ⎝
⎞ ⎠ c x( )− c y( ) 2( )⎧
⎨ ⎩
⎫ ⎬ ⎭
Scale affects affinity• Small σ: group only nearby points• Large σ: group far-away points
Graph cut
• Set of edges whose removal makes a graph disconnected
• Cost of a cut: sum of weights of cut edges• A graph cut gives us a segmentation
• What is a “good” graph cut and how do we find one?
A B
Source: S. Seitz
Graph cut
Graph cut
Affinity matrix Block detection
* Slides from Dan Klein, Sep Kamvar, Chris Manning, Natural Language Group Stanford University
Multi-way graph cut
Minimum cut• We can do segmentation by finding the
minimum cut in a graph• Efficient algorithms exist for doing this
• Drawback: minimum cut tends to cut off very small, isolated components
* Slide from Khurram Hassan-Shafique CAP5415 Computer Vision 2003
Minimum cut• We can do segmentation by finding the
minimum cut in a graph• Efficient algorithms exist for doing this
• Drawback: minimum cut tends to cut off very small, isolated components
Ideal Cut
Cuts with lesser weightthan the ideal cut
* Slide from Khurram Hassan-Shafique CAP5415 Computer Vision 2003
Normalized cut• A minimum cut penalizes large segments• This can be fixed by normalizing the cut by
component size• The normalized cut cost is:
• The exact solution is NP-hard but an approximation can be computed by solving a generalized eigenvalue problem
assoc(A, V) = sum of weights of all edges in V that touch A
),(),(
),(),(
VBassocBAcut
VAassocBAcut
+
J. Shi and J. Malik. Normalized cuts and image segmentation. PAMI 2000
Example results
Example results
Using texture features for segmentation• Texture descriptor is
vector of filter bank outputs
J. Malik, S. Belongie, T. Leung and J. Shi. "Contour and Texture Analysis for Image Segmentation". IJCV 43(1),7-27,2001.
Using texture features for segmentation• Texture descriptor is
vector of filter bank outputs
• Textons are found by clustering
J. Malik, S. Belongie, T. Leung and J. Shi. "Contour and Texture Analysis for Image Segmentation". IJCV 43(1),7-27,2001.
Using texture features for segmentation• Texture descriptor is
vector of filter bank outputs
• Textons are found by clustering
• Affinities are given by similarities of textonhistograms over windows given by the “local scale” of the texture
J. Malik, S. Belongie, T. Leung and J. Shi. "Contour and Texture Analysis for Image Segmentation". IJCV 43(1),7-27,2001.
The importance of scale
J. Malik, S. Belongie, T. Leung and J. Shi. "Contour and Texture Analysis for Image Segmentation". IJCV 43(1),7-27,2001.
Example results
J. Malik, S. Belongie, T. Leung and J. Shi. "Contour and Texture Analysis for Image Segmentation". IJCV 43(1),7-27,2001.
• Pros• Generic framework, can be used with many different
features and affinity formulations
• Cons• High storage requirement and time complexity• Bias towards partitioning into equal segments
* Slide from Khurram Hassan-Shafique CAP5415 Computer Vision 2003
Normalized cuts: Pro and con
Integrating top-down and bottom-up segmentation
Z.W. Tu, X.R. Chen, A.L. Yuille, and S.C. Zhu. Image parsing: unifying segmentation, detection and recognition. IJCV 63(2), 113-140, 2005.
Image parsing• Define generative models for text and faces
• Deformable spline-based templates for characters
• PCA model for faces
• Top-down: propose a model for a given region• Bottom-up: verify the consistency of the model
with image features
Image parsing
Data-driven Markov Chain Monte Carlo
Example results
Example results