Digital Images Levels of Abstractionjacky/Teaching/Courses/COMP_4060... · 2008-11-03 · – Watershed algorithm – Snakes Hough transform – detect lines – general Hough transform

COMP_4190 Artificial Intelligence

Computer Vision

Jacky Baltes

Department of Computer Science

University of Manitoba

Winnipeg, Manitoba

Canada, R3T 2N2

[email protected]

http://www.cs.umanitoba.ca/~jacky

Computer Vision

� Introduction

� Digital images

� Edge detection

� Segmentation

� Watershed algorithm

� Snakes

� Hough transform

� detect lines

� general Hough transform

� Scene interpretation

� Waltz Algorithm

Digital Images

� Computer graphics: model -> image

� Computer vision: image -> model

� Two dimensional array of picture elements

(pixels)

� Captured through a sensor

� camera, scanner

� structured lighting

Levels of Abstraction

� Lowest level:

� Pre-processing, noise removal, edge detection

� Middle level:

� Segmentation into regions

� High level:

� Object recognition, motion analysis

� World model

Optical Illusions Optical Illusions

PPM File Format

� We will use the PPM file format in this course

(easy,lossless)

� Colour format with three channels (red, green, blue)

� ASCII Header

� P6 Magic number

� 312 235 width height

� 255 # of colours per channel (depth)

� Followed by width * height * log2(depth) * 3 bytes

PPM File Format

P6

312 235

255

<00><D9><D6><D7><D4><D4><D6><D7><DA

><DA><E0><DB><E6><D8><E6><D6><E5><

D6>

Edge Detection

� Colours are very susceptible to lighting

� An important pre-processing step is edge

detection

� How do we find an edge?

� Sharp contrast in the image

� Derivative of the image function I(i,j)

� Approximate derivative with I(i+1,j) - I(i-1,j)

Convolution

� Edge Detection and many

other image pre-

processing steps can be

implemented as a

convolution

� A convolution mask is a

matrix that is applied to

each pixel in the image

� Specifies weights of the

neighbors

0 0 0

-1 0 +1

0 0 0

Derivative

Convolution

� What is the output of [ 0, 255, 0, 0, ..... ]

� -255?

� Use divisor and offset to normalize result of

convolution to 0 .. 255

� How to deal with colour images?

� Convert to grey scale

� Handle each channel seperately

Sobel Edge Detection

� To reduce noise, average over several rows

� Weigh rows differently

� Divisor = 8, Offset = 128

-1 0 +1

-2 0 +2

-1 0 +1

Sobel Edge Detection

� Use seperate convolution matrices for horizontal

and vertial

-1 0 +1

-2 0 +2

-1 0 +1

-1 -2 -1

0 0 0

+1 +2 +1

Horizontal Vertical

Convolution

� Many other filters can be implemented efficiently

as convolution

� One problem: what to do at the borders

� What does the following filter do?

+1 +1 +1

+1 +1 +1

+1 +1 +1

Blurring (Simple)

� Blurring is used to reduce noise in the image

+1 +1 +1

+1 +1 +1

+1 +1 +1

Template Matching Convolution

� Find Specific features in the image

0 +1 0

+1 +1 +1

0 0 0

Divisor = 4

Segmentation

Image

"What are the objects to be analyzed?"

Pre-processing, im age enhancem ent

Segmentat ion

Binary operat ions

Morphological operat ions and feature ext ract ion

Classificat ion and m atching

Im age analysis

DataData

Segmentation

� Full segmentation: Individual objects are separated from the background and given individual ID numbers (labels).

� Partial segmentation: The amount of data is reduced (usually by separating objects from background) to speed up the further processing.

� Segmentation is often the most difficult problem to solve in the process; there is no universal solution!

� The problem can be made much easier if solved in cooperation with the constructor of the imaging system (choice of sensors, illumination, background etc) .

Three Types of Segmentation

� Classification – Based on some similarity measure between pixel values. The simplest form is thresholding.

� Edge-based – Search for edges in the image. They are then used as borders between regions

� Region-based – Region growing, merge & split

Common idea: search for discontinuities or/and similitudes in the image

Thresholding

(Global and Local)

� Global: based on some kind of histogram: grey-level, edge, feature etc.� Lighting conditions are extremely important, and it will

only work under very controlled circumstances.

� Fixed thresholds: the same value is used in the whole image

� Local (or dynamic thresholding): depends on the position in the image. The image is divided into overlapping sections which are thresholded one by one.

Classical Automatic Thresholding Algorithm

1. Select an initial estimate for T

2. Segment the image using T. This produces 2 groups: G1 ,

pixels with value >T and G2 , with value <T

3. Compute µ1 and µ2, average pixel value of G1 and G2

4. New threshold: T=1/2(µ1+µ2)

5. Repeat steps 2 to 4 until T stabilizes.

� Very easy + very fast

� Assumptions: normal dist. + low noise

Optimal Thresholding

� Based on the shape of the current image histogram. Search

for valleys, Gaussian distributions etc.

Background

Real histogram

Opt im althreshold ?

Both

Foreground

Histograms

To love� �and to hate

Thresholding and illumination

� Solutions:

� Calibration of the

imaging system

� Percentile filter with very

large mask

� Morphological operators

MR non-uniformity

median filtering thresholding

-

More thresholding

� Can also be used on other kinds of histogram: grey-level,

edge, feature etc.

Multivariate data (� see next lectures)

� Problems:

� Only considers the graylevel pixel value, so it can leave

“holes” in segmented objects.

� Solution: post-processing with morphological operators

� Requires strong assumptions to be efficient

� Local thresholding is better � see region growing

techniques

Edge-based Segmentation

Based on finding discontinuities (local variations of image

intensity)

3. Apply an edge detector

ex gradient operator (Sobel)

second derivative (Laplace)

4. Threshold the edge image to get a binary image

5. Depending on the type of edge detector:

� Link edges together to close shapes (using edge direction

for example)

� Remove spurious edges

Gradient based procedure

Sobel

Sobel

Zero-crossing based procedure

LoG

Laplacian of Gaussian

Edge-based Segmentation: examples

Prewit t : needs edge linking Canny: needs � cleaning�

Region based segmentation

� Work by extending some region based on local

similarities between pixels

� region growing (bottom-up method)

� region splitting and merging (top-down method)

� Bottom-up: from data to representation

� Top-down: from model to data

Region growing: Bottom Up Method

1. Find starting points

3. Include neighbouring pixels with similar features (grey-level, texture, color).

5. Continue untill all pixels have been included with one of the starting points.

� Problems:

� Not trivial to find good starting points, difficult to automate

� Need good criteria for similarity.

Region Growing: Watershed Algorithm

� Think of the grey-level image as a landscape. Let water rise from the bottom of each valley (the water from each valley is given its own label). As soon as the water from two valleys meet, build a dam, or watershed. These watersheds will then define the borders between different regions.

Exam ple of watershed direct ly on a gray-level im age

Exam ple of Watershed on a binary im age

Watershed: Problems and Solutions

� Oversegmentation

� Watershed from markers

� Computation

� new algorithm for fast watershed

� Graylevel might not be the optimal choice as the local

similarity measure

� bigger neighborhood when growing

� other local features (statistical, edge enhanced image,

distance transformed image…)

Region Growing: Split & Merge

1. ) Set up some criteria for what is a uniform area (ex mean, variance, bimodality of histogram, texture, etc…)

2.) Start with the full image and split it in to 4 sub-images

3.) Check each sub-image. If not uniform, divide into 4 new sub-images

4.) After each iteration, compare touching regions with neighboring regions end merge if uniform

The method is also called "quadtree" decomposition/division (and is also used for compression)

Split & Merge The Hough transform

� A method for finding global relationships between pixels.

Example: We want to find straight lines in an image

� 1. Apply edge enhancing filter (ex Laplace)

� 2. Set a threshold for what filter response is considered a true ”edge pixel”

� 3. Extract the pixels that are on a straight line using the Hough transform

original im age edge enhanced image

thresholded edge image

The Hough transform

Finding straight lines:

1. consider a pixel in position (xi,yi)

2. equation of a straight line yi=axi+b

3. set b=-axi+ yi and draw this (single) line in ”ab-space”

4. consider the next pixel with position (xj,yj) and draw the line b=-axj+ yj ”ab-space” (also called parameter space). The points (a’,b’) where the two lines intersect represent the line y=a’x+b’ in ”xy-space” which will go through both (xi,yi) and (xj,yj).

5. draw the line in ab-space corresponding to each pixel in xy-space.

6. divide ab-space into accumulator cells and find most common (a’, b’) which will give the line connecting the largest number of pixels

y

The Hough transform

xy-space

x

ab- or parameter space

b

a

The Hough transform� In reality we have a problem with y=ax+b because a reaches infinity for

vertical lines.

Use instead.

� It is common to use ”filters” for finding the intersection: ”butterfly filters”

� Different variations of the Hough transform can also be used for finding other shapes of the form g(v,c)=0, v is a vector of coordinates, c is a vector of coefficients.

� Possible to find any kind of simple shape

ex. circle: (3D parameter space)� x�c1 �

2�� y�c2 �

2=c3

2

The Hough transform

Conclusions

� The segmentation procedure

� Pre-processing

� Segmentation

� Post-processing

� � Like any IP procedure

� There exists NO universal segmentation method

� Evaluation of segmentation performance is important

46

Snakes

� Example: segmentation of the brain in MRI

Snake after initialization Snake at equilibrium

User interaction

Snakes (active contours)

� A snake is an active contour parametrically represented by its

position v(s)=(x(s), y(s)). s ranges from 0 to 1

� Each position is associated to an energy:

� The final position corresponds to the minimum of the energy

E snake=0

1

E int [ v� s � ]ds�0

1

E ext [ v� s � ]ds

Internal Energy

The internal energy of the snake is due to bending and it

is associated with a priori constraints:

� �(s) controls the tension of the contour

� �(s) controls its rigidity

External Energy

� The external energy depends on the image and accounts for a

posteriori information

� Several energy forms have been proposed based on features

of interest in the image

� An energy commonly used to attract snakes towards edges is:

Applications

Applications

(by Terzopoulos)

52

Considerations

� The number of nodes is an important factor for the behavior of

the snake. Ability to resample the contour may be necessary.

� If we want a closed contour, we set the first and the last point

equal.

� Anchor points are necessary to keep the snake in position if

the image forces are not enough.

� It may be necessary to allow a snake contour to divide into

two contours, or two contours to merge into one contour.

� Different applications may need different potential functions

and different settings of the control parameters (damping,

tension and rigidity).

53

Applications

� Tracking of a moving object

� An initial estimate for the contour (e.g. interactively defined)

is used in the first frame.

� The contour at equilibrium is used as the starting contour for

the next frame. The snake locks on to the object.

� Reconstruction from serial sections

� The user draws an approximate contour in the first slice.

� The contour at equilibrium is used as the starting contour in

the next slice.

� The 3D object is reconstructed from the contours using

triangulation.

� …..

More segmentation

Important in Image Processing in general:

“If you can use expert knowledge (user interaction,

modelling,…) at relatively low cost (development,

computational,…)”

JUST DO IT!!

Shape descriptions

Goal: descript ion of im age inform at ion suitableto use for classificat ion, recognit ion, interpretat ion

� Based on contours:

� chain codes

� poligonal approximation

� signature

� Fourier descriptors, …

� Based on regions:

� area, P2A,…

� topology

� moments,…

Representation schemes

� based on boundary or

region?

� reconstruction?

� stable under scaling,

rotation, and translation?

� recognition by

incomplete

representation?

Chain coding

� boundary as a sequence of straight lines

� 4- or 8-connectedness

Example of chain coding

4-connectedness 8-connectedness

4-code: 0 0 0 3 0 3 0 3 0 3 2 3 2 1 1 2 2 2 2 3 2 1 1 0 1 1

8-code: 0 0 0 7 7 7 6 5 2 3 4 4 5 4 2 2 1 2

Problems with chain coding

� Long code

� Dependent on

� Scale

� Starting point

� Rotation

� Small change in the boundary (noise or bad

segmentation) can generate a chain code that does

not reflect the shape of the object

More stable

� Rescale

� Starting point

� treat as a circular sequence

� define as starting point, the point giving the least

magnitude

� Rotation

� use difference code

10103322

3133030

Codes from different starting points

AB

1212232330301010Bx

2323303010101212Bo

0101012122323303B•

1133345775557110Ax

3457755571101133Ao

1101133345775557A•

Shape number

� Object descriptor by chain coded edge

� Difference code

� Starting point giving least magnitude

� Number of digits gives the order

rotat ion independent & unique ... ... but dependent on orientat ion of

grid

Shape numbers of order 8≤

3333

3333

1230

order 4

shape no.

difference

chain code

order 6

330330

303303

122300

30303030

03030303

11223300

33003300

30033003

12223000

33133030

03033133

11223030

order 8

Basic rectangle

� diameter for a boundary B is defined as

diam(B)=max(D(pi,pj)), pi,pjB

� length

� orientation

� diameter defines major axis

� minor axis perpendicular to major axis

� basic rectangle – minimal enclosing rectangle

using minor & major

Shape number –

using basic rectangle

� Shape order n

� Choose rectangle of order n best approximating the

basic rectangle

� Rectangle gives gridsize

� n=12: 2�4, 3�3, 1�5

� �� a unique shape number

Polygonal approximation

� represent the boundary with a polygon

� reduction of discretization effects

� idea: find a simple polygon that reflects the shape

of the boundary

� fit straight lines to boundary

� often not trivial and time consuming

Minimum perimeter polygon

original minimum perimeter

Polygon: merge, split Signature

� 1D functional representation of the boundary� simplest approach: distance from centroid as

function of angle� 2D boundary � 1D function� problems with rotation and scaling

� select centroid� select start point, e.g., as furthest for centroid� rescale function, e.g., so values[0,1]

Signatures for two boundaries Boundary Segments

� decomposition of boundary into segments

� reduce complexity �� simplify description

� often of interest in case of concavities

� use decomposition based on convex hull

Curvature

� rate of change of slope

�2s(t)/ �t2

� difficult in the discrete case

(= images) due to uneven

boundary

� use boundary segments

� fit straight line

� define as change of slope for n

n1

n2

Using Curvature Information

� Bending energy for boundary of length L

� energy necessary to bend a rod to desired shape...

� sum of squared curvature c(k)

� Decomposition based on points of extreme curvature

BE=1

L k=1

L

c2 �k �

Fourier descriptors

� represent boundary (of K pixels) by a sequence of

coordinates

� s(k)=(x(k),y(k)), k=0,1,2,...,K-1

� complex number s(k)=x(k)+iy(k)

� 2D �� 1D

� discrete Fourier Transform (DFT)

a(u) – Fourier descriptors of the boundary

Fourier descriptors – reconstruction

� reconstruction of boundary using inverse DFT

� approximation of boundary using only P first Fourier coefficients

� gives the same number of points in boundary� remember:

� high-frequency for fine details� low-frequency for global shape

Properties of Fourier descriptors

ap(u)=a(u)e-j2�k0u/Ksp(k)=s(k-k0)starting point

as(k)=�a(u)ss(k)=�s(u)scaling

at(u)=a(u)+ �xy�(u)st(k)=s(k)+�xytranslation

ar(u)=a(u)ej�sr(k)=s(k)ej�rotation

a(u)s(k)identity

fourier descriptorboundarytransformation

�xy=�x+ j�y

Measures

� length of boundary – perimeter � count pixels

� ROUGH approximation

� chain code� simple representation

� a x # edge steps + b x # point steps

� diameter: largest distance between two pixels on the boundary� length

� direction

a

b

Area

Num ber of pixels

blue = 10green = 4

Compactness

� P2/A: perimeter x perimeter / area

� dimensionless

� minimal for disc

� rotation invariant

compact not compact

P2/A

perim eter x perim eter / area norm alizat ion:

area = 3591, perim eter = 221

area = 10538, perim eter = 798

P2/A= 60.43, P2/A norm= 4.81

P2/A= 13.60, P2/A norm= 1.08

Examples of P2/A measurements

Eccentricity

� longest chord/max perpendicular chord

� there are different definitions!

Elongatedness

1. high/width of the minimal bounding rectangle

7. area/(2d2)

� d is maximum width

8. longest possible path

OKnot OK

Rectangularity

� area of region/ area of

bounding rectangle

Topology

� Euler number

E=C-H

� C – number of connected components

� H – number of holes

Invariant under ”rubber sheet” transformation

Convex hull

� convex region R

� for any x1,x2R, straight line between x1 and x2 is in R

� convex hull H of a region R

� smallest convex set containing R

� convex deficiency D

R=green

H=green +brown

D=brown

Projections

ph� i �=

j

f �i , j �

pv� j �=

i

f �i , j �

Principle component analysis (PCA)

� points as 2D vectors

� (a,b)� a – x-coordinate

� b – y-coordinate

� mean vector and

covariance matrix

� find eigenvectors

e1 and e2

� rotate

Hotelling transform

Moments

Given a 2D discrete funct ion f(x,y)

moment of order p+ q

central moment of order p+ q

cent re of mass

Measurements with moments

� µ20 horizontal centralness

� µ02 vertical centralness

� µ11 diagonality

indicates with respect to centroid where R has more ”mass”� µ

12 horizontal divergence

indicates the relative extent of the left of R compared to the right� µ

21 vertical divergence

indicates the relative extent of the right of R compared to the left � µ

30 horizontal imbalance

location of center of gravity with respect to half horizontal extent� µ03 vertical imbalance

location of center of gravity with respect to half vertical extent

Example: Cell Classification

n1,…,n6 = normal

d1,…,d6 = diabetic

Shapes differing

in size, orientation,

border irregularity…

Waltz Algorithm

� One of the earliest examples of a constraint

satisfaction problem

� Interpret line drawings from solid polyhedra

Waltz's Algorithm

� Look at all intersections

� What type of intersection is this?

� Concave intersection of three planes

� Externel convex intersection

� Adjacent intersections impose constraints on each other

� CSP to find a unique set of labels

� First step in image understanding

Waltz Algorithm

� Assumptions

� No shadows, no cracks

� Three-faced vertices

� General position: no junctions that move with small movement

of the eye

� Then each line in the images is

� Boundary line of the object (right hand = solid, left hand =

outside)

� Convex edge (+)

� Concave edge (-)

Waltz Algorithm

� Labeling of the edges

Waltz Algorithm

Types of Junctions� 18 legal types of junctions

� Label each junction as one of

those 18

� Both ends of a line must be

consistent (same label)

� Reformulate as a CSP.

Constraint propagation always

works perfectly

Waltz Examples

References

� Les Kitchen (Lecture Slides)

� http://www.cs.mu.oz.au/480/lec_intro_part.pdf

� Lucia Ballerini (Lecture slides)

� http://www.cb.uu.se/~lucia/

� Andrew Moore's lecture slides (CMU)

� http://www2.cs.cmu.edu/~awm/tutorials/constraint05.

pdf

Documents

Digital Images Levels of Abstractionjacky/Teaching/Courses/COMP_4060... · 2008-11-03 · – Watershed algorithm – Snakes Hough transform – detect lines – general Hough transform