Microsoft PowerPoint - DIP Lecture

Preview:

Citation preview

Digital Image Processing

SHADAB KHANARRO

What is Image Processing??

� Essentially it’s a tool for analyzing image data.

� It’s concerned with extracting meaningful data from real world images.

� Digital image processing has evolved enormously in recent times, and is still evolving, one of the hottest topic of research all across the globe.

Typical applications of IP

� Automated visual inspection system:

Checking for objects with defects visually.

� Satellite image processing.

� Classification (OCR), identification

(Handwriting, Fingerprints) etc.

� Can satisfy tight tolerances and it also has no

subjectivity so we can change parameters for

inspection.

Typical application of IP

� Biomedical Field: An extensive use of IP

techniques for improving the dark images

which are typical in biomedical field.

e.g. MEMICA arm system

� Robotics: UGV, UAV, AUV, ROV.

� Miscellaneous: Image forensics, Movies,

Industries, Defense etc.

Traffic Monitoring

Face Detection

Medical IP

Stanley: Beginning of a new era

Morphing

Image representation:

� Computers can not handle continuous images but only arrays of digital numbers.

� Images are represented as a 2-D arrays of points (2-D matrix).

� A point on this 2-D grid (corresponding to the image matrix elements) is called Pixel (Picture element).

� It represents the average irradiance over the area of the pixel

Image Basics:� Image: x, y, f (x, y) distribution in 2-D space. One with finite value &

discrete values of above all is a digital image.� The tristimulus theory of color perception implies that any color can

be obtained from a mix of the three primaries, red, green and blue.� Radiance is the total amount of energy that flows from the light

source, and it is usually measured in watts (W).� Luminance, measured in lumens (lm), gives a measure of the

amount of energy an observer perceives from a light source. For example, light emitted from a source operating in the far infrared region of the spectrum could have significant energy (radiance), but an observer would hardly perceive it; its luminance would be almost zero

� Brightness is a subjective descriptor of light perception that ispractically impossible to measure. It embodies the achromatic notionof intensity and is one of the key factors in describing color sensation.

Overlapping of primaries

Image sensing:

Image acquisition:

f(x,y) = reflectance(x,y) * illumination(x,y)

Reflectance in [0,1], illumination in [0,inf]

Sampling and Quantization

A digital image a[m,n] described in a

2D discrete space is derived from an analog image a(x,y) in a 2D contin

uous space through a sampling process that is frequently referred to a

s digitization. If our samples are d

apart, we can write this as:f [i ,j] = Quantize { f(i d, j d) }

The image can now be represented

as a matrix of integer values

62 79 23 119 120 105 4 0

10 10 9 62 12 78 34 0

10 58 197 46 46 0 0 48

176 135 5 188 191 68 0 49

2 1 1 29 26 37 0 77

0 89 144 147 187 102 62 208

255 252 0 166 123 62 0 31

166 63 127 17 1 0 99 30

S & Q continued…

Gray level resolution

Spatial Resolution

x

y

f(x,y)

Image as function:

Types of images:

� Color Images: A color image is just three functions pasted together. We can write this as a “vector-

valued” function:

� BGR� [0-255]

( , )

( , ) ( , )

( , )

r x y

f x y g x y

b x y

=

Gray scale images

� Each pixel value has only one component,

grayscale value. This represents the

brightness on a scale of 0-255 scale. It is also

stored in memory with R=G=B.

� Back conversion into color images is possible

with some techniques involving mapping of

2^8 to 2^24, uses ANN.

Binary images

� Each pixel can take only two values that is 0

and 1.

� Easiest to work with.

� Fast processing time.

� For rendering it on gray scale,

1 � 255

0 � 0

Image zooming and shrinking

� Zooming can be viewed as oversampling the

image and shrinking can be viewed as

undersampling an image. (NNI)

� Various other techniques such as Bilinear

interpolation, cubic interpolation etc. are

being used for producing better results and

reducing contours in the image.

Zooming

Neighborhood

� or 4 connectivity neighborhood (VN) of a pixel at position (x,y) is given by following locations:

(x+1,y), (x,y+1), (x-1,y), (x,y-1)

� or 8 connectivity neighbor-

-hood of a pixel at (x,y) is given

by following locations:

(x+1,y),(x,y+1),(x-1,y),

(x,y-1),(x+1,y+1),(x+1,y-1)

(x-1,y+1),(x-1,y-1)

N 4

N 8

N 4

N 8

N 4

and N8

= { b: (b,a) =1 } Lcity

= { b: (b,a) =1} Lchess

Distance measurement

� Types of distances between the pixels:

1. or cityblock distance:

= |x-s| + |y-t|

2. or chessboard distance:

= max {|x-s|, |y-t|}

3. Euclidean Distance

D4

D4

D8

D8

D e p,q` a

= x@ s` a2

+ y@ tb c2F G

1

2

ffff

Connected set, component, region, boundary� Let S represent a subset of pixels in an image. Two pixels p and q

are said to be connected in S if there exists a path between themconsisting entirely of pixels in S. For any pixel p in S, the set of pixelsthat are connected to it in S is called a connected component of S. Ifit only has one connected component, then set S is called aconnected set. Let R be a subset of pixels in an image. We call R aregion of the image if R is a connected set. The boundary (alsocalled border or contour) of a region R is the set of pixels in theregion that have one or more neighbors that are not in R. Edgesare formed from pixels with derivative values that exceed a presetthreshold. Depending on the type of connectivity and edge operatorsused, the edge extracted from a binary region will be the same as theregion boundary. Boundary defined.

Image operations:

� Image Processing: image in -> image out

� Image Analysis: image in -> measurements out

� Image Understanding: image in -> high-level description out

As we shall see in further lectures that we shall apply some kind of transformation function on the image such that:

g(x,y) = T{f(x,y)}

Application of this transformation function is done in these ways:

Operation over an image

N2- the output value at a specific

coordinate is dependent on all the values

in the input image.

* Global

P2- the output value at a specific

coordinate is dependent on the input

values in the neighborhood of that same

coordinate.

* Local

Image size = N x N;

neighborhood size = P x

P. The complexity is

specified in operations

per pixel

constant- the output value at a specific

coordinate is dependent only on the

input value at that same coordinate.

* Point

Generic

Complexit

y/Pixel

CharacterizationOperation

Image Enhancement in Spatial Domain

SHADAB KHANARRO

Image enhancement in spatial domain� The principal objective of enhancement is to process

an image so that the result image is more suitable

than the original image for a specific application. There are two broad categories:

� Spatial domain: These approaches are based on direct manipulation of pixels in an image.

� Frequency domain: These techniques are based on modifying the Fourier transform of an image.

Spatial domain basics

� Spatial domain refers to the aggregate of pixels composing an image. Spatial domain methods are

procedures that operate directly on these pixels

g(x,y) = T [f(x,y)]

where f(x, y) is the input image, g(x, y) is the processed image, and T is an operator on f, defined over some neighborhood of (x, y). In addition, T can

operate on a set of input images, such as performing the pixel-by-pixel sum of K images for

noise reduction

Spatial domain basics

� The simplest form of T is when the neighborhood is of size 1*1 (that is, a single pixel). In this case, g depends only on the value of f at (x, y), and T becomes a gray-level (also called an intensity or mapping) transformation function of the form

s = T (r)

where, for simplicity in notation, r and s are variables denoting, respectively, the gray level of f(x, y) and g(x, y) at any point (x, y)

Thresholding function

Point processing and Masks

� Usually enhancement at any point in an image depends on the gray level at that point, techniques in this category often are referred to as point processing.

� The general approach is to use a function of the values of f in a predefined neighborhood of (x, y) to determine the value of g at (x, y). One of the principal approaches in this formulation is based on the use of so-called masks. Basically, a mask is a small 2-D array, such as the one shown in which the values of the mask coefficients determine the nature of the process, such as image sharpening. Enhancement techniques based on this type of approach often are referred to as mask processing or filtering

w(2,1) w(3,1)

w(3,3)

w(2,2)

w(3,2)

w(3,2)

w(1,1)

w(1,2)

w(3,1)

Spatial domain techniques

What is Contrast stretching?

How to enhance the contrast?

Low contrast � image values concentrated near a

narrow range (mostly dark, or mostly bright, or mostly

medium values)

• Contrast enhancement � change the image value

distribution to cover a wide range

• Contrast of an image can be revealed by its histogram

Some basic transformation functions

Image Negatives

s = L-1-r

Log Transformation

s = c log(1+r)

Power law transformation

s = crγ

Image negative

s = L-1-r

Log transformation

s = c log(1+r)

Power law transformation

� The exponent in the

power law transform

is referred to as

“gamma”.

s = crγ

Gamma Correction

Since all image rendering

devices can vary in

gamma error, the images

which are being

telecasted or uploaded

on internet are usually

pre processed to an

averaged gamma value

so that image looks fine

on most of them.

Power law transformation

Magnetic resonance (MR) image of a fractured human spine. (b)–(d) Results of applying the transformation in with c=1 and g=0.6, 0.4, and 0.3, respectively

Power law transformation

(a) Aerial image.

(b)–(d) Results of

applying the

transformation with

c=1 and

g=3.0, 4.0, and

5.0, respectively.

(Original image

for this example

courtesy of

NASA.)

Log/power law transformation

f

Piecewise linear stretching

Contrast stretching

Contraststretching(a) Form of

transformationfunction.

(b) A low-contrastimage.

(c) Resultof contraststretching.

(d) Result ofthresholding.

Contrast stretching

� If and the transformation is a linear function

that produces no changes in gray levels. If ,

and the transformation becomes a

thresholding function that creates a binary image.

Intermediate values of ( ) and ( ) produce

various degrees of spread in the gray levels of the

output image, thus affecting its contrast. In general,

is assumed so that the function is single valued and monotonically increasing. This condition

preserves the order of gray levels.

r1 = s1r2 = s2

r1 = r2

s1 = 0 s2 = L@ 1

r1 , s1r2 , s2

r1 , r2 and s1 , s2

Gray level slicing

(a) Thistransformationhighlights range[A, B] of graylevels and reducesall others to aconstant level.(b) Thistransformationhighlights range[A, B] butpreserves allother levels.(c) An image.(d) Result ofusing thetransformationin (a).

Gray level slicing

Bit plane slicing

� Usually the visually significant data is stored

in higher four bits, rest of the data accounts

towards subtle detail of an image.

� Used when we require to modify the

contribution made by any particular bit plane.

An 8 bit fractal image

A fractal is an image generated by mathematical expressions

Contribution of each bit plane

� The eight bit planes of the image in earlier slide. The number at the bottom, right of each image identifies the bit plane.

Histogram

� The histogram of a digital image with gray

levels in the range [0, L-1] is a discrete

function , where is the gray level

and is the number of pixels in the image

having gray level .

� Loosely speaking, gives an estimate of

the probability of occurrence of gray level .

h rk

` a= nk

nk

rk k th

rk

p r k

` a

r k

Histogram

The graph on the right shows the histogram of the image on left, notice the distribution of discrete lines over the x coordinate( )nk

Histogram

High Contrast imagewhose pixels have a large variety of gray tones. The histogram is not far from a uniform.

Low Contrast image

The histogram will be

narrow and will be

concentrated on toward

the middle of the gray

scale.

� Can two images have same

histogram??

Different images But have the same histogram

More examples of histogram

Histogram Normalization

� It’s a very common to normalize the histogram.

� It’s done by dividing the values by the number of pixels in an image, hence a normalized histogram is given by:

� Note that sum of all components of a normalized histogram is equal to 1

p rk

` a=

nk

n

fffff

Histogram equalization

� Histogram equalization is done in order to

have a uniform spread of histogram over the

plane.

� Let us consider a transformation function of

the form:

s=T(r)

that produces a level s for every pixel having

a pixel value r in the original image.

0 ≤ r ≤ 1

Histogram equalization

� Assumptions for T(r):

(a) T(r) is a single valued and monotonically

increasing function in the interval

and

(b)

� Validity of assumptions

For inverse transformation

Non inversion of gray levels

0 ≤ r ≤ 1

0 ≤ T r` a

≤ 1 for 0 ≤ r ≤ 1

� Consider the pdf of r and the pdf of

s. Then we have,

Consider a transformation function of the

form:

pr r` a

ps s` a

ps s` a

= pr r` a dr

ds

ffffffLLLLL

MMMMM

s = T r` a

= Z0

r

pr w` a

dw

� According to Leibniz’s rule the derivative of a

definite integral with respect to its upper limit is simply the integrand

value at that limit.

� Applying this rule to

equation stated previously we get,

ds

dr

ffffff= d

T r` a

dr

fffffffffff

=d

dr

ffffffZo

r

pr w` a

dw

HLJ

IMK

= pr r` a

� Therefore we have,

= 1

Which shows that applying the transformation function of the form as described, will produce a uniform histogram.

The discrete formulation is:

Thus a processed image is obtained by mapping each pixel with value in the input image into a corresponding pixel with level

in the output image via the transformation function mentioned above

ps s` a

sk = T rk

` a= X

j = 0

k

pr r j

b c

rksk

Histogram

Equalization

Notice how

the vertical

bars get

Separated

and cover

the entire

range in the

Transformed

result

Spatial filtering and Masks

� Sometime we need to manipulate values obtained from

neighboring pixels for forming the output image.

Example: How can we compute an average value of

pixels in a 3x3 region center at a pixel z ?

4

4

67

6

1

9

2

2

2

7

5

2

26

4

4

5

212

1

3

3

4

2

9

5

7

7

35 8222

Pixel z

Image

� How masking is done.

Masking continued….

4

4

67

6

1

9

2

2

2

7

5

2

26

4

4

5

212

1

3

3

4

2

9

5

7

7

35 8222

Pixel z

Step 1. Selected only needed pixels

4

67

6

9

1

3

3

4

……

……

Masking continued…

4

67

6

9

1

3

3

4

……

……

Step 2. Multiply every pixel by 1/9 and then sum up the values

19

16

9

13

9

1

69

17

9

19

9

1

49

14

9

13

9

1

⋅+⋅+⋅+

⋅+⋅+⋅+

⋅+⋅+⋅=y

1 1

1

1

1

1

1

1

19

1

X

Mask or

Window or

Template

Averaging an image

Question: How to compute the 3x3 average values at every pixels?

4

4

67

6

1

9

2

2

2

7

5

2

26

4

4

5

212

1

3

3

4

2

9

5

7

7

Solution: Imagine that we have

a 3x3 window that can be placed

everywhere on the image

Masking Window

4.3

Step 1: Move the window to the first location where we want to

compute the average value and then select only pixels

inside the window.

4

4

67

6

1

9

2

2

2

7

5

2

26

4

4

5

212

1

3

3

4

2

9

5

7

7

Step 2: Compute

the average value

∑∑= =

⋅=3

1

3

1

),(9

1

i j

jipy

Sub image p

Original image

4 1

9

2

2

3

2

9

7

Output image

Step 3: Place the

result at the pixel

in the output image

Step 4: Move the

window to the next

location and go to Step 2

The 3x3 averaging method is one example of the mask

operation or Spatial filtering.

� The mask operation has the corresponding mask (sometimes

called window or template).

� The mask contains coefficients to be multiplied with pixel

values.

w(2,1) w(3,1)

w(3,3)

w(2,2)

w(3,2)

w(3,2)

w(1,1)

w(1,2)

w(3,1)

Mask coefficients

1 1

1

1

1

1

1

1

19

1

Example : moving averaging

The mask of the 3x3 moving average

filter has all coefficients = 1/9

The mask operation at each point is performed by:

1. Move the reference point (center) of mask to the

location to be computed

2. Compute sum of products between mask coefficients

and pixels in subimage under the mask.

p(2,1)

p(3,2)p(2,2)

p(2,3)

p(2,1)

p(3,3)

p(1,1)

p(1,3)

p(3,1)

……

……

Subimage

w(2,1) w(3,1)

w(3,3)

w(2,2)

w(3,2)

w(3,2)

w(1,1)

w(1,2)

w(3,1)

Mask coefficients

∑∑= =

⋅=N

i

M

j

jipjiwy1 1

),(),(

Mask frame

The reference point

of the mask

The spatial filtering on the whole image is given by:

1. Move the mask over the image at each location.

2. Compute sum of products between the mask coefficients

and pixels inside subimage under the mask.

3. Store the results at the corresponding pixels of the

output image.

4. Move the mask to the next location and go to step 2

until all pixel locations have been used.

(a) Original image, of size500*500 pixels. (b)–(f) Resultsof smoothing with squareaveraging filter masks of sizesn=3, 5, 9, 15, and 35,respectively. The blacksquares at the top are of sizes3, 5, 9, 15, 25, 35, 45, and 55pixels, respectively; theirBorders are 25 pixels apart. The letters at the bottom range insize from 10 to 24 points, inincrements of 2 points; the largeletter at the top is 60 points. Thevertical bars are 5 pixelswide and 100 pixels high; theirseparation is 20 pixels. Thediameter of the circles is 25pixels, and their borders are 15pixels apart; their gray levelsrange from 0% to 100%black in increments of 20%.Thebackground of the image is 10%black. The noisy rectanglesare of size 50*120 pixels.

Image smoothening and thresholding

(a) Image from the Hubble Space Telescope. (b) Image processed by a 15*15

averaging mask. (c) Result of thresholding (b)

Order statistics filter

Subimage

Original image

Moving

window

Statistic parameters

Mean, Median, Mode,

Min, Max, Etc.

Output image

Example of filtering

(a) X-ray image of circuit board corrupted by salt-and-pepper noise. (b) Noise reduction with a 3*3 averaging mask. (c) Noise reduction with a 3*3 median filter.

Derivatives of an image

� ANY DEFINITION WE USE FOR A FIRST DERIVATIVE

(1) MUST BE ZERO IN FLAT SEGMENTS (AREAS OF CONSTANT GRAY-LEVEL

VALUES);

(2) MUST BE NONZERO AT THE ONSET OF A GRAY LEVEL

STEP OR RAMP; AND

(3) MUST BE NONZERO ALONG RAMPS.

SIMILARLY, ANY DEFINITION OF A SECOND DERIVATIVE

(1) MUST BE ZERO IN FLAT AREAS;

(2) MUST BE NONZERO AT THE ONSET AND END OF A GRAY-LEVEL STEP

OR RAMP; AND

(3) MUST BE ZERO ALONG RAMPS OF CONSTANT SLOPE.

SINCE WE ARE DEALING WITH DIGITAL QUANTITIES WHOSE VALUES ARE

FINITE,THE MAXIMUM POSSIBLE GRAY-LEVEL CHANGE ALSO IS FINITE, AND THE

SHORTEST DISTANCE OVER WHICH THAT CHANGE CAN OCCUR IS BETWEEN

ADJACENT PIXELS. A FIRST ORDER DIFFERENTIAL EQUATION IS GIVEN AS:

∂ f

∂x

fffffff= f x + 1` a

@ f x` a

� A SECOND ORDER DIFFERENTIAL EQUATION IS

� WE CONSIDER THE IMAGE SHOWN NEXT AND CALCULATE TE FIRST AND SECOND ORDER DERVIVATIVE ALONG A LINE WHICH PASSES THROUGH THE ISOLATED POINT

∂2

f

∂ x2

fffffffffff= f x + 1` a

+ f x@ 1` a

@ 2f x` a

FIRST, WE NOTE THAT THE FIRST-ORDER DERIVATIVE ISNONZERO ALONG THE ENTIRE RAMP, WHILE THESECONDORDER DERIVATIVE IS NONZERO ONLY AT THE ONSETAND END OF THE RAMP. BECAUSE EDGES IN AN IMAGERESEMBLE THIS TYPE OF TRANSITION, WE CONCLUDETHAT FIRST-ORDER DERIVATIVES PRODUCE “THICK”EDGES AND SECOND-ORDER DERIVATIVES, MUCH FINERONES. NEXT WE ENCOUNTER THE ISOLATED NOISE POINT.HERE, THE RESPONSE AT AND AROUND THE POINT ISMUCH STRONGER FOR THE SECOND- THAN FOR THEFIRST-ORDER DERIVATIVE.

FINALLY, IN THIS CASE, THE RESPONSE OF THE TWODERIVATIVES IS THE SAME AT THE GRAY-LEVEL STEP (INMOST CASES WHEN THE TRANSITION INTO A STEP IS NOTFROM ZERO, THE SECOND DERIVATIVE WILL BE WEAKER).

(1) FIRST-ORDER DERIVATIVES GENERALLY PRODUCE

THICKER EDGES IN AN IMAGE.

(2) SECOND-ORDER DERIVATIVES HAVE A STRONGER

RESPONSE TO FINE DETAIL, SUCH AS THIN LINES AND

ISOLATED POINTS.

(3)FIRST ORDER DERIVATIVES GENERALLY HAVE A

STRONGER RESPONSE TO A GRAY-LEVEL STEP.

(4)SECOND ORDER DERIVATIVES PRODUCE A DOUBLE

RESPONSE AT STEP CHANGES IN GRAY LEVEL. WE ALSO

NOTE OF SECOND-ORDER DERIVATIVES THAT, FOR

SIMILAR CHANGES IN GRAY-LEVEL VALUES IN AN IMAGE,

THEIR RESPONSE IS STRONGER TO A LINE THAN TO A

STEP, AND TO A POINT THAN TO A LINE.

USE OF SECOND DERIVATIVES FOR ENHANCEMENT–THE LAPLACIAN

IT CAN BE SHOWN (ROSENFELD AND KAK [1982]) THAT

THE SIMPLEST ISOTROPIC DERIVATIVE OPERATOR IS THE

LAPLACIAN, WHICH, FOR A FUNCTION (IMAGE) F(X, Y) OF

TWO VARIABLES, IS DEFINED AS

SINCE DERIVATIVE OF ANY ORDEWR IS A LINEAR

OPERATOR THE LAPLACIAN OPERATOR IS ALSO A LINEAR

OPERATOR

HENCE THE PARTIAL SECOND ORDER DERIVATIVE ALONG

X- AXIS IS

AND THE PARTIAL SECOND ORDER DERIVATIVE

ALONG Y- AXIS IS

HENCE OUR LAPLACIAN LINEAR OPERATOR’S

DISCRETE FORMULATION IS

� THE MASK WHICH IMPLIES THE ABOVE

FORMULATION IS

(a) Filter mask used to

implement the digital

Laplacian, as defined above

(b) Mask used to implement

an extension of this equation

that includes the diagonal

neighbors.

(c) and (d) are two other

implementations

of the Laplacian.

BECAUSE THE LAPLACIAN IS A DERIVATIVE OPERATOR, ITS USE HIGHLIGHTS

GRAY-LEVEL DISCONTINUITIES IN AN IMAGE AND DEEMPHASIZES REGIONS

WITH SLOWLY VARYING GRAY LEVELS. THIS WILL TEND TO PRODUCE IMAGES

THAT HAVE GRAYISH EDGE LINES AND OTHER DISCONTINUITIES, ALL

SUPERIMPOSED ON A DARK, FEATURELESS BACKGROUND. BACKGROUND

FEATURES CAN BE “RECOVERED” WHILE STILL PRESERVING THE SHARPENING

EFFECT OF THE LAPLACIAN OPERATION SIMPLY BY ADDING THE ORIGINAL

AND LAPLACIAN IMAGES. AS NOTED IN THE PREVIOUS PARAGRAPH, IT IS

IMPORTANT TO KEEP IN MIND WHICH DEFINITION OF THE LAPLACIAN IS USED.

IF THE DEFINITION USED HAS A NEGATIVE CENTER COEFFICIENT, THEN WE

SUBTRACT, RATHER THAN ADD, THE LAPLACIAN IMAGE TO OBTAIN A

SHARPENED RESULT. THUS, THE BASIC WAY IN WHICH WE USE THE

LAPLACIAN FOR IMAGE ENHANCEMENT IS AS FOLLOWS:

(a) Image of the

North Pole of the

moon.

(b) Laplacian

Filtered image.

(c) Laplacian

image scaled for

Display

purposes.

(d) Image

enhanced by

using equation

stated earlier

Unsharp Masking and High Boost Filtering

-1 -1

-1

k+8

-1

-1

-1

-1

-1

-1 0

0

k+4

-1

-1

0

-1

0

Equation:

∇+

∇−=

),(),(

),(),(),(

2

2

yxfyxkf

yxfyxkfyxfhb

The center of the mask is negative

The center of the mask is positive

Result of HB Filtering

(b) Laplacian of (a)

computed with the mask in

Using k=0.

(c) Laplacian

enhanced image

using the mask (b) with

k=1. (d) Same

as (c), but using

k=1.7.

References

� Digital Image Processing [2e], Rafael C

Gonzalez and R E Woods, Prentice Hall

India.

End of lecture 1

I look forward to receive any kind of constructive criticism or suggestions to improve myself in my further presentations.

SHADAB KHAN

shadab@arro.in

+91 99004 04678

Post your doubts at:

E-mail.

Image Processing community at Orkut.

Recommended