Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
ME5286 – Lecture 3 (Theory)
#1
Lecture 3: Digital Image Representation
and Color Fundamentals
Saad J [email protected]
ME5286 – Lecture 3 (Theory)
#2
Last Lecture
• Image Formation– Pinhole Camera– Lenses
• Human Visual System• Digital Cameras Capture Components
ME5286 – Lecture 3 (Theory)
#3
Outline for this Lecture
• Digital Image Representation– Sampling, Quantization
• Color Fundamentals– Color Transformation
ME5286 – Lecture 3 (Theory)
#4Image Sensing and Acquisitionand
Digital Image Representation:Sampling and Quantization
ME5286 – Lecture 3 (Theory)
#5
Digital Image Quality factors
ME5286 – Lecture 3 (Theory)
Image digitization
• Sampling: measure the value of an image at a finite number of points.
• Quantization: represent measured value (i.e., voltage) at the sampled point by an integer.
Image digitization
ME5286 – Lecture 3 (Theory)
#7
World Camera Digitizer DigitalImage0 10 10 15 50 70 80
0 0 100 120 125 130 130
0 35 100 150 150 80 50
0 15 70 100 10 20 20
0 15 70 0 0 0 15
5 15 50 120 110 130 110
5 10 20 50 50 20 250
PIXEL
Typically:0 = black
255 = white
(picture element)
Digital Images
ME5286 – Lecture 3 (Theory)
#8
Image Digitization
ME5286 – Lecture 3 (Theory)
Digital Image
0
255
Grayscale Image:- 2D Matrix- 8 bits/pixel
ME5286 – Lecture 3 (Theory)
N=M=30
Digital ImageImage is a 2D rectilinear array of pixels (picture element)With FIXED Number of samples : NxM
N=M=256
ME5286 – Lecture 3 (Theory)
L=15(4 bits)L=255 (8 bits)
Digital ImageNo continuous values – Quantization represented by the
number of bits per pixel
255
170
15
8
L=1 (1 bit) L=3 (2 bits)
ME5286 – Lecture 3 (Theory)
Sampling and Quantization
ME5286 – Lecture 3 (Theory) 13
Uniform sampling
• Digitized in spatial domain (IM x N)• M and N are usually integer powers of two• Nyquist theorem and Aliasing
• Non-uniform sampling– communication
(0,0) (0,1) (0,2) (0,3)(1,0)
(3,0)(2,0)
(1,1)(2,1)(3,1)
(2,2)(3,2)
(1,2)
(3,3)(2,3)(1,3)
(0,0) (0,0) (0,2) (0,2)(0,0)
(2,0)(2,0)
(0,0)(2,0)(2,0)
(2,2)(2,2)
(0,2)
(2,2)(2,2)(0,2)
Sampledby 2
ME5286 – Lecture 3 (Theory)
#14
Image Samplingoriginal image sampled by a factor of 2
sampled by a factor of 4 sampled by a factor of 8
ME5286 – Lecture 3 (Theory)
Image Decimation and Interpolation#15
• Decimation is the reduction in dimension or resolution of the Image ( subsampling ) – Decimation of 2, results to half of the size of the
image– Simplest method is the skipping of pixels
• Interpolation is the increase in dimension or resolution of the Image by means – Interpolation of 2, results to double the size of the
image– Simplest method is duplication of pixels
ME5286 – Lecture 3 (Theory)
Image Pyramids
Known as a Gaussian Pyramid [Burt and Adelson, 1983]• In computer graphics, a mip map [Williams, 1983]
Slide by Steve Seitz
ME5286 – Lecture 3 (Theory)
Effect of Sampling
• Simple example: a sign wave
ME5286 – Lecture 3 (Theory)
Undersampling
• What if we “missed” things between the samples?
• Simple example: undersampling a sine wave– unsurprising result: information is lost
ME5286 – Lecture 3 (Theory)
Undersampling• What if we “missed” things between the
samples?• Simple example: undersampling a sine
wave– unsurprising result: information is lost– surprising result: indistinguishable from lower
frequency
ME5286 – Lecture 3 (Theory)
Undersampling• What if we “missed” things between the samples?• Simple example: undersampling a sine wave
– unsurprising result: information is lost– surprising result: indistinguishable from lower
frequency– also was always indistinguishable from higher
frequencies– aliasing: signals “traveling in disguise” as other
frequencies
ME5286 – Lecture 3 (Theory)
What’s happening?Input signal:
x = 0:.05:5; imagesc(sin((2.^x).*x))
Plot as image:
Alias!Not enough samples
ME5286 – Lecture 3 (Theory)
Antialiasing• What can we do about aliasing?
• Sample more often– Join the Mega-Pixel enhancement of the photo industry– But this can’t go on forever
• Make the signal less “wiggly” – Get rid of some high frequencies– Will loose information– But it’s better than aliasing
ME5286 – Lecture 3 (Theory)
23
Aliasing effectAliasing (the Moire effect)
http://www.wfu.edu/~matthews/misc/DigPhotog/alias/
Artifacts
ME5286 – Lecture 3 (Theory) 24
Uniform quantization• Digitized in amplitude (or pixel value)• PGM – 256 levels 4 levels• Compute the uniform step that represent 1 level
step = 64 in this case
0
255
64
128
192
0
3
1
2
ME5286 – Lecture 3 (Theory)
#25
Image Quantization• 256 gray levels (8bits/pixel) 32 gray levels (5 bits/pixel) 16 gray levels (4bits/pixel)
• 8 gray levels (3 bits/pixel) 4 gray levels (2 bits/pixel) 2 gray levels (1 bit/pixel)
ME5286 – Lecture 3 (Theory)
Issues with Dynamic Range
15001500
11
25,00025,000
400,000400,000
2,000,000,0002,000,000,000
- The real world hasHigh dynamic range
- Uniform Sampling is not optimal
- Wide Dynamic Range combines multiple captures
ME5286 – Lecture 3 (Theory)
#27
Color Image Processing
ME5286 – Lecture 3 (Theory)
Color Image Processing
• Color– simplifies object extraction and identification– human vision : thousands of colors vs max-24
gray levels
• Color Spectrum– white light with a prism (1966, Newton)
ME5286 – Lecture 3 (Theory)
Gray scale Image
ME5286 – Lecture 3 (Theory)
Color Imageiew
ME5286 – Lecture 3 (Theory)
#31
What Is Light?• The visible portion of the electromagnetic (EM)
spectrum.• It occurs between wavelengths of approximately
400 and 700 nanometers.
ME5286 – Lecture 3 (Theory)H.R. Pourreza
Color Spectrum
The experiment of Sir Isaac Newton, in 1666.
ME5286 – Lecture 3 (Theory)
Color spaces• How can we represent color?
http://en.wikipedia.org/wiki/File:RGB_illumination.jpg
ME5286 – Lecture 3 (Theory)
Human Eye
• Three different types of cones; each type has a special pigment that is sensitive to wavelengths of light in a certain range:– Short (S) corresponds to blue– Medium (M) corresponds to green– Long (L) corresponds to red
• Ratio of L to M to S cones: – approx. 10:5:1
• Almost no S cones in the center of the fovea
400 450 500 550 600 650
RE
LATI
VE
AB
SO
RB
AN
CE
(%)
WAVELENGTH (nm.)
100
50
440
S
530 560 nm.
M L
ME5286 – Lecture 3 (Theory)
Color images• Color representation is based on the theory of T. Young
(1802) which states that any color can be produced by mixing three primary colors C1, C2, C3:
C = aC1 + bC2 + cC3
• It is therefore possible to characterise a psycho-visual colour by specifying the amounts of three primary colours: red, green and blue, mixed together.
• This lead to the standard RGB space used in television, computer monitors, smart phones, etc.
ME5286 – Lecture 3 (Theory)
Color FundamentalsStandard wavelength values for the primary colors
ME5286 – Lecture 3 (Theory)
Color Fundamentals
Tri-stimulus values: The amount of Red, Green and Blue needed to form any particular color
Denoted by: X, Y and Z
ZYXXx
ZYXYy
ZYX
Zz
1 zyx
Tri-chromatic coefficient:
ME5286 – Lecture 3 (Theory)
Color FundamentalsAny patch of light can be completely describedphysically by its spectrum: the number of photons (per time unit) at each wavelength 400 - 700 nm.
400 500 600 700
Wavelength (nm.)
# Photons(per ms.)
© Stephen E. Palmer, 2002
ME5286 – Lecture 3 (Theory)
Color Examples
Some examples of the reflectance spectra of surfaces
Wavelength (nm)
% P
hoto
ns R
efle
cted
Red
400 700
Yellow
400 700
Blue
400 700
Purple
400 700
© Stephen E. Palmer, 2002
ME5286 – Lecture 3 (Theory)
Tetrachromatism
• Most birds, and many other animals, have cones for ultraviolet light.
• Some humans, mostly female, seem to have slight tetrachromatism.
Bird cone responses
ME5286 – Lecture 3 (Theory)
More Spectra
ME5286 – Lecture 3 (Theory)
Color Image Representation
• RGB Model
ME5286 – Lecture 3 (Theory)
Color Image Representation
• Usually, we specify the levels of R, G and Bin the range [0, 255], (8-bit integers).
(0,0,0)
(255,255,255)
RGB
Colors 216,777,162 38
ME5286 – Lecture 3 (Theory)
RGB Color Representation
0,1,0
0,0,1
1,0,0
Image from: http://en.wikipedia.org/wiki/File:RGB_color_solid_cube.png
Some drawbacks• Strongly correlated channels• Non-perceptual
Default color space
R(G=0,B=0)
G(R=0,B=0)
B(R=0,G=0)
ME5286 – Lecture 3 (Theory)
Color ImageR
G
B
ME5286 – Lecture 3 (Theory)
Alternate Color Spaces
• Various other color representations can be computed from RGB.
• This can be done for:– Decorrelating the color channels:
• principal components.
– Bringing color information to the fore:• Hue, saturation and brightness.
ME5286 – Lecture 3 (Theory)
Most Common Color Spaces The purpose of a color model (also called color
space) is to facilitate the specification of colors in some standard, generally accept way.
RGB (red,green,blue) : monitor, video camera. and HSI ( HSL, HSV, YUV) model, which corresponds
closely with the way humans describe and interpret color. CMY(cyan,magenta,yellow),CMYK (CMY, black) model
for color printing.Black (K) = minimum of C,M,YCyanCMYK = (C - K)/(1 - K)MagentaCMYK = (M - K)/(1 - K)YellowCMYK = (Y - K)/(1 - K)
ME5286 – Lecture 3 (Theory)
ME5286 – Lecture 3 (Theory)
Additive vs Substractive Color Models#49
ME5286 – Lecture 3 (Theory)
Alternate Color Space
The characteristics generally used to distinguish one color from another are Brightness, Hue, and Saturation. Hue: Represents dominant color as perceive by an
observer. Saturation: Relative purity or the amount of white light
mixed with a hue
Hue and saturation taken together are called Chromaticity, and therefore, a color may be characterized by its Brightness and Chromaticity.
ME5286 – Lecture 3 (Theory)
HSI model: hue and saturation
ME5286 – Lecture 3 (Theory)
#52
HSI Color Space• Hue corresponds to color, saturation
corresponds to the amount of white in color, and intensity is related to brightness
• For example: a deep, bright orange color would have a large intensity (bright), a hue of “orange” , and a high value of saturation (“deep”)
• But in terms of RGB components, this color would have the values as R =245, G= 110, and B=20
ME5286 – Lecture 3 (Theory)
Color spaces: HSVIntuitive color space
H(S=1,V=1)
S(H=1,V=1)
V(H=1,S=0)
ME5286 – Lecture 3 (Theory)
#54
rg Chromaticity Coordinates
• A two-dimensional color space in which there is no intensity information
• Normalizes RGB values to the sum of all three• Chromaticity coordinates are:
ME5286 – Lecture 3 (Theory)
Color Transformation - Examples
ME5286 – Lecture 3 (Theory)
Color spaces: YCbCr
Y(Cb=0.5,Cr=0.5)
Cb(Y=0.5,Cr=0.5)
Cr(Y=0.5,Cb=05)
Y=0 Y=0.5
Y=1Cb
Cr
Fast to compute, good for compression, used by TV
ME5286 – Lecture 3 (Theory)
Other Color spaces
• RGB (CIE), RnGnBn (TV - National Television Standard Committee)• XYZ (CIE)• UVW (UCS de la CIE), U*V*W* (UCS modified by the CIE)• YUV, YIQ, YCbCr• YDbDr• DSH, HSV, HLS, IHS• Munsel color space (cylindrical representation)• CIELuv• CIELab• SMPTE-C RGB• YES (Xerox)• Kodak Photo CD, YCC, YPbPr, ...
ME5286 – Lecture 3 (Theory)
Color Image: Full Description
Original image
ME5286 – Lecture 3 (Theory)
Intensity Image: Most information
Only intensity shown – constant color
ME5286 – Lecture 3 (Theory)
Most information in intensity
Only color shown – constant intensity
ME5286 – Lecture 3 (Theory)
Skin color
RGB rgr
g
ME5286 – Lecture 3 (Theory)
Skin detection
M. Jones and J. Rehg, Statistical Color Models with Application to Skin Detection, International Journal of Computer Vision, 2002.
ME5286 – Lecture 3 (Theory)
Common image file formats
• GIF (Graphic Interchange Format) -• PNG (Portable Network Graphics)• JPEG (Joint Photographic Experts Group)• TIFF (Tagged Image File Format)• PGM (Portable Gray Map)• FITS (Flexible Image Transport System)
ME5286 – Lecture 3 (Theory)
PBM/PGM/PPM format• A popular format for grayscale images (8 bits/pixel)• Closely-related formats are:
– PBM (Portable Bitmap), for binary images (1 bit/pixel)– PPM (Portable Pixelmap), for color images (24 bits/pixel)
» ASCII or binary (raw) storage
ASCI
Binary
ME5286 – Lecture 3 (Theory)
Images in Matlab• Images represented as a matrix• Suppose we have a NxM RGB image called “im”
– im(1,1,1) = top‐left pixel value in R‐channel– im(y, x, b) = y pixels down, x pixels to right in the bth channel– im(N, M, 3) = bottom‐right pixel in B‐channel
• imread(filename) returns a uint8 image (values 0 to 255)– Convert to double format (values 0 to 1 if you need to scale)
0.92 0.93 0.94 0.97 0.62 0.37 0.85 0.97 0.93 0.92 0.990.95 0.89 0.82 0.89 0.56 0.31 0.75 0.92 0.81 0.95 0.910.89 0.72 0.51 0.55 0.51 0.42 0.57 0.41 0.49 0.91 0.920.96 0.95 0.88 0.94 0.56 0.46 0.91 0.87 0.90 0.97 0.950.71 0.81 0.81 0.87 0.57 0.37 0.80 0.88 0.89 0.79 0.850.49 0.62 0.60 0.58 0.50 0.60 0.58 0.50 0.61 0.45 0.330.86 0.84 0.74 0.58 0.51 0.39 0.73 0.92 0.91 0.49 0.740.96 0.67 0.54 0.85 0.48 0.37 0.88 0.90 0.94 0.82 0.930.69 0.49 0.56 0.66 0.43 0.42 0.77 0.73 0.71 0.90 0.990.79 0.73 0.90 0.67 0.33 0.61 0.69 0.79 0.73 0.93 0.970.91 0.94 0.89 0.49 0.41 0.78 0.78 0.77 0.89 0.99 0.93
0.92 0.93 0.94 0.97 0.62 0.37 0.85 0.97 0.93 0.92 0.990.95 0.89 0.82 0.89 0.56 0.31 0.75 0.92 0.81 0.95 0.910.89 0.72 0.51 0.55 0.51 0.42 0.57 0.41 0.49 0.91 0.920.96 0.95 0.88 0.94 0.56 0.46 0.91 0.87 0.90 0.97 0.950.71 0.81 0.81 0.87 0.57 0.37 0.80 0.88 0.89 0.79 0.850.49 0.62 0.60 0.58 0.50 0.60 0.58 0.50 0.61 0.45 0.330.86 0.84 0.74 0.58 0.51 0.39 0.73 0.92 0.91 0.49 0.740.96 0.67 0.54 0.85 0.48 0.37 0.88 0.90 0.94 0.82 0.930.69 0.49 0.56 0.66 0.43 0.42 0.77 0.73 0.71 0.90 0.990.79 0.73 0.90 0.67 0.33 0.61 0.69 0.79 0.73 0.93 0.970.91 0.94 0.89 0.49 0.41 0.78 0.78 0.77 0.89 0.99 0.93
0.92 0.93 0.94 0.97 0.62 0.37 0.85 0.97 0.93 0.92 0.990.95 0.89 0.82 0.89 0.56 0.31 0.75 0.92 0.81 0.95 0.910.89 0.72 0.51 0.55 0.51 0.42 0.57 0.41 0.49 0.91 0.920.96 0.95 0.88 0.94 0.56 0.46 0.91 0.87 0.90 0.97 0.950.71 0.81 0.81 0.87 0.57 0.37 0.80 0.88 0.89 0.79 0.850.49 0.62 0.60 0.58 0.50 0.60 0.58 0.50 0.61 0.45 0.330.86 0.84 0.74 0.58 0.51 0.39 0.73 0.92 0.91 0.49 0.740.96 0.67 0.54 0.85 0.48 0.37 0.88 0.90 0.94 0.82 0.930.69 0.49 0.56 0.66 0.43 0.42 0.77 0.73 0.71 0.90 0.990.79 0.73 0.90 0.67 0.33 0.61 0.69 0.79 0.73 0.93 0.970.91 0.94 0.89 0.49 0.41 0.78 0.78 0.77 0.89 0.99 0.93
R
GB
row column
ME5286 – Lecture 3 (Theory)
#66
Image Representation
• Mathematically, an image can be represented by a 2-D matrix– Each entry (i,j) represents the value at the
corresponding location, which is called a pixel– The value of a pixel can have different types,
depending on the image types• Unsigned char ( 8 bits per pixel or 256 levels ) • Int • Float• A vector (Color image, for example)
ME5286 – Lecture 3 (Theory)
#67
Image File Formats
• Image file header: A set of parameters found at the start of the file image and contains information regarding:
• Number of rows (height)• Number of columns (width)• Number of bands• Number of bits per pixel (bpp)• File type