Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Spring 2020: Venu: Haag 313, Time: M/W 4-5:15pm
ECE 5582 Computer VisionLec 02: Image Formation - Color Space
Zhu LiDept of CSEE, UMKC
Office: FH560E, Email: [email protected], Ph: x 2346.http://l.web.umkc.edu/lizhu
Z. Li, ECE/CS 5582 Computer Vision, 2020 p.1
slides created with WPS Office Linux and EqualX LaTex equation editor
Outline
Recap of Lec 01 Camera Model and Image Formation Color Model Summary
Z. Li, ECE/CS 5582 Computer Vision, 2020 p.2
Prerequisite & Text book Prerequisite
For senior and graduate students in EE/CS Good Matlab/C programming skills. Some Python is
also desirable. Taken Signal & System, or Digital Signal Processing or
consent of the instructor Will have different expectation/evaluation scheme for
MS/PhD and undergrad students Textbook:
None required (saving $$) , will distribute relevant chapters, papers, and notes.
Key References: R. Szeliski, Computer Vision: Algorithms and
Applications, Springer, 2014. URL: http://szeliski.org/Book/
J. E. Solem, Programming Computer Vision with Python, O’Reilly, 2015. URL: http://programmingcomputervision.com/downloads/ProgrammingComputerVision_CCdraft.pdf
Z. Li: ECE 5582 Computer Vision, 2020. p.3
Tentative Lecture Plan Image Processing Basics
Camera model and image formation Image filtering
Image Features for Retrieval Color Features Texture and Shape Features Basic Image Retrieval System and
Metrics Object Identification in Image
Key Point Detection Key Point Feature Description Fisher Vector Aggregation MPEG Mobile Visual Search
Technology and Standard Holistic Approach in Image
Understanding Subspace methods for face recognition:
Eigenface, Fisherface, Laplacianface. Deep Learning in Image Understanding
Z. Li: ECE 5582 Computer Vision, 2020. p.4
Homework 1: Image Filtering and Features
Homework 2: Image Retrieval System
Homework 3:SIFT/VGG Feature Aggregation
Homework 5:Deep learning in Classification/Identification
Homework 4:Subspace methods in face recognition
Potential Course/MS thesis Project Resources from last year: https://sce.umkc.edu/faculty-
sites/lizhu/teaching/2019.spring.vision/main-cv.html
Potential projects with 25% bonus points (can form 2 person team) Google Landmark Grand Challenge -
Identification/Recognition (CDVS baseline, U of Surrey)
VisDrone - UAV vision and object recognition (Pengfei)
FlatCam Face Verification Challenge (Salman)
Real world smart phone image super resolution (NITRE2020)
Fast Face Detection in compressed video (OpenCV)
Z. Li: ECE 5582 Computer Vision, 2020. p.5
Outline
Recap of Lec 01 Camera Model and Image Formation Color Model Summary
Z. Li, ECE/CS 5582 Computer Vision, 2020 p.6
Anatomy of human eye
Z. Li, ECE/CS 5582 Computer Vision, 2020
+*
Top-down view
The Human Eye optics: Lens: cornea and aqueous humour Lens control: muscle group called
zonula, changes the shape and position of the lens
Aperture control: iris is a muscle that change the size of pupil.
Human eye sensors: Photon sensors: the back of the eye
is called retina, photo sensor cells concentrate around fovea
Blind spot: where optical nerve terminates
p.7
look at the * with the right eye, moving closer ‘till + disappears
The Human Vision System PipelineThe signal path:
Z. Li, ECE/CS 5582 Computer Vision, 2020 p.8
The Retina Circuits
Z. Li, ECE/CS 5582 Computer Vision, 2020
Retina photon sensor cells Approx. 120 million rods Approx. 6 million cones Approx less than 1 million optical nerves (ganlion) connecting to brain
p.9
Filters
Visual Functions at Retina
Z. Li, ECE/CS 5582 Computer Vision, 2020
Vision function at retina Cones concentrated around the yellow spot, or macular, about 2.5-3mm in
diameter In the center of the macular, approx. 0.3mm in diameter, has no rods,
called fovea centralis, for high acuity vision. Rods are distributed sparsely away from fovea, and are good for low light
vision, and motion detection. Nigh vision 2nd blind spot: on fovea.
Rods for low light vision, cones for normal light high resolution vision
p.10
Lateral Geniculate Nucleus
Retina is doing low level luminance processing via rods/cones
Approx 1 million optical nerves connect the signal to LGN (Lateral Geniculate Nucleus) : mid level vision LGN has 6 layers More on the contrasts and movements First stage of stereo vision processing Color vision: Paired response for red-
green and blue-yellow signals
Primary and secondary visual cortex Optical radiations connect to primary
visual cortex Primary is then connected to secondary
cortex Complete higher level of vision tasks
Z. Li, ECE/CS 5582 Computer Vision, 2020
1000x1000 color p.11
Lateral Inhibition – Edge Perceiving Edge info processing at Retina circuits More rods/cones than optical nerves Not all photon reception is feedback to brains, the ganlion cells have
this lateral inhibition function to suppress the amount of information fired back to visual cortex
Z. Li, ECE/CS 5582 Computer Vision, 2020
No inhibitionInhibition: enhance edge
Mach Band: the edge perception with inhibition :)
p.12
Color perception: more sensitive around green bands
Human Color perception
Z. Li, ECE/CS 5582 Computer Vision, 2020 p.13
why red ?
Early Imaging Devices
Dark Chamber
Lens Based Dark Chamber Camera, 1568
Srinivasa Narasimhan’s slideZ. Li, ECE/CS 5582 Computer Vision, 2020 p.14
First Film
Imaging with Chemical methods
Still Life, Louis Jaques Mande Daguerre, 1837
Z. Li, ECE/CS 5582 Computer Vision, 2020 p.15
Daguerréotype Imaging
Chemical oxiding of sliver plate
Z. Li, ECE/CS 5582 Computer Vision, 2020 p.16
civil war circa 1862
Modern Digital Camera
•Modern Digital Camera Pipeline–Digital part:
»Auto focus»White balance»De-bluring»...»Only 1 square mm for ISP on Samsung Exynos chip !
Z. Li, ECE/CS 5582 Computer Vision, 2020 p.17
Outline
Recap of Lec 01 Camera Model and Image Formation Color Model Summary
Z. Li, ECE/CS 5582 Computer Vision, 2020 p.18
Color Sensing
CCD vs CMOS Sensors Charge-Coupled Devices (CCD) Requires 3 chips and precise
alignment CMOS (complementary metal–oxide–semiconductor) sensor is
cheaper (but noiser) but can easily integrated with digital logic circuits
More expensive than CMOS sensors
CCD(B)
CCD(G)
CCD(R)
Z. Li, ECE/CS 5582 Computer Vision, 2020
Samsung ISOCELL
p.19
Color Filter
Color sensing Source: Steve Seitz
Estimate missing components from neighboring values(demosaicing)
Why more green?
Bayer grid
Human Luminance Sensitivity Function
p.20Z. Li, ECE/CS 5582 Computer Vision, 2020
Bayer's Pattern More green samples than blue and red...
YungYu Chuang’s slide
Z. Li, ECE/CS 5582 Computer Vision, 2020 p.21
Color Filter
Demosaicing filter: fill the holes of missing R, G, B
Z. Li, ECE/CS 5582 Computer Vision, 2020 p.22
Demosaic Results
Z. Li, ECE/CS 5582 Computer Vision, 2020 p.23
Demosaicing filter: utilizing cross channel lapalcian filters, i.e, let red values influence
green values
results reference [3]
Dynamic Range
High Dynamic Range Imaging:•What is the range of light intensity that a camera can capture?
–Called dynamic range–Digital cameras have difficulty capturing both high intensities and low intensities in the same image–MPEG is launching new HDR video compression work
Z. Li, ECE/CS 5582 Computer Vision, 2020 p.24
High Dynamic Range Imaging Tone mapping Align multi-exposure image map input measurement to target display
Z. Li, ECE/CS 5582 Computer Vision, 2020 p.25
RGB vs CMY
Tri-Chromatic Theory
Z. Li, ECE/CS 5582 Computer Vision, 2020 p.26
CIE XYZ Model
Z. Li, ECE/CS 5582 Computer Vision, 2020 p.27
CIE Color Gamut
Z. Li, ECE/CS 5582 Computer Vision, 2020 p.28
Visible and Printable Gamut
Visible and Printable Color
Z. Li, ECE/CS 5582 Computer Vision, 2020 p.29
30
Y’CBCR
Rec. 601 specifies a range of [16, 235] for Y’ and [16, 240] for CB and CR.
To obtain Y’CBCR from 8-bit R’G’B’ values (i.e., in the range [0, 255]), use the transformation:
úúú
û
ù
êêê
ë
é·
úúú
û
ù
êêê
ë
é
----+
úúú
û
ù
êêê
ë
é=
úúú
û
ù
êêê
ë
é
'''
285.18154.94439.112439.112494.74945.37064.25057.129738.65
2561
12812816'
BGR
CCY
R
B
Z. Li, ECE/CS 5582 Computer Vision, 2020
YUV/YCbCr/YIQ Model
Z. Li, ECE/CS 5582 Computer Vision, 2020 p.31
Rec. 601 for TV: specifies a range of [16, 235] for Y’ and [16, 240] for CB and CR.To obtain Y’CBCR from 8-bit R’G’B’ values (i.e., in the range [0, 255]), use the transformation:
úúú
û
ù
êêê
ë
é·
úúú
û
ù
êêê
ë
é
----+
úúú
û
ù
êêê
ë
é=
úúú
û
ù
êêê
ë
é
'''
285.18154.94439.112439.112494.74945.37064.25057.129738.65
2561
12812816'
BGR
CCY
R
B
Color HistogramThe First (very coarse) Image Feature
–Color is an important cue to the image perception–Why not use the statistics of color distribution in an image to represent the image ?
Color Blobs, without spatial info
Z. Li, ECE/CS 5582 Computer Vision, 2020 p.32
Computing Histogram•The Input
–Collection of pixels, {xi}, for i=1..(wxh) in R3
–Pre-computed color bins, can be expressed as n centroids, {m1, m2,…, mn}
•The output–Generate a normalized pixel counting w.r.t to each color bins–Binarization–Distance metrics
–A matlab example:im = imread('cameraman.tif');h=imhist(im);
it will create a 256 bin histogram for the image for single channel grayscale image, how about color image ?
Z. Li, ECE/CS 5582 Computer Vision, 2020 p.33
A toy problem:
Color Histogram as a Feature for Image Retrieval
Query Data base
d(q, i) = d(Hq, Hi)
Z. Li, ECE/CS 5582 Computer Vision, 2020 p.34
MPEG-7 Scalable Color Descriptor
•Scalable Color Descriptor (SCD) is in the form of a color histogram in the HSV color space encoded using a Haar transform. H is quantized to 16 bin and S and V are quantized to 4 bins each, total 256 bins.
•The pixel count for each bin is quantized to 4 bits, so at max 256x4=1024 bits for representing. The distance between two images are therefore hamming distance, Scalability thru Haar trans.
Saturation
Hue
Value
Red (0o)
Yellow (60o)
Green (120o)
Cyan (180o)
Blue (240o) Magenta (300o)
Black
White
Z. Li, ECE/CS 5582 Computer Vision, 2020 p.35
MPEG 7 Scalable Color Descriptor
•Achieving Scalability by two stage quantization and a Haar Transform• 1
st stage, a non-linear quantization from 11bit to 4 bit, giving
more resolution to lower pixel counts. • Haar transform, decorrelates, • Linear quantization, generate bit stream
Z. Li, ECE/CS 5582 Computer Vision, 2020 p.36
Scalable Color Descriptor Retrieval Results
• Not bad for color dominating images ... •
Z. Li, ECE/CS 5582 Computer Vision, 2020 p.37
Dark Image Enhancement from Sensor Field
Low light photography Low light vision task Object detection and
recognition under low light condition
SurveillanceFigure 1. Low light photography for mobile devices
Figure 2. Low light pedestrian detection (Ref: Multispectral Deep Neural Networks for Pedestrian Detection)
Z. Li, ECE/CS 5582 Computer Vision, 2020 p.38
Objectives To design end-to-end network that performs image denoising and
enhancement To design network with less computational complexity for fast low
light photography and video application Input: RGBG bayer pattern sensor, output: RGB images
Figure 3. [a] Extreme low-light image from Sony a7S II exposed for 1/10 second . [b] 100x intensity scaling of image in [a]. [c] Ground truth image captured with 10 second exposure time. [d] Output from [1]. [e] Output from our method.
Z. Li, ECE/CS 5582 Computer Vision, 2020 p.39
Dark Image Enhancement from Sensor Field• Residual based learning - learns noise instead of image prior
• Residual Blocks with 2x scaling layer at end of network
• LeakyReLU as activation function for residual blocks
• Residual block followed by Squeeze-and-Excitation block- converges the network faster and increases the performance
Figure 5. [a] Proposed network [b] Residual Block with Squeeze-and-Excitation Network
Z. Li, ECE/CS 5582 Computer Vision, 2020 p.40
Denoising and Color Accuracy Denoising
Color accuracy
Figure 6: [a] Ground truth image (b) Output from SID. Noise is still present in few parts of the image (c) Output from BM3D. Denoised image is darker than the ground truth. (d) Denoised output from our network
Figure 7: [a] Input dark image (b) 100x scaled version of dark images (c) Ground truth with exposure time of 10 seconds (d) SID output with missing color information, PSNR: 20.48dB (e) Output from our network with close approximation to ground truth image, PSNR: 27.17dB.
Z. Li, ECE/CS 5582 Computer Vision, 2020 p.41
Less Color Spread and Image Details
Less Color Spread
Image Details
Figure 9: [a] Ground truth (b) 300x amplified dark image (c) U-Net output. Image not clear due to pixelated effect. (d) Output from our network with higher image quality
Figure 8: [a] Input dark image form Sony 300x subset (b) 300x amplification of dark image (c) Ground truth image with exposure time of 10 seconds (d) U-Net output with unnecessary color spread at the ground. (e) Output from our network with close approximation to ground truth image.
Z. Li, ECE/CS 5582 Computer Vision, 2020 p.42
Summary
In this lecture, we covered Camera model Color Space Color model manipulation and transform Demosaic - from sensor field to image pixels Color based image features A taste of deep learning based dark image enhancement.
To do Install vl_feat v0.9.20, the 9.21 has some issue Go thru the ETHZ matlab image processing tutorial - very good one Will arrange lab session on this.
Next Lecture: Image Formation - Geometry : how pixels are related to the 3D
world points, and how pixels from images of the same scene are related.
Z. Li, ECE/CS 5582 Computer Vision, 2020 p.43
Q&A
Q&A
Z. Li, ECE/CS 5582 Computer Vision, 2020 p.44