ECE 5582 Computer Vision...Visual Functions at Retina Z. Li, ECE/CS 5582 Computer Vision, 2020 Vision function at retina Cones concentrated around the yellow spot, or macular, about

Spring 2020: Venu: Haag 313, Time: M/W 4-5:15pm

ECE 5582 Computer VisionLec 02: Image Formation - Color Space

Zhu LiDept of CSEE, UMKC

Office: FH560E, Email: [email protected], Ph: x 2346.http://l.web.umkc.edu/lizhu

Z. Li, ECE/CS 5582 Computer Vision, 2020 p.1

slides created with WPS Office Linux and EqualX LaTex equation editor

Outline

Recap of Lec 01 Camera Model and Image Formation Color Model Summary


Prerequisite & Text book Prerequisite

For senior and graduate students in EE/CS Good Matlab/C programming skills. Some Python is

also desirable. Taken Signal & System, or Digital Signal Processing or

consent of the instructor Will have different expectation/evaluation scheme for

MS/PhD and undergrad students Textbook:

None required (saving $$) , will distribute relevant chapters, papers, and notes.

Key References: R. Szeliski, Computer Vision: Algorithms and

Applications, Springer, 2014. URL: http://szeliski.org/Book/

J. E. Solem, Programming Computer Vision with Python, O’Reilly, 2015. URL: http://programmingcomputervision.com/downloads/ProgrammingComputerVision_CCdraft.pdf

Z. Li: ECE 5582 Computer Vision, 2020. p.3

Tentative Lecture Plan Image Processing Basics

Camera model and image formation Image filtering

Image Features for Retrieval Color Features Texture and Shape Features Basic Image Retrieval System and

Metrics Object Identification in Image

Key Point Detection Key Point Feature Description Fisher Vector Aggregation MPEG Mobile Visual Search

Technology and Standard Holistic Approach in Image

Understanding Subspace methods for face recognition:

Eigenface, Fisherface, Laplacianface. Deep Learning in Image Understanding


Homework 1: Image Filtering and Features

Homework 2: Image Retrieval System

Homework 3:SIFT/VGG Feature Aggregation

Homework 5:Deep learning in Classification/Identification

Homework 4:Subspace methods in face recognition

Potential Course/MS thesis Project Resources from last year: https://sce.umkc.edu/faculty-

sites/lizhu/teaching/2019.spring.vision/main-cv.html

Potential projects with 25% bonus points (can form 2 person team) Google Landmark Grand Challenge -

Identification/Recognition (CDVS baseline, U of Surrey)

VisDrone - UAV vision and object recognition (Pengfei)

FlatCam Face Verification Challenge (Salman)

Real world smart phone image super resolution (NITRE2020)

Fast Face Detection in compressed video (OpenCV)


Outline



Anatomy of human eye

Z. Li, ECE/CS 5582 Computer Vision, 2020

+*

Top-down view

The Human Eye optics: Lens: cornea and aqueous humour Lens control: muscle group called

zonula, changes the shape and position of the lens

Aperture control: iris is a muscle that change the size of pupil.

Human eye sensors: Photon sensors: the back of the eye

is called retina, photo sensor cells concentrate around fovea

Blind spot: where optical nerve terminates

p.7

look at the * with the right eye, moving closer ‘till + disappears

The Human Vision System PipelineThe signal path:


The Retina Circuits


Retina photon sensor cells Approx. 120 million rods Approx. 6 million cones Approx less than 1 million optical nerves (ganlion) connecting to brain

p.9

Filters

Visual Functions at Retina


Vision function at retina Cones concentrated around the yellow spot, or macular, about 2.5-3mm in

diameter In the center of the macular, approx. 0.3mm in diameter, has no rods,

called fovea centralis, for high acuity vision. Rods are distributed sparsely away from fovea, and are good for low light

vision, and motion detection. Nigh vision 2nd blind spot: on fovea.

Rods for low light vision, cones for normal light high resolution vision

p.10

Lateral Geniculate Nucleus

Retina is doing low level luminance processing via rods/cones

Approx 1 million optical nerves connect the signal to LGN (Lateral Geniculate Nucleus) : mid level vision LGN has 6 layers More on the contrasts and movements First stage of stereo vision processing Color vision: Paired response for red-

green and blue-yellow signals

Primary and secondary visual cortex Optical radiations connect to primary

visual cortex Primary is then connected to secondary

cortex Complete higher level of vision tasks


1000x1000 color p.11

Lateral Inhibition – Edge Perceiving Edge info processing at Retina circuits More rods/cones than optical nerves Not all photon reception is feedback to brains, the ganlion cells have

this lateral inhibition function to suppress the amount of information fired back to visual cortex


No inhibitionInhibition: enhance edge

Mach Band: the edge perception with inhibition :)

p.12

Color perception: more sensitive around green bands

Human Color perception


why red ?

Early Imaging Devices

Dark Chamber

Lens Based Dark Chamber Camera, 1568

Srinivasa Narasimhan’s slideZ. Li, ECE/CS 5582 Computer Vision, 2020 p.14

First Film

Imaging with Chemical methods

Still Life, Louis Jaques Mande Daguerre, 1837


Daguerréotype Imaging

Chemical oxiding of sliver plate


civil war circa 1862

Modern Digital Camera

•Modern Digital Camera Pipeline–Digital part:

»Auto focus»White balance»De-bluring»...»Only 1 square mm for ISP on Samsung Exynos chip !


Outline



Color Sensing

CCD vs CMOS Sensors Charge-Coupled Devices (CCD) Requires 3 chips and precise

alignment CMOS (complementary metal–oxide–semiconductor) sensor is

cheaper (but noiser) but can easily integrated with digital logic circuits

More expensive than CMOS sensors

CCD(B)

CCD(G)

CCD(R)


Samsung ISOCELL

p.19

Color Filter

Color sensing Source: Steve Seitz

Estimate missing components from neighboring values(demosaicing)

Why more green?

Bayer grid

Human Luminance Sensitivity Function

p.20Z. Li, ECE/CS 5582 Computer Vision, 2020

Bayer's Pattern More green samples than blue and red...

YungYu Chuang’s slide


Color Filter

Demosaicing filter: fill the holes of missing R, G, B


Demosaic Results


Demosaicing filter: utilizing cross channel lapalcian filters, i.e, let red values influence

green values

results reference [3]

Dynamic Range

High Dynamic Range Imaging:•What is the range of light intensity that a camera can capture?

–Called dynamic range–Digital cameras have difficulty capturing both high intensities and low intensities in the same image–MPEG is launching new HDR video compression work


High Dynamic Range Imaging Tone mapping Align multi-exposure image map input measurement to target display


RGB vs CMY

Tri-Chromatic Theory


CIE XYZ Model


CIE Color Gamut


Visible and Printable Gamut

Visible and Printable Color


30

Y’CBCR

Rec. 601 specifies a range of [16, 235] for Y’ and [16, 240] for CB and CR.

To obtain Y’CBCR from 8-bit R’G’B’ values (i.e., in the range [0, 255]), use the transformation:

úúú

û

ù

êêê

ë

é·

úúú

û

ù

êêê

ë

é

----+

úúú

û

ù

êêê

ë

é=

úúú

û

ù

êêê

ë

é

'''

285.18154.94439.112439.112494.74945.37064.25057.129738.65

2561

12812816'

BGR

CCY

R

B


YUV/YCbCr/YIQ Model


Rec. 601 for TV: specifies a range of [16, 235] for Y’ and [16, 240] for CB and CR.To obtain Y’CBCR from 8-bit R’G’B’ values (i.e., in the range [0, 255]), use the transformation:

úúú

û

ù

êêê

ë

é·

úúú

û

ù

êêê

ë

é

----+

úúú

û

ù

êêê

ë

é=

úúú

û

ù

êêê

ë

é

'''

285.18154.94439.112439.112494.74945.37064.25057.129738.65

2561

12812816'

BGR

CCY

R

B

Color HistogramThe First (very coarse) Image Feature

–Color is an important cue to the image perception–Why not use the statistics of color distribution in an image to represent the image ?

Color Blobs, without spatial info


Computing Histogram•The Input

–Collection of pixels, {xi}, for i=1..(wxh) in R3

–Pre-computed color bins, can be expressed as n centroids, {m1, m2,…, mn}

•The output–Generate a normalized pixel counting w.r.t to each color bins–Binarization–Distance metrics

–A matlab example:im = imread('cameraman.tif');h=imhist(im);

it will create a 256 bin histogram for the image for single channel grayscale image, how about color image ?


A toy problem:

Color Histogram as a Feature for Image Retrieval

Query Data base

d(q, i) = d(Hq, Hi)


MPEG-7 Scalable Color Descriptor

•Scalable Color Descriptor (SCD) is in the form of a color histogram in the HSV color space encoded using a Haar transform. H is quantized to 16 bin and S and V are quantized to 4 bins each, total 256 bins.

•The pixel count for each bin is quantized to 4 bits, so at max 256x4=1024 bits for representing. The distance between two images are therefore hamming distance, Scalability thru Haar trans.

Saturation

Hue

Value

Red (0o)

Yellow (60o)

Green (120o)

Cyan (180o)

Blue (240o) Magenta (300o)

Black

White


MPEG 7 Scalable Color Descriptor

•Achieving Scalability by two stage quantization and a Haar Transform• 1

st stage, a non-linear quantization from 11bit to 4 bit, giving

more resolution to lower pixel counts. • Haar transform, decorrelates, • Linear quantization, generate bit stream


Scalable Color Descriptor Retrieval Results

• Not bad for color dominating images ... •


Dark Image Enhancement from Sensor Field

Low light photography Low light vision task Object detection and

recognition under low light condition

SurveillanceFigure 1. Low light photography for mobile devices

Figure 2. Low light pedestrian detection (Ref: Multispectral Deep Neural Networks for Pedestrian Detection)


Objectives To design end-to-end network that performs image denoising and

enhancement To design network with less computational complexity for fast low

light photography and video application Input: RGBG bayer pattern sensor, output: RGB images

Figure 3. [a] Extreme low-light image from Sony a7S II exposed for 1/10 second . [b] 100x intensity scaling of image in [a]. [c] Ground truth image captured with 10 second exposure time. [d] Output from [1]. [e] Output from our method.


Dark Image Enhancement from Sensor Field• Residual based learning - learns noise instead of image prior

• Residual Blocks with 2x scaling layer at end of network

• LeakyReLU as activation function for residual blocks

• Residual block followed by Squeeze-and-Excitation block- converges the network faster and increases the performance

Figure 5. [a] Proposed network [b] Residual Block with Squeeze-and-Excitation Network


Denoising and Color Accuracy Denoising

Color accuracy

Figure 6: [a] Ground truth image (b) Output from SID. Noise is still present in few parts of the image (c) Output from BM3D. Denoised image is darker than the ground truth. (d) Denoised output from our network

Figure 7: [a] Input dark image (b) 100x scaled version of dark images (c) Ground truth with exposure time of 10 seconds (d) SID output with missing color information, PSNR: 20.48dB (e) Output from our network with close approximation to ground truth image, PSNR: 27.17dB.


Less Color Spread and Image Details

Less Color Spread

Image Details

Figure 9: [a] Ground truth (b) 300x amplified dark image (c) U-Net output. Image not clear due to pixelated effect. (d) Output from our network with higher image quality

Figure 8: [a] Input dark image form Sony 300x subset (b) 300x amplification of dark image (c) Ground truth image with exposure time of 10 seconds (d) U-Net output with unnecessary color spread at the ground. (e) Output from our network with close approximation to ground truth image.


Summary

In this lecture, we covered Camera model Color Space Color model manipulation and transform Demosaic - from sensor field to image pixels Color based image features A taste of deep learning based dark image enhancement.

To do Install vl_feat v0.9.20, the 9.21 has some issue Go thru the ETHZ matlab image processing tutorial - very good one Will arrange lab session on this.

Next Lecture: Image Formation - Geometry : how pixels are related to the 3D

world points, and how pixels from images of the same scene are related.


Q&A

Q&A


Documents

ECE 5582 Computer Vision...Visual Functions at Retina Z. Li, ECE/CS 5582 Computer Vision, 2020 Vision function at retina Cones concentrated around the yellow spot, or macular, about