EE 604 Image Processing - IITKhome.iitk.ac.in/~venkats/teaching/slide_1.pdf · Wave and Ray...

Preview:

Citation preview

EE 604 Image Processing

Philosophy, perception, optics

Introduction

What Is This About?

• Very broadly, creating, processing, representing, and presenting visual information. [Image/Video Processing]

• At the next level, interpreting, categorizing, and reasoning with visual information. [Computer Vision]

How Do We See?

"The eye is like a mirror, and the visible object is like the thing reflected in the mirror." -- Avicenna, early 11th century

Extromissive and Intromissive theories of vision • Extromissive (active) vision: the eye emits ‘rays’ that go out and strike

objects, thus discovering them.

• Intromissive (passive) vision: the eye receives ‘rays’ from objects outside, and thus perceives objects.

• Both theories explain satisfactorily why you cannot see if your eyes are closed, but extromission cannot explain why you cannot see in the dark.

What Do We Need?

For ‘seeing’ to happen, three things are required:

• Light (illumination): no light, no see, see?

• Object(s): the light must be intercepted by objects which should interact with it (reflect/transmit).

• Sensing system: given all the above, we need a physical mechanism to receive the redirected light from the object and finally process/sense it.

Sources of Light

• Active Sources (emitters): these actually produce light – actually, convert energy in some other form into light.

• Passive Sources (reflectors/transmitters): these receive light from other sources (active or passive) and simply pass on a part of it.

• Sources can be point- (tiny LED), discrete- (collection of tiny LEDs) or extended (tubelight, dome light) in space .

In the absence of sources of some kind, there will be no images.

Objects

An object is made visible (seen) only if it disrupts (interacts with) the light from the source in some manner, by diverting or deflecting it by virtue of its presence.

• Example: a glass rod or prism redirects the light coming from the source: we say we ‘see’ the object.

• Air is ever present around us, but we cannot see it, because it hardly interacts with the light.

• Generally, each point on an object scatters/reflects light in all directions.

Object-Light Interaction

• If an object deflects light from the source by less than 90 from the original direction, we shall call it transmissive. If it transmits without diffusing it, we shall call it a transparent object, else we call it translucent. Transparent transmitters are less common.

• If an object deflects light from the source by more than 90 from the original direction, we shall call it reflective. If it reflects without diffusing it, we shall call it a specular object, else we call it, progressively glossy or Lambertian. Specular reflectors are less common.

• An object can partly transmit, partly reflect, too. [Glass! Anti reflective coatings]

Sensors

Sensors receive light and absorb some of it. • If a sensor absorbs zero light, it cannot be seeing/sensing the scene. • Hence a truly invisible man has to be blind. (Get it, Harry Potter?) • The region of space over which the sensor accepts light is called its

aperture. • Generally, a sensor accepts light from all (many) directions. • Sensors can be point- (tiny photodiode /phototransistor), discrete- (a

CCD/CMOS array, eye) or extended in space (rare). • A sensor with an array of individual independent elements such as the

CCD array or the eye qualifies to be called an imaging sensor.

Imaging

• If there are multiple objects, or objects with their own internal detail, we have a nontrivial scene.

• Imaging or image acquisition consists of mapping the spatial distribution of the light coming from the scene towards the sensor to create a sort of replica of it that we call the image.

• In general, the image is expected to resemble the scene in some sense. This requires a physical mechanism to establish a one-one correspondence between scene elements and sensor elements.

The Problem with Imaging

• In general, every individual point in the scene (scene element) separately acts as a source of light, and it emits/reflects/transmits light in multiple/all directions.

• Simultaneously, each element of the imaging sensor collects light from multiple/all directions.

• Result: many/every scene element is ‘seen’ by many/every sensor element.

We don’t want this! We want a mechanism to establish a one-one correspondence between scene elements and sensor elements.

Eugene Hecht, 5th Ed

The Need for Optics

• The optical system is an imaging sensor (henceforth, camera) which has a mechanism to establish a one-one correspondence between scene elements and sensor elements.

• The optical system acts as a disentangler that separates out the light components coming from different scene elements and sends them to separate sensor elements (sensor pixels).

• Without the optical system, the scene-camera mapping is ‘from-all-to-all’. With it, the scene-camera mapping becomes ‘from-one-to-one’. Thus, we have an ‘image’.

The Need for Optics

Sensor

array objects objects

Without Optics With Optics

Sensor

array

Theories of Light

• Down the ages, light has been recognized as a form of energy. How does ir propagate?

• Particles: says light is a beam of particles that fly in straight lines. • Waves: says that it propagates in the form of a wave. • We accept the wave model, but with important simplifications. Diffraction

and interference phenomena are neglected. • This leads to a very convenient approximate theory called geometric optics.

The physical constraints under which geometric optics serves as an acceptable approximation are that all objects are far greater in size relative to the wavelength.

READING ASSIGNMENT: The complete Wikipedia page on “Light”

Wave and Ray Diagrams

READING ASSIGNMENT: Hecht ‘Optics’. Chapter 5, Geometric Optics.

A hyperbolic interface between air and glass. (a) The wave- fronts bend and straighten out. (b) The rays become

parallel. (c) The hyperbola is such that the optical path from S to A to D is the same no matter where A is.

Eugene Hecht, 5th Ed

Converting Waves to Rays • Rays are straight lines

• Waves are represented as

wavefront curves. All points on a

curve are in phase

• To convert a wave diagram into a

ray diagram, draw lines orthogonal

to each of the wavefronts, and

connect up collinear rays.

• Thus a spherically expanding

wavefront gives diverging beam,

while a plane wavefront gives a

parallel beam, a spherically

contracting wavefront gives a

converging beam.

Convergent and Divergent Interfaces

(a) and (b) Hyperboloidal and (c) and (d ) ellipsoidal refracting surfaces (n2 > n1) in cross section.

READING ASSIGNMENT: Hecht ‘Optics’. Chapter on Geometric Optics

Eugene Hecht, 5th Ed

Double Interfaces (Lenses)

Eugene Hecht, 5th Ed

The Simple Thin Spherical (STS) Lens: relations

The STS Lens: equations and limitations

Google Images

Spherical aberrarion

Wikipedia

Chromatic aberration

Google Images

The Simplest Camera: no lens

• An imaging system or camera ought to consist of the illumination, the optics as well as the sensor.

• However, the illumination system is often widely distributed in space, well beyond the confines of the sensor-optics setup, so we leave it out.

• Thus, the camera is considered to consist of just the optics followed by the sensor array.

• The simplest optics that one can construct is the pinhole. A pinhole camera approximately forms an image on a sensor plane.

• The sensor array can be at any distance behind the pinhole. It forms an inverted image.

The Thin Lens Camera: description

• A pinhole camera has 2 interesting properties of which the first one is that it focusses all scene depths at all image depths.

• The second property is a serious limitation. It accepts only very little light from the scene, in particular, only one ray from each point in the scene. This makes the images formed very dim.

• A this lens camera replaces the pinhole by a this lens at the same centre of projection. Unlike the pinhole, the lens has a nonzero, finite aperture, so that it admits a nonzero amount of energy.

• Now, for any given scene depth, there is a specific reciprocal, image depth. Images of objects at a particular depth are formed only at this reciprocal depth, and are ‘defocussed’ at all other depths.

Michael Veth: Google

The centre of projection is still taken to be the centre of the lens. But

with a lens, every point in the aperture admits a ray from any given

scene point, delivering more energy, and making the image of the

scene point brighter .So, bigger the aperture and the lens, brighter

the image.

Capture parameters

Exposure and blur

Tech specs, capabilities, fallibilities

Visual System

The Human Eye

Sensor Cells and Image Formation

HVS Spectral Sensitivity Curves

Right: Photopic spectral efficacy

Below Right: Scotopic luminosity function

Below: Individual cone responses

Thus,

• Things look bluish in dim light

• Reddish objects not seen well in dim light

Wikipedia

Scotopic (Low Light) Vision

Human Cat

Capture parameters

Subjective Brightness

To measure the sensitivity of intensity change, the inner circular region is illuminated by a different intensity from the outer.

Simultaneous Contrast

Simply put, the subjective brightness of any region

is not independent of the surroundings. In the

picture above, all they inner squares are of the

same objective intensity. In the picture on the right,

(White’s Illusion) the squares A and B are of equal

intensity.

Weber’s Law of Simultaneous Contrast

• Two luminances are just noticeably different from one another if their ratio is at least a certain value: the Weber ratio. This value itself varies slightly with intensity as shown.

• Equal changes in the logarithm of the intensity result in equal noticeable changes in the intensity for a wide range of intensities. This fact suggests that the human eye performs a pointwise logarithm operation on the input image.

Contrast Sensitivity

Lateral Inhibition

• Lateral inhibition is a mechanism by which neurons are able to determine more precisely the origin of a stimulus. For instance, when the skin is touched by an object, several sensory neurons in the skin next to one another are stimulated.

• To determine more exactly the origin of the stimulus, neurons that are stimulated suppress the stimulation of neighbouring neurons.

• The amount of inhibition is greater when a neuron’s own stimulation is more powerful.

• Thus, only the neurons that are strongly stimulated will fire. These neurons are more to the centre of the stimulus, while the suppressed neurons lie somewhat away from the centre of the stimulus.

Lateral Inhibition Modeling

The grey level transition is shown along with the plot of the intensity and subjective brightness. Above: the impulse response of the system that yields the subjective response observed.

Lateral Inhibition Phenomena

How perceived brightness varies against the actual intensity. Observe the inherent enhancement of the edges

Corn sweet Edge Illusion

Intensity vs perceived brightness

Adelson's Checker Shadow Illusion

The two squares A and B appear very different as a result of the illusion. The second picture includes a rectangular bridge connecting

square A and B to show they are the same shade of gray.

Chubb Illusion

Both the inner square regions are identical, though the one on the right appears to have lesser contrast than the one to the right.

Visual Gestalts: interpretational ambiguities

Spatial Frequency Response

Subjective response to spatial frequency variation varies with the spatial frequency. In the picture shown on the right, frequency is varied horizontally, while contrast is varied vertically. The response is shown in the plot to the left. The angular orientation of the spatial variation also matters.

Spatial Frequency Application

Important visual signals always make use of the

properties of the visual system. In this case, we exploit

the fact that an alternaing patter, having a spatial

frequency > 0 is much more easily seen than a plain.

Temporal Frequency Response

Response to temporal frequency – ‘flicker’. The curve depend on the intensity range of the flicker. Sensitivity is greater at higher intensities.

Image Representation, Approximations

Digital Images

Quantification

Digital Image: Sampling, Quantization

Digital Image Quantization

Gonzales

Digital Image standard and formats

Parameter Typical values

256,512,525,625,1024,1035

256,512,768,1024,1320

2,64,256,1024,4096,16384

Tools: The continuous 2D Fourier Transform

The 2D discrete Fourier Transform

Properties of the Fourier Transform

Properties of the Fourier Transform

Properties of the Fourier Transform

Properties of the Fourier Transform

• The value of the space signal or its spectrum at the origins:

• Derivatives:

Significance of phase and magnitude

Both the magnitude and the phase functions are necessary for the

complete reconstruction of an image from its Fourier transform

Partial reconstructions

Reconstructed from magnitude alone (phase assumed to be 0) and phase alone (magnitude = 0)

2D FT pairs examples

Recommended