View
217
Download
0
Category
Tags:
Preview:
Citation preview
Vision in Man and MachineVision in Man and Machine..STATS 19 SEM 2. 263057202. Talk 2.STATS 19 SEM 2. 263057202. Talk 2.
Alan L. Yuille.
UCLA. Dept. Statistics and Psychology.
www.stat.ucla/~yuille
The Purpose of Vision.The Purpose of Vision.
“To Know What is Where by Looking”. Aristotle. (384-322 BC).
Information Processing: receive a signal by light rays and decode its information.
Vision appears deceptively simple, but there is more to Vision than meets the Eye.
Ames RoomAmes Room
Perspective.Perspective.
Curved Lines?Curved Lines?
Brightness of Patterns: Adelson (MIT)Brightness of Patterns: Adelson (MIT)
Visual IllusionsVisual Illusions
The perception of brightness of a surface, or the length of a line, depends on context. Not on basic measurements like:the no. of photons that reach the eyeor the length of line in the image..
Perception as InferencePerception as Inference
Helmholtz. 1821-1894.“Perception as Unconscious Inference”.
Ball in a Box. (D. Kersten)Ball in a Box. (D. Kersten)
How Hard is Vision?How Hard is Vision?
The Human Brain devotes an enormous amount of resources to vision.
(I) Optic nerve is the biggest nerve in the body. (II) Roughly half of the neurons in the cortex are
involved in vision (van Essen). If intelligence is proportional to neural activity,
then vision requires more intelligence than mathematics or chess.
Vision and the BrainVision and the Brain
Half the Cortex does VisionHalf the Cortex does Vision
Vision and Artificial IntelligenceVision and Artificial Intelligence
The hardness of vision became clearer when
the Artificial Intelligence community tried to
design computer programs to do vision. ’60s.AI workers thought that vision was “low-
level” and easy. Prof. Marvin Minsky (pioneer of AI) asked
a student to solve vision as a summer project.
Chess and Face DetectionChess and Face Detection
Artificial Intelligence Community preferred Chess to Vision.
By the mid-90’s Chess programs could beat the world champion Kasparov.
But computers could not find faces in images.
Man and Machine.Man and Machine.
David Marr (1945-1980) Three Levels of explanation:
1. Computation Level/Information Processing
2. Algorithmic Level
3. Hardware: Neurons versus silicon chips.
Claim: Man and Machine are similar at Level 1.
Vision: Decoding ImagesVision: Decoding Images
Vision as Probabilistic Inference Vision as Probabilistic Inference
Represent the World by S.Represent the Image by I.Goal: decode I and infer S.Model image formation by likelihood
function, generative model, P(I|S)Model our knowledge of the world by a
prior P(S).
Bayes TheoremBayes Theorem
Then Bayes’ Theorem states we show infer the world S from I by
P(S|I) = P(I|S)P(S)/P(I).Rev. T. Bayes. 1702-1761
Bayes to Infer S from IBayes to Infer S from I
P(I|S) likelihood function . P(S) prior.
.
Technically very interdisciplinaryTechnically very interdisciplinary
But applying Bayes is not straightforward.A beautiful theory is being developed
adapting techniques from Computer Science, Engineering, Mathematics, Physics, and Statistics.
E.G. Probabilistic Reasoning (Pearl CS),
Level Sets (Osher Maths).
ExamplesExamples
Generative Models Visual Inference:
(1) Estimating Shape.
(2) Segmenting Images.
(3) Detecting Faces.
(4) Detecting and Reading Text.
Generative ModelsGenerative Models
Learn Generative Models from a fewimages and then generate new images.
Uses of Generative ModelsUses of Generative Models
Univ. Oxford
Shape Inference: (Zhu Lab)Shape Inference: (Zhu Lab)
Shape and Photometry ( Soatto Lab)Shape and Photometry ( Soatto Lab)
– Estimate geometry (shape) and photometry from multiple images.
Jin-Soatto-Yezzi
Compare ground truth (Soatto Lab)Compare ground truth (Soatto Lab)
Jin-Soatto-Yezzi 11/1/02
Estimated shapeEstimated shape
Alternative algorithmAlternative algorithm
Ground truthGround truth
Generated Image:synthesized from novelviewpoint and illumination.
Jin-Soatto-Yezzi 11/1/02
Ground Truth:
same lighting and viewpoint
Compare w. ground truth (Soatto Lab)Compare w. ground truth (Soatto Lab)
Segmentation (Level Sets)Segmentation (Level Sets)
Segmentation (Level Sets)Segmentation (Level Sets)
Segmenting Images (Zhu Lab)Segmenting Images (Zhu Lab)
Characterize the set of image patterns that
occur in natural images. Provide mathematical models. P(I|S) and P(S).
Face and Text Detection.Face and Text Detection.
Back to the BrainBack to the Brain
Top-Level; compare human performance to
Ideal Observers.
Explain human perceptual biases (visual
illusions) as strategies that are “statistical
effective”.
Brain Architecture Brain Architecture
The Bayesian models have interesting
analogies to the brain. Generative Models require top-down
processing
High-Level Tells Low-Level to High-Level Tells Low-Level to Shut Up (Kersten Lab)Shut Up (Kersten Lab)
High-Level Tells Low-Level to High-Level Tells Low-Level to Shut up (Kersten Lab)Shut up (Kersten Lab)
ConclusionConclusion
Vision is unconscious inference. Theory of Vision for Man and Machine.
See more about Vision at UCLA in the Vision and Image Science Collective
http://visciences.ucla.edu
Recommended