View
74
Download
0
Category
Preview:
DESCRIPTION
Neuromorphic Image Processing. Ralph Etienne-Cummings The Johns Hopkins University Collaborators: Kabena Boahen, Gert Cauwenberghs, Timothy Horiuchi, M. Anthony Lewis, Philippe Pouliquen Students: Eugenio Culurciello, Viktor Gruev, Udayan Mallik Sponsors: NSF, ONR, ARL. - PowerPoint PPT Presentation
Citation preview
Computational Sensory Motor Systems LabJohns Hopkins University
Neuromorphic Image Processing
Ralph Etienne-CummingsThe Johns Hopkins University
Collaborators:Kabena Boahen, Gert Cauwenberghs, Timothy Horiuchi, M. Anthony Lewis, Philippe
Pouliquen
Students:Eugenio Culurciello, Viktor Gruev, Udayan Mallik
Sponsors:NSF, ONR, ARL
Computational Sensory Motor Systems LabJohns Hopkins University
An Alternative Style of Neuromorphic Image Processing• Traditional image processing uses pixel-serial image access,
digitization and sequential processing- Discrete levels, Discrete time- High fidelity images, large vocabulary of functions (GP)- High power, high latency, small sensor/processing area ratio
• Traditional neuromorphic vision systems typically uses pixel-parallel processing
- Continuous and/or discrete levels, continuous time- Low fidelity images, large pixels, small vocabulary of function (ASICs)- Low power, low-latency
• Computation-On-Readout (COR) vision systems uses block-serial-pixel-parallel image processing
- Continuous levels, discrete time- High fidelity images, medium vocabulary of function (pseudo-GP)- Low power, medium/low-latency, computation for “free,”
Computational Sensory Motor Systems LabJohns Hopkins University
Adaptive SpatioTEmpoRal Imaging (ASTERIx) Architecture
Competition/recurrencepossible
- Digitally controlled analog processing
- Image acts as memory
- Parallel execution of multiple filters
- Temporal evolution of results
- Standard Fetch-Decode-Compute-Store (RISC) architecture possible
Foveated Tracking Chip
Technology 2 m NWELL CMOS, 2 Metal, 2 Poly Chip Size 6.4 x 6.8 mm2 Package 132 Pin DIP Array Sizes Fovea: 9x9 @150m
pitch Peri: 19x17 @300m
pitch Fill Factor Fovea: 18% Peri: 34%
Fovea: Receptor + Edge: 12
Peri: Receptor + Edge + ON: 12
Transistors/Cell
Fovea: Motion: 8 Peri: Centroid: 15 Photosensitivity 6 Orders of Magnitude Contrast 10 - 100%
2.5W/cm2: 1.5 - 1.5K pixels/s 25W/cm2: 3 - 4.5K pixels/s
Foveal Direction Sensitivity
250W/cm2: 5 - 10K pixels/s 2.5W/cm2: < 0.1 - 63K Hz 25W/cm2: < 0.1 - 250K Hz
Peripheral ON-set Sensitivity
250W/cm2: < 0.1 - 800K Hz Power Consumption 25W/cm2: >10mW @ 3V Supply
• Spatially variant layout of sensors and processing elements
• Dynamically controllable spatial acuity• Velocity measurement capabilities• Combined high-resolution imaging and
focal-plane processing
• A single chip stereo vision system has been implemented
• Contains 2, 128 x 128 imagers• Computes full frame disparity in
parallel• Provides a confidence measure on
computation• Uses a vertical template to reduce noise
and computation• Operates at 20 fps• Uses ~30mW @ 5V (can be reduced)
VLSI Implementation of Robotic Vision System: Single Chip Micro-Stereo System
Matlab Simulation of VLSI Algorithm
Single Chip Stereo Optics VLSI Algorithm
Chip Layout
-40
-30
-20
-10
0
10
20
30
40
50
Disparity (# pixels shift from right to left) after confidence test
20 40 60 80 100 120
20
40
60
80
100
120
Measured data:Line is disparate on
the imagers
STEREO CHIP
IMAGE 1 IMAGE 2
Biological Inspiration of the GIP Chip: Orientation Detection
Parallel ProcessedImages
Spatiotemporal receptive fields
SpatiallyProcessed:OrientationSelectivity
TemporallyProcessed:Motion Detection
• Implemented CMOS Imagers with focal plane spatiotemporal filters
• Realized high resolution imaging and high speed processing
• Consumes milliwatts of power• Performs image processing at
GOPS/mW (unmatched by any other technology)
• Used for optical flow measurement, object recognition and adaptive optics.
VLSI Implementation of Robotic Vision System: Spatiotemporal Focal Plane Image Processing
• Implemented chip that contains a camera and a recognition engine
• Decomposes the image into Hue, Saturation and Intensity (HSI)
• Creates a template of HIS for learned template• Identifies part of the scene that match a
template• Used by interactive toys, aides to the blind
and Robots
Color-Based Object Recognition on a Chip
Smart Camera Chip
Synapses
Neu
rons
“Learned” templates
Skin-tone Identification
Fruit Identification
Coke or Pepsi?
Low Noise Imaging and Motion Tracking Chip
• Implemented CMOS Imager with active pixel sensor and motion tracking
• Obtain low noise image• Tracks multiple targets simultaneously• Consumes milliwatts of power• Used for optical flow measurement,
target tracking, 3D mouse and robot assisted surgical systems.
VLSI Implementation of Robotic Vision System: Visual Tracking
Sample Image Target Tracking
Technology 0.5µm 3M CMOS
Array Size APS: 120 (H) x 36 (V)
Pixel Size APS: 14.7µm x 14.7µm
Fill Factor APS: 16%
Power Consumption (with 3.3V supply)
3.2mW
FPN (APS) – Dark (Std. Dev./Full Scale)
Pixel-Pixel (within column): 0.6%Column-Column: 0.7%
FPN (APS) – Half Scale (Std. Dev./Full Scale)
Pixel-Pixel (within column): 0.7%Column-Column: 1.2%
Ultrasonic Array Processing
• Implemented ultrasonic bearing estimation chip and change detection chip
• Uses sonic flow across microphone array to measure bearing of target
• Creates internal map of environment• Detects changes in the structure of the
environment• Operates on milliwatts of power• Used for surveillance and navigation
VLSI Implementation of Robotic Vision System: Ultrasonic Imaging and Tracking
Bearing Estimation Algorithm
Bearing/Range Mapping and Novelty Detection
MEMS Front-End
triggeri i+1 i+2
analog bearing output
digital range output
switches & encoders
difference detector
sample memory
sample and hold
detection flag
input signal
Burst-Oscillator
shift register
triggeri i+1 i+2
analog bearing output
digital range output
switches & encoders
difference detector
sample memory
sample and hold
detection flag
input signal
Burst-Oscillator
shift register
0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18-3
-2
-1
0
1
2
3
Time (seconds)
change detection flag
sampling period #1 sampling period #2
input voltage time series
Target blip changes in height
0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16-3
-2
-1
0
1
2
3
Time (seconds)
sampling period #1 sampling period #2
change detection flag
Target blip moves slightly later in time
input voltage time series
Target blip moves slightly later in time
Bearing Change Detection Range Change Detection
Bearing Estimation Chip
Bearing Estimation withSpatiotemporal Filters
Biologically Inspired Locomotion controller
• Implemented a general purpose CPG chip• Contains 10 Neurons• Allows 10 fully connected neurons• Allows 10 inputs from off-chip• Allows Spike and Graded neuron inputs• Allows digitally programmable synapses• Operates on microwatts of power• Used to control legged locomotion
VLSI Implementation of Central Pattern Generators (CPG) for Legged Locomotion
10 Neuron CPG Chip
Silicon Integrate-and-fireNeuron
MotorOutput
Non-LinearSensoyFeedback
Non-LinearSensoyFeedback
MotorOutput
Descending signals
Adaptive Locomotion Controller
New Biped: Snappy
Synapses
Neu
rons
Computational Sensory Motor Systems LabJohns Hopkins University
Outline• Photo-transduction:
• Active Pixel Sensors• Dynamic Range Enhancement• Current Mode
• Spatial Processing: • Image Filtering
• Spatiotemporal Processing: • Change Detection• Motion Detection
• Spectral Processing: • Color-Based Object Recognition
Computational Sensory Motor Systems LabJohns Hopkins University
Photo-transduction
Computational Sensory Motor Systems LabJohns Hopkins University
Conventional CMOS Cameras:Integrative Photo-detection
Integrative Imagers: Voltage domain; Dense arrays (1.25-T); Low Noise;
Low dynamic range (~45 – 60dB), Not ideal for computation
Simple 3-T APS:Fossum, 1992
Computational Sensory Motor Systems LabJohns Hopkins University
Conventional CMOS Cameras:Integrative Photo-detection
Camera phones are driving the CMOS
camera market
- 150 million sold in 2004, 55% annual growth rate to 700 million by 2008
-Power consumption is relatively low ( ~ 10’s of mW for VGA)
- 2 Mega Pixels is probably the limit of usefulness
- Download bandwidth is a problem (service providers would like more people to download their pictures)
- There is a fear that it will represent the next technology bubble …. So much hype, legal problems …
- Small (~ 100 x 100 pixels) imagers, with smarts (e.g. motion, color processing) have market in toys, sensor networks, computer mouse …
Computational Sensory Motor Systems LabJohns Hopkins University
Spike-Based CMOS Cameras:Octopus
Ic
event
reset
Vdd_r
Imaging Concept
Sample Image
Other approaches:- W. Yang, “Oscillator in a Pixel,” 1994- J. Harris, “Time to first Spike,” 2002
Culurciello, Etienne-Cummings & Baohen, 2003
Computational Sensory Motor Systems LabJohns Hopkins University
Front-End of Vision Chips: Photoreception Adaptation
Adaptive Phototransduction(Delbruck, 1994)
Vdd
Vss(a)
gm+-
+ - Cdiff
Vo
Clk
Discrete Time
Vss
Vdd
Vss
AdaptiveElement
Vss
C1
C2
Vbias
Vcas
Vo
(b)
Adaptive Elements
Bad Choice
Good Choice
n-well
p-substrate
p+ n+
(c)
•Time adaptive (band-pass)•Voltage domain •Large dynamic range (9 orders)•Can be large pixels (Caps)•Can have mismatch?
After Normann & Werblin, 1974
Computational Sensory Motor Systems LabJohns Hopkins University
Front-End of Vision Chips: Photoreception
Vdd
Vo
Vss(a)
Vdd
Vo
Vdd1 : B
Vss(b)
Vdd Vdd
Io
Vss(c)
Current Domain Imaging(Mead et al, 1988)•Wide dynamic range (9 orders)•Simple to implement (2 Trans.)•Ideal for computation (KCL)•Poor matching (10 – 15%)•Slow turn-off•Transfer function is non-linear
Photo sensitive elements:Phototransistors: ~100pA/um2
Photodiodes: ~1pA/um2
Computational Sensory Motor Systems LabJohns Hopkins University
How Can We Improve Current Mode Imagers
- Linear Current Mode APS Photodiode linear discharges with light intensity Amplified linear current output from the APS
- Incorporate noise correction techniques at the focal plane Current mode Correlated Double Sampling (CDS) Improve the quality of image noise characteristics Easy integration with processing units – convolution,
ADC, others.
Computational Sensory Motor Systems LabJohns Hopkins University
Complete Imaging System
2
resetI [( ) ]2ref
p OX reset t ref
VWC V V VL
out photo resetI I I ( )p OX ref reset photoWC V V VL
Pixel Vt variations are eliminated from the final current output!
2
photoI [( ) ]2ref
p OX photo t ref
VWC V V VL
Computational Sensory Motor Systems LabJohns Hopkins University
Measured FPN figure
-Image quality has been improved
-Non-linearity due to mobility degradation degrades performance under bight light
Computational Sensory Motor Systems LabJohns Hopkins University
Spatial Processing:Image Filtering
Computational Sensory Motor Systems LabJohns Hopkins University
Architectural Concept:Visual Receptive Fields
Computational Sensory Motor Systems LabJohns Hopkins University
Architectural Concept:Visual Receptive Fields
High resolutionImaging array
Parallel ProcessedImages
ProgrammableScanning Registers
Spatiotemporal receptive fields
Etienne-Cummings, 2001
Computational Sensory Motor Systems LabJohns Hopkins University
Results – Spatial Image Processing
• 1. Vertical Edge Detection (3x3)
• 2. Horizontal Edge Detection (3x3)
• 3. Laplacian Filter (3x3)• 4. Intensity Image
• 6. Vertical Edge Detection (5x5)
• 7. Horizontal Edge Detection (5x5)
• 8. Laplacian Filter (5x5)• 9. Gaussian Filter (5x5)
1. Intensity Image2. Horizontal Edges 3. Enhanced Image = Intensity +
Horizontal Edge Image
Enhanced Imaging
Computational Sensory Motor Systems LabJohns Hopkins University
Results – Spatial Image Processing
3 x 3 Kernels 5 x 5 Kernels
Computational Sensory Motor Systems LabJohns Hopkins University
Summary GIP version 1 GIP version 2Technology 1.2 m Nwell CMOS 1.5 m Nwell CMOSNo. Transistors 6K 13KArray Size 16 x 16 42 x 35Pixel Size 30m x 30m 20m x 20mFPN (STD/Mean) 2.5% (Average) 2.1% (Average)Fill Factor 20% 35%Dynamic Range 1 – 6000 Lux 1 – 6000 LuxFrame Rate DC – 400KHz DC – 400KHzKernel Sizes 2x2 - whole array 2x2 - whole arrayKernel Coefficients +/- 3.75 by 0.25 +/- 3.75 by 0.25Coeff. Precision Intra-processor: <0.5% Inter-processor: <2.5%
Temporal Delay 1% decay in 150ms @ 800Lux NA
Power 5 x 5: 1mW @ 20 kfps 5 x 5: ~1mW @ 20 kfps
Computation Rate(Add and Multiply)
5 x 5: 1 GOPS/mW @ 20 kfps 5 x 5: 1 GOPS/mW @ 20 kfps
Computational Sensory Motor Systems LabJohns Hopkins University
Spatiotemporal Processing: Change & Motion Detection
Computational Sensory Motor Systems LabJohns Hopkins University
Motivation: Free Space Laser Communication
Computational Sensory Motor Systems LabJohns Hopkins University
Motivation
High speed, high resolution, high accuracy, pitch matched, Temporal Difference Imager (TDI)
Flexible control of exposure, inter-frame delay and read-out synchronization
Low fixed pattern noise on current and previous image
Pipelined readout mechanism for improved read-out rate and temporal difference accuracy
Computational Sensory Motor Systems LabJohns Hopkins University
Photo Pixel Designs
TDI version 1 TDI version 2
Pixel Size 25m x 25m 25m x 25mFill Factor 30% 50%
FPN 0.5% of saturation 0.15% of saturationTechnology 0.5m (SCMOS) 0.35m (Native)
Computational Sensory Motor Systems LabJohns Hopkins University
Results and Measurements
Computational Sensory Motor Systems LabJohns Hopkins University
Results and Measurements
Computational Sensory Motor Systems LabJohns Hopkins University
New Change Detection Chip
Computational Sensory Motor Systems LabJohns Hopkins University
On-Set and Off-Set Imaging
Narrow Rejection Band Wide Rejection Band
Computational Sensory Motor Systems LabJohns Hopkins University
Video Compression
Computational Sensory Motor Systems LabJohns Hopkins University
Video Reconstruction
Computational Sensory Motor Systems LabJohns Hopkins University
Spectral Processing:Color Object Recognition
Computational Sensory Motor Systems LabJohns Hopkins University
RGB to HIS: Why?
BGRBbiasIb
BGRGbiasIg
BGRRbiasIr
_;_;_
)],,min(1[_),,( bgrbiasIBGRSat
BGRBGYXBGRHue
2)(866.0arctan)/arctan(),,(
Etienne-Cummings et al., 2002
Computational Sensory Motor Systems LabJohns Hopkins University
Examples: Chroma-Based Object Identification
Fruit Identification
Skin Identification
“Learned” templates
Computational Sensory Motor Systems LabJohns Hopkins University
Chip Block Diagram
-Block addressable color imager
-White correction and R,G,B scaling
-R,G,B normalization
-R,G,B to HSI conversion
-HSI histogramming for an image block
-Stored “learned” HSI templates
-SAD template matching
Computational Sensory Motor Systems LabJohns Hopkins University
Hue Computation
BGRBGYXBGRHue
2)(866.0arctan)/arctan(),,(
Computational Sensory Motor Systems LabJohns Hopkins University
Hue Computation
RGB-to-HSI Transformation
0
60
120
180
240
300
360
0 5 10 15 20 25 30 35Chip Computed Hue Bins [10 degrees resolution]
Theo
retic
al H
ue V
alue
[deg
rees
]
Computational Sensory Motor Systems LabJohns Hopkins University
Hue Based Segmentation
Computational Sensory Motor Systems LabJohns Hopkins University
HSI Histogramming
-Filters Saturation and Intensity Values
-Non-linear RGB->Hue transformation using analog-to-digital look-up
-Hue histogram constructed by counting number of pixels in a block mapping to each Hue bin
-36 x 12b Template per block
-Programmable bin assignment in next version
Computational Sensory Motor Systems LabJohns Hopkins University
Template Matching Results
0
50
100
150
200
250
300
350
400
450
1 14 27 40 53 66 79 92 105
Image Segment Block Index
SAD
Val
ue
Matching threshold
Template Matching
kkjiji TISAD
,,,
Computational Sensory Motor Systems LabJohns Hopkins University
Color-Based Object Recognition
Computational Sensory Motor Systems LabJohns Hopkins University
SummaryTechnology 0.5µm 3M CMOS
Array Size (R,G,B) 128 (H) x 64 (V)
Chip Area 4.25mm x 4.25mm
Pixel Size 24.85µm x 24.85µm
Fill Factor 20%
FPN ~5%
Dynamic Range >120 dB (current mode)
Region-Of-Interest Size
1 x 1 to 128 x 64
Color Current Scaling
4bits
Hue Bins 36, each 10 degree wide
Saturation Analog (~5bits) one threshold
Intensity Analog (~5bits) one threshold
Histogram Bin Counts 12bits/bin
Template Size 432bits (12 x 36bits)
No. Stored Template 32 (13.8Kbits SRAM)
Template Matching (SAD)
4 Parallel SAD, 18bits results
Frame Rate Array Scan: ~2K fpsHIS Comp: ~30 fps
Power Consumption ~1mW @ 30 fps on 3.3V Supplies
Computational Sensory Motor Systems LabJohns Hopkins University
Some Conclusions
• Block-Serial-Pixel-Parallel Focal-Plane Computation-on-Readout (COR) is an another style of neuromorphic image processing – Computation for “free”, high fidelity images, compact, low-power,
high-speed, reconfigurable, multiple parallel kernels, can be iterated
• Although COR can be used for both voltage- and current-mode imagers, current-mode image processing is more ideal for focal-plane implementation– Linearize the photo-current, perform CDS to remove FPN
• Many different algorithms can be implemented with COR that are compatible with standard machine vision
Recommended