Upload
marlene-harrell
View
217
Download
0
Tags:
Embed Size (px)
Citation preview
CNN Technology for Brain-like
Spatial-temporal Sensory Computing –Present and Future
ISCAS-2004 Plenary LectureVancouver, May 2004
Tamás ROSKA
Pázmány University and the Hungarian Academy of SciencesBudapest, Hungary
Acknowledgements
Berkeley-Budapest-Seville (12 years) and Notre Dame-Harvard research groups
Office of Naval Research (ONR) Future and Emerging Technologies Division of
the EU R&D Directorate Human Frontiers of Science Program Hungarian National Research Fund Spanish National Research Council Pázmány University, Budapest Hungarian Academy of Sciences
Table of contents1. Scenario: new vistas of complexity in
circuits, systems and computers2. A new framework for sensory-
computing- activating circuits and systems: Cellular Wave Computers and Wave-Logic
3. Various physical implementations: towards (topographic) visual microprocessors
4. Bio-inspiration- sensor fusion and proactive systems – multichannel retina models and cross-modalities
Contents (cont.)
5. Wave-algorithms – a new kind of software with embedded sensors
6. Applications
– „CNN technology”
- Semantic embedding
7. CNN principles in nanotechnology
1 Scenario: new vistas of complexity
New directions in micro- and nano- technologies – a must for cellular
The sensory revolution and its impact on data – a move to image flows and multimodal sensor fusion
Complexity: billion devices and interconnect problems – bio-inspired architectures
Mind inspired and brain inspired computing Sensory understanding and inferencing with
embedded spatial-temporal semantics-Events are patterns
2. Cellular Wave Computers and Wave-Logic
Data, Instructions, Subroutines, Events, and Algorithms
are different!
Non-Boolean logic
A first departure from the all digital-logic computing paradigm
Turing – von-Neumann framework:data are bit streams, time is discrete,elementary operations on bits, and
STORED PROGRAMMABLE Blum-Schub-Smale departure:
data are reals, the role of accuracy and problem parameter in comp.complexity
Newton machine, the role of algebra&nonlinearity
A drastic departure Data are multivariable flows, continuous in
time and signal value on a finite time interval Events are patterns ,
in space, in space-time, or in multivariable synchronization
Elementary instructions: the solution of a nonlinear wave equation plus local and global binary logic
Architecture and algorithms => ?? Universality??
•Brain-like: - analog signal array - several 2D strata of analog processors - mainly local and sparse global interconnections with variable delays - spatial-temporal active waves
Mind-like: - logical sequences - algorithmic
Hint: Left-brain – Right-brain
Univesral Machine on Flows
A generic view of
CNN dynamics and
the CNN Universal Machine
on flows
How to form a generic spatial temporal machine?
Take the simplest dynamical system, a cell Take the simplest spatial grid for placing the
cells (2D sheets) Introduce the simplest spatial interactions
between dynamic cells
CNN: Cellular Nonlinear Network;
archetype:Turing-morphogenesis Place the CNN dynamics into the simplest
stored programmable computing machine
CNN Universal Machine on Flows
Cell Dynamics
ijij ax - dt /dx
)f(xy ijij 1-1
-1
1
Introduction to CNN Dynamics
j
i
The Cellular Neural/nonlinearNetwork (CNN) is: an analog processor array on a rectangular grid with space invariant local interactions.
Local interconnection pattern:cloning TEMPLATE
uij - input
xij - state/ yij - output
z - bias
Template - the program of the network: [A B z]
A
B
010
121
010
111
181
111
z= -0.5
(ij)kl
kla
r
y kl)A(ij,iN
Interaction pattern defined by templates
(ij)kl
klb
r
u kl)B(ij,iN
)y(yA klij*
)u(uB klij*
ijbat ziii
CNN Dynamics
ijklklij
tijij
zu kl)B(ij,y kl)A(ij,ax -
iax - dt /dx
)f(xy ijij
Operation of the a CNN array (computation)
A
input
state/output
B
z
PDE formulation of reaction-diffusion processes
Reaction-diffusion type nonlinear PDE:
)),,((
))),,(()),,(((),,(
tyxF
tyxgradtyxcdivt
tyx
If the first diffusion term is the Laplacian, and there are only two variables in , we get the simple reaction diffusion equation of Turing - morphogenesis
Deriving coupled ODEs from PDEs
Reaction-diffusion type nonlinear ODE:
Templates (symmetric-isotropic class):
)((.)(.)))(()(
))()()()((4
)())(()(
000
11111
1
tbzzfcgtxft
zttttc
ttgdt
td
klNkl
klijijij
ijijijjijiijijij
0
212
101
212
1
101
1
;;
00
00
z
bbb
bbb
bbb
B
c
ccc
c
A
wavestriggerccdiffusioncc 010;0100
Computing with diffusion and waves derived from reaction-diffusion systems
Examples:
Linear diffusion
Trigger-waves
Pattern formation
A generalization of Turing’s morphogenesis and Von Neumann’s vision on
analytic theory of computing Generating a plethora of different
– waves and
– patterns A framework of modeling chemical,
physical, biological and abstract
complex systems Introducing algorithms on waves
Make a computing machine with CNN elementary instructions
Extend each cell with– A few local memory units (LAM, LLM)– A local Communication and Control Unit– And possibly local logic and arithmetic units
Add a Global analogic Programming Unit (GAPU) with
- Analog and logic Registers and
- A machine code storage (GACU)
CNN Universal Machine (CNN-UM)
G A P U
GAPU: Global Analogic Programming Unit
LAM: Local Analog MemoryLLM: Local Logic MemoryLCCU: Local Communication and Control UnitLAOU: Local Analog Output UnitLLU: Local Logic Unit
LCCU
LAM
LLM
APR: Analog Instruction RegisterLPR: Logic Program RegisterSCR: Switch Configuration RegisterGACU: Global Analogic Control Unit
CNNnucle
usLAOU LLU
[A1 B1 z1], [A2 B2 z2], . . .[A1 B1 z1], [A2 B2 z2], . . .
<=<=Analogic (analog+logic) Analogic (analog+logic) algorithalgorithmm<=<=Analogic (analog+logic) Analogic (analog+logic) algorithalgorithmm
Cellular Active Wave Computer on image Flows
Universal Machine on Flows (UMF)
Data: Image Flow
(t): i j (t) , t T= 0, td
1 i m 1 j n
at t = t, (t) is an m x n Picture P
if P is binary it is a Mask M
if t = t0, t0 + t, t0+ 2t, …… t0+ k t then:
image sequence or video stream.
Operators on Image Flows The protagonist elementary instruction, also
called wave instruction, is defined as output (t):= ( input(t), Po, ); t T=0, td
where
: an array function on image flows or image sequences
Po : a picture defining initial state (0) and/or bias map
: boundary conditions (a frame), (t) is a boundary input might also be connected to all cells in a row
A scalar functional on an image flow:
q: = ( input(t), Po, );
Algorithms on image flows α-recursive function. initial settings of image flows, pictures, masks, and boundary values: (0), Po, M, ; equilibrium and non-equilibrium solutions of partial differential difference equations (PDDE) via canonical CNN equations on (t) ; global (and local) minimization on the above; memoryless (arithmetic) and logic combina-tions on the above results, comparisons (thresholding) and logic condi-tions in branchings, via scalar global functionals recursions on the above operations
Properties of the Universal Machine on Flows
Universality in Turing sense and as a spatial-temporal nonlinear operator
Active waves in the region of edge of chaos within the locally active regime
Stored programmability is the key for practical applications
Combining analog spatial-temporal waves with local and global logic– analog-and-logic
Native operators for Programmable Sensor-Computing and in Nanotechnology
3. Physical implementation: towards visual microprocessors
Mixed-signal CMOS Mixed signal BiCMOS Emulated digital CMOS Optical FPGA Integration of topographic sensors SoC Software and development
systems Self contained units e.g. Bi-i
Comparison between an IBM Cellular Supercomputer and an analogic processor
65536 (32*32*64) Power PC
A = 65536 x 1.06 cm2 = 6.9468 m2
P = 491 kW
IBM Cellular Supercomputer 2002
Computing Power ~ 12 * 1012 (TeraFLOPS)
128 x 128 processor with optical input
An analog-and-logicic CNNsupercomputer
Computing Power ~ 12 * 1012
(TeraOPS) equivalent
A = 1.4 cm2
P = 4.5 W
1.
32.
1. 32.
64.
CNN technology roadmap
time
20x22,bin I/O,
optical input50 000 fr/sec
64x64, gray I/O,par. optical input,
1000 fr/sec
128x128, gray I/O,
optical input50 000 fr/sec
128x96, gray I/O,
optical input10 000 fr/sec
embedded Digital Microprocessor
A-D cells
1995-96 1998-99 2003 2004
Complexity/resolution
ACE400 ACE4k
ACE16kXENON*
The ACE CNN chip family has been designed at IMSE-CNM and AnaFocus Ltd.,Seville Spain.The XENON chip was designed by ANCLab at the Hungarian Academy of Sciences
and AnaLogic Computers Ltd., Budapest, Hungary *under fabrication
• Standalone
• Compact
• Embedded 128x128 ACE16k* chip (1 or 2)
• Above 5000 Fps
• Embedded 250MHz DSP
• Embedded 1.3M CMOS imager with ROI
• Ethernet 100MBit/s
• USB
Bi-i: a standalone visual system
*ACE16k chip was designed at IMSE-CNM Seville Spain
First prize and Product of the year at Vision 2003, Stuttgart
CNN Universal Machine (CNN-UM) Chip
logicalcircuitry
base cell
analog synapses
analogmemory
analog synapses
opt.input
logmem
digital I/O bus
analo
g I/O
bus
glo
bális
contr
ol si
gnals
analo
g a
nd logic
al opera
tions
CNN UM Chip /Multiscreen Theather
Standard image processing functions
Separation of the stationary and moving parts
Original image Moving parts Stationary parts
High dynamic range(integration time is indicated)
24s 200s 1.6ms 6.4ms
12.8ms 25.6ms 51.2ms 102.4ms
High Frame rate AND
processing1,000 to 50, 000 frames
per secondand processing during
the 20-1000 microsec
windowHigh speed diffusion5 grids in 100 nsec
4 Biological Relevance Retinotopic Visual Pathway New discovery in 2001 (Nature):
Multichannel Mammalian Retina Model Multilayer CNN dynamics with programmable
space and time constants Towards a programmable vision prosthesis
– the first 5 human retinal implants (at USC) Multichannel tactile (haptic) model Non-synaptic neural signal transmission Immune response inspired algorithms
Retina structure
OPL
IPL
ConesCones
GanglionsGanglions
Ganglion cell types Off Brisk L Off Brisk Tr On Brisk TrN On Brisk Tr On Beta On Sluggish Bistratified Local Edge Detector
IPL Strata
Parallel space-time featuresBotond Roska and FrankS.Werblin Nature, March 29, 2001
MeasurementExcitation
Inhibition
Spiking
On Beta ganglion cell
x
x
x
t
t
t
b2b1A11
A22
1
2
a12 a21
10
13a
b2b1A11
A22
1
2
a12 a21
10
b2b1A11
A22
1
2
a12 a21
10
b2b1A11
A22
1
2
a12 a21
10
b2b1A11
A22
1
2
a12 a21
10
General block input flow sampling output sample/hold spike generation
Complex R-Unit decomposition
OPLOPL
GRBGRB
IRE-AIRE-A IRE-DIRE-D IRIIRI
Model Simulations
Local Edge Detector Off Brisk TrL cell
Stimulus and 3 different output together
Spikingtransfer to the brain
ExcitationInhibition
3-Layer Prototype R-unit
b2b1A11
A22
1
2
Inputs
a12
13a
a21
a23
1|)0(|0:,1|)0(|0:,1||0,1,1
),(),( ,223,1133,3,22,2,11,1
,112,2,22,2,22
10,221,1,11,1,11
1
1
txrangeFullxYangChuauMjMi
yayafyxfyxfy
yayAxx
zubyayAxx
ijij
ij
ijijijijijijij
ijNkl
klklijij
ijNkl
ijklklijij
•mutually coupled 1st order „RC” cells, space constants•double time-scale property•separate inputs and initial states
Output
5. Wave-algorithms – a new kind of software with
embedded sensors Are programmed complex nonlinear
~waves implementable on Silicon? Are ~spatial-temporal signatures
significant in coding some shapes? Can we combine these waves
algorithmically embedding in
~proactive adaptive systems?
Five bio-inspired algorithmic wave computing principles:
the twin wave principle the push-pull principle the multi-channel (e.g.color) opponent
principle (center channel – surround channel) the programmable first action (proactive)
principle the detection by emerging dynamics
principle
Twin wave principle illustration
Inhibition wave(large lateral inhibition)
Excitation wave(small lateral excitation)
Combination
ContoursConcave curves from the bottom
(sad mouth)
All sad
No,He is laughing!
Original image
CNN initial state
(a patch in the spiral-wave region)
CNN input
( noisy, compressed original image )
Example: Contour Detection of a Spiral-wave
Case 1: pixel-wise adaptation
Strategies to Control a Trigger-wave - I.
A B
b b b
b b b
b b b
z t
0 25 0 25 0 25
0 25 3 0 25
0 25 0 25 0 25
2 1 2
1 0 1
2 1 2
. . .
. .
. . .
, , ( )
z and B
3 75
0 0 0
0 2 5 0
0 0 0
. .
Strategies to Control a Trigger-wave - II.
Case 2: adaptation through an optimal reconstruction filter + bias control
t
z
3.75
-3.75
2.25
-2.25
T1 T2 T3
Global propagation
Local propagation
when z
B
375
01 0 2 01
0 2 13 0 2
01 0 2 01
.
. . .
. . .
. . .
Experiment:
Endocardial (inner) contour de-tection of the left ventricle from a sequence of echocardio-graphic images
Apical four-chamber view of the human heart
LV - left ventricle
LA - left atrium
RV - right ventricle
RA - right atrium
Motivation:
Feature extraction from echocardiographic images
Major importance for both quantitative and qualitative analysis of the heart function
LV
LA
RV
RA
Tracking Experiments in Echocardiography
(k-1)-th frame k-th frame(k-1)-th result
k-th result
VIIt
)(0 IfIIIt
~30 sec/fr
~60 sec/fr
~160 sec/fr
~ 250 sec/fr
PDE formalism:CNN-UMchip results(ACE4K):
Active contour tracking based on trigger-waves
0IIIt
6 Applications Very high frame rate real-time detection : ~
10-50 k frame per second Proactive, adaptive, topographic sensory
computing – with locally tuned sensors Very high computing power for complex,
wavetype algorithms: Tera OPS Very high number of targets and pattern
matching templates – immune response inspired CNN algorithms
Embedded semantics - handwriting (geometrical features)- multimodal (vision and tactile)
Virtual action closing a hole I
StarflexR septal occluder 3D modellje
Virtual action closing a hole II
AmplatzerR septal occluder 3D modellje
Virtual action closing a hole III.
Virtualis ASD zárás AmplatzerR septal
occluderrel
Interventional closing
Virtual action closing a hole IV.
Virtualis ASD zárás AmplatzerR septal occluderrel
Interventional Closing in 3D
7 Towards Nanotechnology
Starting from nano-friendly device modesl
Evolutionary and revolutionary Nanotechnology (>100 nm)
Integrating sensing and computing CNN and Crossnet /CMOL technologies
Cellular Wave Computer Chipwith 1000x1000 processors
Integrated sensing - computing Cellular Integrated sensing - computing Cellular Nano ArchitectureNano Architecture
multiple sensor array
Projected capability:10 PetaOps speed100,000 frame/sec
Enabling technology for•Ultra high speed multiple target detection•Fusion reactor control•Intelligent surveillance
nano antenna
Start from Nano-friendly devices:easy to implement and interconnect => CNN
Function-in-layout or non-transistor-based Analog signals and logic (e.g. CMOL/BiCWAS)I/O via radiation - add sensing arrays and optical tranceivers
Processing via Unconventional Processing via Unconventional Nano –friendly devices and Nano SystemsNano –friendly devices and Nano Systems
Lithographically-Defined Nanoantennas
Dipole antenna with MOM diode, which functions at THz frequencies
Bowtie antenna with MOM diode, which operates in the visible
I. Wilke, W. Herrmann, F. K. Kneubuhl, “Integrated Nanostrip Dipole Antennas for Coherent 30 THz Infrared Radiation,” Appl. Phys. B 58(2), pp. 87-95 (1994).
C. Fumeaux, J. Alda, and G. D. Boreman, “Lithographic Antennas at Visible Frequencies,” Optics Lett. 24, 1629-1631 (1999).
Detector Layout for 30 THz
Si
SiO2
SiO2
Antenna
1.5 μm
1.5 μm
100 nm1.5 μm
NiNiO
Ni
Ni
200 nm
35 Å
200 nm
a) small enough ?
b) sensitive enough ?
c) fast enough ?
d) sufficient spectral purity ?
e) dynamically controllable ?
YES
YES
YES
Limited yes (Bias)
Maybe20.2 W/cmNEP
References
L.O.Chua and T. Roska, Cellular Neural Networks and Visual Computing, Cambridge Univesrity Press, Cambridge, 2002
T.Roska and Á.Rodríguez Vázquez, „Towards Visual Microprocessors”, Proc. IEEE, July, 2002
http://lab.analogic.sztaki.hu: Bibliography Special Issues:
– IEEE Trans. CAS-I, May 2004– Int.J. Bifurcation and Chaos, February, 2004– J. Circuits, Systems and Computers, Nos.4 and 6,
2003– Int.J.Circuit Theory and Applications, Nos. 1 and
2, 2002