Upload
vuongdien
View
227
Download
1
Embed Size (px)
Citation preview
OPEN Lab Prof. Sorger
Volker Sorger
14th NSF-Korea Nanotechnology Forum 2017
aJ/bit Modulators and Photonic Neuromorphic Computing
OPEN Lab Prof. Sorger
Orthogonal Physics Enabled Nanophotonics (OPEN) lab
compared to any photonic (non-plasmonic) mode and cavity structures where any placement of the metallic contact
close to the optical mode will introduce intolerable losses. This is different for our plasmonic mode, which is
inherently lossy, but the polaritonic (matter-like) mode allows to scale-down the device into a few micrometer small
device (a reduction of a factor of 100) compared to traveling-wave Silicon-based modulators. This ‘in-the-device-
basing’, as suppose to biasing the device few to 10’s of micrometer away from the active region. As such, the overall
design allows for a more compact overall footprint. Lastly, reducing the dielectric thickness (tox = 5 nm), improves the
electrostatics enabling a sub-1 Volt modulation performance. Despite this increasing the capacitance, the lower Rc and
small device area (5 mm2) the energy consumption is relatively low 18 fJ/bit. However, this can be further improved by
only biasing the device in the steepest region of the transfer function (e.g. 0-0.3V) for a small-signal modulation, and
narrowing the waveguide width from (currently 1 mm) to a SOI diffraction-limited waveguide width of 250 nm
reducing the E/bit to low 160aJ (Fig. 1d).
In summary, we have experimentally shown a graphene-based silicon-photonics integrated electro-absorption
modulator operating at telecom frequencies. Utilizing a hybrid-photon-plasmon modes high group index, field-
concentration to increase the optical overlap factor with the thin graphene active material, and improving the
electrostatics by reducing the gate oxide allows for a steep switching transfer function to enable sub-1 Volt
modulation, which has positive impacts on energy-efficiency. Further improvements allow for a 160aJ/bit efficient
modulator using this approach platform, thus paving the way for next-generation nanophotonics devices [13]. !
References [1] R. A. Soref, D. L. McDaniel, B. R. Bennett, “Guided-Wave Intensity Modulators Using Amplitude and-Phase Perturbations”
IEEE Journal of Lightwave Technology, vol. 6, no. 3, (1988).
[2] K. Liu, S. Sun, A. Majumdar, V. J. Sorger, “Fundamental Scaling Laws in Nanophotonics” Sci. Rep., 6, 37419 (2016).
[3] S. K. Pickus, S. Khan, C. Ye, Z. Li, and V. J. Sorger, “Silicon Plasmon Modulators: Breaking Photonic Limits” IEEE Phot.
Soc., 27, 6 (2013).
[4] M. Liu, X. Yin, E. Ulin-Avila, B. Geng, T. Zentgraf, L. Ju, F. Wang, and X. Zhang, “A graphene-based broadband optical
modulator,” Nature, vol. 474, no. 7349, pp. 64–67, (2011).
[5] M. Liu, X. Yin and X. Zhang, "Double-Layer Graphene Optical Modulator", Nano Letters, 12, no. 3, pp. 1482-1485, (2012).
[6] C. Ye, S. Khan, Z. R. Li, E. Simsek, V. J. Sorger (2014). “λ-size ITO and graphene-based electro-optic modulators on
SOI”. Selected Topics in Quantum Electronics, IEEE Journal of, 20(4), 40-49 (2014).![7] C. Ye, K. Liu, R. Soref, V. J. Sorger, “3-Waveguide 2x2 Plasmonic Electro-optic Switch” Nanophot., 4, 1, pp. 261-268 (2015).
[8] Z. Ma, M. Tahersima, S. Khan, V. J. Sorger, IEEE J. Sel. Topics in Quantum Electronics, 23(1), 3400208 (2017).
[9] R.Amin, C.Suer, Z.Ma, J. B. Khurgin, R. Agarwal, V. J. Sorger, “Active Material, Optical Mode and Cavity Impact on electro-
optic Modulation Performance”, arXiv:1612.02494, (2017)
[10] V. J. Sorger, R. Ma, C. Huang, Z. Li, M. Liu, and X. Zhang, "Graphene, Plasmonic and Silicon Optical Modulators,"
Conference on Lasers and Electro-Optics - IQEC,OSA, (2013).
[11] Z. Ma, Z. Li, K. Liu, C. Ye, V. J. Sorger, “Indium-Tin-Oxide for High-performance Electro-optic Modulation”, Nanophot., 4, 1
(2015).
[12] Ansell, D. et al. Hybrid graphene plasmonic waveguide modulators. Nature Commun. 6, 8846 (2015). [13] A. Fratalocchi, C. M Dodson, R. Zia, P. Genevet, et al. “Nano-optics gets practical”, Nature Nanotech. 10, 11-15 (2015).
Figure 1. a, Schematic of a hybrid-photon-plasmon
graphene-based electro-absorption modulator. The
modulation mechanism is based on Pauli-blocking upon
gating the Fermi level of graphene. b, Silicon waveguide-
integrated modulator. A cw laser (l = 1.55 mm) is fiber
coupled into the SOI waveguide via grating couplers.
Device length, L = 8 mm, tox = 5 nm. c, Electric field
density across the active MOS region of the modulator
showing an enhanced field strength coinciding with the
active graphene layer. This improves the optical overlap
factor by about 25%. Taking into consideration the grain
boundaries introduced during the metal deposited creates
in-plane field vectors inside the graphene layer. d
Modulator transfer function; normalized modulation depth
at different drive voltages (VD). The modulator
performance yields a high extinction ratio of 0.25dB/mm,
due to the combination of the plasmonic MOS mode
enhancing the electroabsorption in the active region (see
text for details).
!
Modulator
(i) (ii)
(iii) (iv)bias
input
output
Electro-OpticNonlinearity
ElectronicNonlinearity
pump
output
input
bias
output
input
output
pump
input
inputs
(c)
T(λ) output
input
t t t
λ1 λ2 λ3 λ1 λ2 λ3 λ4
photo-detector
spectralfilter
I(t) I(t) i(t) I(t)
t
weights summation nonlinearity output
(a)
(b)
!x1
x2
x3
w1
w2
w3
x⌃
f (x⌃ )
X
i
wi x ix4
modulator/laser
Laserelectrical wire
lightwave
Atto-Joule Optoelectronics
Photonic Functions
Analogue Computing
Today
Sorger, Zhang lab, Nature Photonics (2008) Sorger, Zhang lab, Nature (2009) Sorger lab , IEEE Photonics (2013) Sorger lab, Altug lab, Nature Nanotech. (2015) Sorger lab, Majumdar Lab, Sci. Reports (2016) Sorger lab, Optics Letters (2016) Sorger lab, IEEE STQE (2014 & 2017)
Sorger lab, IEEE Photonics (2015) Sorger lab, Nanophotonics (2016) Sorger lab, Optics Letters (2016) Sorger lab, El-Ghazawi lab, IPCC (2017) Sorger lab, El-Ghazawi lab, Mircoprocess. & MS (2017) Sorger lab, Frontiers in Optics (2017)
Sorger lab, Nanophotonics (2017) Sorger Lab, Grace lab, Biofabrication (2017) Sorger lab, IEEE Rebooting Computing (2017) Prucnal lab, Sorger lab, (in preparation)
OPEN Lab Prof. Sorger OPEN Lab
Prof. Sorger
5cm x 1cm
Modulators = Optical Transistors
EO Modulator
(=SS@FET)
OPEN Lab Prof. Sorger
Power-BW Roadmap
V. J. Sorger, Nature Nanotech., 10, 11-15 (2015)
Power = E bit( ) × Bitrate( )
J
bit
bit
s=
J
s= W
é
ëê
ù
ûú
OPEN Lab Prof. Sorger
EO Modulators
Sorger, Nanoph. (2012) Huang, IEEE Phot. (2013)
Ye, IEEE STQE (2014) Khan, (under review)
Ma, IEEE STQE (2017)
LASER &PHOTONICSREVIEWS
Vol. 9 | March 2015
Review and perspective on ultra-fast wavelength-size electro-optic modulators
Ke Liu, Chenran Ye, Sikandar Khan, Volker J. Sorger
www.lpr-journal.org Vol. 9 | March 2015 www.lpr-journal.org
Selective switching of individual multipole resonances in single dielectric nanoparticles
Pawel Wozniak, Peter Banzer, Gerd Leuchs
LASER &PHOTONICSREVIEWS
LPOR_9_2_cover.indd 2 11/03/15 5:05 PM
Sarpkaya, (in prep)
Grating period: 1um Fill Factor: 50%
Slot Gap < 50nm achieved Single EBL Exposure
ER = 1.2dB/um (single layer graphene)
ON
OFF 20 mm
5 mm
Control
2H 2H
1T’ Tr
ansm
issi
on
Control-Voltage
15 15
E/bit (fJ)
3.4
0.3
15
2.4 (0.16)
OPEN Lab Prof. Sorger
E/bit Scaling: FET vs. EAM
300K = 0.004aJ
= 4aJ
= 4fJ
15fJ
3fJ
0.3J
Boltzmann Approximation (=Thermal Smearing)
Sorger Group EAMs (Time not to scale)
10’s aJ Cavity
Finesse = 10
Mode Material
HPP ITO
HPP Graphene Slot Gr or TMD
SIS TMD
Technology Gap With equal E/bit, Photonics performance over Electronics:
1) <10x speed (<100GHz EAM vs. <2GHz link) 2) <1-5x WDM (depending on link length)
Sorger Nanophot. (2012)
Khan, (in review)
Ma, IEEE STQE (2017)
Cavity design based on Liu, J. Nanoph.(2015)
•
Thermally Unattainable Without Cooling
Transistors by (Intel)
Sorger Group, J. Opt., special issue (submitted)
100x
OPEN Lab Prof. Sorger
HyPPI: Hybrid Plasmon Photonics Interconnect
Sun, Badaway, Narayana, El-Ghazawi, Sorger, IEEE Photonics, 7,6 (2015).
High BFD
Mid BFD
low BFD
Electrical
Plasmonic Photonic
HyPPI (this work)
* BFD: Bit Flow Density [Gbps/um2]
Physics & Material (E/bit)Device SNR @ Rx Desired BER
OPEN Lab Prof. Sorger
Photonic Reconfigurable Computing
2015 2016 2017
MorphoNoC
DEVICE
LINK
NETWORK
SYSTEM & APPLICATION
HyPPI BFD
MoDetector
Universal CLEAR
For P
eer Review
Fig. 2 Capability-to-Length-Energy-Area-Resistance (CLEAR) based performance-cost comparison
between electrical (red) and hybrid photon-plasmon (blue) on-chip interconnect links as a function of link
length and technology evolution time. The chip scale (CS = 1 cm) link length and current year (2016) are
denoted in red. The following models are deployed; a) A capacity-area model based on the number of
transistors and on-chip optical devices, which can be regarded as the original Moore’s Law model; b) An
energy efficiency model is derived based on Koomey’s law, which is bounded by the kBT ln(2) ≈ 2.75zJ/bit,
Landauer limit (kB is the Boltzmann constant; T is the temperature); c) A the economic resistance model
based on technology-experience models and at the year 2016, the electronic link cost less than one billionth
to one millionth of the cost of the hybrid link; and d) A model for parallelism (after year 2006) capturing
multi-core architecture and the limitation from ‘dark’ silicon concepts in electrical link interconnects. The
yellow data point represents the actual CMOS silicon photonic chip that IBM fabricated in 2015.
Page 5 of 5 Spectrum
Link-CLEAR
Dynamic-CHyPPI HyPPI NoC
D-CHyPPI based Camera Sensor
Noise HyPPI
5x5 Optical Router
Fundamental Scaling Law
G. Pomrenke F. Darema
El-Ghazawi & Sorger Groups
OPEN Lab Prof. Sorger
Optical on-chip FFT
Hillerkus OE (2011)
Cooley-Tukey Method Addition
Multiplication
Sorger Group, Frontiers in optics (2017)
OPEN Lab Prof. Sorger
Convolutional Neural Networks based-on Optical FTT
Sorger Group, IEEE Computing (2017)
OPEN Lab Prof. Sorger
Example: Photonic Analogue Perceptron
Modulator
(i) (ii)
(iii) (iv)bias
input
output
Electro-OpticNonlinearity
ElectronicNonlinearity
pump
output
input
bias
output
input
output
pump
input
inputs
(c)
T(λ) output
input
t t t
λ1 λ2 λ3 λ1 λ2 λ3 λ4
photo-detector
spectralfilter
I(t) I(t) i(t) I(t)
t
weights summation nonlinearity output
(a)
(b)
!x1
x2
x3
w1
w2
w3
x⌃
f (x⌃ )
X
i
wi x ix4
modulator/laser
Laserelectrical wire
lightwave
Weight Bank = Partial Drop-Filter
Modulator Neuron
OPEN Lab Prof. Sorger
Neuromorphic Photonics
Prucnal, Sorger, El-Ghazawi, NSF E2CDA (2017) Ferreira de Lima, Nanophot (2016)
Nanophotonic Neuromorphic Computing
1 Introduction
TrueNorth
NeuroGrid HICANN
Neuromorphic Electronics
Current platform
Digital Electronics
Digital Efficiency Wall
State-of-the-Art
SpiNNaker Microwave Electronics
Neuromorphic Photonics
102
103
104
105
106
107
108
1
10
10-1
Effic
iency (G
MAC
/s/W
)
10-2
Computational Speed (MMAC/s/cm2)
1021 104 106 108 1010
109
1011
Nanophotonics
= 1 GMAC/s/nW @ 1x108 GMAC/s/cm2
This Project
Figure 1: Comparison of computational energy effi-
ciency and processing speed between existing elec-
tronic neuromorphic demonstrations and our proposed
programmable photonic platform meeting the NSF–SRC
challenge of 1 Giga-MAC/s/nW.
The gap between current computing capabilities and
current computing needs is widening due to the limitations
conventional, microelectronic processors. This insufficiency
is increasingly apparent in problems involving complex sys-
tems [1, 2], big data [3, 4], or real-time requirements [5].
In only some respects, processor performance has kept
pace with the expectations of Moore’s law; however, the ef-
ficiency of elemental multiply-accumulate (MAC) operations
has plateaued [6,7]at about 10 GigaMACs per second per
Watt (see digital efficiency wall in Fig. 1). This represents a
factor of only ⇠2 over the past 14 years. It is no longer pos-
sible for microelectronics to maintain previous rates of pro-
cessor evolution in speed, efficiency, and generality [8–10].
There is also a consensus that centralized, universal von-
Neumann architectures employed by conventional comput-
ers are no longer capable of being the one-size-fits-all ap-
proach to computing problems.
50µm
Broadcast Loop
Current Prototype:Silicon Photonic Chip (without EOM)
Photonic Neuron (EOM)
50µm
Microring Weight Bank
Broadcast-and-WeightAssembly
Small-world NetworkProcessing Stage
Interconnected PICsProcessing Networks
Processing Network Node
Primitive
Weights(Mircorings)
Neuron (EOM)
Figure 2: (Top) Scalable, programmable complex analog photonic processor. Hi-
erarchical architecture at different scales—from individual spike primitives to in-
terconnected complex photonic integrated circuits. (Bottom) Proposed computa-
tional primitive i.e. photonic neuron and current prototype of an integrated pho-
tonic neural network on silicon photonics chip.
Non-von-Neumann computing ar-
chitectures have been developed to
outperform in particular metrics, but
often at the expense of the other met-
rics (Section 2). Hardware devel-
oped for deep learning excels in ef-
ficiency at the cost of generality. Dig-
ital neuromorphic architectures offer
the generality of neural networks, but
at the expense of MAC efficiency.
Forays into unconventional comput-
ing have, so far, only been partially
successful because of the physical
limitations posed by metal intercon-
nects [11–13] and digital weighted
addition. Breaking the limitations
inherent in conventional microelec-
tronic computers will require a cross-
disciplinary approach to use new
physical phenomena and new pro-
cessing models.
Photonics can resolve the limita-
tions inherent in the digital MAC op-
eration and those of microelectronic
interconnects. Photonic weighted
addition, the analog photonic equivalent of a MAC, offers MAC energy consumption that does not tradeoff with
MAC speed. Photonic physics does not provide simple mechanisms to maintain logic levels in a way that is
cascadable and robust [14]. On the other hand, photonic phenomena can be employed to implement analog
weighted addition with unmatched speed, efficiency, and scalability (Section 3). Furthermore, photonic transport
physics can support massively distributed interconnections with commensurate performance.
The proposed program represents a vertically integrated approach to creating extremely high-performance
processors based on photonics (Figs. 1 and 2). In Sec. 4.1, we propose to investigate new devices, archi-
tectures, and evaluation standards, all of which must work in concert for the success of the program. Device
1
Implementation Options
A. Spiking photonic laser neurons on III-V platform
B. Perceptron photonic neurons on Si platform
Our Goal
MAC/s per Neuron
Computational Efficiency = J/MAC
Goal: 1GMAC/nJ VpCEOM = 1-10aC
For B.
Applications • Deep-Learning • Real-time • Non-linear Optimization
Vector Multiply Weighted Additions
OPEN Lab Prof. Sorger
Neuromorphic Performance
Deep Learning
Table 1: Comparison Between Different Neuromorphic Processors
ChipMAC Rate/
processor
Energy/
MAC
Processor
fan-in
Area/MAC
(µm2)
MAC
Rate/cm2
Silicon Photonic (Princeton) 2TMACs/s 5pJ 56 20,000 1⇥1014
Hybrid CMOS-Silicon Photonics 2TMACs/s 2.1 fJ 148 5,000 4⇥1014
Nanophotonic (This Project) 2 TMACs/s 7:4aJ 300 20 1 ⇥ 1017
TrueNorth (Electronic) [13] 2.5kMACs/s 26pJ 256 4.9 2⇥108
A table including performance metrics for various development stages of the proposed platform, and comparison to a popular
electronic implementation. 1 TMAC refers to 1⇥1012 MACs. The energy per MAC for TrueNorth was estimated by dividing
wall-plug power to number of neurons and to operational MAC rate per processor. The MAC Rate /cm2 was computed for the
photonic case assuming a 20GHz bandwidth for each node. All numbers include overheads in terms of footprint and area.
Density, Bandwidth and Speed Beyond energy efficiency, we evaluated how neuromorphic photonics will prac-
tically scale in density and speed. In Table 1, we have summarized the calculated metrics for the course of this
project, the MAC rate (per processor), energy per MAC (pJ), processor fan-in, and area per MAC. In the following
sections, we briefly discuss the physical effects that define these metrics.
Technology Limitation Ctotal (fF)Quantum
EfficiencyEMAC Notes
Silicon photonics (Eq. 1) Gain 47 7% 5 pJ Princeton
Hybrid CMOS (Eq. 2 )Switching energy +
noise35 7% 2.1 fJ
Nanophotonics (Eq. 2 ) noise 0.1 16% 7.4 aJ†
Figure 12: Efficiency estimations in different platforms. Silicon pho-
tonics (current technology). Hybrid CMOS (under research cur-
rently). Nanophotonic (next-generation silicon photonics + CMOS).
We assumed that NFO ⇡ NFI , with NFI on the order of 100. †Note:
Nanophotonic implementation requires high performance CMOS TIA:
power consumption less than 1µW while supporting a bandwidth of
more than 10GHz.
Number of Interconnects and Bandwidth The
network capacity is limited by two factors: (a)
the transform-limited bandwidth of the optical
transmission window, and (b) the finesse of op-
tical resonators utilized in weight banks. For ex-
ample, using a standard optical telecommunica-
tions band exceeding 4THz and 20GHz chan-
nels, we can expect >100 channels that each
node can receive. However, there are some en-
gineering challenges to reach that channel den-
sity. The finesse of the resonators must be
high enough to accommodate the 4THz band,
and the microrings must be engineered to have
transmission curves as packed as possible (cf. Fig. 5). Recent channel density studies [32]quantified analysis
for multiwavelength analog networks, and derived a limit of N 148 fan-in per PN. Using more advanced –-size,
nanophotonic cavities (i.e. photonic crystal defect state) could lead to higher Q filters and flatter resonances,
potentially increasing the channel account to 200.
MAC Unit Density Traditional photonic devices are diffraction-limited—i.e., devices and waveguides cannot
shrink much smaller than ⇠1µm2 in size. Although this number is much larger than seen in electronics (⇠5nm),
in digital architectures, many thousands of electronic transistors are needed to emulate simple mathematical or
floating point functions, which increases energy consumption and occupies space (cf. TrueNorth in Table 1). Uti-
lizing underlying physical phenomena to perform computations leads to 3–4x orders of magnitude improvements
in energy efficiency and information density (per area) in comparison.
MAC Rate Per Processor One of the most pronounced advantages of the nanophotonic platform is the speed
and bandwidths that it can achieve during operation. In digital systems, processing speed is bounded by the
clock rate, which puts an upper bound on the temporal resolution of digital information. The neuromorphic
photonic implementation avoids these limitations by using photonic interconnects together with electronic nodes
for nonlinear operations: optical signals surpass the bandwidth limitations of metal interconnects, while low-
power, analog electronic circuits have the potential to minimize (J/MAC) energy consumption to allow for more
scalability. Interconnects are optical, so the nodes are small enough to avoid transmission line effects.
Typical waveguide lengths on-chip are <1mm (i.e., <10ps time-of-flight in silicon), and typical resonator effects
have similar time constants. Thus, delays of the signal are no more than tens of picoseconds. Optoelectronic
components (such as photodetectors, modulators, or transimpedance amplifiers) in modern platforms such as
IMEC [68] can accept signals on the order of 20GHz to 30GHz. Since the optical response is much faster
than this range, optoelectronic circuits are the primary contribution to limitations in bandwidth. Considering both
bandwidth and latency together, the operating signal bandwidth can exceed >20GHz range.
9
Table 1: Comparison Between Different Neuromorphic Processors
ChipMAC Rate/
processor
Energy/
MAC
Processor
fan-in
Area/MAC
(µm2)
MAC
Rate/cm2
Silicon Photonic (Princeton) 2TMACs/s 5pJ 56 20,000 1⇥1014
Hybrid CMOS-Silicon Photonics 2TMACs/s 2.1 fJ 148 5,000 4⇥1014
Nanophotonic (This Project) 2 TMACs/s 7:4aJ 300 20 1 ⇥ 1017
TrueNorth (Electronic) [13] 2.5kMACs/s 26pJ 256 4.9 2⇥108
A table including performance metrics for various development stages of the proposed platform, and comparison to a popular
electronic implementation. 1 TMAC refers to 1⇥1012 MACs. The energy per MAC for TrueNorth was estimated by dividing
wall-plug power to number of neurons and to operational MAC rate per processor. The MAC Rate /cm2 was computed for the
photonic case assuming a 20GHz bandwidth for each node. All numbers include overheads in terms of footprint and area.
Density, Bandwidth and Speed Beyond energy efficiency, we evaluated how neuromorphic photonics will prac-
tically scale in density and speed. In Table 1, we have summarized the calculated metrics for the course of this
project, the MAC rate (per processor), energy per MAC (pJ), processor fan-in, and area per MAC. In the following
sections, we briefly discuss the physical effects that define these metrics.
Technology Limitation Ctotal (fF)Quantum
EfficiencyEMAC Notes
Silicon photonics (Eq. 1) Gain 47 7% 5 pJ Princeton
Hybrid CMOS (Eq. 2 )Switching energy +
noise35 7% 2.1 fJ
Nanophotonics (Eq. 2 ) noise 0.1 16% 7.4 aJ†
Figure 12: Efficiency estimations in different platforms. Silicon pho-
tonics (current technology). Hybrid CMOS (under research cur-
rently). Nanophotonic (next-generation silicon photonics + CMOS).
We assumed that NFO ⇡ NFI , with NFI on the order of 100. †Note:
Nanophotonic implementation requires high performance CMOS TIA:
power consumption less than 1µW while supporting a bandwidth of
more than 10GHz.
Number of Interconnects and Bandwidth The
network capacity is limited by two factors: (a)
the transform-limited bandwidth of the optical
transmission window, and (b) the finesse of op-
tical resonators utilized in weight banks. For ex-
ample, using a standard optical telecommunica-
tions band exceeding 4THz and 20GHz chan-
nels, we can expect >100 channels that each
node can receive. However, there are some en-
gineering challenges to reach that channel den-
sity. The finesse of the resonators must be
high enough to accommodate the 4THz band,
and the microrings must be engineered to have
transmission curves as packed as possible (cf. Fig. 5). Recent channel density studies [32]quantified analysis
for multiwavelength analog networks, and derived a limit of N 148 fan-in per PN. Using more advanced –-size,
nanophotonic cavities (i.e. photonic crystal defect state) could lead to higher Q filters and flatter resonances,
potentially increasing the channel account to 200.
MAC Unit Density Traditional photonic devices are diffraction-limited—i.e., devices and waveguides cannot
shrink much smaller than ⇠1µm2 in size. Although this number is much larger than seen in electronics (⇠5nm),
in digital architectures, many thousands of electronic transistors are needed to emulate simple mathematical or
floating point functions, which increases energy consumption and occupies space (cf. TrueNorth in Table 1). Uti-
lizing underlying physical phenomena to perform computations leads to 3–4x orders of magnitude improvements
in energy efficiency and information density (per area) in comparison.
MAC Rate Per Processor One of the most pronounced advantages of the nanophotonic platform is the speed
and bandwidths that it can achieve during operation. In digital systems, processing speed is bounded by the
clock rate, which puts an upper bound on the temporal resolution of digital information. The neuromorphic
photonic implementation avoids these limitations by using photonic interconnects together with electronic nodes
for nonlinear operations: optical signals surpass the bandwidth limitations of metal interconnects, while low-
power, analog electronic circuits have the potential to minimize (J/MAC) energy consumption to allow for more
scalability. Interconnects are optical, so the nodes are small enough to avoid transmission line effects.
Typical waveguide lengths on-chip are <1mm (i.e., <10ps time-of-flight in silicon), and typical resonator effects
have similar time constants. Thus, delays of the signal are no more than tens of picoseconds. Optoelectronic
components (such as photodetectors, modulators, or transimpedance amplifiers) in modern platforms such as
IMEC [68] can accept signals on the order of 20GHz to 30GHz. Since the optical response is much faster
than this range, optoelectronic circuits are the primary contribution to limitations in bandwidth. Considering both
bandwidth and latency together, the operating signal bandwidth can exceed >20GHz range.
9
System Efficiency Vectors
where ” L is the wall-plug-to-chip coupling efficiency of the laser source; ” pp, the total optical efficiency between
one optical stage and the next; ” r the photodiode’s quantum efficiency; NF O=NF I the ratio between fan-out and
fan-in, typically greater than 1; h⌫=e, the photon energy divided by electron’s charge; Vs, the inverse slope of the
modulator’s transmission curve; and, finally Cmod + CPD, the joint capacitances of the modulator and photodiode,
respectively. The laser pump power consumption by far dominates all others in this case, overcoming both
the noise and switching energy contributions to power consumption. The quantity Vs(Cmod + CPD) must be
minimized as much as possible, with capacitances on the order of attofarads to meet the energy efficiency
objective. This may be possible using the proposed modulator (Section 4.1) at very low temperatures (! 0K):
graphene, for example, experiences no carrier freeze-out, and its sensitivity ffen can be enhanced proportional
to the temperature ffen = T=T0. However, one must also consider the energy burden of cryogenics, which will
largely offset these advantages. Instead, we can take use electronic amplification to tackle this problem at room
temperature: namely, the use of an active CMOS-based trans-impedance amplifier (TIA).
Figure 10: PN signal pathway between two neural layers in
feedforward configuration.
Hybrid CMOS-Silicon Photonics An active trans-
impedance amplifier (TIA) serves several functions:
it can separate capacitive contributions of the pho-
todetector and modulator, and it also reduces the
impedances associated with each stage. That allows
us to effect a full voltage swing to the modulator even
if the current I p is low, bringing the pump power down
while maintaining cascadability.
There are two main limiting factors of signal power
in this link, namely, the shot noise at the photodetector
and TIA, and the resolution of the weight banks. The
combination of these two phenomena can be seen
in Fig. 11, and results in a noise-limited trade-off be-
tween low-power and bandwidth. With the power of
the lasers down to the noise-limit, the power con-
sumed by the switching energy of the modulator, Ebit , becomes more significant. The recalculated EMAC accounts
for both the laser energy consumption and the switching energy of the modulator.
EMAC ≥ NFO|{z}fan−out
·1
” pp” L ” r| {z }
quantumefficiency
·22Nb + 1
1− ¸ 2| {z }noise andresolution
·h⌫
NFI|{z}photon
energy/MAC
+Ebit
NFI|{z}switching
energy/MAC
(2)
Figure 11: SNR analysis considering both weight bank reso-
lution and noise cascadability requirements. The nonlinearity
factor of the modulator, ¸ , reduces the noise of the signal as it
travels through. This compares to transistors restoring the logic
voltage levels between gates. Therefore, the SNR of the signal
is limited above by how much the modulator can compensate
for shot noise accumulation. The signal must also match the
bit resolution of the MAC operation, Nb : SNR > 4Nb . That
imposes a lower limit on the SNR for a given bit resolution.
Equation (2) outlines the main vectors of improve-
ment for this project. For context, one attojoule per
MAC translates to EMAC⇠10h⌫. First, all the quantum
efficiencies need to be improved, namely, the photonic
link efficiency, ” pp; the laser wall-plug efficiency, ” L;
and the photodiode’s quantum efficiency, ” r. Second,
the modulator nonlinearity could be enhanced ¸ ! 0,
and the bit resolution, Nb of the neural network, lim-
ited. Finally, the modulator’s switching energy should
be minimized to sub-femtojoule levels, assuming a
fan-in number in the hundred.
Nanophotonics (This Project) To meet the NSF tar-
get, each MAC operation energy consumption must
be on the order of an attojoule. Figure 12 summa-
rizes the efficiency performance for the different fab
platforms discussed above. It shows that to achieve
attojoule-level energy efficiency, it is necessary toboth
cointegrate photonics with CMOS and bring better ef-
ficiencies with nanophotonic devices.
8
where ” L is the wall-plug-to-chip coupling efficiency of the laser source; ” pp, the total optical efficiency between
one optical stage and the next; ” r the photodiode’s quantum efficiency; NF O=NF I the ratio between fan-out and
fan-in, typically greater than 1; h⌫=e, the photon energy divided by electron’s charge; Vs, the inverse slope of the
modulator’s transmission curve; and, finally Cmod + CPD, the joint capacitances of the modulator and photodiode,
respectively. The laser pump power consumption by far dominates all others in this case, overcoming both
the noise and switching energy contributions to power consumption. The quantity Vs(Cmod + CPD) must be
minimized as much as possible, with capacitances on the order of attofarads to meet the energy efficiency
objective. This may be possible using the proposed modulator (Section 4.1) at very low temperatures (! 0K):
graphene, for example, experiences no carrier freeze-out, and its sensitivity ffen can be enhanced proportional
to the temperature ffen = T=T0. However, one must also consider the energy burden of cryogenics, which will
largely offset these advantages. Instead, we can take use electronic amplification to tackle this problem at room
temperature: namely, the use of an active CMOS-based trans-impedance amplifier (TIA).
Figure 10: PN signal pathway between two neural layers in
feedforward configuration.
Hybrid CMOS-Silicon Photonics An active trans-
impedance amplifier (TIA) serves several functions:
it can separate capacitive contributions of the pho-
todetector and modulator, and it also reduces the
impedances associated with each stage. That allows
us to effect a full voltage swing to the modulator even
if the current I p is low, bringing the pump power down
while maintaining cascadability.
There are two main limiting factors of signal power
in this link, namely, the shot noise at the photodetector
and TIA, and the resolution of the weight banks. The
combination of these two phenomena can be seen
in Fig. 11, and results in a noise-limited trade-off be-
tween low-power and bandwidth. With the power of
the lasers down to the noise-limit, the power con-
sumed by the switching energy of the modulator, Ebit , becomes more significant. The recalculated EMAC accounts
for both the laser energy consumption and the switching energy of the modulator.
EMAC ≥ NFO|{z}fan−out
·1
” pp” L ” r| {z }
quantumefficiency
·22Nb + 1
1− ¸ 2| {z }noise andresolution
·h⌫
NFI|{z}photon
energy/MAC
+Ebit
NFI|{z}switching
energy/MAC
(2)
Figure 11: SNR analysis considering both weight bank reso-
lution and noise cascadability requirements. The nonlinearity
factor of the modulator, ¸ , reduces the noise of the signal as it
travels through. This compares to transistors restoring the logic
voltage levels between gates. Therefore, the SNR of the signal
is limited above by how much the modulator can compensate
for shot noise accumulation. The signal must also match the
bit resolution of the MAC operation, Nb : SNR > 4Nb . That
imposes a lower limit on the SNR for a given bit resolution.
Equation (2) outlines the main vectors of improve-
ment for this project. For context, one attojoule per
MAC translates to EMAC⇠10h⌫. First, all the quantum
efficiencies need to be improved, namely, the photonic
link efficiency, ” pp; the laser wall-plug efficiency, ” L;
and the photodiode’s quantum efficiency, ” r. Second,
the modulator nonlinearity could be enhanced ¸ ! 0,
and the bit resolution, Nb of the neural network, lim-
ited. Finally, the modulator’s switching energy should
be minimized to sub-femtojoule levels, assuming a
fan-in number in the hundred.
Nanophotonics (This Project) To meet the NSF tar-
get, each MAC operation energy consumption must
be on the order of an attojoule. Figure 12 summa-
rizes the efficiency performance for the different fab
platforms discussed above. It shows that to achieve
attojoule-level energy efficiency, it is necessary to both
cointegrate photonics with CMOS and bring better ef-
ficiencies with nanophotonic devices.
8
Comparison Neuromorphic Processors
OPEN Lab Prof. Sorger
Mirror Symmetry Density with Delay in Spiking Neural Networks
Sorger Group. (2017) submitted
OPEN Lab Prof. Sorger
Prof. Volker Sorger sorger.seas.gwu.edu [email protected]
OPEN Sorger Team
Post Docs Grad Students U-grads Collaborators
Dr. I. Sarpkaya Dr. Ke Liu Dr. Hasan Goktas Dr. Elnaz Akbari
Mohammad Tahersima Sikandar Khan Matt Zhizhen Shuai Sun Jonathan George Rubab Amin Hani Nejadriahi Seyed Haghshenas Rohit Hemnani
D. D’Hemmecout T. Weinshel J. Crandall
Prof. Majumdar (UW) Prof. Agarwal (UPenn) Prof. Kimerling (MIT) Prof. Reed (Stanford) Prof. Bartels (UCR) Prof. Lee (KU) Prof. Prucnal (Princeton ) Prof. El Ghazawi (GWU) Dr. Sadana (IBM Watson) Prof. Cesare (NTU)