7
Advanced Technologies for Brain-Inspired Computing  Abstract – This paper aims at presenting how new technologies can overcome classical implementation issues of Neural Networks. Resistive memories such as Phase Change Memories and Conductive-Bridge RAM can be used for obtaining low-area synapses thanks to programmable resistance also called Memristors. Similarly, the high capacitance of Through Silicon Vias can be used to greatly improve analog neurons and reduce their area. The very same devices can also be used for improving connectivity of Neural Networks as demonstrated by an application. Finally, some perspectives are given on the usage of 3D monolithic integration for better exploiting the third dimension and thus obtaining systems closer to the brain. I. Introduction Brain-inspi red computing has long been a nice theoretical research topic facing implementation issues. Even with  powerful computing capabilities on some of the hardest  problems, v ery few indu strial applicatio ns have ap peared. As largely known, implementation limitations are linked to two main elements: - Difficulty to “emulate” the behavior of synapses and neurons with transistor s. Classical CMOS technology is not aimed at providing devices compatible with neurons requirements . - Difficulty to connect neurons. This is linked to 2D-layered implementation used for simplifying CMOS integrati on. Some solutions have been proposed to cope with these issues. Analog neuron is an elegant implementation solution to reduce the number of transistors needed for a neuron. However, while reducing the number of transistors, large capacitances are needed for obtaining a good RC delay. On the connection purpose, time multiplexin g is currently used, but it limits the overall performance while increasing the power consumption due to high frequency links. Moreover, the increasing capacitance of wires in advanced technologies limits this solution. A global conclusion of all these works is the relative inexpediency of standard CMOS with Neural Networks (NN) implementati on. In recent years, new technologies have appeared that can change the game. Some of these technologies and their application or potential of application to NN are presented in this paper. In section II, resistive memories are used to improve implementation of synapse thanks to their nice properties. Both Phase Change Memories (PCM) and Conductive-Bridge Random Access Memories (CBRAM) are studied for this purpose. In section III, analog neurons are revisited through the usage of high capacitances of Through Silicon Vias (TSV) used for 3D stacking. TSV also opens the way towards 3D integration and interconnection length reduction which is discussed in section IV. Finally, emerging 3D monolithic technology is discussed in section V. This technology allows high vertical interconnects density, thus becoming closer to the brain and opening the way to a high range of possibilities to NN implementation. II. Exploiting Resistive Memories Memristive memories can be defined as two-terminal devices whose conductance can be modulated by flowing current though it. In this section, we show how memristor can be used to produce brain-like circuits and in particular arrays of artificial synapses. The functional analogy between memristive technologies and the behavior of a synapse has  been anticipated by Chua [1] and later popularize d by Strukov [2]. Since then several authors have shown that it was a fruitful idea. Most notably, Snider [3] showed the concept of using memristive memories to implement the Spike Timing Dependent Plasticity (STDP) learning rule, by virtue of their own physics. This idea has since been experimentally demonstrated to be actually working by several groups: University of Michigan using Si-Ag devices [4], University of Aachen with nano-ionics devices [5] and Alibart et al. with Organic transistors (NOMFET) [6]. However, there is a bit more than the basic functionality of a single device to yield an actual working memristive  based brain-like circuit. As a matter of fact, the circuit architecture allowing the exploitation of such devices is of utmost importance. Depending on the polarity and electric al characteristics of investigated devices, three types of circuits have been identified which are now described in the following paragraphs. Fabi en Cler midy , Ro dol phe Hel iot , Al exa ndre Vale nti an , Ch rist ian Gamra t , Olivier Bichler ** , Marc Duranton ** , Bilel Blehadj #  and Olivier Temam #  * CEA-LETI, Grenoble, FRANCE, [email protected] ** CEA-LIST, Paris, FRANCE, [email protected] # INRIA, Paris, FRANCE, [email protected]  

Advanced Technologies for Brain-Inspired Computing

Embed Size (px)

Citation preview

Page 1: Advanced Technologies for Brain-Inspired Computing

7/26/2019 Advanced Technologies for Brain-Inspired Computing

http://slidepdf.com/reader/full/advanced-technologies-for-brain-inspired-computing 1/7

Advanced Technologies for Brain-Inspired Computing 

Abstract – This paper aims at presenting how new technologies

can overcome classical implementation issues of Neural

Networks. Resistive memories such as Phase Change Memories

and Conductive-Bridge RAM can be used for obtaining

low-area synapses thanks to programmable resistance also

called Memristors. Similarly, the high capacitance of Through

Silicon Vias can be used to greatly improve analog neurons and

reduce their area. The very same devices can also be used for

improving connectivity of Neural Networks as demonstrated by

an application. Finally, some perspectives are given on the

usage of 3D monolithic integration for better exploiting the

third dimension and thus obtaining systems closer to the brain.

I.  IntroductionBrain-inspired computing has long been a nice theoretical

research topic facing implementation issues. Even with

 powerful computing capabilities on some of the hardest

 problems, very few industrial applications have appeared.

As largely known, implementation limitations are linkedto two main elements:

-  Difficulty to “emulate” the behavior of synapses and

neurons with transistors. Classical CMOS technology

is not aimed at providing devices compatible with

neurons requirements.

-  Difficulty to connect neurons. This is linked to

2D-layered implementation used for simplifying

CMOS integration.

Some solutions have been proposed to cope with these

issues. Analog neuron is an elegant implementation solution

to reduce the number of transistors needed for a neuron.

However, while reducing the number of transistors, large

capacitances are needed for obtaining a good RC delay.On the connection purpose, time multiplexing is currently

used, but it limits the overall performance while increasing

the power consumption due to high frequency links.

Moreover, the increasing capacitance of wires in advanced

technologies limits this solution.

A global conclusion of all these works is the relative

inexpediency of standard CMOS with Neural Networks

(NN) implementation.

In recent years, new technologies have appeared that can

change the game. Some of these technologies and their

application or potential of application to NN are presented in

this paper.

In section II, resistive memories are used to improveimplementation of synapse thanks to their nice properties.

Both Phase Change Memories (PCM) and

Conductive-Bridge Random Access Memories (CBRAM)

are studied for this purpose.

In section III, analog neurons are revisited through the

usage of high capacitances of Through Silicon Vias (TSV)used for 3D stacking. TSV also opens the way towards 3D

integration and interconnection length reduction which is

discussed in section IV.

Finally, emerging 3D monolithic technology is discussed

in section V. This technology allows high vertical

interconnects density, thus becoming closer to the brain and

opening the way to a high range of possibilities to NN

implementation.

II.  Exploiting Resistive MemoriesMemristive memories can be defined as two-terminal

devices whose conductance can be modulated by flowing

current though it. In this section, we show how memristorcan be used to produce brain-like circuits and in particular

arrays of artificial synapses. The functional analogy between

memristive technologies and the behavior of a synapse has

 been anticipated by Chua [1] and later popularized by

Strukov [2].

Since then several authors have shown that it was a

fruitful idea. Most notably, Snider [3] showed the concept of

using memristive memories to implement the Spike Timing

Dependent Plasticity (STDP) learning rule, by virtue of their

own physics. This idea has since been experimentally

demonstrated to be actually working by several groups:

University of Michigan using Si-Ag devices [4], University

of Aachen with nano-ionics devices [5] and Alibart et al.with Organic transistors (NOMFET) [6].

However, there is a bit more than the basic functionality

of a single device to yield an actual working memristive

 based brain-like circuit. As a matter of fact, the circuit

architecture allowing the exploitation of such devices is of

utmost importance. Depending on the polarity and electrical

characteristics of investigated devices, three types of circuits

have been identified which are now described in the

following paragraphs.

Fabien Clermidy , Rodolphe Heliot , Alexandre Valentian , Christian Gamrat ,Olivier Bichler 

**, Marc Duranton

**, Bilel Blehadj

# and Olivier Temam

*CEA-LETI, Grenoble, FRANCE, [email protected]

**CEA-LIST, Paris, FRANCE, [email protected]#INRIA, Paris, FRANCE, [email protected]  

Page 2: Advanced Technologies for Brain-Inspired Computing

7/26/2019 Advanced Technologies for Brain-Inspired Computing

http://slidepdf.com/reader/full/advanced-technologies-for-brain-inspired-computing 2/7

A. 

Circuits for Bipolar Memristors

Most of the works on memristive devices that have been

 published over the last couple of years focus on bipolar

resistive switching devices [2] [3] [4] [5]. Indeed, these

devices exhibit characteristics that are the closest to the

original Memristor predicted by Chua. Their resistance can be gradually increased or decreased with opposite polarity

voltage pulses and the resistance change is cumulative with

the previous state of the device, which make them

 particularly suitable to implement synaptic-like

functionality.

Several (simplified) models have been proposed to

develop nano-architectures capable of learning with

memristive devices. In [7], a new behavioral model

especially tailored at modeling the conductance evolution of

some popular memristive devices in pulse regime for

synaptic-like applications is introduced. This model

demonstrates high tolerance of NN architectures to

Memristor variability.A biologically-inspired spiking NN-based computing

 paradigm which exploits the specific physic of those devices

is presented in [8]. In this approach, CMOS input and output

neurons are connected by bipolar memristive devices used as

synapses. It is natural to lay out the nanodevices in the

widely studied crossbar as illustrated on Figure 1, where

CMOS silicon neurons and their associated synaptic driving

circuitry are the dots, the squares being the nanodevices.

Learning is competitive thanks to lateral inhibition and fully

unsupervised using a simplified form of STDP.

Using this topology, comparable performance to

traditional supervised networks has been measured [8] on

the textbook case of character recognition, despite extremevariations of various memristive devices’ parameters. With

the same approach, unsupervised learning of temporally

correlated patterns from a spiking silicon retina has also

 been demonstrated. When tested with real-life data, the

system is able to extract complex and overlapping

temporally correlated features such as car trajectories on a

freeway [9].

Figure 1: Basic circuit topology. Wires originate from CMOS input

layer (horizontal black wires) and from the CMOS output layer

(vertical gray wires). Memristive nanodevices are at the

intersection of the horizontal and vertical wires [8]. 

B.  Circuits for Unipolar Memristors (PCM)

Among the resistive technologies, Phase-Change Memory

(PCM) has good maturity, scaling capability, high endurance,

and good reliability [10]. PCM resistance can be modified

 by applying a temperature gradient modifying the material

organization between an amorphous and a crystalline phase.

The amorphous region inside the phase change layer can be

crystallized by applying Set pulses, thus increasing device

conductance. It was shown that the magnitude of the relative

increase in conductance can be controlled by the pulseamplitude and by the equivalent pulse width [11].

Amorphization, on the other hand is a more power-hungry

 process and is not progressive with identical pulses. The

current required for amorphization is typically 5–10 times

higher than for crystallization, even for state-of-the art

devices.

To overcome these issues, a novel low-power architecture

“2-PCM Synapse” was introduced in [12]. The idea is to

emulate synaptic functions in large scale neural networks

thanks to two PCM devices constituting one synapse as

shown in Figure 2. The two devices have an opposite

contribution to the neuron’s integration. When the synapse

needs to be potentiated, the Long Term Potentiation (LTP)PCM device undergoes a partial crystallization, increasing

the equivalent weight of the synapse. Similarly, when the

synapse must be depressed, the Long Term Depression

(LTD) PCM device is crystallized. As the LTD device has a

negative contribution to the neuron’s integration, the

equivalent weight of the synapse is reduced. Furthermore,

 because gradual crystallization is achieved with successive

identical voltage pulses, the pulse generation is greatly

simplified.

Figure 2: (Left) Experimental LTP characteristics of Ge2Sb2Te5

(GST) PCM devices. For each curve, first, a reset pulse (7 V, 100

ns) is applied followed by 30 consecutive identical potentiating

 pulses (2 V). Dotted lines correspond to the behavioral model fit

used in our simulations. (Right) 2-PCM synapse principle.  

C.  Using Bipolar Memories (CBRAM)

1T-1R CBRAM multi-level programming was also

 proposed to emulate biological synaptic-plasticity. Like

PCM, it is difficult to emulate a gradual synaptic depression

effect using CBRAM, due to the abrupt nature of the

set-to-reset transition in those devices. Moreover, LTP

 behavior is possible, but requires to gradually increasing the

select transistor gate voltage. This implies keeping a history

of the previous state of the synaptic device, thus leading to

additional overhead in the programming circuitry.

In [13], it is shown that when weak SET programming

conditions are used immediately after a RESET, a

 probabilistic switching of the device appears, as illustrated

Equivalent

2-PCM synapse

I = ILTP - ILTD

ILTD

ILTP

From spiking pre-synaptic

neurons (inputs)

VRD

Spiking post-

synaptic neuron

(output)

Page 3: Advanced Technologies for Brain-Inspired Computing

7/26/2019 Advanced Technologies for Brain-Inspired Computing

http://slidepdf.com/reader/full/advanced-technologies-for-brain-inspired-computing 3/7

in Figure 3. Then, the switching probability can be tuned by

using the right combination of programming conditions.

This opens the way to exploit the intrinsic stochasticity of

CBRAM to implement synapses. Similarly to the system

level, a functional equivalence [14] exists between

multi-level deterministic synapses and binary probabilisticsynapses. Using this property, a low-power stochastic

neuromorphic system for auditory (cochlea) and visual

(retina) cognitive processing applications was proposed in

[13].

Figure 3: (Left) TEM of the CBRAM resistive element. (Right)

Stochastic switching of 1T-1R device during 1000 cycles using 100

ns pulse shows switching probability of around 0.5.  

D.  Conclusion

This section has demonstrated how synaptic weight

adaptation rules: LTP, LTD can be implemented using

various memristive devices. Thanks to this property, such

devices can lead to intrinsic implementation of STDP based

learning rule when coupled with a spike coding scheme [15].

It is safe to say that memristive technology is a promising

way for implementing synaptic arrays for brain-inspiredcomputing while neurons circuits will be implemented using

advanced CMOS technology as described in the following

section. This hybrid CMOS-memristive approach is

 particularly appropriate for embedded neuromorphic

devices.

III.  Analog neurons and 3D-TSVBeyond pure technological advantages, a mix between

advanced design and new technologies is an appealing way

to succeed in integrating more neurons and greatly increase

the overall performance of NN. In this section, we discuss

the advantages of analog neurons design style and their

combination with Through-Silicon-Vias (TSV) used in 3D

stacking technologies.Analog neurons are considered powerful computational

operators thanks to 1) compact design, 2) low power

consumption, 3) ability to interface sensors directly with the

 processing part, and 4) computational efficiency. Even if

some analog neuron designs have already been proposed,

most are geared towards fast emulation of biological neural

networks [16] [17]. Consequently, these hardware neurons

contain a feedback loop for learning purposes [19] which is

irrelevant when learning can be done off-line such as many

current applications. Still, some of these designs have been

applied to computing applications [18], but they come with

the aforementioned overhead of their bio-inspired

motivation. Recently, we implemented and fabricated a

Leaky Integrate and Fire (LIF) analog neuron in 65nm

technology node (Figure 4) without feedback loop. This

neuron is capable of harnessing input signals ranging from

low (1kHz) to medium (1MHz) frequency, compatible with

most of the signal processing applications [20]. Note that the

aforementioned frequencies (1kHz to 1MHz) correspond tothe rate at which the neuron and I/O circuits can process

spikes per second, not the overall update frequency of the

input data. This frequency can potentially scale up to much

higher values by leveraging the massive parallelism of the

architecture: an “x” MHz input signal can be de-multiplexed

and processed in parallel using x input circuits.

Figure 4: Circuit diagram of the analog neuron (injection in purple,

leak in blue, reset in yellow, capacitance in orange, comparator in

red).

For instance, the motion estimation application processing

a black and white SVGA (800x600) image at 100Hz

corresponds to a maximum spike frequency of 48MHz

(800*600*100). This application requires 4*N p + 3 neurons

for N p  pixels. Thus, (4*800*600+3)*100 spikes must be processed per second. Since the maximum processing rate of

a neuron is 1Mspikes/s in our design, we need

((4*800*600+3)*100)/106 = 193 neurons to process images

at a speed compatible with the input rate.

However, an analog neuron has both assets and

drawbacks. The main drawback is the need for a large

capacitance. The capacitance has a central role as it

accumulates spikes, i.e., it performs the addition of inputs. In

spite of this drawback, the analog neuron can remain a very

low-cost device: our analog neuron has an area of 120!m2 at

65nm. This area accounts for the capacitance size and the 34

transistors. Since the capacitance is implemented using only

two metal layers, most of the transistor logic can fitunderneath. As a result, most of the area actually

corresponds to the capacitance.

The analog neuron of Figure 4 operates as follows. When

a spike arrives at the input of a neuron, before the synapse, it

triggers the S"inc switch (see bottom left), which is bringing

the current to the capacitance via V I. The different

transistors on the path from S"inc to VI form a current mirror,

which aims at stabilizing the current before injecting it. Now,

if the synapse has value V, this spike will be converted into

a train of V pulses. These pulses are emulated by switching

on/off S pulse  V times, inducing V current injections in the

capacitance. We have physically measured the neuron

energy at the very low value of 1.4pJ per spike (we used themaximum value of V = 127 for the measurement). The

Page 4: Advanced Technologies for Brain-Inspired Computing

7/26/2019 Advanced Technologies for Brain-Inspired Computing

http://slidepdf.com/reader/full/advanced-technologies-for-brain-inspired-computing 4/7

group of transistors on the center left part of the figure

implements the variable leakage of the capacitance (which

occurs when Sleak   is closed) necessary to tolerate signal

frequencies ranging from 1kHz to 1MHz. After the pulses

have been injected, the neuron potential (corresponding to

the capacitance charge) is compared against the threshold byclosing Scmp, and an output spike is generated on Sspkout if the

 potential is higher than the threshold.

As explained before, analog spiking neurons require large

capacitances to store the internal membrane voltage of the

neuron. Typical capacitances values are in the order of 0.5#1

 pF, which corresponds to 50#200 µm$ capacitors in 32#65

nm technologies, and up to 50% of the total neuron area.

3D stacking is a technique that can provide massive

 parallelism between layers by stacking them and directly

connecting them via a large number of Through-Silicon Vias

(TSVs). TSVs are used to create vertical interconnections in

3-D stacked chips. As shown in Figure 5, they are composed

of a metal wire isolated from the substrate. They are not perfect wires however, and act as MOS capacitors.

In traditional digital circuits, these TSVs are a significant

limitation of 3D stacking: they consume on-chip area and

suffer from their parasitic capacitance behavior [21].

However, we can actually take advantage of these

capacitances to implement the capacitors of spiking neurons,

turning a weakness into a useful feature.

An RLC %-shaped electrical model of TSVs was used, as

described in [21] (Figure 5, middle). The TSV model

features a resistance R tsv, and an inductance Ltsv. It is

isolated from the substrate by three capacitors Cox, Cdep and

Csi, while parasitic silicon substrate losses are represented bya conductance Gsi. All of these values are process dependent

and can change with TSV density, height, diameter,

operating frequency and oxide thickness.

Figure 5: Through Silicon Vias structure and electrical model

We designed an analog LIF neuron using the above TSV

model and compared it through Spice simulations against the

 previously designed analog neuron. Figure 6  shows the

 behavior of the neuron internal potential Vm  for standard

(top) and TSV (bottom) neurons (for different densities).

When Vm reaches a threshold voltage Vth, Vm  is reset to a

resting potential. With a medium density TSV (400/mm2), it

can be seen that the TSV-neuron exhibits the exact same

 behavior as the standard neuron. TSVs with other densities

exhibit similar behavior, albeit with different time constants

[22].

Figure 6: Membrane potential of standard- and TSV-based (several

TSV densities) neurons.

3D stacking neuromorphic architectures with TSVs offers

massive tremendous opportunities. Figure 7 summarizes

some of the options that are offered. These different setups

come with different gains in terms of area or with additional

connectivity. Some of these applications are discussed in

next section.

Figure 7. Illustrations of (a) a standard 2D neuromorphic

architecture (b) a 3D-IC with standard 2D neurons, and (c) (d) a

3D-IC with TSV-based neurons

IV. 

3D staking integration Neuromorphic architectures are fundamentally 3D

structures with massive parallelism, but the 2D planar

circuits on which they are implemented considerably limit

the bandwidth between layers (or significantly increase the

area and energy cost required to achieve sufficient

 bandwidth). Indeed, neural networks typically exhibit a high

level of connectivity. Especially when biologically relevant

neural networks are considered, a given neuron can receive

information from up to 10.000 neurons. Deep computation

and cognitive functions may be reached with thousands of

neurons and millions of synapses [21][24][25][26].

In this section, we assume that the neural network is

multilayer and densely-connected, and neurons are

Page 5: Advanced Technologies for Brain-Inspired Computing

7/26/2019 Advanced Technologies for Brain-Inspired Computing

http://slidepdf.com/reader/full/advanced-technologies-for-brain-inspired-computing 5/7

 point-to-point connected (hardwired connections). Dense

hardware integration is, therefore, required for advanced

 processing tasks. Analog neurons are compact circuits but

connecting thousands of them leads to routing and

throughput problems. The 3D architectures are expected to

reduce the routing congestion problem related to neuroninterconnect, increase the scalability of the network, reduce

critical paths length, and finally save circuit area and power.

(a)

(b) 

Figure 8: 2D (a) and 3D (b) designs of a neural processor. The two

layers of the neural network are mapped onto separate silicon

layers. The mean wire-length is reduced, as well as the routing

complexity and the inter-neuron throughput.

The case study presented in this paper is a neural

 processor with a densely connected 2-layers neural network

and aiming to recognize objects appearing in a 1000frame/svideo stream. The final objective is to execute complex tasks,

such as real-time surveillance (motion detection + image

filtering + shape recognition) with low-power and compact

circuitry. The 3D design offers more flexibility to meet

timing, area and power constraints. Figure 8 shows the block

diagrams of both 2D and 3D partitioning of the neural

 processor design. The intensity of inter-blockcommunication is represented by the thickness of edges

(thick edges for intense communication). The routing

congestion risk is represented using colors (red edges for

higher congestion risk to green edges for the opposite). In

2D design, the communication between the two neuron

layers is the most intensive in terms of data exchange and

throughput requirements. This complexity is mitigated after

mapping the network layers onto separate silicon layers for

3D design. Throughput and congestion problems are,

interestingly, alleviated by the increasing number of metal

layers and short connections created by 3D routing.

The design methodology we used consists on designing

two separate circuits for face-to-face staking fabrication process. We use two types of TSV to link both circuits:

TSVs that are created by bonding the last metal layers of

 both circuits (bumps), and TSVs that cross all the metal

layers to output pad signals of the hidden circuit (IO TSV).

Bumps have a size of 5x5!m2  and may be placed

everywhere in the circuit body, while IO TSVs have a size

of a 60x90!m2 and can only be placed in the corona area of

 both circuits.

A 2-tier 3D layout has been built in a 130nm technology.

Layouts show a decrease in the mean wire-length. Short

connections reduce the footprint area in the 3D design,

which is largely related to the buffer count reduction

 because of shorter wire-length and hence better timing.Routing demand is quite different between 2D and 3D

designs. As for the 2D case, a large number of inter-neuron

and over-neuron connections are required, and this increases

 both total and average wire-length. Thus, more high metal

layers are necessary to complete inter-neuron routing. In the

other hand, many wires in the 3D design are connected to

nearby bumps, and this reduces wiring demand significantly.

Figure 9 shows the mapping results of the same neural

network in 2D (left) and 3D (right). 3D mapping allows to

tremendously decrease the total connections length, and

hence the associated power consumption.

Figure 9: mapping results of the same neural network (184 neurons,

1181 connections) in 2D (left) and 3D (right). Total connections

length is reduced by 3x.

Configuration

Memory

41 2 3Controler 

Layer 1

Controler 

Layer 2

Switcher 1

5 m

41 2 3 5 n

IF

Switcher 2   … Switcher m

Memory

Layer 1

Layer 2

Memory

41 2 3Controler 

Layer 1

Switcher 1

5 m

IF

Switcher 2   … Switcher m

Configuration

Controler 

Layer 2

41 2 3 5 n…

Memory

Bump matrix

Bump matrix

TSV pad

TSV pad

Layer 1

Layer 2

Page 6: Advanced Technologies for Brain-Inspired Computing

7/26/2019 Advanced Technologies for Brain-Inspired Computing

http://slidepdf.com/reader/full/advanced-technologies-for-brain-inspired-computing 6/7

V.  3D monolithic integration3D-TSV opens the way towards more integration thanks

to new functions (3D-capacitances) and reduced wire

lengths as demonstrated in previous sections. However,

recent ITRS [27] roadmap shows that TSV alignment will be

limited between ~ 0.5!m and 1µm to guarantee correctoperation (Table I). This is due to the TSV process which is

 performed outside the foundry, thus not benefiting from

advanced lithography.

Due to the restriction in the alignment, the pitch between

two TSVs cannot be smaller than ~ 4 - 8µm. In addition,

height of TSV (~ 20 - 50 µm) implies performance impact

when signals crossing TSV are in the chip critical paths. As

we have seen before, high capacitance can be used for

neurons design. However, the limited number of vertical

TSV per mm2  prevent from using this technology for

ultra-dense wires.

Recently, a new 3D technology called 3D Monolithic

Integration (3DMI) has appeared. With 3DMI, transistorlayers are fabricated one after another on the same die using

high-end lithography equipment. It results in an improved

alignment performance of ~10nm [28].

Therefore, vertical connections can be placed with a very

small footprint of less than 100nm diameter in 65nm

technology [29].

Table 1. 3DMI vs 3D-TSV comparison

Alignment

(!m)

Diameter

(!m)

Pitch

(!m)

Minimum

height

(!m)

TSV 0.5 - 1 2 - 4 4 - 8 20 – 50

3DMI 0.01 0.1 0.2* 0.1

3DMI vs

TSV gain

50x to

100x

20x to

40x

20x to

40x

200x to

500x

* Pitch is assumed to be at least two times larger than the

diameter.

Moreover, the distance between the top and bottom layers

can be reduced to 100nm [28]. Consequently the

 performance while passing through the vertical via is greatly

improved. As a result, efficient, very fine grain partitioning

is achievable with 3DMI. Figure 10 shows some partitioning

 possibilities based on 3DMI integration. Contrary to TSV,the 3D connections are on the same size than classical 2D

vias. No guard interval is required and classical design rules

can be used. This characteristic allows ultra-fine grain

 partitioning: as shown in Figure 10.a, a classical library cell

can be split in two layers. Figure 10.b shows a so-called

“cell-on-cell” approach requiring new 3D place & route

tools with a fine grain partitioning [30]. In all cases, 3DMI

 becomes an ideal choice for highly integrated 3D circuits.

(a) 

(b)

Figure 10 Monolithic 3D integration approaches, (a)

transistor-level (N/P), (b) gate-level (cell-on-cell)

3DMI implementation of NN has not been demonstrated

yet. However, compared to 3D TSV, a high degree of

improvement is expected, especially when considering

multi-layers of neurons which is not an issue with 3DMI. If

we consider that NN are working at quite low frequency

compared to classical devices, thermal issues should not be a

 problem.

VI.  ConclusionIn this paper, we have presented the usage of emerging

technologies for greatly improve the implementation of

 Neural Networks: Synapses effects can be obtained by

Memristors thanks to PCM or CBRAM. Neurons cost can be

reduced by mixing analog design and high TSV capacitances.

Finally, neurons connections issues can be partially solved

through 3D integration, monolithic integration being the

most advanced case.

All these technologies appear at the same time, and when

mixed with new “smart” and low-power applications

requirements, we believe we have here a unique opportunityof pushing NN forward.

Page 7: Advanced Technologies for Brain-Inspired Computing

7/26/2019 Advanced Technologies for Brain-Inspired Computing

http://slidepdf.com/reader/full/advanced-technologies-for-brain-inspired-computing 7/7

 

References

[1]  L. Chua, “Memristor-The missing circuit element,” CircuitTheory, IEEE Transactions on, vol. 18, no. 5, pp. 507-519,

1971.[2] 

D. B. Strukov, G. S. Snider, D. R. Stewart, and R. S.Williams, “The missing memristor found,”  Nature, vol. 453,no. 7191, pp. 80-83, 2008.

[3] 

G. S. Snider, “Spike-timing-dependent learning in memristivenanodevices,”  Prof. of IEEE International Symposium on Nanoscale Architectures 2008 (NANOARCH), 2008, pp.85-92.

[4]  S. H. Jo, T. Chang, I. Ebong, B. B. Bhadviya, P. Mazumder,and W. Lu, “Nanoscale Memristor Device as Synapse in Neuromorphic Systems,”  Nano Letters, vol. 10, no. 4, pp.1297-1301, 2010.

[5]  R. Waser and M. Aono, “Nanoionics-based resistive switchingmemories,”  Nature Materials, vol. 6, no. 11, pp. 833–840,2007.

[6]  F. Alibart, S. Pleutin, D. Guérin, C. Novembre, S. Lenfant, K.Lmimouni, C. Gamrat, and D. Vuillaume, “An organicnanoparticle transistor behaving as a biological spikingsynapse,”  Advanced Functional Materials, vol. 20, no. 2, pp.330–337, 2010.

[7]  D. Querlioz, P. Dollfus, O. Bichler, and C. Gamrat, “Learningwith memristive devices: How should we model their behavior?,”  Nanoscale Architectures (NANOARCH), 2011 IEEE/ACM International Symposium on, pp. 150-156, 2011.

[8] 

D. Querlioz, O. Bichler, P. Dollfus, C. Gamrat, “Immunity toDevice Variations in a Spiking Neural Network withMemristive Nanodevices,”  Nanotechnology, IEEETransactions on, vol. 12, no. 3, pp. 288-295, 2013.

[9] 

O. Bichler, D. Querlioz, S. J. Thorpe, J.-P. Bourgoin, and C.Gamrat, “Extraction of temporally correlated features fromdynamic vision sensors with spike-timing-dependent

 plasticity,” Neural Networks, vol. 32, pp. 339–348, 2012.[10]

 

A. Fantini et al, “N-doped GeTe as performance booster forembedded phase-change memories,”  Proc. IEDM , 2010, pp.29.1.1–29.1.4.

[11] D. Kuzum, R. G. D. Jeyasingh, B. Lee, and H.-S. P. Wong,“Nanoelectronic programmable synapses based on phasechange materials for braininspired computing,”  Nano Lett.,vol. 12, no. 5, pp. 2179–2186, 2012.

[12] O. Bichler, M. Suri, D. Querlioz, D. Vuillaume, B. DeSalvo,and C. Gamrat, “Visual pattern extraction usingenergy-efficient ’2-PCM synapse’ neuromorphicarchitecture,”  Electron Devices, IEEE Transactions on, vol.59, no. 8, pp. 2206–2214, 2012.

[13] M. Suri, D. Querlioz, O. Bichler, G. Palma, E. Vianello, D.Vuillaume, C. Gamrat, B. DeSalvo, “Bio-Inspired StochasticComputing Using Binary CBRAM Synapses,”  Electron Devices, IEEE Transactions on, vol. 60, no. 7, pp. 2402–2409,2013.

[14] D. H. Goldberg, G. Cauwenberghs, and A. G. Andreou,“Probabilistic synaptic weighting in a reconfigurable networkof VLSI integrate-and-fire neurons,”  Neural Networks, vol.14, pp. 781–793, 2001.

[15] Alibart, F., Pleutin, S., Bichler, O., Gamrat, C.,Serrano-Gotarredona, T., Linares-Barranco, B. and Vuillaume,D. A “Memristive Nanoparticle/Organic Hybrid Synapstor for Neuroinspired Computing”.  Advanced Functional Materials,vol. 22-3, pp. 609-616, 2012.

[16] J. V. Arthur and K. Boahen, “Silicon-Neuron Design: ADynamical Systems Approach,” Circuits and Systems I:Regular Papers, IEEE Transactions on, vol. 58, no. 99, p. 1,2011.

[17] 

G. Venkatesh, J. Sampson, N. Goulding-hotta, S. K. Venkata,M. B. Taylor, and S. Swanson, “QsCORES : Trading Dark

Silicon for Scalable Energy Efficiency with Quasi-SpecificCores Categories and Subject Descriptors,” in InternationalSymposium on Microarchitecture, 2011.

[18] 

R. J. Vogelstein, U. Mallik, J. T. Vogelstein, and G.Cauwenberghs, “Dynamically reconfigurable silicon array ofspiking neurons with conductance-based synapses,” IEEE

Transactions on Neural Networks, vol. 18, no. 1, pp. 253–265,2007.

[19] 

A. Hashmi, A. Nere, J. J. Thomas, and M. Lipasti, “A case forneuromorphic ISAs,” in International Conference onArchitectural Support for Programming Languages andOperating Systems. New York, NY: ACM, 2011.

[20] 

A. Joubert, B. Belhadj, and R. Heliot, “A robust and compact65 nm LIF analog neuron for computational purposes,” inIEEE NEWCAS conference, vol. 10, pp. 9–12, 2011.

[21] !" $%&'() *" !'+,-.//01,) +/2 3" !'1,+45" 67,.&1(( +/2 8$4.21990/: .; <3= 9+(> +??,.+&' ;., @A 8$ 0/>1,?.(1,B)CC<!DEFE GHII) ?+:1( JKII) GHII" 

[22] F" *.%-1,>) E" A%,+/>./) L" L19'+2M) N" <14+4 +/2 8"O190.>) 6!+?+&0>+/&1 .; <3=( 0/ @PA (>+&Q12 &'0?( +?,.-914R S.> ;., /1%,.4.,?'0& (5(>14(B) CTTTDF!EA1(0:/ +/2 F%>.4+>0./ !./;1,1/&1) AF!UIG) *%/1 GHIG" 

[23] 

W. Gerstner and W. M. Kistler, Spiking Neuron Models.Cambridge University Press, 2002.

[24] P. Merolla, J. Arthur, F. Akopyan, N. Imam, R. Manohar, andD. Modha, “A digital neurosynaptic core using embeddedcrossbar memory with 45pJ per spike in 45nm,” in IEEECustom Integrated Circuits Conference. IEEE, Sep. 2011, pp.1–4.

[25] U. Rutishauser and R. J. Douglas, “State-dependentcomputation using coupled recurrent networks,” Neuralcomputation, vol. 21, no. 2, pp. 478–509, 2009.

[26] J. Schemmel, J. Fieres, and K. Meier, “Wafer-scale integrationof analog neural networks,” in International Joint Conferenceon Neural Networks. Ieee, Jun. 2008, pp. 431–438.

[27] 

Semiconductor Industry Association, “The International

Technology Roadmap for Semiconductors (ITRS)”, 2011Edition.

[28] P. Batude et al., "GeOI and SOI 3D Monolithic Cellintegrations for High Density Applications" VLSI Technology(VLSIT), symposium on, IEEE, 2009.

[29] Soon-Moon Jung; et al, "Highly cost effective and high

 performance 65nm S3 (stacked single-crystal Si) SRAM

technology with 25F2, 0.16um2 cell and doubly stacked

SSTFT cell transistors for ultra high density and high speed

applications," VLSI Technology, 2005. Digest of Technical

Papers. 2005 Symposium on , vol., no., pp.220,221, 14-16

June 2005.

[30] B. Shashikanth et al., “CELONCEL: Effective DesignTechnique for 3-D Monolithic Integration targeting HighPerformance Integrated Circuits”, proceedings of the 16th

Asia and South Pacific Design Automation Conference, IEEEPress, 2011.