Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
A Compact Model for Scalable MTJ SimulationFernando Garcıa-RedondoArm Ltd, Cambridge, UK
Pranay PrabhatArm Ltd, Cambridge, [email protected]
Mudit BhargavaArm Inc, Austin, USA
Cyrille DrayArm Ltd, La Paros, France
Abstract—This paper presents a physics-based modelingframework for the analysis and transient simulation of cir-cuits containing Spin-Transfer Torque (STT) Magnetic TunnelJunction (MTJ) devices. The framework provides the toolsto analyze the stochastic behavior of MTJs and to generateVerilog-A compact models for their simulation in large VLSIdesigns, addressing the need for an industry-ready model ac-counting for real-world reliability and scalability requirements.Device dynamics are described by the Landau-Lifshitz-Gilbert-Slonczewsky (s-LLGS ) stochastic magnetization consideringVoltage-Controlled Magnetic Anisotropy (VCMA) and the non-negligible statistical effects caused by thermal noise. Model be-havior is validated against the OOMMF magnetic simulator and itsperformance is characterized on a 1-Mb 28 nm Magnetoresistive-RAM (MRAM) memory product.
Index Terms—STT-MRAM, MTJ, Compact modeling, s-LLGS
I. INTRODUCTION
Recent advances in MTJ devices [1]–[4] open the path tothe next generation of system architectures, from embeddedbattery-less edge devices where off-chip Flash will be replacedby embedded MRAM, to future power-efficient caches in HPCsystems. STT-MRAM is being actively developed by foundriesand integrated into 28 nm generation CMOS Process DesignKits (PDKs) [4].
The complex multi-layered MTJ structures have heavilynon-linear magnetic and electric behaviors that depend onthe device structure, the thermal and external magnetic en-vironment, and the applied electrical stress (Figure 1). On topof this complex system of equations, the evolution of MTJmagnetization m shows a stochastic behavior caused by therandom magnetic field induced by thermal fluctuations.
Many MTJ models have been presented, from micro-magnetic [5], [6] approaches to macro-magnetic SPICE andVerilog-A compact models for circuit simulations [1]–[3],[7]. Compact models present in the literature account fordifferent behavioral aspects: a better temperature dependence[1], a more accurate anisotropy formulation for particular MTJstructures [1]–[3], and the integration of Spin-Orbit Torque(SOT) for three-terminal devices [2], [3]. These prior worksfocus on the ability to model a specific new behavior for thesimulation of a single or a few MTJ devices in a small circuit.In these approaches, the simulation of device stochasticityimposes a high computational load on the simulation, com-plicating the simulation of larger VLSI circuits. To addressthe need to simulate thousands of MTJ devices in a singlememory macro, circuit designers may have to oversimplify
m
mpm (AP)
m (P)
m (AP)
High
m (P)
LowResistance Resistance
mθ
φx
z
y
θ0
Free Layer Stack
Tunnel Junction
Pinned Layer Stack
m
mp
m
mp
Fig. 1. Basic MTJ structure, magnetization vector switching trajectory [10]from Parallel (P) to Anti-Parallel (AP) states, and reference coordinate system.
MTJ switching to a basic interpolated behaviour, and designfor worst-case fixed voltage and current targets. This hindersthe design of optimized circuits.
Therefore, there is a clear need for an accurate and modularcompact model accounting for the magnetization dynamicsof MTJ devices yet scalable enough for the simulation oflarge MRAM memory macros. In this paper we present threekey contributions: Section II presents a study of reliabilityissues in MTJ model simulation. The proposed modelingframework is presented in Section III, including a modularmulti-threaded s-LLGS solver validated against OOMMF [5],an efficient Verilog-A compact model for MRAM macrosimulation and a solution to model thermal noise effects.Finally, in Section IV we present a case study with a 1-Mb28 nm MRAM macro.
II. MTJ COMPUTATIONAL AND CONVERGENCE ISSUES
From early behavioral prototypes to mature, product-readyPDKs, the simulation of emerging devices is a process thatrequires not only research contributions capturing novel devicebehavior [1]–[3], [7], but also optimization stages enablingcircuit design [8]. For STT-MRAM, circuit design needs thecomplex device dynamics to be incorporated into the standardSPICE and Cadence® Spectre [9] solvers. In this sectionwe address the challenges of reliable and efficient simulationof MTJ devices. All results are generated using Spectre17.1.0.583 with default settings unless noted otherwise.
The s-LLGS system resolution can easily lead to artificialoscillations if solved in a Cartesian reference system [1] whenusing the Euler, Gear and Trapezoidal methods provided bycircuit solvers. In [10] it is studied how MTJ devices benefitfrom resolution in spherical coordinates, specially when usingimplicit midpoint and explicit RK45 methods. Additionally,Cartesian methods require [1] a non-physical normalizationstage for the magnetization vector [10]. Some further issuesarising from solving in a spherical reference system aredetailed below.
arX
iv:2
106.
0497
6v1
[cs
.ET
] 9
Jun
202
1
0 5 10 15 20 25 30Time [ns]
0.0
0.5
1.0
1.5
2.0
2.5
3.0Th
eta
[rad]
OOMMFNaive dyn. stepNaive 1ps stepTrapezoidal dyn. step
Naive Euler Naive Euler1ps timestep
Trapezoidal
102
103
104
Solv
er E
vent
s Cou
nt
Solver EventsTheta RMS Error
0.00
0.05
0.10
0.15
0.20
0.25
Solv
er R
MS
Erro
r
Fig. 2. MTJ simulation comparing a naive Euler integration againstSpectre trapezoidal solver. The naive method leads to larger RMSE (vs. theOOMMF reference) even under 1ps timestep with 100× computational load.
Fig. 3. MTJ s-LLGS simulation under no external current/field, consideringthermal Hth effects. The evolution of mφ and mθ over time highlights theasymptote on d
dtmφ at θ = 0. The trapezoidal solver under relaxed tolerance
successfully provides the desired solution while using a dynamic time step.
A. mφ Wrapping
Prior work on MTJ Verilog-A models uses a naive Eulermethod to integrate the magnetization vector [1], [3]
m(t) = m(t− dt) + dmdt. (1)
However, Equation 1 causes mφ to exceed the [0, 2π) physi-cally allowed range, and grow indefinitely over time, leadingto unnatural voltages that eventually prevent simulator conver-gence. Our solution is to use the circular integration idtmodfunction provided by Spectre [8], [9] which wraps mφ at2π and prevents it from increasing indefinitely.
B. mθ Accuracy and Minimum Timestep
Vector mθ represents the binary state of the MTJ databit and is the most critical component to accurately modelMTJ switching. The naive integration of mθ directly encodingEquation 1 in the Verilog-A description incurs a substantialerror during a switching event, as shown in Figure 2. Reducingthe timestep improves the RMS error but at the cost of a largeincrease in computational load. Our solution is to use the idtfunction for the circuit solvers provided by Spectre [9],which adapt the integration based on multiple evaluations ofthe differential terms at different timesteps, leading to betteraccuracy. Figure 2 describes the accuracy RMSE when usinga naive Euler method and the idt trapezoidal solver, whencompared against OOMMF reference. The graph highlights howeven with a smaller timestep involving 100× more computedevents, the naive integration leads to larger errors.
s-LLGS Solver
Anisotropy
- Bulk - Interfacial - Shape
Thermal Field Effects - Hth Field - Fake Hth
- Theta windowing
VCMA
TMR Module
Dynamics s-LLGS Module
Conduction Module
Rmtj
m, HthImtj
Vmtj
m,HVCMA
m, Hani
Rmtj
Temp
Hext
m
Compact Model
Foundry Specific Conduction
Pythonnon-stochastic
Start
OOMMF
Results match?
Adjust tolerancesPython Frozen
Verilog-A OOMMF
Results match?
Stochastic Simulations to regress thermal ci
f
Verilog-a Library
cif
Yes No
Yes
No
Adjust tolerances
Validation/Calibration
Vmtj
Temp
Fig. 4. Proposed MTJ modeling framework. The Conduction and Dynamicsmodules determine the device conduction and the magnetization based on theexternal, device anisotropy, STT and thermal induced fields. The Python andVerilog-A models validate the MTJ behavior against OOMMFC .
C. mφ acceleration
If a larger circuit is to be simulated and fixed timestepsbanned, to avoid instabilities it is essential to manage the toler-ance/timestep scheme [10], especially during the computationof themφ component. The asymptote d
dtmφ →∞ atmθ = 0accelerates the precession mechanism [10]. As described byFigure 3, this requires the solver to accordingly increase itsresolution. A bounded or fixed time step would fail either toprovide sufficient resolution or sufficient performance.
III. PROPOSED FRAMEWORK
In this section we describe the implementation and vali-dation of the proposed MTJ compact model addressing thechallenges described in Section II, emphasizing its dynamicsmodule computing the MTJ magnetization and proposingsolutions to incorporate thermal noise effects.
A. Compact Model Structure
Figure 4 describes the implemented compact model, com-posed of two modules: the Conduction and Dynamics modules.The conduction scheme describing the instantaneous MTJresistance is dependent on the foundry engineered stack. Ourmodular approach allows foundry-specific conduction mech-anisms to complement the basic Tunnel-Magneto-Resistance(TMR) scheme [1]–[3], [11]. The Dynamics module describesthe temporal evolution of the MTJ magnetization m as a mon-odomain nanomagnet influenced by external and anisotropyfields, thermal noise and STT [5], [10], described by thes-LLGS equations [5] in I.S.U.
dm
dt= −γ′m×Heff + αγ′m× dm
dt
+γ′βε(m×mp ×m)− γ′βε′(m×mp)
β = | ~µ0e| I
VMs, ε =
PΛ2
(Λ2 + 1) + (Λ2 − 1)(m ·mp)(2)
where alpha and γ are the Gilbert damping factor andgyromagnetic ratio respectively, related by γ′ = γµ0
1+α2 , P is
0 5 10 15Time [ns]
1.00
0.75
0.50
0.25
0.00
0.25
0.50
0.75
1.00
m_i
m_xOOMMFverilog-apython
0 5 10 15Time [ns]
m_y
0 5 10 15Time [ns]
m_z
Fig. 5. Validation against OOMMF Framework. A cylindrical MTJ with 50nm diameter, 1 nm thickness, Ki = 1e − 3 Jm−2, P = 0.75, α = 0.01,γ = 1.76e11 rad/s/T , Ms = 1.2 Am−1 switches by a current of 35uA.
the polarization factor, Ms is the magnetization saturation, Iis the current flowing through the MTJ volume V , mp is thepinned-layer unitary polarization direction and Λ and ε′ set theprimary and secondary spin transfer terms respectively. The ef-fective magnetic field for a Perpendicular Magnetic Anisotropy(PMA) is defined by the anisotropy field, the external field andthe thermal induced field Heff = Hani+Hext+Hth. Theanisotropy field is composed of the PMA uniaxial term, theshape anisotropy demagnetization field, and the VCMA field[1], [3], becoming the Heff vector
Heff = Hext +Huni +Hdemag −HV CMA +Hth
= Hext + 2Kitflµ0Ms
mz −MsN ·m
− 2ξIRmtjtfltoxµ0Ms
mzz + N (0, 1)√
2KBTαγ′MsV∆t
, (3)
where Ki is the interfacial energy constant, tfl and tox are thefree layer and oxide thicknesses, N is the shape anisotropydemagnetization factor, ξ the VCMA coefficient, KB theBoltzmann constant and N (0, 1) a Gaussian random vectorwith components in x, y, z meeting [10], [12] conditions.
The compact model has been implemented in Python byadapting the Scipy solve_ivp [13] solvers to supportthe Hth simulation as a pure Wiener process. The paral-lel Python engine enables MC and statistical studies. TheVerilog-A implementation uses idt /idtmod integrationschemes with parameterizable integration tolerances.
B. Validation Against OOMMF
The compact model is validated against the NISTOOMMF MicroMagnetic Framework [5], testing our resultsagainst its OOMMFC interface [6]. Following the methodfrom [10], we use SpinXferEvolve to simulate a singlemagnet under an induced spin current. Figure 5 shows thatPython and Verilog-A implementations compare wellwith OOMMF with adaptive tolerance driven computations.
C. Thermal noise and MTJ stochasticity
The random magnetic field Hth caused by thermal fluctu-ations induces stochastic MTJ behaviour resulting in a non-zero Write Error Rate (WER) upon switching events. Hth
follows a Wiener process [10], [12], in which each x, y, zrandom component is independent of each other and previous
0 20 40 60Time [ns]
0
1
2
3
4
5
6
Ang
le [
rad
]
d0
0 20 40 60Time [ns]
0 20 40 60Time [ns]
Curr
ent
[A]
60
50
40
30
20
10
0
d1
Phi
Theta
Currentd2
Fig. 6. Damping effect on mθ under Hth field absence. The absence ofSTT-current (during the d2 delay) collapses mθ preventing MTJ switching.
states. This implies that the computation of Hth requires largeindependent variations between steps, hindering the solver’sattempts to guarantee signal continuity under small tolerances.The scenario shown in Figure 3 leads to computational errorsunder default solver tolerances, and excessive computationalload under 1 ps bounded time steps.
Three solutions have been proposed in the literature. First,to emulate the random field by using an external current orresistor-like noise source [7]. However, SPICE simulatorsimpose a minimum capacitance on these nodes filtering therandomness response, therefore preventing a true Wiener pro-cess from being simulated. Second, to bound a fixed smalltimestep to the solver [1], [2], but as described before this isnot feasible for large circuits. Third, to only consider scenarioswhere the field generated by the writing current is muchlarger than Hth, forcing Hth = 0 [1], [3]. This has strongimplications when moving from single to multiple successiveswitching event simulations. Under no Hth thermal field,the magnetization damping collapses mθ to either 0 or π.s-LLGS dynamics imply that the smaller the mθ the harderit is for the cell to switch, and if completely collapsed, it isimpossible. This artificial effect, depicted in Figure 6, doesnot have an equivalent in reality, as Hth 6= 0 imposes arandom angle. Design, validation and signoff for large memoryblocks with integrated periphery and control circuits requiresthe simulation of sequences of read and write operations, witheach operation showing identical predictable and switchingcharacteristics. However, the damping artifact discussed aboveprevents or slows down subsequent switches after the firstevent, since the subsequent events see an initial θ valuevanishingly close to zero. Two solutions are proposed below.
1) Windowing Function Approach: Our first objective isto provide a mechanism for mθ to saturate naturally to theequilibrium value given by Hth during switching events. Byredefining the evolution of m on its θ component we are ableto saturate its angle at θ0, the second moment of the Maxwell-Boltzmann distribution of mθ under no external field [1]–[3],[7]. The new derivative function d
dtm′θ uses a Tukey window
with the form ddtm
′θ = wtukey(mθ, θ0) ddtmθ where
w(mθ) =
0 if mθ < θ′00.5− cos
2 (4π(mθ−θ′0)
θ′0) if mθ − θ′0 <
θ′04
1 otherwise(4)
Fig. 7. 105 stochastic simulations for the calibration of Hfth.
1T1R internal
Fig. 8. Memory macro simulation showing MTJ switching.
defined for mθ ∈ [0, π2 ] and defined symmetrically (w′(mθ) =w(π−mθ)) for mθ ∈ [π2 , π]. This function slows down d
dtm′θ
when reaching the angle θ′0 = cwθ0, therefore saturating mθ
and avoiding m from collapsing over z. Moreover, by usingcmeanw we are able to define the angle that statistically followsthe mean stochastic MTJ behavior. Similarly, by simply usingthe set of parameters [cworstw , cbestw , cWER0
w , ..., cWERiw ] we can
simulate the worst, best, WERi behaviors, analyzing how agiven circuit instantiating that MTJ device would behave sta-tistically with negligible simulation performance degradation.
2) Emulated Magnetic Term: The windowing function ap-proach prevents artificialmθ saturation at the end of switchingevents, but still does not capture the mean effect ofHth duringswitching or under low-current excitation, such as during readevents. Our next objective is to address this without performinga large ensemble of random transient simulations. The expan-sion of Equation 3 after its expression in spherical coordinatesdescribes mθ evolution as proportional to Heffφ+αHeffθ ,leaving d
dtmθ ' γ′
1+α2Heffφ. We propose to add a ficti-tious Hfthφ term Heff with the purpose of emulating themean/best/worst statistical Hth contribution that generates θ0
[1]–[3], [7]. By defining
Hfth = cf
√2KBTα
γ′MsV∆tφ (5)
we are able to efficiently model – by simply using[cworstf , cbestf , cWER0
f , ..., cWERif ] set of parameters – the sta-
tistical behavior caused by the thermal effects while avoidingtheir inherent simulation disadvantages.
The process of calibrating a particular compact modelinstance is described in Figure 4, and ensures the validation ofPython and Verilog-A instances against OOMMF, beforethe regression of cif or ciw coefficients takes place with a one-off stochastic simulation step as shown in Figure 7.
IV. 1-MB MRAM MACRO BENCHMARK
To validate scalability on a commercial product, the modelis instantiated into the the 64 × 4 memory top block of theextracted netlist from a 1-Mb 28 nm MRAM macro [4], andsimulated with macro-specific tolerance settings. The emulatedmagnetic term from Section III-C2 enables the previouslyimpossible capability of simulating successive writes withidentical transition times due to non-desired over-damping.Figure 8 –resistance/voltage units omitted for confidentiality–describes a writing operation 10 µs after power-on sequence.We combine the s-LLGS OOMMF validated dynamics withfoundry-given thermal/voltage conductance dependence, pro-viding the accurate resistance response over time. Comparedto using fixed resistors, there is an overhead of 3.1× CPU timeand 1.5× RAM usage. In return, circuit designers can observeaccurate transient switching behaviour and read disturbs.
V. CONCLUSIONS
This work presents an MTJ modeling approach for largeVLSI circuits. We analyze MTJ modeling challenges andpropose solutions to accurately capture stochastic thermalnoise effects. Accuracy is validated against OOMMF , andscalability shown by incorporation into a 1-Mb 28 nm MRAMmacro. The framework code is available upon request.
REFERENCES
[1] H. Lee et al., “Analysis and Compact Modeling of Magnetic TunnelJunctions Utilizing Voltage-Controlled Magnetic Anisotropy,” IEEETrans. Magn., vol. 54, no. 4, 2018.
[2] M. M. Torunbalci et al., “Modular Compact Modeling of MTJ Devices,”IEEE Trans. Electron Devices, vol. 65, no. 10, 2018.
[3] K. Zhang et al., “Compact Modeling and Analysis of Voltage-GatedSpin-Orbit Torque MTJ,” IEEE Access, vol. 8, 2020.
[4] E. M. Boujamaa et al., “A 14.7Mb/mm2 28nm FDSOI STT-MRAM withCurrent Starved Read Path, 52Ω/Sigma Offset Voltage Sense Amplifierand Fully Trimmable CTAT Reference,” IEEE Symp. VLSI Circuits, Dig.Tech. Pap., vol. 2020-June, 2020.
[5] M. J. Donahue et al., “OOMMF user’s guide, v1.0,” National Instituteof Standards and Technology, Gaithersburg, MD, Tech. Rep., 1999.
[6] M. Beg et al., “User interfaces for computational science: A domainspecific language for OOMMF embedded in Python,” AIP Adv., vol. 7,no. 5, may 2017.
[7] G. D. Panagopoulos et al., “Physics-based SPICE-compatible compactmodel for simulating hybrid MTJ/CMOS circuits,” IEEE Trans. ElectronDevices, vol. 60, no. 9, 2013.
[8] F. Garcia-Redondo et al., “Building Memristor Applications: FromDevice Model to Circuit Design,” IEEE Trans. Nanotechnol., vol. 13,no. 6, nov 2014.
[9] Cadence, “Spectre Accelerated Parallel Simulator.”[10] S. Ament et al., “Solving the stochastic Landau-Lifshitz-Gilbert-
Slonczewski equation for monodomain nanomagnets : A survey andanalysis of numerical techniques,” 2016.
[11] Y.-J. Tsou et al., “Write Margin Analysis of Spin–Orbit TorqueSwitching Using Field-Assisted Method,” IEEE J. Explor. Solid-StateComput. Devices Circuits, vol. 5, no. 2, dec 2019.
[12] W. F. Brown, “Thermal Fluctuations of a Single-Domain Particle,”Phys. Rev., vol. 130, no. 5, jun 1963.
[13] P. Virtanen et al., “SciPy 1.0: fundamental algorithms for scientificcomputing in Python,” Nat. Methods, vol. 17, no. 3, mar 2020.