A Compact Model for Scalable MTJ Simulation

A Compact Model for Scalable MTJ SimulationFernando Garcıa-RedondoArm Ltd, Cambridge, UK

[email protected]

Pranay PrabhatArm Ltd, Cambridge, [email protected]

Mudit BhargavaArm Inc, Austin, USA

[email protected]

Cyrille DrayArm Ltd, La Paros, France

[email protected]

Abstract—This paper presents a physics-based modelingframework for the analysis and transient simulation of cir-cuits containing Spin-Transfer Torque (STT) Magnetic TunnelJunction (MTJ) devices. The framework provides the toolsto analyze the stochastic behavior of MTJs and to generateVerilog-A compact models for their simulation in large VLSIdesigns, addressing the need for an industry-ready model ac-counting for real-world reliability and scalability requirements.Device dynamics are described by the Landau-Lifshitz-Gilbert-Slonczewsky (s-LLGS ) stochastic magnetization consideringVoltage-Controlled Magnetic Anisotropy (VCMA) and the non-negligible statistical effects caused by thermal noise. Model be-havior is validated against the OOMMF magnetic simulator and itsperformance is characterized on a 1-Mb 28 nm Magnetoresistive-RAM (MRAM) memory product.

Index Terms—STT-MRAM, MTJ, Compact modeling, s-LLGS

I. INTRODUCTION

Recent advances in MTJ devices [1]–[4] open the path tothe next generation of system architectures, from embeddedbattery-less edge devices where off-chip Flash will be replacedby embedded MRAM, to future power-efficient caches in HPCsystems. STT-MRAM is being actively developed by foundriesand integrated into 28 nm generation CMOS Process DesignKits (PDKs) [4].

The complex multi-layered MTJ structures have heavilynon-linear magnetic and electric behaviors that depend onthe device structure, the thermal and external magnetic en-vironment, and the applied electrical stress (Figure 1). On topof this complex system of equations, the evolution of MTJmagnetization m shows a stochastic behavior caused by therandom magnetic field induced by thermal fluctuations.

Many MTJ models have been presented, from micro-magnetic [5], [6] approaches to macro-magnetic SPICE andVerilog-A compact models for circuit simulations [1]–[3],[7]. Compact models present in the literature account fordifferent behavioral aspects: a better temperature dependence[1], a more accurate anisotropy formulation for particular MTJstructures [1]–[3], and the integration of Spin-Orbit Torque(SOT) for three-terminal devices [2], [3]. These prior worksfocus on the ability to model a specific new behavior for thesimulation of a single or a few MTJ devices in a small circuit.In these approaches, the simulation of device stochasticityimposes a high computational load on the simulation, com-plicating the simulation of larger VLSI circuits. To addressthe need to simulate thousands of MTJ devices in a singlememory macro, circuit designers may have to oversimplify

m

mpm (AP)

m (P)

m (AP)

High

m (P)

LowResistance Resistance

mθ

φx

z

y

θ0

Free Layer Stack

Tunnel Junction

Pinned Layer Stack

m

mp

m

mp

Fig. 1. Basic MTJ structure, magnetization vector switching trajectory [10]from Parallel (P) to Anti-Parallel (AP) states, and reference coordinate system.

MTJ switching to a basic interpolated behaviour, and designfor worst-case fixed voltage and current targets. This hindersthe design of optimized circuits.

Therefore, there is a clear need for an accurate and modularcompact model accounting for the magnetization dynamicsof MTJ devices yet scalable enough for the simulation oflarge MRAM memory macros. In this paper we present threekey contributions: Section II presents a study of reliabilityissues in MTJ model simulation. The proposed modelingframework is presented in Section III, including a modularmulti-threaded s-LLGS solver validated against OOMMF [5],an efficient Verilog-A compact model for MRAM macrosimulation and a solution to model thermal noise effects.Finally, in Section IV we present a case study with a 1-Mb28 nm MRAM macro.

II. MTJ COMPUTATIONAL AND CONVERGENCE ISSUES

From early behavioral prototypes to mature, product-readyPDKs, the simulation of emerging devices is a process thatrequires not only research contributions capturing novel devicebehavior [1]–[3], [7], but also optimization stages enablingcircuit design [8]. For STT-MRAM, circuit design needs thecomplex device dynamics to be incorporated into the standardSPICE and Cadence® Spectre [9] solvers. In this sectionwe address the challenges of reliable and efficient simulationof MTJ devices. All results are generated using Spectre17.1.0.583 with default settings unless noted otherwise.

The s-LLGS system resolution can easily lead to artificialoscillations if solved in a Cartesian reference system [1] whenusing the Euler, Gear and Trapezoidal methods provided bycircuit solvers. In [10] it is studied how MTJ devices benefitfrom resolution in spherical coordinates, specially when usingimplicit midpoint and explicit RK45 methods. Additionally,Cartesian methods require [1] a non-physical normalizationstage for the magnetization vector [10]. Some further issuesarising from solving in a spherical reference system aredetailed below.

arX

iv:2

106.

0497

6v1

[cs

.ET

] 9

Jun

202

1

0 5 10 15 20 25 30Time [ns]

0.0

0.5

1.0

1.5

2.0

2.5

3.0Th

eta

[rad]

OOMMFNaive dyn. stepNaive 1ps stepTrapezoidal dyn. step

Naive Euler Naive Euler1ps timestep

Trapezoidal

102

103

104

Solv

er E

vent

s Cou

nt

Solver EventsTheta RMS Error

0.00

0.05

0.10

0.15

0.20

0.25

Solv

er R

MS

Erro

r

Fig. 2. MTJ simulation comparing a naive Euler integration againstSpectre trapezoidal solver. The naive method leads to larger RMSE (vs. theOOMMF reference) even under 1ps timestep with 100× computational load.

Fig. 3. MTJ s-LLGS simulation under no external current/field, consideringthermal Hth effects. The evolution of mφ and mθ over time highlights theasymptote on d

dtmφ at θ = 0. The trapezoidal solver under relaxed tolerance

successfully provides the desired solution while using a dynamic time step.

A. mφ Wrapping

Prior work on MTJ Verilog-A models uses a naive Eulermethod to integrate the magnetization vector [1], [3]

m(t) = m(t− dt) + dmdt. (1)

However, Equation 1 causes mφ to exceed the [0, 2π) physi-cally allowed range, and grow indefinitely over time, leadingto unnatural voltages that eventually prevent simulator conver-gence. Our solution is to use the circular integration idtmodfunction provided by Spectre [8], [9] which wraps mφ at2π and prevents it from increasing indefinitely.

B. mθ Accuracy and Minimum Timestep

Vector mθ represents the binary state of the MTJ databit and is the most critical component to accurately modelMTJ switching. The naive integration of mθ directly encodingEquation 1 in the Verilog-A description incurs a substantialerror during a switching event, as shown in Figure 2. Reducingthe timestep improves the RMS error but at the cost of a largeincrease in computational load. Our solution is to use the idtfunction for the circuit solvers provided by Spectre [9],which adapt the integration based on multiple evaluations ofthe differential terms at different timesteps, leading to betteraccuracy. Figure 2 describes the accuracy RMSE when usinga naive Euler method and the idt trapezoidal solver, whencompared against OOMMF reference. The graph highlights howeven with a smaller timestep involving 100× more computedevents, the naive integration leads to larger errors.

s-LLGS Solver

Anisotropy

- Bulk - Interfacial - Shape

Thermal Field Effects - Hth Field - Fake Hth

- Theta windowing

VCMA

TMR Module

Dynamics s-LLGS Module

Conduction Module

Rmtj

m, HthImtj

Vmtj

m,HVCMA

m, Hani

Rmtj

Temp

Hext

m

Compact Model

Foundry Specific Conduction

Pythonnon-stochastic

Start

OOMMF

Results match?

Adjust tolerancesPython Frozen

Verilog-A OOMMF

Results match?

Stochastic Simulations to regress thermal ci

f

Verilog-a Library

cif

Yes No

Yes

No

Adjust tolerances

Validation/Calibration

Vmtj

Temp

Fig. 4. Proposed MTJ modeling framework. The Conduction and Dynamicsmodules determine the device conduction and the magnetization based on theexternal, device anisotropy, STT and thermal induced fields. The Python andVerilog-A models validate the MTJ behavior against OOMMFC .

C. mφ acceleration

If a larger circuit is to be simulated and fixed timestepsbanned, to avoid instabilities it is essential to manage the toler-ance/timestep scheme [10], especially during the computationof themφ component. The asymptote d

dtmφ →∞ atmθ = 0accelerates the precession mechanism [10]. As described byFigure 3, this requires the solver to accordingly increase itsresolution. A bounded or fixed time step would fail either toprovide sufficient resolution or sufficient performance.

III. PROPOSED FRAMEWORK

In this section we describe the implementation and vali-dation of the proposed MTJ compact model addressing thechallenges described in Section II, emphasizing its dynamicsmodule computing the MTJ magnetization and proposingsolutions to incorporate thermal noise effects.

A. Compact Model Structure

Figure 4 describes the implemented compact model, com-posed of two modules: the Conduction and Dynamics modules.The conduction scheme describing the instantaneous MTJresistance is dependent on the foundry engineered stack. Ourmodular approach allows foundry-specific conduction mech-anisms to complement the basic Tunnel-Magneto-Resistance(TMR) scheme [1]–[3], [11]. The Dynamics module describesthe temporal evolution of the MTJ magnetization m as a mon-odomain nanomagnet influenced by external and anisotropyfields, thermal noise and STT [5], [10], described by thes-LLGS equations [5] in I.S.U.

dm

dt= −γ′m×Heff + αγ′m× dm

dt

+γ′βε(m×mp ×m)− γ′βε′(m×mp)

β = | ~µ0e| I

VMs, ε =

PΛ2

(Λ2 + 1) + (Λ2 − 1)(m ·mp)(2)

where alpha and γ are the Gilbert damping factor andgyromagnetic ratio respectively, related by γ′ = γµ0

1+α2 , P is

0 5 10 15Time [ns]

1.00

0.75

0.50

0.25

0.00

0.25

0.50

0.75

1.00

m_i

m_xOOMMFverilog-apython

0 5 10 15Time [ns]

m_y

0 5 10 15Time [ns]

m_z

Fig. 5. Validation against OOMMF Framework. A cylindrical MTJ with 50nm diameter, 1 nm thickness, Ki = 1e − 3 Jm−2, P = 0.75, α = 0.01,γ = 1.76e11 rad/s/T , Ms = 1.2 Am−1 switches by a current of 35uA.

the polarization factor, Ms is the magnetization saturation, Iis the current flowing through the MTJ volume V , mp is thepinned-layer unitary polarization direction and Λ and ε′ set theprimary and secondary spin transfer terms respectively. The ef-fective magnetic field for a Perpendicular Magnetic Anisotropy(PMA) is defined by the anisotropy field, the external field andthe thermal induced field Heff = Hani+Hext+Hth. Theanisotropy field is composed of the PMA uniaxial term, theshape anisotropy demagnetization field, and the VCMA field[1], [3], becoming the Heff vector

Heff = Hext +Huni +Hdemag −HV CMA +Hth

= Hext + 2Kitflµ0Ms

mz −MsN ·m

− 2ξIRmtjtfltoxµ0Ms

mzz + N (0, 1)√

2KBTαγ′MsV∆t

, (3)

where Ki is the interfacial energy constant, tfl and tox are thefree layer and oxide thicknesses, N is the shape anisotropydemagnetization factor, ξ the VCMA coefficient, KB theBoltzmann constant and N (0, 1) a Gaussian random vectorwith components in x, y, z meeting [10], [12] conditions.

The compact model has been implemented in Python byadapting the Scipy solve_ivp [13] solvers to supportthe Hth simulation as a pure Wiener process. The paral-lel Python engine enables MC and statistical studies. TheVerilog-A implementation uses idt /idtmod integrationschemes with parameterizable integration tolerances.

B. Validation Against OOMMF

The compact model is validated against the NISTOOMMF MicroMagnetic Framework [5], testing our resultsagainst its OOMMFC interface [6]. Following the methodfrom [10], we use SpinXferEvolve to simulate a singlemagnet under an induced spin current. Figure 5 shows thatPython and Verilog-A implementations compare wellwith OOMMF with adaptive tolerance driven computations.

C. Thermal noise and MTJ stochasticity

The random magnetic field Hth caused by thermal fluctu-ations induces stochastic MTJ behaviour resulting in a non-zero Write Error Rate (WER) upon switching events. Hth

follows a Wiener process [10], [12], in which each x, y, zrandom component is independent of each other and previous

0 20 40 60Time [ns]

0

1

2

3

4

5

6

Ang

le [

rad

]

d0

0 20 40 60Time [ns]

0 20 40 60Time [ns]

Curr

ent

[A]

60

50

40

30

20

10

0

d1

Phi

Theta

Currentd2

Fig. 6. Damping effect on mθ under Hth field absence. The absence ofSTT-current (during the d2 delay) collapses mθ preventing MTJ switching.

states. This implies that the computation of Hth requires largeindependent variations between steps, hindering the solver’sattempts to guarantee signal continuity under small tolerances.The scenario shown in Figure 3 leads to computational errorsunder default solver tolerances, and excessive computationalload under 1 ps bounded time steps.

Three solutions have been proposed in the literature. First,to emulate the random field by using an external current orresistor-like noise source [7]. However, SPICE simulatorsimpose a minimum capacitance on these nodes filtering therandomness response, therefore preventing a true Wiener pro-cess from being simulated. Second, to bound a fixed smalltimestep to the solver [1], [2], but as described before this isnot feasible for large circuits. Third, to only consider scenarioswhere the field generated by the writing current is muchlarger than Hth, forcing Hth = 0 [1], [3]. This has strongimplications when moving from single to multiple successiveswitching event simulations. Under no Hth thermal field,the magnetization damping collapses mθ to either 0 or π.s-LLGS dynamics imply that the smaller the mθ the harderit is for the cell to switch, and if completely collapsed, it isimpossible. This artificial effect, depicted in Figure 6, doesnot have an equivalent in reality, as Hth 6= 0 imposes arandom angle. Design, validation and signoff for large memoryblocks with integrated periphery and control circuits requiresthe simulation of sequences of read and write operations, witheach operation showing identical predictable and switchingcharacteristics. However, the damping artifact discussed aboveprevents or slows down subsequent switches after the firstevent, since the subsequent events see an initial θ valuevanishingly close to zero. Two solutions are proposed below.

1) Windowing Function Approach: Our first objective isto provide a mechanism for mθ to saturate naturally to theequilibrium value given by Hth during switching events. Byredefining the evolution of m on its θ component we are ableto saturate its angle at θ0, the second moment of the Maxwell-Boltzmann distribution of mθ under no external field [1]–[3],[7]. The new derivative function d

dtm′θ uses a Tukey window

with the form ddtm

′θ = wtukey(mθ, θ0) ddtmθ where

w(mθ) =

0 if mθ < θ′00.5− cos

2 (4π(mθ−θ′0)

θ′0) if mθ − θ′0 <

θ′04

1 otherwise(4)

Fig. 7. 105 stochastic simulations for the calibration of Hfth.

1T1R internal

Fig. 8. Memory macro simulation showing MTJ switching.

defined for mθ ∈ [0, π2 ] and defined symmetrically (w′(mθ) =w(π−mθ)) for mθ ∈ [π2 , π]. This function slows down d

dtm′θ

when reaching the angle θ′0 = cwθ0, therefore saturating mθ

and avoiding m from collapsing over z. Moreover, by usingcmeanw we are able to define the angle that statistically followsthe mean stochastic MTJ behavior. Similarly, by simply usingthe set of parameters [cworstw , cbestw , cWER0

w , ..., cWERiw ] we can

simulate the worst, best, WERi behaviors, analyzing how agiven circuit instantiating that MTJ device would behave sta-tistically with negligible simulation performance degradation.

2) Emulated Magnetic Term: The windowing function ap-proach prevents artificialmθ saturation at the end of switchingevents, but still does not capture the mean effect ofHth duringswitching or under low-current excitation, such as during readevents. Our next objective is to address this without performinga large ensemble of random transient simulations. The expan-sion of Equation 3 after its expression in spherical coordinatesdescribes mθ evolution as proportional to Heffφ+αHeffθ ,leaving d

dtmθ ' γ′

1+α2Heffφ. We propose to add a ficti-tious Hfthφ term Heff with the purpose of emulating themean/best/worst statistical Hth contribution that generates θ0

[1]–[3], [7]. By defining

Hfth = cf

√2KBTα

γ′MsV∆tφ (5)

we are able to efficiently model – by simply using[cworstf , cbestf , cWER0

f , ..., cWERif ] set of parameters – the sta-

tistical behavior caused by the thermal effects while avoidingtheir inherent simulation disadvantages.

The process of calibrating a particular compact modelinstance is described in Figure 4, and ensures the validation ofPython and Verilog-A instances against OOMMF, beforethe regression of cif or ciw coefficients takes place with a one-off stochastic simulation step as shown in Figure 7.

IV. 1-MB MRAM MACRO BENCHMARK

To validate scalability on a commercial product, the modelis instantiated into the the 64 × 4 memory top block of theextracted netlist from a 1-Mb 28 nm MRAM macro [4], andsimulated with macro-specific tolerance settings. The emulatedmagnetic term from Section III-C2 enables the previouslyimpossible capability of simulating successive writes withidentical transition times due to non-desired over-damping.Figure 8 –resistance/voltage units omitted for confidentiality–describes a writing operation 10 µs after power-on sequence.We combine the s-LLGS OOMMF validated dynamics withfoundry-given thermal/voltage conductance dependence, pro-viding the accurate resistance response over time. Comparedto using fixed resistors, there is an overhead of 3.1× CPU timeand 1.5× RAM usage. In return, circuit designers can observeaccurate transient switching behaviour and read disturbs.

V. CONCLUSIONS

This work presents an MTJ modeling approach for largeVLSI circuits. We analyze MTJ modeling challenges andpropose solutions to accurately capture stochastic thermalnoise effects. Accuracy is validated against OOMMF , andscalability shown by incorporation into a 1-Mb 28 nm MRAMmacro. The framework code is available upon request.

REFERENCES

[1] H. Lee et al., “Analysis and Compact Modeling of Magnetic TunnelJunctions Utilizing Voltage-Controlled Magnetic Anisotropy,” IEEETrans. Magn., vol. 54, no. 4, 2018.

[2] M. M. Torunbalci et al., “Modular Compact Modeling of MTJ Devices,”IEEE Trans. Electron Devices, vol. 65, no. 10, 2018.

[3] K. Zhang et al., “Compact Modeling and Analysis of Voltage-GatedSpin-Orbit Torque MTJ,” IEEE Access, vol. 8, 2020.

[4] E. M. Boujamaa et al., “A 14.7Mb/mm2 28nm FDSOI STT-MRAM withCurrent Starved Read Path, 52Ω/Sigma Offset Voltage Sense Amplifierand Fully Trimmable CTAT Reference,” IEEE Symp. VLSI Circuits, Dig.Tech. Pap., vol. 2020-June, 2020.

[5] M. J. Donahue et al., “OOMMF user’s guide, v1.0,” National Instituteof Standards and Technology, Gaithersburg, MD, Tech. Rep., 1999.

[6] M. Beg et al., “User interfaces for computational science: A domainspecific language for OOMMF embedded in Python,” AIP Adv., vol. 7,no. 5, may 2017.

[7] G. D. Panagopoulos et al., “Physics-based SPICE-compatible compactmodel for simulating hybrid MTJ/CMOS circuits,” IEEE Trans. ElectronDevices, vol. 60, no. 9, 2013.

[8] F. Garcia-Redondo et al., “Building Memristor Applications: FromDevice Model to Circuit Design,” IEEE Trans. Nanotechnol., vol. 13,no. 6, nov 2014.

[9] Cadence, “Spectre Accelerated Parallel Simulator.”[10] S. Ament et al., “Solving the stochastic Landau-Lifshitz-Gilbert-

Slonczewski equation for monodomain nanomagnets : A survey andanalysis of numerical techniques,” 2016.

[11] Y.-J. Tsou et al., “Write Margin Analysis of Spin–Orbit TorqueSwitching Using Field-Assisted Method,” IEEE J. Explor. Solid-StateComput. Devices Circuits, vol. 5, no. 2, dec 2019.

[12] W. F. Brown, “Thermal Fluctuations of a Single-Domain Particle,”Phys. Rev., vol. 130, no. 5, jun 1963.

[13] P. Virtanen et al., “SciPy 1.0: fundamental algorithms for scientificcomputing in Python,” Nat. Methods, vol. 17, no. 3, mar 2020.

Documents

A Compact Model for Scalable MTJ Simulation