PMC-3D: A Parallelized 3D Monte Carlo Simulator for
5
PMC-3D: A Parallelized 3D Monte Carlo Simulator for Electronic and Electro-optic Devices S. Pennathur, U.A. Ranawake, V.K. Tripathi, Patrick Lenders and Stephen M. Goodnick Department of Electrical and Computer Engineering Oregon State University Corvallis, OR 97331 Abstract A parallel Monte Carlo device simulator, PMC-3D has been developed for multiproces- sors. Through the use of parallel architectures, full three dimensional modeling of the device domain is possible. Here a discussion of the parallel algorithm is given for coupling the Monte Carlo particle simulation with Poisson's equation for quasi-static problems, and full Maxwell's equations for electro-optic devices. I. Introduction PMC-3D is a parallel three dimensional (3-D) Monte Carlo device simulator written for multiprocessors[l][2]. The parallel algorithm is an extension of the standard Monte Carlo device simulation model in 3D in which the particle dynamics generated from the stochas- tic Monte Carlo method are solved simultaneously with the appropriate set of field equa- tions on a 3D mesh using finite differences. For quasi-static problems such as the behavior of a three terminal MESFET device, Poisson's equation is solved for the potential and corresponding electric field used to drive the particle dynamics. Both the Poisson solver and the Monte Carlo phase are parallelized due to the large computational requirements of solving Poisson's on a 3D grid. The implementation of the Poisson solver is based on an iterative method that uses an odd/even ordering with Chebyshev acceleration. The code was developed both for a distributed memory 1024 node nCUBE multicomputer and a 4-node shared memory Ardent Titan multiprocessor. II. Parallel Algorithm In the distributed memory implementation, the spatial domain of the device is subdivided onto separate processors according to a recursive bisection algorithm. Figure 1 shows the decomposition of the MESFET problem for a simple four processor case. The particles and mesh associated with each of the subregions is mapped onto separate processors. The choice of spatial domain size is chosen to roughly balance the number of particles per processor, and adjusts itself throughout the simulation to maintain a balanced processor load. The code is written in C, and contains compiler directives to distribute portions of the code on various processors. The parallel to sequential speedup was characterized in several ways shown in Fig. 2. The speedups for the fixed size and scaled size problem were measured for both the Poisson phase and the Monte Carlo phase of the code on a 1024 nCUBE hypercube 156
PMC-3D: A Parallelized 3D Monte Carlo Simulator for
PMC-3D: A Parallelized 3D Monte Carlo Simulator for Electronic and
Electro-optic Devices
S. Pennathur, U.A. Ranawake, V.K. Tripathi, Patrick Lenders and
Stephen M. Goodnick
Department of Electrical and Computer Engineering Oregon State
University Corvallis, OR 97331
Abst rac t A parallel Monte Carlo device simulator, PMC-3D has been
developed for multiproces sors. Through the use of parallel
architectures, full three dimensional modeling of the device domain
is possible. Here a discussion of the parallel algorithm is given
for coupling the Monte Carlo particle simulation with Poisson's
equation for quasi-static problems, and full Maxwell's equations
for electro-optic devices.
I. Introduction
PMC-3D is a parallel three dimensional (3-D) Monte Carlo device
simulator written for multiprocessors[l][2]. The parallel algorithm
is an extension of the standard Monte Carlo device simulation model
in 3D in which the particle dynamics generated from the stochas
tic Monte Carlo method are solved simultaneously with the
appropriate set of field equa tions on a 3D mesh using finite
differences. For quasi-static problems such as the behavior of a
three terminal MESFET device, Poisson's equation is solved for the
potential and corresponding electric field used to drive the
particle dynamics. Both the Poisson solver and the Monte Carlo
phase are parallelized due to the large computational requirements
of solving Poisson's on a 3D grid. The implementation of the
Poisson solver is based on an iterative method that uses an
odd/even ordering with Chebyshev acceleration. The code was
developed both for a distributed memory 1024 node nCUBE
multicomputer and a 4-node shared memory Ardent Titan
multiprocessor.
II. Parallel Algorithm
In the distributed memory implementation, the spatial domain of the
device is subdivided onto separate processors according to a
recursive bisection algorithm. Figure 1 shows the decomposition of
the MESFET problem for a simple four processor case. The particles
and mesh associated with each of the subregions is mapped onto
separate processors. The choice of spatial domain size is chosen to
roughly balance the number of particles per processor, and adjusts
itself throughout the simulation to maintain a balanced processor
load. The code is written in C, and contains compiler directives to
distribute portions of the code on various processors.
The parallel to sequential speedup was characterized in several
ways shown in Fig. 2. The speedups for the fixed size and scaled
size problem were measured for both the Poisson phase and the Monte
Carlo phase of the code on a 1024 nCUBE hypercube
156
00 4-01 11 10
Figure 1: Mapping onto a 4 processor hypercube and external
interaction region
with dynamical load balancing turned off. For the fully scaled
problem, an efficiency of 69.8% was obtained for 512 processors
compared to the ideal (Amdahl's law) speedup. An additional
increase of 8% was obtained using dynamical load balancing. In the
fully scaled case, several million grid points, and several million
particles were simulated.
I I I . Electromagnet ic Modeling
For high frequency operation or for electro-optic devices,
Poisson's equation is insufficient to correctly characterize the
dynamical behavior of the system, and thus PMC-3D has been extended
to include full electromagnetic solutions of Maxwell's equations
coupled with the particle transport models. The Lorentz force
equation which determines the carrier momentum between collisions,
includes the effect of the magnetic and electric fields obtained
from the time-dependent solution of Maxwell's equations, and is
given by
F = hic = q(E + vxB)
The standard finite difference/time domain (FD/TD) formulation of
the field-equations is used to compute the time-evolving quantities
E,H and J with p(x, y, z, t) being the source term corresponding to
the charges present as described by Connolly et al. [3]. The
algorithm involves two grids in space (for E and H fields)
displaced by half mesh increments in the three spatial directions.
Every component of E (or H) is computed at a given time, using the
four adjacent values of H (or E) that contribute to the loop
integral, that results from the Maxwell's curl equations. In
addition, the E and H fields are computed at time instants
differing by a half time step, amounting to a leapfrog method of
computing the time evolution of the field quantities.
157
500
400
Number of Processors
Figure 2: Parallel to sequential speedups of PMC-3D
The current density term J is central to the feedback that exists
between the Monte Carlo part and the field computations. It is
computed once every timestep at every grid point by summing the
weighted velocities of the particles in the ensemble that lie
within a unit cell volume about the grid point, and is given
as
where Sn and vn refer respectively to the charge and velocity of
the nth particle associated with the grid point [3].
The electric fields are initialized to the values obtained using a
one-time solution of Poisson's solution, corresponding to the
applied DC bias voltages on the strip line, while the magnetic
fields are assumed to be zero. Dirichlet boundary conditions are
imposed on the metal-semiconductor boundaries as well as on all the
conducting walls. Some preliminary results have also been obtained
by imposing Neumann boundary conditions at the semiconductor-air
interfaces.
The FD/TD algorithm essentially requires the solution of an
identical set of difference equations at every grid point and hence
is ideally suited for parallel implementation. In addition to
updating the electric and magnetic fields of the grid points
belonging to the subgrid mapped to its local memory, every
processor also needs to communicate with its neighboring processors
via message passing to obtain the electric or magnetic field values
that may be required in the computation of fields along the
boundaries of subgrids.
The program flow during a typical timestep in each processor starts
off with the
1- 2 3 4 5
fixed grid and fixed particles / • fixed grid & scaled
particles / - scaled grid and scaled particles / • scaled grid and
fixed particles / Ideal /
V
<y*
357.3
'IMA
182.5
158
computation of electric fields at time tn (after computing the
current density vector J at each grid point at time tn), followed
by evolution of the particles in momentum and real space for half a
timestep. After computing the magnetic fields at time tn + 4* using
the electric field values at tn, the ensemble is allowed to evolve
for another half timestep, thereby completing one timestep of Monte
Carlo particle evolution.
One such structure that was modeled is the photoconductive switch
[4] [5] shown in Fig. 3. As shown there, a femtosecond laser pulse
is used to generate electron hole pairs
GaAi Subctrate Truumittfd
Figure 3: Experimental configuration of photoconductive
switching
between two DC biased coplanar striplines on a GaAs bulk or
superlattice substrate. The transient dynamical response of the
electrons and holes as they accelerate in opposite directions
induces a subpicosecond time scale transient electrical pulse in
the coplanar waveguide which propagates down the stripline and is
subsequently detected at a different point. Through this technique,
ultra-short time scale electrical pulses may be generated and
detected optically. The gap separating the microstrip lines is 10
/im, while the strips are 2^m wide. The GaAs semiconducting layer
is 1.2 /xm thick, and 15/im long along the z-direction. A 20
femtosecond laser pulse with an energy of 1.55 eV is assumed with a
spatial width of 2/zm.
Figure 4 shows the simulated particle current for a fairly low
(peak) injection density of 1 x 1015 /cm3. The results obtained
from using FD/TD solutions of Maxwell's equations as well as the
case for which only the Poisson's equation was solved for updating
the fields, are shown for comparison. It is seen that there is a
reasonable qualitative agreement at low density between the
quasi-static solution and the more complete model where velocity
overshoot is directly observed. As the density is increased, the
perturbation of the DC electric fields due to the higher density of
the moving charges results in significant modification of the time
dependent magnetic and electric fields. At very high densities,
velocity overshoot is barely observed, with additional effects due
to reflections of the
159
Poisson's eqn
Maxwell's eqns
0.5 0.6
Figure 4: Simulated particle current for iV, = l x l 0 l s / c m
3
electromagnetic fields from the boundary walls coming into play.
Comparison of the effects of absorbing versus reflecting boundary
conditions are currently being studied.
Acknowledgment
Computer time provided by Sandia National Laboratories and partial
support of NSF contract ECS-8821107 are gratefully
acknowledged.
References
[1] U.A.Ranawake, P. Lenders and S.M.Goodnick, "Parallelization of
Monte Carlo Al gorithms in Semiconductor Device Physics on
Hypercube Multiprocessors," Compu tational Electronics, Kluwer
Academic Publishers, pp. 137-141, 1991.
[2] U. A. Ranawake, C. Huster, P. Lenders and S. M. Goodnick, "The
3-D Monte Carlo Simulation of a Semiconductor Device on a Hypercube
Multicomputer," Presented at the Workshop on Computational
Electronics, Urbana, Illinois, May 27-28, 1992; published in the
proceedings of the Workshop on Computational Electronics, pp.
193-196, 1992.
[3] K.M. Connolly, R.O. Grondin, R.P. Joshi, and S.M. El-Ghazaly,
"Numerical Mod eling of Ultrafast Electrical Waveform Generation
and Characterization," published in the Proceedings of Microwave
Symposium, 1990.
[4] W.H. Knox, J.E. Henry, K.W. Goossen, K.D. Li, B. Tell, D.A.B.
Miller, D.S. Chemla, A.C. Gossard, J. English, and S. Schmitt-Rink,
"Femtosecond Excitonic Optoelec tronics," IEEE J. Quantum Elec,
Vol. 25 No. 12 (1989) 2586-2595.
[5] W.H. Knox, G.E. Doran, J. Cunningham, and S.M. Goodnick,
"Femtosecond Spatio- Temporal Field Measurements in GaAs Quantum
Wells," in the proceedings of the Topical Meeting on Picosecond
Electronics and Optoelectronics, Salt Lake City, March 1991
(invited).
160