PMC-3D: A Parallelized 3D Monte Carlo Simulator for

PMC-3D: A Parallelized 3D Monte Carlo Simulator for Electronic and Electro-optic Devices
S. Pennathur, U.A. Ranawake, V.K. Tripathi, Patrick Lenders and Stephen M. Goodnick
Department of Electrical and Computer Engineering Oregon State University Corvallis, OR 97331
Abst rac t A parallel Monte Carlo device simulator, PMC-3D has been developed for multiproces sors. Through the use of parallel architectures, full three dimensional modeling of the device domain is possible. Here a discussion of the parallel algorithm is given for coupling the Monte Carlo particle simulation with Poisson's equation for quasi-static problems, and full Maxwell's equations for electro-optic devices.
I. Introduction
PMC-3D is a parallel three dimensional (3-D) Monte Carlo device simulator written for multiprocessors[l][2]. The parallel algorithm is an extension of the standard Monte Carlo device simulation model in 3D in which the particle dynamics generated from the stochas tic Monte Carlo method are solved simultaneously with the appropriate set of field equa tions on a 3D mesh using finite differences. For quasi-static problems such as the behavior of a three terminal MESFET device, Poisson's equation is solved for the potential and corresponding electric field used to drive the particle dynamics. Both the Poisson solver and the Monte Carlo phase are parallelized due to the large computational requirements of solving Poisson's on a 3D grid. The implementation of the Poisson solver is based on an iterative method that uses an odd/even ordering with Chebyshev acceleration. The code was developed both for a distributed memory 1024 node nCUBE multicomputer and a 4-node shared memory Ardent Titan multiprocessor.
II. Parallel Algorithm
In the distributed memory implementation, the spatial domain of the device is subdivided onto separate processors according to a recursive bisection algorithm. Figure 1 shows the decomposition of the MESFET problem for a simple four processor case. The particles and mesh associated with each of the subregions is mapped onto separate processors. The choice of spatial domain size is chosen to roughly balance the number of particles per processor, and adjusts itself throughout the simulation to maintain a balanced processor load. The code is written in C, and contains compiler directives to distribute portions of the code on various processors.
The parallel to sequential speedup was characterized in several ways shown in Fig. 2. The speedups for the fixed size and scaled size problem were measured for both the Poisson phase and the Monte Carlo phase of the code on a 1024 nCUBE hypercube
156
00 4-01 11 10
Figure 1: Mapping onto a 4 processor hypercube and external interaction region
with dynamical load balancing turned off. For the fully scaled problem, an efficiency of 69.8% was obtained for 512 processors compared to the ideal (Amdahl's law) speedup. An additional increase of 8% was obtained using dynamical load balancing. In the fully scaled case, several million grid points, and several million particles were simulated.
I I I . Electromagnet ic Modeling
For high frequency operation or for electro-optic devices, Poisson's equation is insufficient to correctly characterize the dynamical behavior of the system, and thus PMC-3D has been extended to include full electromagnetic solutions of Maxwell's equations coupled with the particle transport models. The Lorentz force equation which determines the carrier momentum between collisions, includes the effect of the magnetic and electric fields obtained from the time-dependent solution of Maxwell's equations, and is given by
F = hic = q(E + vxB)
The standard finite difference/time domain (FD/TD) formulation of the field-equations is used to compute the time-evolving quantities E,H and J with p(x, y, z, t) being the source term corresponding to the charges present as described by Connolly et al. [3]. The algorithm involves two grids in space (for E and H fields) displaced by half mesh increments in the three spatial directions. Every component of E (or H) is computed at a given time, using the four adjacent values of H (or E) that contribute to the loop integral, that results from the Maxwell's curl equations. In addition, the E and H fields are computed at time instants differing by a half time step, amounting to a leapfrog method of computing the time evolution of the field quantities.
157
500
400
Number of Processors
Figure 2: Parallel to sequential speedups of PMC-3D
The current density term J is central to the feedback that exists between the Monte Carlo part and the field computations. It is computed once every timestep at every grid point by summing the weighted velocities of the particles in the ensemble that lie within a unit cell volume about the grid point, and is given as
where Sn and vn refer respectively to the charge and velocity of the nth particle associated with the grid point [3].
The electric fields are initialized to the values obtained using a one-time solution of Poisson's solution, corresponding to the applied DC bias voltages on the strip line, while the magnetic fields are assumed to be zero. Dirichlet boundary conditions are imposed on the metal-semiconductor boundaries as well as on all the conducting walls. Some preliminary results have also been obtained by imposing Neumann boundary conditions at the semiconductor-air interfaces.
The FD/TD algorithm essentially requires the solution of an identical set of difference equations at every grid point and hence is ideally suited for parallel implementation. In addition to updating the electric and magnetic fields of the grid points belonging to the subgrid mapped to its local memory, every processor also needs to communicate with its neighboring processors via message passing to obtain the electric or magnetic field values that may be required in the computation of fields along the boundaries of subgrids.
The program flow during a typical timestep in each processor starts off with the
1- 2 3 4 5
fixed grid and fixed particles / • fixed grid & scaled particles / - scaled grid and scaled particles / • scaled grid and fixed particles / Ideal /
V
<y*
357.3
'IMA
182.5
158
computation of electric fields at time tn (after computing the current density vector J at each grid point at time tn), followed by evolution of the particles in momentum and real space for half a timestep. After computing the magnetic fields at time tn + 4* using the electric field values at tn, the ensemble is allowed to evolve for another half timestep, thereby completing one timestep of Monte Carlo particle evolution.
One such structure that was modeled is the photoconductive switch [4] [5] shown in Fig. 3. As shown there, a femtosecond laser pulse is used to generate electron hole pairs
GaAi Subctrate Truumittfd
Figure 3: Experimental configuration of photoconductive switching
between two DC biased coplanar striplines on a GaAs bulk or superlattice substrate. The transient dynamical response of the electrons and holes as they accelerate in opposite directions induces a subpicosecond time scale transient electrical pulse in the coplanar waveguide which propagates down the stripline and is subsequently detected at a different point. Through this technique, ultra-short time scale electrical pulses may be generated and detected optically. The gap separating the microstrip lines is 10 /im, while the strips are 2^m wide. The GaAs semiconducting layer is 1.2 /xm thick, and 15/im long along the z-direction. A 20 femtosecond laser pulse with an energy of 1.55 eV is assumed with a spatial width of 2/zm.
Figure 4 shows the simulated particle current for a fairly low (peak) injection density of 1 x 1015 /cm3. The results obtained from using FD/TD solutions of Maxwell's equations as well as the case for which only the Poisson's equation was solved for updating the fields, are shown for comparison. It is seen that there is a reasonable qualitative agreement at low density between the quasi-static solution and the more complete model where velocity overshoot is directly observed. As the density is increased, the perturbation of the DC electric fields due to the higher density of the moving charges results in significant modification of the time dependent magnetic and electric fields. At very high densities, velocity overshoot is barely observed, with additional effects due to reflections of the
159
Poisson's eqn
Maxwell's eqns
0.5 0.6
Figure 4: Simulated particle current for iV, = l x l 0 l s / c m 3
electromagnetic fields from the boundary walls coming into play. Comparison of the effects of absorbing versus reflecting boundary conditions are currently being studied.
Acknowledgment
Computer time provided by Sandia National Laboratories and partial support of NSF contract ECS-8821107 are gratefully acknowledged.
References
[1] U.A.Ranawake, P. Lenders and S.M.Goodnick, "Parallelization of Monte Carlo Al gorithms in Semiconductor Device Physics on Hypercube Multiprocessors," Compu tational Electronics, Kluwer Academic Publishers, pp. 137-141, 1991.
[2] U. A. Ranawake, C. Huster, P. Lenders and S. M. Goodnick, "The 3-D Monte Carlo Simulation of a Semiconductor Device on a Hypercube Multicomputer," Presented at the Workshop on Computational Electronics, Urbana, Illinois, May 27-28, 1992; published in the proceedings of the Workshop on Computational Electronics, pp. 193-196, 1992.
[3] K.M. Connolly, R.O. Grondin, R.P. Joshi, and S.M. El-Ghazaly, "Numerical Mod eling of Ultrafast Electrical Waveform Generation and Characterization," published in the Proceedings of Microwave Symposium, 1990.
[4] W.H. Knox, J.E. Henry, K.W. Goossen, K.D. Li, B. Tell, D.A.B. Miller, D.S. Chemla, A.C. Gossard, J. English, and S. Schmitt-Rink, "Femtosecond Excitonic Optoelec tronics," IEEE J. Quantum Elec, Vol. 25 No. 12 (1989) 2586-2595.
[5] W.H. Knox, G.E. Doran, J. Cunningham, and S.M. Goodnick, "Femtosecond Spatio- Temporal Field Measurements in GaAs Quantum Wells," in the proceedings of the Topical Meeting on Picosecond Electronics and Optoelectronics, Salt Lake City, March 1991 (invited).
160

Documents

PMC-3D: A Parallelized 3D Monte Carlo Simulator for