Electrostatic Lens Systems - Dipartimento di Matematica e ...gavioli/corsi/MSFM/ref/Analizz... · 4.3 The calculation of ray paths 65 4.3.1 Ray tracing in a known potential 65 4.3.2

Electrostatic Lens SystemsSecond Edition

Electrostatic Lens SystemsSecond Edition

D W O HeddleEmeritus Professor of PhysicsRoyal Holloway and Bedford New CollegeUniversity of London

Institute of Physics PublishingBristol and Philadelphia

c© IOP Publishing Ltd 2000

All rights reserved. No part of this publication may be reproduced, storedin a retrieval system or transmitted in any form or by any means, electronic,mechanical, photocopying, recording or otherwise, without the prior permissionof the publisher. Multiple copying is permitted in accordance with the terms oflicences issued by the Copyright LicensingAgency under the terms of its agreementwith the Committee of Vice-Chancellors and Principals.

British Library Cataloguing-in-Publication Data

A catalogue record for this book is available from the British Library.

ISBN 0 7503 0697 1

Library of Congress Cataloging-in-Publication Data are available

Publisher: Nicki DennisCommissioning Editor: Jim RevillProduction Editor: Simon LaurensonProduction Control: Sarah PlentyCover Design: Victoria Le BillonMarketing Executive: Colin Fenton

Published by Institute of Physics Publishing, wholly owned by The Institute ofPhysics, London

Institute of Physics Publishing, Dirac House, Temple Back, Bristol BS1 6BE, UK

US Office: Institute of Physics Publishing, The Public Ledger Building, Suite1035, 150 South Independence Mall West, Philadelphia, PA 19106, USA

Typeset in TEX by Digital by Design, CheltenhamPrinted in the UK by Bookcraft, Midsomer Norton, Somerset

Contents

Preface ix

1 The optics of simple lenses i

1.1 Analogies between particle and photon optics 11.2 The cardinal points of a lens 4

1.2.1 The construction of an image 51.2.2 The general ray 81.2.3 The nodal points 9

1.3 Matrix representation of lens parameters 10

2 The motion of charged particles in an electrostatic field 13

2.1 The equation of paraxial motion 132.2 Some general results 15

2.2.1 The nodal and principal points 172.2.2 Relativistic effects 18

3 The determination of the axial potential 20

3.1 Analytic expressions 213.1.1 Approximations 233.1.2 The value of ω 25

3.2 The boundary element method 263.3 The finite element method 29

3.3.1 The basic equations 303.4 The finite difference method: relaxation 32

3.4.1 A two dimensional example 333.4.2 Cylindrical symmetry 383.4.3 Target conditions 403.4.4 Precision and accuracy 43

3.5 The nine point method applied to cylindrical geometry 463.5.1 The nine point formula for a general point 463.5.2 The nine point formula for an axial point 48

v

vi Contents

3.5.3 The convergence of the potential in the lens geometry 493.5.4 Comparisons of the five and nine point methods 50

3.6 Bessel function expansions 523.6.1 The variational principle 543.6.2 The problem of the non-linear potential in the gap 563.6.3 Calculation of the potential 57

4 The optics of simple lens systems 61

4.1 Aperture lenses 614.1.1 Three-aperture lenses 62

4.2 Cylinder lenses 644.3 The calculation of ray paths 65

4.3.1 Ray tracing in a known potential 654.3.2 Integration of the Picht equation 69

4.4 Real and asymptotic cardinal points 704.4.1 The determination of the cardinal points by ray tracing 71

4.5 Windows and pupils 714.5.1 The energy-add lens 73

4.6 Multi-element lenses 734.6.1 Zoom lenses 734.6.2 The afocal lens 754.6.3 The zoom afocal lens 77

5 Aberrations 80

5.1 Spherical aberration 805.1.1 The axial displacement 825.1.2 The disc of minimum confusion 83

5.2 Chromatic aberration 845.3 Off-axis aberrations 865.4 Interrelations of the spherical aberration coefficients 87

5.4.1 The principal surfaces 885.5 The determination of aberration coefficients 89

5.5.1 Direct ray tracing 895.5.2 Perturbation of the paraxial solution 92

5.6 The aberrations of retarding lenses 985.6.1 Spherical aberration 985.6.2 Fifth order effects 995.6.3 Chromatic aberration 1025.6.4 Figures of merit 103

6 The LENSYS program 105

6.1 The programs on the disk 1056.1.1 Loading and running the programs 106

Contents vii

6.2 The screen display 1066.2.1 The menu system 106

6.3 The Set Options option 1076.4 The Lens Data option 107

6.4.1 Error conditions 1086.5 The Imaging option 109

6.5.1 Types of solution 1096.5.2 Aperture lenses 1116.5.3 Cylinder lenses 1116.5.4 The Afocal and Varimag lenses 1126.5.5 The special lenses 114

6.6 The Print/Save option 1156.6.1 The Save to Files option 1156.6.2 The printing options 1166.6.3 Printing under Windows 117

6.7 The auxiliary programs 1176.7.1 The RELAX51A.EXE and RELAX51P.EXE programs 1176.7.2 The RELAXCYL.EXE program 118

Appendix: Technical aspects of LENSYS 120

A.1 The programming language 120A.2 Initialization 120

A.2.1 The effect of processor type on the speed of the program 120A.3 Calculation of potentials 121

A.3.1 Aperture lenses 121A.3.2 Cylinder lenses 121A.3.3 Details of the calculations 121

A.4 Ray tracing for focal data 124A.4.1 Accuracy and consistency 125

A.5 Solving the imaging problem 125A.5.1 Finding the image 125A.5.2 Solving the lens 126A.5.3 The afocal lens 126A.5.4 The Varimag lens 127

References 128

Preface

Interest in electrostatic lenses for the control of ion and (especially) electronbeams is of long standing and has grown considerably in the past few decades. Inaddition, recent innovations in the production of low energy positrons have openeda whole new field of research for which electrostatic lenses are required. Thisbook makes no attempt to be a comprehensive text on electron and ion optics, butaddresses the need for data on simple lenses consisting of two or three apertures orcylinders, together with some more complex cylinder lenses which have interestingproperties. There are a number of quite detailed and extensive texts, of which tworecent books, by Hawkes and Kasper [1] and by Szilagyi [2], contain moderntreatments of the subject and are well worth consulting. The book by Harting andRead [3] which has objectives similar to the present volume has, unfortunately,been out of print for some years.

The book itself forms an introductory text on electrostatic optics and isaccompanied by a disk containing a program for the IBM personal computer. Thisprogram, which is described in detail in chapter 6 and an appendix, will calculatethe focal and aberration properties of a range of lenses and will also allow thestudy of the imaging behaviour between conjugate planes. Ray paths are traced inthe paraxial approximation. A graphical display (Hercules, CGA, EGA or VGA)is required to run the program, and a maths coprocessor is very desirable. Onlynon-relativistic energies are considered and the electrode geometries are all verysimple and easily manufactured.

The text bears some resemblance to courses I have given to first yearpostgraduate students over a number of years. The LENSYS program, on the otherhand, is the result of a pressing need for lens data in a readily usable form forresearch purposes, and includes a number of lenses which have been found to beparticularly useful for the control of electron and positron beams.

It is a pleasure to acknowledge the contribution made by two former researchstudents, Dr Roger Cook and Dr Tony Renau, to the development of the Besselfunction expansion method and its application to reveal hitherto unsuspectedinterrelationships between the aberration coefficients. I am particularly indebtedto my colleague, Dr Susan Kay, who has worked with the LENSYS program from its

ix

x Preface

earliest manifestation as a number of separate routines. Her suggestions of materialto include or omit, and her assistance in developing a reasonably ‘user-friendly’interface, have been invaluable.

The principal changes in this new edition are in chapters 3 and 5 though chapter 6and the appendix have also been substantially revised to reflect the changes in theLENSYS program itself. Chapter 3 now includes an account of the finite elementmethod for the solution of the potential problem as well as the boundary element,finite difference and Bessel function expansion methods, which have all beenexpanded. There is a particular emphasis on the FDM, particularly in the ninepoint form, because it is the easiest of the methods for the reader to develop fromscratch. The full equations are derived and, with material in the appendix, allowefficient, convergent programs to be written.

While LENSYS remains a DOS program, the rise in the Windows operatingsystem and the associated great increases in computer memory since the firstedition have been taken into account.

The LENSYS program has been expanded to include the calculation of thecoefficients of chromatic aberration and the Petzval integral. The solution ofthe imaging problem has been extended and now, as well as the ability to find afocusing voltage appropriate to a specified object/image pair, the image positioncan be found for a specified set of potentials even if it lies outside the displayedscreen. Data calculated during the ray tracing process may now be saved to filesfor further study and screen copies in colour may now be made via the Windowsclipboard.

D W O Heddle6 July 2000

1 The optics of simple lenses

1.1 Analogies between particle and photon optics

The actions of lenses and mirrors in controlling beams of light are matters ofeveryday experience and it may be helpful to look at some systems which havesimilar behaviour in photon and particle optics. The simplest, shown in figure 1.1,is the plane boundary separating two regions which differ in some property. In thecase of photon optics the important property is the refractive index.

Figure 1.1 (a) Refraction of light at a plane boundary between twomedia having refractive indices n1 and n2; and (b) deviation of a beamof charged particles at a plane boundary separating regions havingpotentials V1 and V2.

The path of a ray of light incident non-normally onto the boundary is changed oncrossing the boundary, the directions in the two regions being related by Snell’s law.The analogue in particle optics is a boundary separating two regions at differentelectrostatic potentials. Here we consider a particle having a velocity v1 in thefirst region and v2 in the second. These velocities are related to the potentials, V1

1

2 Electrostatic Lens Systems

and V2 in the two regions by

12mv2

1 + eV1 = 12mv2

2 + eV2 = 0 (1.1)

where e is the particle charge. This expression, which is fundamental to the simpleanalysis of electrostatic lenses, defines the zero of potential as that for which theparticle is at rest. The potential changes abruptly at the boundary, but there is nochange parallel to the boundary and so no force parallel to the boundary acts onthe particle. The component of momentum parallel to the boundary is thereforeunchanged and we can write

mv1 sin α1 = mv2 sin α2 (1.2)

where α1 and α2 are the angles between the normal to the boundary and thepath of the particle in the two regions. From equations (1.1) and (1.2) we have(sin α1)/(sin α2) = (V2/V1)

1/2 which is exactly Snell’s law with V 1/2 playing therole of the refractive index.

That example was rather artificial, because abrupt changes of potential do notoccur in free space, but can be only approximated by the use of closely spaced grids.Our second example is more realistic and demonstrates the focusing behaviour ofa lens consequent on curved boundaries. Figure 1.2(a) depicts two regions, havingrefractive indices n1 and n2, separated by a spherical boundary of radius R. Raysof light incident parallel to the axis, at small distances, h2 and h1, from the rightand left are refracted at the boundary in such a way that they cross the axis atpoints labelled F1 and F2 respectively. These are the first and second focal pointsof this simple lens and the distances measured from the boundary (and here weassume that h � R so we can define the position of the boundary by the pole, O,where the boundary cuts the axis) are called the first and second focal lengths anddenoted by f1 and f2 respectively. Remembering that h/R is very small, so thesines and tangents of the angles can be approximated by the angles themselves,we can write for the upper part of figure 1.2(a)

n1α1 = n2α2 α1 = h1/R α1 − α2 = h1/f2

and for the lower part

n1α4 = n2α3 α3 = h2/R α3 − α4 = h2/f1.

Eliminating the angles we find

1

f1= (n1 − n2)

n1

1

R

1

f2= (n2 − n1)

n2

1

R. (1.3)

Taking the ratio of these two expressions we see that the ratio of the twofocal lengths is the negative of the ratio of the refractive indices. The sign is a

The optics of simple lenses 3

(a)

Figure 1.2 (a) Refraction of light at a spherical boundary between twomedia having refractive indices n1 and n2. The inset illustrates the signconvention. The angles α1 and α2 are positive as are the angles betweenthe rays and the axis at the focal points. Since α3 and α4 are negative,the angle at F1 is equal to α3 − α4. (b) Paths of charged particles in atwo cylinder lens. The paths become asymptotic to straight lines onlyaway from the central region of the lens. The values of the equipotentialsurfaces, shown in section, are discussed in the text.

consequence of defining the focal lengths as distances measured from the boundaryand taking the pole to be the origin of Cartesian coordinates. The sign conventionfor angles is that angles measured from the axis to the ray are positive if the rotationis clockwise. These conventions lead to angles having the signs of their tangents,with the two distances involved measured from the right angle. One consequenceof this choice is that the sign of the angle a ray makes with the axis is opposite tothat of dr

dz for that ray. For angles defined with respect to some other direction, suchas a radius (as in figure 1.2(a)) or a second ray (as in figure 1.3), clockwise rotation


from the ray to the other direction defines the positive sense. In a number of thefigures we have indicated the direction in which angles are measured according tothese conventions.

Figure 1.2(b) illustrates an electrostatic analogue consisting of two coaxialcylinders separated by half their common diameter for clarity and held atpotentials V1 and V2. No mathematical analysis should be needed to show that,deep inside each cylinder, the potentials are very close to those of the electrodes,but within a diameter or so of the centre of the system the potential changesquite rapidly and we show a number of equipotentials to illustrate this. These aresymmetric about a flat, central, equipotential which has a value of (V2 + V1)/2 andthose to the left have values V1 + (V2 − V1)/2n where n = 2, . . . , 5. Notice thatthe paths of the particles are curved and only away from the central region of thelens do they become asymptotic to straight lines. We shall have more to say aboutthe curvature later, but for the moment we concentrate on the asymptotes. As infigure 1.2(a), rays entering the system parallel to the axis from the right and leftare deviated towards the axis, crossing it at the first and second focal points, butas the deviation is not abrupt in this case there is no physical reference point fromwhich to measure the focal lengths. Instead we consider the asymptotic paths andmeasure the focal lengths from the intersections of the asymptotes to the incidentand emergent rays. These intersections define, for incident rays close to the axis,the principal planes of the lens. We shall see later that, for non-paraxial rays,the principal surfaces are actually curved; nonetheless the term ‘principal plane’is universally employed. Notice that the actual paths are not perpendicular to theequipotential surfaces and approach the asymptotes from the outside on the highpotential side of the lens. It would be possible to regard the equipotentials asboundaries separating regions having some mean potential and to use Snell’s lawto follow the refraction, boundary by boundary, through the lens, but this wouldbe rather tedious and not very accurate. There are much better methods.

1.2 The cardinal points of a lens

The contrast between the ray paths in the photon and particle lenses of the previoussection was very marked. In the case of a photon lens of many elements, suchas is common in even quite simple cameras, the ray paths, while still made up ofstraight line segments, may experience many changes of direction within the lensand resemble somewhat the ray paths in the particle lens. The details of the pathsare of little importance as long as the asymptotic paths are well defined and theappropriate planes of intersection known, because the imaging properties of bothsorts of lens can be completely described (in the paraxial approximation, at least)by the positions of the focal points and the principal points, that is, the intersectionsof the principal planes with the axis. These are four of the six cardinal points ofthe lens and their use is illustrated in figure 1.3. The lens is specified in this figureby these four points, and their separations and the distances from an arbitrary


P

u

1F

1f

p

2

rI

q2

f

F2

v

Q

I’

2

2F

1K

2H H

11F

1

R

O

1r

O’ 2K

Figure 1.3 This diagram shows the focal and principal points of a lensand illustrates a construction, using asymptotes to the principal rays, forfinding the position of the image of a given object. The anglesα1 and α2are measured from the upper ray to the lower and have opposite signs.

reference plane, R, are indicated. If there is a plane of (mechanical) symmetryin the lens, this plane is commonly used as reference and the distances from thisplane to the focal points are called the ‘mid-focal distances’ and usually denotedby F1 and F2. The ambiguity of the same symbols being used for these distancesand for the focal points themselves is not usually a problem.

Chapter 5 will contain a discussion of the aberrations of particle lenses, butuntil then we shall concentrate on the rather simpler situation of Gaussian imagingappropriate to paraxial rays, and in the next section shall develop some importantinterrelations. We have restricted the field of this book to lenses of axial symmetry,and we shall consider only rays which lie in planes containing the optic axis, thatis, the meridional rays.

1.2.1 The construction of an imageWe consider an object, OO′ of height r1 at a distance p from the first focal point,u from the first principal point or P from the reference plane, and construct theimage II′ by drawing two principal rays from O′. The first is drawn parallel to theaxis to represent the initial asymptote and, from its intersection with the secondprincipal plane at K2, a line representing the asymptote to the emergent ray isdrawn to pass through the second focal point, F2. The second is drawn throughthe first focal point to meet the first principal plane at K1 from where it continuesparallel to the axis. The intersection of these two rays defines the image point, I′.


The image has a height r2 and is a distance q from the second focal point, v fromthe second principal point or Q from the reference plane. With our Cartesian signconvention p, P, f1, F1, u and r2 are all negative. The other distances are positive.The angles α1 and α2 have opposite signs because the rotations from one ray tothe other are in different senses: we have chosen to mark α1 as the positive angle.

In order to find relationships among the various distances, we consider a numberof similar triangles. From triangles OF1O′ and H1F1K1 we see that r1/p = −r2/f1

and, from triangles IF2I′ and H2F2K2, r2/q = −r1/f2. Triangles H1F1K1 andH2F2K2 are not similar and allow us to represent the angles as α1 = r2/f1 andα2 = −(−r1/f2). The (transverse) magnification of the lens, M , is given by theratio r2/r1 and the angular magnification, Mα , by α2/α1 so we can write

M = −f1/p = −q/f2 and Mα = (−r1/r2)(f1/f2)

soMMα = −f1/f2 = n1/n2 = (V1/V2)

1/2. (1.4)

This last relationship, between the magnification, angular magnification and theratio of the potentials in the object and image regions, is known as the law ofHelmholtz and Lagrange and is essentially a statement that the brightness of animage cannot exceed that of the object.

Using the two expressions for the magnification we obtain pq = f1f2, whichis Newton’s relation, and if we substitute for u (=p + f1) and v (=q + f2) weobtain f1/u + f2/v = 1. This is not often used (except in an approximate formvalid for thin lenses), but substitution for P (=p + F1) and Q (=q + F2) showsthat a graph of Q against P is a rectangular hyperbola with its centre at F1, F2.

This is frequently used to present data for lenses of two elements with the voltageratio as a parameter. Lines of constant magnification can be shown on the sameaxes, leading to a very simple procedure for the choice of a (two element) lens tomeet a specification in terms of a voltage ratio and magnification. In the same waythat it is only the ratio of the potentials which governs the behaviour of a lens, sothe actual size of the lens does not matter and it is only the ratios P/D, F1/D

etc where D is some characteristic dimension of the lens, usually the diameter,which are important. It is very common indeed to find focal lengths and otherdistances represented with D as the unit of length. In this book we shall normallywrite ‘F1’, for example, instead of ‘F1/D’ unless a specific point is to be made.Figure 1.4 illustrates the general form of a family of P–Q curves showing bothbranches of the hyperbolæ and also the real object-image region for a lens of twoclosely spaced coaxial cylinders.

With our sign convention, the transverse and angular magnifications of a realimage are negative. It is, however, common usage to speak of magnifications asthough they were positive and to ignore the sign when using terms such as ‘larger’,‘ smaller’, ‘increase’ etc. Except in critical situations we shall cite magnificationsin this fashion.


Figure 1.4 (a) P –Q diagram for a lens. The upper left region(P < 0, Q > 0) applies to the most common case of real objectsand images. In the other regions one or both are virtual. Notice that,close to the origin, the P –Q relationship is virtually independent of thevoltage ratio. (b)P –Qdiagram for a two cylinder lens, with logarithmicscales to simplify the presentation of data over a wide range of distancesand voltage ratios. Lines of constant magnification and of voltage ratioare shown.


2(z)

F2

r

I

(z)1

r

1F

1H

2H

O

Figure 1.5 Asymptotes to two principal rays, r1(z) and r2(z), and toa general ray r(z) = −[r1(z) + (1/M)r2(z)].

It is easy to see that an object in the first principal plane will be imaged in thesecond principal plane unchanged in size and the principal planes are sometimesreferred to as ‘planes of unit positive magnification’. Note that either the object orthe image must be virtual in this case. We shall see later that this property of theprincipal planes has been exploited in particle optics.

1.2.2 The general rayWhile the two principal rays shown in figure 1.3 offer a simple construction of theposition and size of an image, a ray from an axial object point to the correspondingimage point is often of greater interest. Provided that we restrict ourselves toparaxial, Gaussian, imaging, any linear combination of two rays which themselvessatisfy the focusing condition will also satisfy that condition. Figure 1.5 illustratesa construction to demonstrate this property using the asymptotes to the various rays.Any independent pair of rays may be used as the basis, but it is convenient to chooserays which are parallel to the axis in one region. We define the first ray, r1(z), asthat passing through the first focal point of the lens at an angle of +45◦. This rayis therefore parallel to the axis in the image space and a distance f1 from the axis.The second ray, r2(z), is parallel to the axis in the object space and a distance, −f1,from the axis. It crosses the axis at the second focal point, with a slope f1/f2.

The general ray, r(z), is then represented by r(z) = ζ r1(z) + ξr2(z). Theobject is a distance p from the first focal point, so at the object we haver(z) = ζ(−p) + ξ(−f1) = 0, and we can therefore write ξ = ζ/M where M

is the lens magnification between these conjugates. At the image we write


r(z) = ζ [f1 + q(f1/f2)/M] = 0 for all values of ζ because the term in thesquare brackets is zero. The single parameter, ζ, is sufficient to define a particularray and we write

r(z) = ζ

(r1(z) + 1

Mr2(z)

). (1.5)

Figure 1.5 illustrates the construction of the asymptotes to one such general ray.We take ζ = −1, and draw the asymptote from the object point, O, until it reachesthe first principal plane, H1. Then, from the second principal plane, H2, at the sameradial distance, draw the emergent asymptote to the image point, I. The slope ofthis asymptote is (−1/M)f1/f2 and, as M is known from the ratio of the first focallength and the object distance, the position of the image point does not have to beknown beforehand. The values used for illustration in figure 1.5 are

f1 = −4 f2 = 8 p = −5 q = 6.4

givingM = −0.8.

1.2.3 The nodal pointsThe two remaining cardinal points are the nodal points, N1 and N2. These are pointsof unit positive angular magnification in the sense that a ray directed towards thefirst emerges as though from the second with the same slope. All rays from apoint on the object must pass through the corresponding point on the image andwe show a particular linear combination of the two principal rays in figure 1.6.We draw a ray for which the incident asymptote crosses the first principal plane atsome distance h from the axis. The emergent asymptote appears to come from apoint in the second principal plane at this same distance, h, from the axis. If theseasymptotes are parallel we have

r1 − h

p + f1= r2 − h

q + f2

which leads to

h = r1

(f1 + f2

f2 − p

).

These asymptotes cross the axis at the nodal points which are clearly separated bythe same distance as the principal points. We leave it as an exercise for the readerto show that these points are at distances f2 and f1 from the focal points F1 andF2 respectively. Notice that if the refractive indices (potentials) on the two sidesof the lens are the same, i.e. f1 = −f2, then h = 0 and the asymptotes cross theaxis at the principal points which, for this case, coincide with the nodal points.


f1

1

hF

p

H2

q2

f

2N

1F

2N

H

1

1r

r2

Figure 1.6 Diagram to illustrate the action at the nodal points of alens.

1.3 Matrix representation of lens parameters

While the focal lengths and distances are the parameters most often presented, it ispossible to express the action of a lens by a two by two matrix which acts on someinitial ray asymptote for which the ray position, r , and slope, r ′, at some plane areexpressed as a column vector, to produce a modified asymptote at a second plane.(

r2

r ′2

)=

(a11 a12

a21 a22

) (r1

r ′1

). (1.6)

The planes between which the matrix acts determine the values of the matrixelements and some compromise is needed between simple expressions for theelements and a convenient choice of planes. The significance of the individualelements is clear from equation (1.6): a11 is the linear magnification betweenthe planes and a22 the angular magnification. The other elements relate theradius at one plane and the slope at the other. The simplest matrix is onewhich transfers the ray from the first principal plane to the second. In this casethe element a11 = 1 because the principal planes are planes of unit positivemagnification, and a22 = − f1

f2from equation (1.4). r2 does not depend on r ′

1

so a12 = 0 and it is easy to show that a21 = − 1f2

by considering an incidentasymptote parallel to the axis.

The positions of the principal planes of an electrostatic lens vary with thepotentials and it would be more convenient to work with a matrix operating betweenplanes which did not move in this way. We now develop the matrix which transfersan incident asymptote at the reference plane to an emergent asymptote at the sameplane. Despite the change in radial position which may occur, this is usuallyreferred to as a bending matrix. We do this by operating on the incident columnvector with three matrices. The first transfers the ray from the reference plane to the


Figure 1.7 Matrix elements for a lens of two coaxial cylindersseparated by 0.1 diameters. Note that all four matrix elements arecontinuous as the potential ratio passes through a value of one.

first principal plane, the second is the matrix acting between the principal planesand the third transfers the ray from the second principal plane to the referenceplane. (

r2

r ′2

)=

(1 f2 − F2

0 1

) (1 0

− 1f2

− f1

f2

) (1 F1 − f1

0 1

) (r1

r ′1

)

=( F2

f2

F1F2−f1f2

f2

− 1f2

−F1f2

) (r1

r ′1

).

To use this matrix to study the formation of an image we first translate the rayfrom the object to the reference plane, a distance −P , then use the lens matrix toproduce the emergent ray and finally transfer this ray a distance L. The productof these three matrices gives the transfer matrix(

1 L

0 1

)( F2f2

F1F2−f1f2

f2

− 1f2

−F1f2

)(1 −P

0 1

)=

( F2−Lf2

(F2−L)(F1−P)−f1f2

f2

− 1f2

P−F1f2

).

The condition for imaging is that the position of the ray in the exit plane does notdepend on the slope of the incident ray. This is equivalent to the element, a12, ofthe transfer matrix being zero, which will be the case if L = Q. The element a11

is equal to the magnification of the lens between the conjugate planes, and a22 isthe angular magnification.


The analysis of combinations of lenses is frequently simpler if the matrixformulation is used, because the only calculation required is a sequence of matrixmultiplications. A further situation in which the matrix formulation has advantagesis the analysis of very weak two element lenses. This is because the matrix elementsvary smoothly as the potential ratio passes through one, in contrast to the focallengths and distances which become infinite for this value of the ratio. Figure 1.7shows the matrix elements for a lens of two coaxial cylinders separated by a gapof 0.1 diameters.

2 The motion of charged particles in anelectrostatic field

2.1 The equation of paraxial motion

Our concern is with axisymmetric lenses and so we use cylindrical polarcoordinates with the z-axis as the axis of symmetry of the lens system. Laplace’sequation is

∂2V

∂z2+ 1

r

∂

∂r

(r∂V

∂r

)= 0 (2.1)

where V (r, z) is the potential. Because of the rotational symmetry an expansionof the potential in even powers of r can be made

V (r, z) =∞∑n=0

An(z)r2n.

The two terms of Laplace’s equation are

1

r

∂

∂r

(r∂V

∂r

)=

∞∑n=0

4n2An(z)r2n−2

and∂2V

∂z2=

∞∑n=0

A′′n(z)r

2n

where primes (′) are used to denote differentiation with respect to z. The sum ofthe coefficients of each power of r must be zero so we have a recurrence relation

An+1(z) = − A′′n(z)

4(n + 1)2

giving the series expansion as

V (r, z) = A0(z) − A′′0(z)r

2

22+ A

(4)0 (z)r4

22 · 42+ · · · . (2.2)

13


The axial potential, V (z), is just A0(z) and we can write the axial and radialcomponents of the electric field as

Ez = −∂V

∂z= −V ′ Er = −∂V

∂r= r

2V ′′

where we omit the explicit reference to z and retain terms only to second orderbecause we are considering only paraxial rays. When we come to consider theaberrations of particle lenses we shall have to make allowance for the higher orderterms.

The axial velocity is so much greater than the radial in these circumstancesthat we can write 1

2m(dz/dt)2 + eV = 0 for the total energy of the particle. Theequation of radial motion is

md2r

dt2= eEr = er

2V ′′

writing e for the charge on the particle. The sign of the charge matters only insofaras the potential must always have the opposite sign and we shall soon see that themagnitude of the charge does not matter. To eliminate t from these expressionswe write

d2r

dt2= dz

dt

d

dz

(dz

dt

dr

dz

)giving (−2eV

m

)1/2 d

dz

[(−2eV

m

)1/2 dr

dz

]= er

2mV ′′

which reduces tod2r

dz2+ 1

2

V ′

V

dr

dz= − r

4

V ′′

V. (2.3)

This equation contains neither the charge of the particle nor its mass and it istherefore valid for electrons, positrons and ions of either sign. The only constraintis that the potential should have a sign opposite from that of the particle to ensurea positive total energy.

In order to integrate this equation of motion we require very precise informationabout the axial potential, because we need to determine the second derivative. Analternative approach is to change the independent variable in such a way as toremove this term. We introduce a reduced radius, R, defined by R = rV 1/4

where V must be read as |V | for positive particles. Successive differentiation ofthis expression yields

dR

dz= V 1/4 dr

dz+ 1

4V −3/4V ′r

d2R

dz2= V 1/4

[d2r

dz2+ 1

2

V ′

V

dr

dz+ r

4

V ′′

V︸︷︷︸= 0

−3r

16

(V ′

V

)2]

The motion of charged particles in an electrostatic field 15

where equation (2.3) shows that the grouped terms sum to zero, and we are leftwith the very simple equation of motion

d2R

dz2= − 3

16R

(V ′

V

)2(2.4)

which is known as the Picht equation [4].

2.2 Some general results

The independent variable in the Picht equation is the ratio of the axial potentialgradient to the potential itself and it will be convenient to define a new variable,T (z) = (V ′(z)/V (z)), and to note that, because this appears squared the sign doesnot matter.

Figure 2.1(a) shows the variation of the axial potential, the potential gradientand the parameter T (z) for a lens having a gap equal to the radius with a potentialratio of 1 : 16. Figure 2.1(b) shows the lens geometry and equipotentials at valuesof 1

32 ,1

16 ,18 ,

14 ,

12 ,

34 ,

78 ,

1516 and 31

32 of the overall potential difference. Thepotential distribution was calculated by the nine point finite difference method ofsection 3.5 and the two principal rays were traced in this lens by the routines ofthe LENSYS2 program.

It is not difficult to show that all electrostatic lenses having uniform potentialregions to each side are converging and to obtain an approximate expression fortheir focal lengths. We consider a ray incident parallel to the axis at a reducedradius R1 and make the assumption that this reduced radius does not change inpassing through the lens. Naturally, r , the true radius, will change or there wouldbe no lens action, but the change in V will act in the opposite sense. A formalintegration of the Picht equation gives∫ ∞

−∞R′′ dz = R′

2 − R′1 = − 3

16R1

∫ ∞

−∞T 2 dz

though in practice the limits of integration can be very much narrower as aconsequence of the sharply peaked nature of T 2. R′

1 = 0 because the incidentray is parallel to the axis and so R′

2 has the opposite sign to R1. Writing

R′2 = r ′

2V1/4

2 + r2

4V

−3/42 V ′

2

and, noting that V ′2 will be zero away from the lens proper and that V

1/42 is

intrinsically positive, we see that r ′2 has the same sign as R′

2 and so for rays incidentabove the axis the emergent ray moves towards the axis giving a convergent lensaction. Lenses for which the object and image positions lie in regions of uniformpotential are known as immersion lenses.


T V'

z/D

V

(a)

(b)

Figure 2.1 (a) The variation of V (z), V ′(z) and their ratio, T (z) withaxial position z/D in a two cylinder lens having a gap to diameter ratioof 0.5. Note that the maximum value of T occurs on the low potentialside of the lens. The ordinate scale applies to the axial potential. (b) Raypaths for a voltage ratio of 1 : 16 in this lens showing that most of thedeviation of the particle paths occurs in the region where |T | is large.

We saw in section 1.2.1 that the second focal length of a lens can be writtenas −r1/α2. We identify α2 with r ′

2 and write

1

f2= − r ′

2

r1= −R′

2

R1

(V1

V2

)1/4= 3

16

(V1

V2

)1/4∫ ∞

−∞

(V ′

V

)2dz. (2.5)


Figure 2.2 Paths of the ray r(z) (· · · · · ·) and the reduced ray R(z)(——) in the vicinity of the nodal points. The reduced ray is boundedby the asymptotes and the curvature of its path is always towards theaxis. The slopes of the asymptotes to the ray, r(z), are equal, whilethose of the reduced ray depend on the potential ratio. The radial scaleshave been chosen to make the incident asymptotes coincide.

If we were to trace a ray incident parallel to the axis, but from the other side, wewould obtain

1

f1= − r ′

1

r2= −R′

1

R2

(V2

V1

)1/4= 3

16

(V2

V1

)1/4∫ −∞

∞

(V ′

V

)2dz (2.6)

and would find, for the ratio of the focal lengths,(f1

f2

)= −

(V1

V2

)1/2= −n1

n2.

The ratio is negative because the integrals in equations (2.5) and (2.6) have oppositesigns and the result is what we expect from our previous general discussion.

An alternative expression is sometimes quoted

1

f2= 1

8V 1/22

∫ ∞

−∞

V ′2

V 3/2dz

but the derivation of this expression is valid only if V ′′ does not change sign andthis is never true unless the potential is constrained by grids.

2.2.1 The nodal and principal pointsWe consider an incident ray having its asymptote directed towards the first nodalpoint as illustrated in figure 2.2. By a suitable choice of vertical scale the same


line can be used to represent the asymptotes of both the ray itself and the reducedray. The Picht equation shows that R′′ ∝ −R and so the path of the reduced rayis always curved towards the axis. The reduced ray therefore crosses the axis tothe left of the first nodal point; the curvature then reverses and the ray approachesits emergent asymptote from below. This asymptote passes through the secondnodal point which must lie to the left of the point at which the ray crossed theaxis. Note that both the ‘real’ ray, shown by the dotted line, and the reduced raycross the axis at the same point. While the slope of the real asymptotes are equal,those of the reduced asymptotes are in the ratio of the one fourth power of thepotential ratio. As figure 2.2 shows, the actual ray path is not bounded by theasymptotes. In chapter 1 we saw that the separation of the nodal points is equal tothat of the principal points and our result, above, shows that the principal pointsof an electrostatic immersion lens are crossed, in the sense that H2 is further to thelow potential side of the lens than H1. This is just the way in which the principalplanes were represented in figure 1.3.

2.2.2 Relativistic effectsThe independence of the paths of charged particles in electrostatic fields from thecharge to mass ratio is only true for velocities sufficiently small that the mass of theparticle remains essentially equal to the rest mass. Certainly the lenses discussedin chapter 6 and illustrated in the program would not be suitable for use with highpotentials, but it is perhaps worth noting here the ways in which relativistic effectscan be taken into account. Writing the momentum of a particle as

p = m0v(1 − β2)1/2 (2.7)

and the kinetic energy as

E = m0c2((1 − β2)−1/2 − 1) (2.8)

the velocity can be expressed in terms of the momentum (using equation (2.7))and substituted into equation (2.8) to give

m0c2 + E = c

√m2

0c2 + p2

which can be rearranged to give an explicit expression for the momentum in termsof the kinetic energy. Recalling that this is just equal to the negative of the chargemultiplied by the potential (the zero being taken as usual for the particle to be atrest), we finally express the momentum in terms of the accelerating potential, V .

p2 = 2m0E(

1 + E

2m0c2

)= −2m0eV

(1 − eV

2m0c2

)= −2m0eVrel


where

Vrel = V

(1 − eV

2m0c2

)= V (1 + ε)

= V (1 + 0.9785 × 10−6V ) for electrons

is known as the relativistically corrected acceleration potential. Note that theproduct eV is always negative, and therefore ε is a positive quantity.

The equations of motion can be analysed in terms of this corrected potentialand the Picht equation retains much the same form. The reduced radius, R, is nowdefined by R = rV

1/4rel = rV 1/4(1 + ε)1/4 and the equation itself becomes

R′′ + 3

16R

(V ′

V

)2× K(ε) = 0

where

K(ε) = 1 + 43ε + 4

3ε2

(1 + ε)2.

3 The determination of the axial potential

In order to trace the path of a particle through a lens, we have to integrate theequation of motion and it is convenient and simple to work in terms of the Pichtequation. We need to know the function T (z) to find the focal properties andwe shall see later that we also need to know the axial derivative, T ′(z), if wewish to obtain information on the aberrations. The first stage in the calculationis the determination of the axial potential, from which, with its derivative, weconstruct T (z). We shall consider ways of generating the potential distribution forlenses with more than two electrodes later, but in the first instance we concentrateour attention on lenses having only two electrodes. If the electrode potentialsare V1 and V2, we can express the axial potential in one of the forms

V (z) = 12 (V1 + V2) + 1

2 (V2 − V1)+(z)

= V1 + (V2 − V1),(z)

where +(z) and ,(z) are functions which range between ∓1 and between 0 and 1respectively as z → ∓∞.

There are various methods of calculating these functions. There is an exactsolution for the case of two coaxial cylinders at different potentials, but with nogap between them. This is an unrealistic situation, but it has been used as an aidto the normalization of other expressions.

The general form can be approximated by a number of analytic expressionschosen more for their mathematical tractability than their connection with thebasic physics of the problem. Such an approach has a long history, but withthe amount of computing power readily available nowadays, there is little pointin using this sort of method though we give a brief account in the next section.Results of much greater accuracy can be obtained by approximations based on thelaws of electrostatics. In later sections we consider three such approaches: thedirect application of Coulomb’s law, numerical solutions of Laplace’s equationon discrete networks of points and the analytical solution of Laplace’s equationsubject to boundary conditions which closely approximate the true situation.

20

The determination of the axial potential 21

3.1 Analytic expressions

An exact analytic solution of Laplace’s equation can be found for the axisymmetriccase, where the potential, V (z, r), can be written as a constant term plus theproduct of separate functions, Z(z) and R(r) of the axial and radial coordinates.If we differentiate this product and substitute into Laplace’s equation we obtain anequation in which the terms in z and in r may each be set equal to some parameterwhich we write as −k2.

1

R

(d2R

dr2+ 1

r

dR

dr

)= − 1

Z

d2Z

dz2= −k2.

The differential equations for the two functions are then

d2Z

dz2− k2Z = 0

d2R

d(kr)2+ 1

kr

dR

d(kr)+ R = 0.

The solution for Z(z) is straightforward:

Z(z) = Akekz + Bke−kz

while that for R(r) requires some consideration of the context. The equation isBessel’s equation of zero order for which the formal solution would be

R(kr) = C1J0(kr) + C2N0(kr)

where J0 and N0 are Bessel functions of zero order and of the first and second kindrespectively. The latter become infinite for zero argument and cannot thereforegive a physically real description of the potential on the axis, so C2 = 0 and theoverall solution for the potential becomes

V (z, r) = (Akekz + Bke−kz)J0(kr) + V0 (3.1)

where we have incorporated the coefficient C1 into Ak and Bk and V0 is a constantpotential determined by the boundary conditions. Notice that the zero order Besselfunction of the first kind can be expressed as

J0(r) =∞∑i=0

(−1)i(r/2)2i

(i!)2= 1 − r2

4+ r4

64− r6

2304+ · · ·

which is exactly the form given in equation (2.2) by the Taylor expansion. Thegeneral solution for a particular lens geometry will consist of a sum of suchterms with, perhaps, a constant representing the potential of an electrode.


If k is a real number then, in equation (3.1), Ak must be set to zero for positivevalues of kz and Bk must be zero for negative values. In section 3.6 we shall usethese distinct forms, but for a general case we have to assume that k = jκ and is animaginary number. Using the modified Bessel function, In(x) = j−nJn(jx), andexpanding the exponentials of equation (3.1) we write

V (z, r) = (Gκ cos(κz) + Fκ sin(κz)

)I0(κr) + V0.

This is a solution of Laplace’s equation for an axially symmetric field and for aspecific value of κ . To find the general solution we must integrate over all possiblevalues of κ .

V (z, r) =∫ ∞

−∞

(Gκ cos(κz) + Fκ sin(κz)

)I0(κr) dκ + V0.

We consider the case of two coaxial cylinders of radius R separated by aninfinitesimal gap and take the following values of the potentials to define theboundary conditions:

for z < 0 V (z, R) = V1

for z = 0 V (0, r) = (V1 + V2)/2

for z > 0 V (z, R) = V2.

The second of these conditions both determines the value of V0 and requires thatGκ should be zero to remove the cosine term.

V (z, r) =∫ ∞

−∞Fκ sin(κz)I0(κr) dκ + V1 + V2

2. (3.2)

Application of the first and third boundary conditions allows us to show, after anapplication of Fourier’s integral theorem, that the coefficient Fk can be expressedas

(V2 − V1)

2πκI0(κR)

and if we substitute this into equation (3.2), rearrange the terms and note that theintegrand is an even function, we find that the potential can be written as

V (z, r) = V1 + V2

2+ V2 − V1

2

2

π

∫ ∞

0

sin(κz)

κ

I0(κr)

I0(κR)dκ (3.3)

which reduces to the correct boundary values for r = R as∫ ∞

0

sin(κz)

κdκ = −π

2, 0,

π

2for z < 0, 0, > 0.


This expression is not amenable to evaluation except by numerical methods andthere is only one reason for continuing the analysis.

3.1.1 ApproximationsIt was pointed out by Gray in 1939 [5] that the axial potential function for the twocylinder zero gap lens can be represented approximately by the function

+(z) = tanhωz

R(3.4)

which has the correct asymptotic behaviour. The precise shape is governed by theparameter ω. If we differentiate this expression we have

d+

dz= ω

Rsech2 ωz

R= ω

Rat z = 0.

From equation (3.3) we can extract the true shape of the axial potential as

+(z) = 2

πR

∫ ∞

0

sin(κz)

κ

dκ

I0(κ)(3.5)

and we can equate the differential of this at z = 0 to ω/R to find

ω = 2

π

∫ ∞

0

dκ

I0(κ). (3.6)

Cylinders with no gap between them are somewhat impracticable, but thisapproximate form for +(z) does give a reasonable guide to the behaviour of thelens with a small gap. Writing γ for the potential ratio V1/V2, the parameter, T (z),of the Picht equation can be expressed as

T = 2ω(1 − γ ) exp 2ωz

(1 + exp 2ωz)(γ + exp 2ωz)(3.7)

which has a maximum value,

Tm = −2ω(γ 1/2 − 1

γ 1/2 + 1

)at zm = (1/4ω) ln γ . Analysis using this approximate potential function usuallycontinues by replacing equation (3.7) by one of two expressions having a similarshape:

T = Tm

1 + ((z − zm)/a)2(3.8)

orT = Tm sech((z − zm)/a) (3.9)


Figure 3.1 The differences between the potential functions, +, ofequations (3.4) (•) and (3.10) (©) and the true axial potential for a twocylinder lens having a gap of D/10.

where a is found by equating the area under the curves of equations (3.7), (3.8)and (3.9). The focal lengths and distances can then be expressed as rathercomplicated functions of γ , a, ω and zm.

For the more practical case of cylinders separated by a small, but notinfinitesimal, gap, g, the function

+ = 1

ω′g/Rln

(cosh(ωz/R + ω′g/2R)

cosh(ωz/R − ω′g/2R)

)(3.10)

has often been used. Ifω′ = ω this reduces to tanh ωz/R asg → 0. The expressionfor T (z) is more complex and the approximation becomes progressively pooreras g increases. Some improvement may be made by modifying the value of ω′

to take account of the smaller value of the slope of the potential at z = 0, and avalue of 1.67 has been used, but this is not really a good model for a wide gaplens. It was in any case common practice to use equations (3.8) or (3.9) for actualcalculations. Figure 3.1 shows the differences between the values of + given bythese two approximations and the true axial potential for a two cylinder lens witha gap of D/10.


Some of the other approximate forms which have been suggested are:

, = 1 − 1

πarccot

( z

a

), =

(1 + z/a

1 + |z/a|)

+ = 1

a

∫ z

0exp(−π(z/a)2) dz

where the value of a depends on the length of the gap, g. There is little point ingiving further discussion of this type of approximation.

3.1.2 The value of ωValues for ω were first calculated some 60 years ago [5] by three rather differentmethods which gave values of 1.318, 1.32 and 1.315. These values have appearedin many textbooks since then, including the first edition of the present work,frequently without specific attribution.

A value of 1.326 227 5, obtained by a numerical integration of equation (3.6),was quoted by Verster in 1963 [6], but this seems to have passed unnoticed and hasnot been cited in a textbook. This omission might suggest that there were doubtsabout the accuracy of this value and a recalculation would seem appropriate.

It is now quite practicable to evaluate the integral with much higher precisionand accuracy, and we have used a standard series representation of I0(κ) [7] todetermine values of the integrand from 0 to 50 at intervals of 0.001. We have thenused three quadrature expressions, each at intervals of 0.01, 0.002 and 0.001, tocalculate the integral. While there are differences beyond the 11th decimal placeall the results are consistent with a value of ω = 1.326 227 505 1 in completeagreement with Verster, but with three more significant figures. This confirmationof Verster’s value underlines the question of why an older, approximate, value wasstill being cited 25 years later [1, 2].

The numerical methods described in this chapter cannot calculate the field fora true zero gap lens, but values can be found for progressively smaller gaps andmay be extrapolated to zero.

While it is no surprise that the Bessel function expansion method of section 3.6gives an excellent agreement, two other methods have been applied to the problemand also agree well with our calculated value.

The finite difference method of section 3.5 lends itself to a simple type ofextrapolation in which both the size of the gap and the density of the finite latticeare extrapolated together. With a cylinder radius, R1, in terms of the lattice spacing,the smallest value of the gap to diameter ratio which can be modelled is 1/R1.Figure 3.2 shows values of the field at the centre of the gap as a function of 1/R2

1 ,which is proportional to the lattice density. A linear fit to these data extrapolatesto 1.326 227 506 with a standard deviation of 5 in the last figure. This is a verysatisfactory agreement. Similar agreement has been found by Read [8] using aspecial version of the CPO2D program which uses the boundary element methodof section 3.2.


Figure 3.2 Values of the field at the centre of a two cylinder lens ofradius R1. Extrapolation to the zero value of 1/R2

1 implies zero gaplength and also infinite lattice density.

3.2 The boundary element method

The solution of Laplace’s equation requires the conditions to be known over aboundary which completely encloses the space. Almost without exception thiscalls for some approximation of the conditions in the gaps between the electrodes.The boundary element method, also known as the charge density method, doesnot have this constraint. It is in principle a very simple method: any electrodemay be replaced by a system of charges at the electrode surface provided thatthe potentials they produce are everywhere the same as those produced by theelectrode. In particular, the potential of the electrode itself must be reproduced bythe charge distribution. The formal way to express this is

Vk = 1

4πε0

n∑j=1

∫Sj

σj (rj ) dSj|rj − rk| (3.11)

where Vk is the potential at a position rk due to charges qj = σj dSj at positions rjon each of n electrodes. In the present context of round lenses with axial symmetry,the problem reduces to the summation of the potentials at a point due to ringsof charge on all the electrodes. The geometry of this situation is illustrated infigure 3.3 in which the potential at the ring of radius ri due to the charge on thering of radius rj is to be found. In terms of the distances shown in this diagramthe denominator of equation (3.11) is

l = [r2i + r2

j − 2rirj cos(αi − αj ) + (zi − zj )2]1/2.


Figure 3.3 Diagram to indicate the coordinates appropriate to thecalculation of the potential of a point due to a ring of charge.

The absolute values of the angles are arbitrary because of the axial symmetry so,for convenience, let αi = π which allows us to simplify this expression as

l = [(ri + rj )2 − 4rirj sin2(αj/2) + (zi − zj )

2]1/2

={

[(ri + rj )2 + (zi − zj )

2](

1 − 4rirj sin2(αj/2

(ri + rj )2 + (zi − zj )2

)}1/2.

The total charge on the ring, Qj , can be written in terms of a line density,λj = Qj/2πrj , and each term in the summation of equation (3.11) becomes

1

4πε0

∫ 2π

0

λj rj dαjl

= 1

πε0

λj rj

[(ri + rj )2 + (zi − zj )2]1/2

×∫ π/2

0

(1 − 4rirj sin2(αj/2)

(ri + rj )2 + (zi + zj )2

)−1/2d(αj/2).

The integral is just K(τ), the complete elliptic integral of the first kind,

∫ π/2

0(1 − τ 2 sin2 β)−1/2 dβ


with

τ 2 = 4rirj(ri + rj )2 + (zi − zj )2

and so in terms of τ and Qj the potential at a point i can be written as

Vi =n∑

j=1

Qj

4π2ε0

τ√rirj

K(τ). (3.12)

The practicability of solving this set of equations depends on a sensible choice of‘ring’elements and the distribution of charge between them. The charge density onthe surface of an electrode is directly proportional to the electric field normal to thesurface and a preliminary estimate of this field may be made in other ways, such asby the finite difference method. The charge density near the edge of an electrodewill be high and change rapidly so small rings are needed here: conversely, deepinside a cylindrical electrode the charge density will be small and a substantiallength of the cylinder may be used as a single ‘ring’. This choice of elements isprobably the factor which most severely limits the overall accuracy of the method.The most direct way of solving the problem is to treat equation (3.12) as a matrixequation relating the column vectors Vi and Qj with the individual elements beingfunctions of the positions of the two rings involved, but other methods have beenused [9].

The use of discrete rings of charge leads to a non-uniformity of potential onthe electrode surfaces. In place of a constant potential, the potential will varyin a ‘saw-tooth’ fashion with discontinuities at the edges of the ring elements.This effect can be minimized by making the charges on each element the same.An alternative approach would be to use a continuous charge distribution with asurface charge density described analytically.

Even though the only boundary conditions which need to be applied are those onthe electrode surfaces, the effects of field penetration from charges on the surface ofthe vacuum system enclosing the electrodes should be assessed and the electrodesdesigned to minimize such effects.

Once the charge distribution consistent with the required electrode potentialshas been found, the axial potential distribution follows readily, since the ellipticintegrals are then just equal to 1. The gradient of the axial potential can alsobe expressed in closed form and the function T (z) of the Picht equation followsimmediately.

When properly optimized, the boundary element method is capable of veryhigh accuracy. There are commercial programs available which use the boundaryelement method and, in some cases, a limited, trial version is available for downloadfrom the Web†.

† The CPO program written by RB Consultants Ltd can be downloaded fromcpo.ph.man.ac.uk and the Optics program written by Dr Li Wang can be downloaded fromwww.jps.net/liwang/download.htm.


Figure 3.4 The right hand half of a two cylinder lens showing apattern of triangular elements, suitable for analysis by the finite elementmethod. Note that the sizes of the elements are far from constant.

3.3 The finite element method

Instead of treating the space within a system of electrodes as continuous, we mayestablish a set of discrete points at which we determine the potential by somemeans. The finite elements of this method are, usually, triangles and it is thepotentials at the vertices of these triangles which we seek to determine.

Figure 3.4 shows a system of triangular elements appropriate to the solution ofthe potential distribution in a two cylinder lens with thick walls. We consider thewalls to be thick compared to the length of the gap for two reasons. For purposes ofcalculation it is necessary to specify the potential at all boundaries, and at the outeredge of the gap a linear variation with position is usually a good approximationand one which can be tested. In addition, for practical purposes, the thick wallsreduce the effect of field penetration from, for example, the walls of the vacuumchamber. This factor is important whatever method is used for the calculation.

Only half of the lens system is shown in figure 3.4 as the system is symmetricabout the mid-plane. If we are representing a lens with potentials 1 and 0 on theleft hand and right hand electrodes, we should set the potential at all points on themid-plane to be 0.5 and at points at the outer edge of the gap to fall linearly from0.5 to 0.

The axial potential gradient has a maximum value at the centre of the lensand is also large close to the inner end of the cylinder because of the 90◦ changeof the boundary equipotential. In both regions we should use elements of smallsize though elsewhere, particularly deep inside the cylinder, they may be muchlarger.


Figure 3.5 A representative element of the network illustrated infigure 3.4.

The physical principle behind the finite element method is that the potentialdistribution will be such that the potential energy of the electrostatic field is aminimum and so we need to express this potential energy in terms of the potentialsat the vertices of each and all of the triangular elements.

3.3.1 The basic equationsWe consider a typical element in figure 3.5 where we have shown both Cartesianand axisymmetric cylindrical coordinates. We shall use the latter for the lensexample. We assume that the potential varies linearly with position and write

V1 = V (z1, r1) = a + bz1 + cr1 etc (3.13)

where the coefficients can be expressed in terms of the coordinates of the verticesand their potentials.

a = [V1(z2r3 − z3r2) + V2(z3r1 − z1r3) + V3(z1r2 − z2r1)]/2S

b = [V1(r2 − r3) + V2(r3 − r1) + V3(r1 − r2)]/2S

c = [V1(z3 − z2) + V2(z1 − z3) + V3(z2 − z1)]/2S

S = [z1(r2 − r3) + z2(r3 − r1) + z3(r1 − r2)]/2

(3.14)

where S is the area of the triangle.


The potential energy of the electrostatic field is

W =∫∫ [(∂V

∂r

)2+

(∂V

∂z

)2]ε0

22πr dr dz. (3.15)

The condition for minimum potential energy is that

∂W

∂V1= ∂W

∂V2= · · · = ∂W

∂Vk

= · · · = ∂W

∂VN

= 0. (3.16)

For a single element, ∂V∂r

= c and ∂V∂z

= b so the associated potential energycan be written as

BW = ε0

2

(b2 + c2

)2πr dr dz

= πrε0S(b2 + c2

).

From figure 3.4 it is clear that any given node is common to a number of triangularelements and the potential at that node affects the potential energy of all of these.We should, therefore, write each term of equation (3.16) in the form

∂W

∂Vk

= ∂

∂Vk

( k+n∑j=k+1

BWj

)= 0 (3.17)

where there are n elements having node k in common. We may move thedifferential operator inside the summation and write for a representative element

∂(BW)

∂Vj

= πrε0[b(rj+1 − rj+2) + c(zj+2 − zj+1)

].

The radial coordinate r in this equation has come from the volume element inequation (3.15) and should be assigned the value at the centroid of the triangle,that is the average of the values at the vertices. If we substitute for b and c usingequation (3.14) this expression reduces to the sum of the potentials at the threenodes multiplied by coefficients which involve the coordinates of these nodes andwhich can be calculated once for all. Equation (3.17) can then be written as

∂W

∂Vk

=k+n∑m=k

gmVm = 0.

This is one of a set ofN equations which are usually solved by matrix methods. Thesolution is the set of free node potentials from which equipotential plots, such asthat shown in figure 3.6, may be generated. Remember that all the nodes, includingthose at the boundaries, must be considered in evaluating these equations.

The finite element method is of limited accuracy for electrostatic lenses as thelinear interpolations used in equation (3.13) militate against the determination


Figure 3.6 Equipotentials for the electrode and finite element systemshown in figure 3.4. Notice that the equipotentials are not quite smooth;this is the result of using the linear potential variation of equation (3.13)within each triangular element.

of the accurate axial potentials and gradients required for ray tracing and thecalculation of aberrations. It is a much more useful tool for the analysis of magneticfield problems and has been extensively used in this area.

Commercial programs based on the method exist, and the data of figures 3.4and 3.6 were obtained using Student’s QuickField 4.1† which is a limited version ofa very powerful program. We noted earlier the need to vary the size of the elementssuch that, in regions of high field strength, the elements were small. QuickFieldallows the user to determine the sizes during the problem definition phase, butother programs, such as FlexPDE‡ will generate an optimal set of elements fromthe boundary conditions.

3.4 The finite difference method: relaxation

One of the simplest methods for calculating the potential distribution in abounded region is to treat the space not as continuous, but as a regu-lar lattice of discrete points. We can then replace the differential form ofLaplace’s equation by difference equations. We shall analyse systems of the

† This program, a development of one called ELCUT, can be downloaded fromwww.tor.ru/quickfield.‡ FlexPDE lite can be downloaded from www.pdesolutions.com.


Figure 3.7 Lattice of points spaced by a distance h on which thepotential may be relaxed using groups of five points (•).

cylindrical symmetry appropriate to electrostatic lenses later, but first con-sider a simpler case which allows us to illustrate some general features.

3.4.1 A two dimensional exampleLaplace’s equation in rectangular Cartesian coordinates is

∂2V

∂x2+ ∂2V

∂y2= 0. (3.18)

We can represent this space by a rectangular array of points separated by distanceshin both x and y directions as shown in figure 3.7, and identify five points eachmarked by the symbol •. The point at the centre of this group is at (x, y). Wecan express the potential at the four outer points using a Taylor expansion of thepotential at the centre point and its derivatives at that point.

V (x − h, y) = V (x, y) − h∂V

∂x+ 1

2h2 ∂

2V

∂x2

− 1

6h3 ∂

3V

∂x3+ 1

24h4 ∂

4V

∂x4− · · ·

V (x + h, y) = V (x, y) + h∂V

∂x+ 1

2h2 ∂

2V

∂x2

+ 1

6h3 ∂

3V

∂x3+ 1

24h4 ∂

4V

∂x4+ · · ·


with two similar expressions, but involving y-derivatives, for V (x, y − h)

and V (x, y + h). Adding all four equations we find

V (x − h, y) + V (x + h, y) + V (x, y − h)+V (x, y + h) − 4V (x, y)

= h2(∂2V

∂x2+ ∂2V

∂y2

)ignoring higher order terms beginning with

+ 1

12h4

(∂4V

∂x4+ ∂4V

∂y4

).

Equation (3.18) shows that the right hand side of this equation is zero. The potentialat any point on the lattice can therefore be written as the mean of the potentials atthe four nearest neighbour points. In order to apply this result to a real problem,fixed values of potential must be assigned to all the points on a boundary enclosinga region of our two dimensional space. For the non-boundary points it is sometimespossible to guess approximate potentials, but unless the array is quite large it isusually enough to set all the other potentials to zero. We then move systematicallyover the space replacing the potential at each point by the appropriate average. It isusual to start at a place on the boundary where regions at different potentials meet.As an example, suppose that the potentials at points (x, 10) and (0, y) are fixedat potential 1 and those at (0, x) and (10, y) are fixed at zero, x and y taking allvalues from 1 to 9; the potentials at the corners are set to 0.5, 1, 0.5, 0 in sequence,defining a square boundary. We would then begin by writing

V (1, 1) = 14 [V (0, 1) + V (2, 1) + V (1, 0) + V (1, 2)]

= 14 [1 + 0 + 0 + 0] = 0.25

then increase x to the end of the row and then move up one row, return to the leftand so on. A single passage across the array is not going to give us the solutionand we must repeat the whole cycle many times until the changes from one cycleto the next are sufficiently small. This process of repeated iteration is known asrelaxation. Notice the change in notation here: distances are expressed in unitsof h.

3.4.1.1 Improving the speed and precision. The relaxation method outlinedabove will always give a solution, but the precision will depend on the scaleof the array of points and the time taken to converge to an adequate degree willincrease as the square of the number of points. This conflict between speed andprecision can be resolved to some extent by first relaxing a rather coarse arrayand then inserting additional points between the array points, interpolating to findstarting potentials for these new points and then continuing the relaxation process.However, a much greater increase in speed can be obtained by a procedure known


Figure 3.8 The approach of the potential on the diagonal of a squarelattice (51×51 points) to the final value. (a) The difference between thepotential after n cycles of iteration and the final potential, illustratingthe exponential approach. The upper curve and those of figure 3.8(b)are the result of five point iteration; the lower curve is the resultof nine point iteration, discussed later. (b) The potential after ncycles of iteration with acceleration parameters of (from the right)1.0, 1.4, 1.8, 1.85, 1.90.


as ‘successive over-relaxation’. The basis for this is that each time we calculatethe potential at some point, we move it towards its final value, but by only a smallamount, and indeed by an amount which is progressively smaller. Over-relaxationinvolves replacing the potential at a point not by the simple mean of that at its fourneighbours but, recognizing that next time around it will be changed even more,we make a bigger change immediately. We write

V (x, y) = g

4(V (x − 1, y) + V (x + 1, y) + V (x, y − 1) + V (x, y + 1))

− (g − 1)V (x, y)†

where V (x, y)† is the value of V (x, y) from the preceding iteration and g is calledthe acceleration parameter. As an example of the increased speed that can beobtained by appropriate choice of acceleration parameter, we show in figure 3.8the behaviour of a square array similar to that of figure 3.7, but with 51 points on aside. We know from symmetry that the potential of all the points on the diagonalwith x = y will become 0.5 eventually and so the sum of the potentials for the 49points with 0 < x = y < 50 will be 24.5. Figure 3.8(a) shows the exponentialapproach to this final value for no acceleration, and figure 3.8(b) shows, on linearscales, the sum potential for a number of values of g. With no acceleration thesum reaches a value within 0.0001 of 24.5 only after more than 3000 cycles ofiteration, while for g = 1.90 this condition is reached after 95 cycles. Noticethat the final value is reached, in this case, from above. The effect of increasing g

from 1 to larger values resembles very much the effect of reducing the dampingof a resonant circuit and, if g is increased too much, the system becomes unstableand will oscillate for g � 2; it is the first signs of this which we see for g = 1.90.The maximum safe value of g depends on the size of the array and also on itsshape. This example has only 2401 free points and would be regarded as a rathersmall array. We commonly use arrays of up to some 500 000 points with aspectratios of six to 20 and have found no serious problems with values as high as 1.97.The Pascal program RELAX51 used for this calculation is shown, in outline, on thefollowing page, and a compiled version, with certain refinements, is included onthe program disk.

3.4.1.2 Estimates of g. Expressions from which the value of the optimum valueof the acceleration parameter may be estimated are given by Press et al [7] andfigure 3.9 is based on these expressions. It is clear that only for very small arrayswill values less than 1.7 be needed.


PROGRAM relax51;VAR V:array[0..50,0..50] of double;

x,y,c:integer;g,Vsum:double;

PROCEDURE average;VAR x,y:integer;BEGINfor x:=1 to 49 do BEGINfor y:=1 to 49 do BEGINV[x,y]:=g*(V[x+1,y]+V[x-1,y]+V[x,y+1]+V[x,y-1])/4

-(g-1)*V[x,y];END;

END;Vsum:=0;for x:=1 to 49 do Vsum:=Vsum+V[x,x];writeln(c:4,Vsum:16:5);

END;BEGINREPEATREPEAT

clrscr;gotoxy(16,6);write(’ENTER a value for the acceleration parameter,

g: ’);c:=wherex; writeln;gotoxy(16,7);write(’(1’,chr(243),’ g < 2), but g=0 terminates

the program’);gotoxy(c,6);readln(g);IF g=0 THEN exit;

UNTIL (g >= 1) and (g < 2);clrscr;c:=1;Vsum:=0;for x:=1 to 50 do for y:=0 to 49 do V[x,y]:=0;for x:=0 to 49 do V[x,50]:=1;for y:=1 to 49 do V[0,y]:=1;V[0,0]:=0.5;V[50,50]:=0.5;while abs(Vsum - 24.50) > 0.0001 do BEGIN

average;c:=c+1;

END;readln;

UNTIL g=0;END.


Figure 3.9 Estimates of the acceleration parameter, g, for arrays ofY (R) × X(Z) points. The abscissa is the aspect ratio, X/Y , whichallows values for a wide range to be shown compactly.

3.4.2 Cylindrical symmetryIn cylindrical polar coordinates Laplace’s equation (2.1) is

∂2V

∂z2+ ∂2V

∂r2+ 1

r

∂V

∂r= 0.

Consider a rectangular array of points separated by distances h with the polar axisforming the lower boundary. We can write for the potential close to the point (z, r)

V (z, r + h) = V (z, r) + h∂V

∂r

+ 1

2h2 ∂

2V

∂r2+ 1

6h3 ∂

3V

∂r3+ 1

24h4 ∂4

∂r4+ · · · (3.19a)

V (z + h, r) = V (z, r) + h∂V

∂z

+ 1

2h2 ∂

2V

∂z2+ 1

6h3 ∂

3V

∂z3+ 1

24h4 ∂4

∂z4+ · · · (3.19b)

with similar expressions for the two other neighbouring points. Taking appropriatesums and differences and ignoring terms of fourth order and above we can write


V (z + h, r) + V (z − h, r) − 2V (z, r) = h2 ∂2V

∂z2

V (z, r + h) + V (z, r − h) − 2V (z, r) = h2 ∂2V

∂r2

h

2r

[V (z, r + h) − V (z, r − h)

] = h2

r

∂V

∂r.

The right hand sides of these three equations are, term by term, equal to h2 timesthe terms of Laplace’s equation, and so the sum of the left hand sides must be zero.We can therefore write

V (z, r) = 14

[V (z + h, r) + V (z − h, r) + V (z, r + h) + V (z, r − h)

]+ h

8r

[V (z, r + h) − V (z, r − h)

]=

[V (z + h, r) + V (z − h, r) +

(1 + h

2r

)V (z, r + h)

+(

1 − h

2r

)V (z, r − h)

]/4

where the latter version is better suited to computation as the r-dependentcoefficients can be calculated once for all.

These expressions cannot be applied to points on the axis, as the final termwould become infinite. For this case we have to consider six points: two axialpoints, one to either side of the target point, and four points in a plane perpendicularto the axis.

V (z, 0) = 16

[V (z + h, 0) + V (z − h, 0) + 4V (z, h)

].

As in the two dimensional case, the convergence of the calculation can be madesignificantly faster by over-relaxation.

V (z, r) = g

4

[V (z + h, r) + V (z − h, r) +

(1 + h

2r

)V (z, r + h)

+ (1 − h

2r

)V (z, r − h)

]− (g − 1)V (z, r)†

V (z, 0) = g

6(V (z + h, 0) + V (z − h, 0) + 4V (z, 1)) − (g − 1)V (z, 0)†.

3.4.2.1 Boundary conditions. It is a straightforward matter to apply theconditions at the electrode surfaces, but the gaps between electrodes can presentproblems and there are other ‘boundaries of convenience’which can be introducedto reduce the computational space and time requirements. Considerable reductions


can be made by recognizing planes of symmetry in the problem. Figure 3.10shows the electrodes of two aperture lenses, one with two apertures and the otherwith three. The origin of coordinates is at the centre of the lens. In the case ofthe two aperture lens, the symmetry is obvious and a suitable set of boundaryconditions would be V = 0 on the left hand electrode, V = 0.5 on the plane ofsymmetry (a boundary of convenience) and a linear rise from 0 to 0.5 along theline from a to b. It is easy to test the validity of this last condition by applying it ata large radial distance and examining the change in potential distribution betweenthe aperture plates as a function of radial position: it is usually very good providedthat (D3 −D1)/A is greater than about two. Only the mesh in the region indicatedneeds to be relaxed. The potential gradients towards the left hand end of thisregion are quite low and the relaxation process will converge very slowly if it isstarted at that end. It is far better to start the process in a region which will havelarge potential gradients such as the junction between the mid-plane boundary andthe outer cylindrical surface as this will tend to drive the effects of the differentboundaries across the mesh. One might think of the method as a type of diffusionprocess and, in a later section, we shall see that this is quite a realistic description.

For the three aperture lens it is necessary to find the potential distribution for twocases. The first, asymmetric, case has potentials of 0, 0.5 and 1 on the left, centreand right electrode respectively, and a potential of 0.5 can be assigned to the centreplane leaving, again, only the hatched area to be analysed. The symmetric case,with potentials 0, 0.5 and 0 on the three electrodes needs more thought becausethe potential on the centre plane is not part of the fixed boundary conditions buthas to be found. This is not difficult. For 0 < r < D1/2 we write

V (0, r) = 14 [V (0, r − h) + V (0, r + h) + 2V (−h, r)]

+ h

8r[V (0, r + h) − V (0, r − h)]

and on the axisV (0, 0) = 1

3 [V (h, 0) + 2V (0, h)]

with appropriate modifications for acceleration.If the electrode potentials are set at V1, V2 and V3, the axial potential can be

written as

V (z) = V1 + (V3 − V1)+a(z) + (2V2 − V1 − V3)+s(z)

where +a(z) and +s(z) are the axial potential functions found for the asymmetricand symmetric cases respectively.

3.4.3 Target conditionsFor the example of a square mesh considered in section 3.4.1, the values of thepotential on all the diagonal points were known from the symmetry, and testingfor convergence by examining the difference between the value after each iteration


Figure 3.10 Aperture lenses of two and three elements showing theimportant dimensions and the regions over which the potential mustbe solved. The cylinders have to be bounded by a known potential attheir outer ends, but the length would normally be much greater than isshown here, as is implied by the broken-line sections.

cycle and this known value was not a problem. Even in this situation, a simple testmay fail if the acceleration parameter is too large, because the overshoot illustratedin figure 3.8(b) for g = 1.90 may, by chance, satisfy the condition on the risingpart. In general the true potential distribution is not known a priori and someother method of testing for the convergence of the iterative procedure has to beemployed.

It is a fair assumption that eventually the potentials at all points will be as closeto the true values as the discrete nature of the analysis will allow. At some pointsthe potential will be too high and at others it will be too low, but after the nextcycle of iteration these specific values will change and may, indeed, alternate onsubsequent cycles. The magnitude of these changes will be some measure of the


Figure 3.11 The dependence of |BV | on the iteration number for thearray of RELAX51 calculated using the five point method using singleand double precision arithmetic and with two values of the accelerationparameter, g.

effect of the discrete nature of the process and will also depend on the accelerationparameter. It is useful to examine their behaviour as it may be a guide to thecompletion of the iterative process.

It would be impracticable to examine the changes at every point for a largearray, typical of lens systems, but we shall see later that all the lens parameters ofgeneral interest can be deduced from the axial potential and changes in this can bean adequate guide. We write δV (j) as the change in the potential at the j th (axial)point, and form the sums

BV =∑j

δV (j) |BV | =∑j

|δV (j)|.

We expect the first sum to take either sign and to approach zero, but the second sumis always positive and will decrease, tending to some finite value. This limitingvalue will depend on the precision of the calculation and also, though this is onlyapparent for quite large arrays, on the acceleration parameter, g.

Figure 3.11 illustrates the behaviour of |BV | for the array of RELAX51 wherethe δV are summed along the diagonal and the iteration is continued for 2000 ormore steps. The results are shown for two values of g, 1.60 and 1.90, using singleand double precision arithmetic, and also for a sequence, shown by the brokenlines, in which g = 1.90 until |BV | < 10−8 from which point the lower value ofg is used.

There are a number of points of interest in this figure. The asymptotic valuesfor the single and double precision calculations do not depend significantly on the


value of g. We shall see later that there is a much clearer dependence in the case oflarge arrays. The approach to the asymptotes has a slope which depends only onthe value of g even in the case where most of the approach is made more rapidly,using a larger initial value. This shows that much calculation time can be savedby starting the iteration with a large, even excessive value. Again, this will bediscussed further in the context of lens geometries in the appendix. Finally, thereare glitches separated by 100 iterations in the approach of the g = 1.90 curve toits asymptote and, at the point where the value of g is reduced, the value of |BV |falls rapidly before adopting the slower decrease appropriate to the new value.

The demonstration program RELAXCYL.EXE on the disk uses this criterion anddisplays both BV and |BV | at each step and halts when |BV | = 0.00001. Notethat the sign of BV alternates to some extent for large values of g. The glitches infigure 3.11 mentioned above correspond to large negative values ofBV . The readermay care to examine the dependence on g of the number of iteration cycles requiredto reach the end point. A graph of log n against g emphasizes the importance of thecorrect choice of value, while a graph of n against log(2 − g) is also interesting.Both graphs show the effects of incipient oscillation for values above the optimum.

3.4.4 Precision and accuracyThe precision of the final result of a relaxation calculation depends on the sizeof the array. With other factors being the same, the precision increases directlywith the number of points in the array. The accuracy is also affected, because thecontribution of the neglected higher order terms of the Taylor expansion becomessmaller.

There are particular problems close to edges and corners of electrodes, wherethe potential changes rapidly. It is possible to alleviate these if the mesh densityis increased locally in these regions by reducing h to a half or a quarter of itsvalue over the main system [10]. Unfortunately, it is just at the edges of electrodesthat the boundary conditions are not well known and increasing the mesh densitymay give a false sense of accuracy when only the precision of the solution to asomewhat different problem has been improved.

3.4.4.1 The limitations of the five point method. The accuracy of the five pointmethod may be estimated by noting that the potential at a given point differsfrom the mean of that at the nearest neighbours by an amount proportional to h4

or N−2 where N is the number of mesh points. However this is not the absoluteaccuracy, because these latter points do not have their ‘correct’values either so thereis a progressive change which will depend on N , leaving the absolute accuracyproportional to N−1. It is very easy to exploit this dependence by solving the sameproblem with different mesh spacing. We noted earlier that the speed of calculationcould be increased by repeated halvings of the mesh separation, and if the problemis tackled in this way, the information needed to extrapolate to an infinite numberof points is available. In figure 3.12, values of the potential at five points on the axisof a two cylinder lens calculated with mesh spacings of D/40, D/80 and D/160


Figure 3.12 (a) The potential at five points on the axis of a cylindricallens calculated by the five point method on lattices having spacing ofD/40, D/80 and D/160 plotted against the reciprocal of the numberof lattice points, illustrating extrapolation to an infinite lattice density.The five sets of data are shown with substantially suppressed zeros toemphasize the variation in slope. The data for 0.9 and 1.4 diametersare shown on ordinate scales magnified by 10 and 100 respectively.(b) The dependence of the slope of the lines of (a) (identified by theirsymbols), and similar plots, on axial position. The scale is magnifiedby a factor of ten for the positive values.


are plotted against the reciprocal of the number of mesh points. There is a cleardependence on N−1 though the slope varies and even changes sign. While thisextrapolation can only be applied at the spacing of the coarsest mesh, the smoothvariation of the slope can itself be interpolated so that the values of the potentialat intermediate points can be corrected.

This position dependent slope implies that the five point method can never givethe true axial potential and it is only this extrapolation procedure which allows thetrue potential to be found.

3.4.4.2 The nine point method. It is possible to increase the accuracy of thecalculation by considering not just the average of the potentials at four nearestneighbours, but by taking account of four further points. This leads to relaxationmethods known as ‘nine point relaxation’ in distinction to the ‘five point relaxation’we have discussed so far. If we consider the four points a distance ±2h from thetarget point in the two dimensional array [11] we can write

V (x − 2, y) + V (x + 2, y) + V (x, y − 2) + V (x, y + 2) − 4V (x, y)

= 16h2(∂2V

∂x2+ ∂2V

∂y2

)and this time the first neglected term is

+4

3h4

(∂4V

∂x4+ ∂4V

∂y4

)which is 16 times as great as for the four nearest neighbour case. We can eliminatethis term, leaving the sixth order term as the first to be neglected, if we give thenearest neighbour sum a weight of 16 and the ‘2h’ sum a weight of −1

60V (x, y) = 16[V (x − h, y) + V (x + h, y) + V (x, y − h) + V (x, y + h)]

− [V (x − 2h, y) + V (x + 2h, y) + V (x, y − 2h) + V (x, y + 2h)].

This choice of additional points leads to some problems close to boundary pointsand it is more usual to use the points at the corners of a square to eliminate thefourth order term. The expansion of the potential at these points to fourth orderrequires 15 terms:

V (x + h, y + h) = V (x, y)

+ h∂V

∂x+ h

∂V

∂y+ 1

2h2 ∂

2V

∂x2+ h2 ∂2V

∂x∂y+ 1

2h2 ∂

2V

∂y2

+ 1

6h3 ∂

3V

∂x3+ 1

2h3 ∂3V

∂x2∂y+ 1

2h3 ∂3V

∂x∂y2+ 1

6h3 ∂

3V

∂y3

+ 1

24h4 ∂

4V

∂x4+ 1

6h4 ∂4V

∂x3∂y+ 1

4h4 ∂4V

∂x2∂y2

+ 1

6h4 ∂4V

∂x∂y3+ 1

24h4 ∂

4V

∂y4. (3.20)


We leave it as an exercise for the reader to show that the fourth order terms canbe removed by giving the nearest neighbour terms a weight of four and the cornerterms a weight of unity.

The lower of the two curves of figure 3.8(a) is the result of this form of thenine point method. It reaches the target condition after 2577 cycles of iterationcompared with the 3094 cycles taken by the five point method. The total calculationtime is greater, however, being 0.35 m s per cycle compared with 0.7 m s per cycleon a 450 MHz Pentium III machine.

3.5 The nine point method applied to cylindrical geometry

The increases in precision and, especially, accuracy of the nine point method areso significant that a full derivation of the formulæ is appropriate.

In cylindrical polar coordinates, equation (3.20) becomes

V (z + h, r + h) = V (z, r) + h∂V

∂z+ h

∂V

∂r

+ 1

2h2 ∂

2V

∂z2+ h2 ∂2V

∂z∂r+ 1

2h2 ∂

2V

∂r2

+ 1

6h3 ∂

3V

∂z3+ 1

2h3 ∂3V

∂z2∂r+ 1

2h3 ∂3V

∂z∂r2+ 1

6h3 ∂

3V

∂r3

+ 1

24h4 ∂

4V

∂z4+ 1

6h4 ∂4V

∂z3∂r+ 1

4h4 ∂4V

∂z2∂r2

+ 1

6h4 ∂4V

∂z∂r3+ 1

24h4 ∂

4V

∂r4(3.21)

with three further expressions which differ from this only in the signs of the termsinvolving the odd order differentials.

3.5.1 The nine point formula for a general pointLaplace’s equation relates only first and second derivatives and, if we are to improvethe calculations beyond those considered in section 3.4.2, we have to eliminate thehigher order derivatives from the eight expressions of equations (3.19) and (3.21).By taking appropriate sums and differences of these expressions, we can writeexplicit expressions for the higher order derivatives in terms of the potentials andthe lower order derivatives.


h3 ∂3V

∂r3= 3

[V (z, r + h) − V (z, r − h)

] − 6h∂V

∂r

h4 ∂4V

∂r4= 12

[V (z, r + h) + V (z, r − h)

] − 24V (z, r) − 12h2 ∂2V

∂r2

h3 ∂3V

∂r∂z2= 1

2

[V (z + h, r + h) + V (z − h, r + h)

− (V (z + h, r − h) + V (z − h, r − h)

)]+ V (z, r − h) − V (z, r + h) (3.22)

h4 ∂4V

∂r2∂z2= 4V (z, r) + V (z + h, r + h) + V (z − h, r + h)

+ V (z + h, r − h) + V (z − h, r − h)

− 2(V (z + h, r) + V (z − h, r) + V (z, r + 1) + V (z, r − h)

)h4 ∂

4V

∂z4= 12

(V (z + h, r) + V (z − h, r)

) − 24V (z, r) − 12h2 ∂2V

∂z2.

We can find relations between these higher differentials by differentiating Laplace’sequation once and twice with respect to r and twice with respect to z.

∂3V

∂r3+ ∂3V

∂r∂z2+ 1

r

∂2V

∂r2− 1

r2

∂V

∂r= 0

∂4V

∂r4+ ∂4V

∂r2∂z2+ 1

r

∂3V

∂r3− 2

r2

∂2V

∂r2+ 2

r3

∂V

∂r= 0

∂4V

∂r2∂z2+ ∂4V

∂z4+ 1

r

∂3V

∂r∂z2= 0.

Substitution of the five expressions (3.22) into these three equations leads to asingle equation for V (z, r) in terms of the potentials at the eight adjacent points.The use of a computer algebra package, in our case DERIVE†, is very useful atthis stage.

60(8r2 + 3h2)V (z, r) = 6(16r2 + 7h2)(V (z + h, r) + V (z − h, r))

+ (96r2 + 48rh + 30h2 + 23h3/r)V (z, r + h)

+ (96r2 − 48rh + 30h2 − 23h3/r)V (z, r − h)

+ (24r2 + 12rh + 9h2 + 13h3/2r)(V (z − h, r + h) + V (z + h, r + h))

+ (24r2 − 12rh + 9h2 − 13h3/2r)(V (z − h, r − h) + V (z + h, r − h)).

(3.23a)Again, this cannot be applied to points on the axis, but the calculation in terms ofthe potentials at the five nearest points in the plane is straightforward.

† Soft Warehouse, Inc., 3660 Waialae Avenue, Suite 304, Honolulu, HI, USA.


3.5.2 The nine point formula for an axial pointA Taylor expansion of the potential near the axis gives

V (z, h) = V (z, 0) − h2

4

∂2V

∂z2+ h4

64

∂4V

∂z4(3.24a)

and the potential at the adjacent axial points follows from equation (3.19b) withr = 0 and the corresponding expression for V (z − h, 0).

V (z − h, 0) + V (z + h, 0) = 2V (z, 0) + h2 ∂2V

∂z2+ h4

12

∂4V

∂z4. (3.24b)

The potentials at the remaining two points are found by writing equation (3.19b)as V (z ± h, h), remembering that ∂2V /∂z2 has to be evaluated at these valuesof z :

V (z − h, h) + V (z + h, h) = V (z − h, 0) + V (z + h, 0)

−(h2

4

∂2V

∂z2

)z−h

−(h2

4

∂2V

∂z2

)z+h

+ h4

32

∂4V

∂z4

= V (z − h, 0) + V (z + h, 0)

−(h2

2

∂2V

∂z2

)z

− h4

4

∂4V

∂z4+ h4

32

∂4V

∂z4

= 2V (z, 0) + h2 ∂2V

∂z2

+ h4

12

∂4V

∂z4− h2

2

∂2V

∂z2− h4

4

∂4V

∂z4+ h4

32

∂4V

∂z4

= 2V (z, 0) + h2

2

∂2V

∂z2− 13

96

∂4V

∂z4. (3.24c)

The differential terms in equations (3.24a–c) can be eliminated by taking thesum, weighted by multipliers of 34, five and seven, respectively, which gives thefollowing expression for the potential at a point on the axis.

58V (z, 0) = 34V (z, h) + 5(V (z − h, 0) + V (z + h, 0))

+ 7(V (z − h, h) + V (z + h, h)).(3.23b)

Equations (3.23) were first derived by Durand [12], but a more recent,and perhaps more accessible, derivation by Szilagyi [2], has a misprint in thecoefficients of the V (z, r ± h) terms. Kasper [13] quotes an expression as validfor fields of various symmetries. However, for the axisymmetric case we areconsidering, his expressions give the coefficient of the V (z, r +h) term as exactlyfour times that of the V (z±h, r±h) terms which is only true in the limit of large r .The correct expression has been found by Edwards [10], but is not given in his


Figure 3.13 The fractional difference between the axial potential aftersufficient cycles of iteration to ensure convergence. The two data setsare for 40 000 iterations with g = 1.80 and 10 000 iterations withg = 1.97.

paper, but only in a separately published appendix with only the barest outline ofderivation.

3.5.3 The convergence of the potential in the lens geometryThe criterion that |BV | has reached some chosen value, or that it has reached itsasymptotic value, is often adequate for a ‘closed box’ situation, but is not goodenough in the context of lens design, where the values of the potential are notknown and the range of these is very large. In this situation we must recognizethat reaching the asymptotic value is only the beginning and subsequent cycles ofiteration are required to drive the correct values of the potential along the lengthof the lens. The potential distribution has therefore to be tested for consistencyunder continued iteration. If the iteration has converged completely, values ofthe potential at each axial point will not have changed, to within the numericalprecision, after some further cycles.

The most sensitive way to examine this is to calculate the fractional differencebetween the values

V1(z) − V2(z)

V2(z)

for all values of the axial position z, where V1 and V2 correspond to differentnumbers of iterations. As an indication of the precision we can expect, figure 3.13shows the comparison between two data sets after full convergence has beenobtained. There is a clear periodicity which shows that we are seeing differences inthe last significant digit of the double precision numbers we have used. The valueof the potential at z/D = 0 is 0.5 while at z/D = 9.9 it is 1.0 · · · × 10−21. Each


Figure 3.14 The fractional difference between the axial potential after10 000 cycles of iteration with g = 1.95 and at five intermediate stages.

of the 21 � marks the axial position where the number changes from 1.0 × 10−n

to 9.9 × 10−(n+1) and they echo the periodicity of the data. A histogram of thesedata with bins of width 10−15 shows that 48% are in the zero bin with almost equalnumbers greater (26.3%) and less (25.7%) than zero.

Figure 3.14 shows the situation before convergence. The values of the fractionaldifference are so large that we cannot use the linear display of figure 3.13. Thelogarithmic display we have to use loses the negative data, but the progress ofconvergence along the axis is quite clear. After 20% of the final number of iterationsthe axial potential is nowhere correct; after 30% it has converged to z/D ≈ 1.2;but it is only after some 55% that there is full convergence. In the calculationsused in the preparation of the potential data file, LPOTS.DAT, for LENSYS2.EXE thetests for convergence were of this type, but were automated, as described in theAppendix, and were made at rather closer intervals.

3.5.4 Comparisons of the five and nine point methodsThe advances in personal computer technology since the first edition have reducedthe importance of the extrapolation procedure of section 3.4.4.1 and it is nowpracticable to apply the finite difference method to significantly larger arrays andto realize the advantage in accuracy of the nine point relaxation method. For thecalculation of the potentials and fields used in the LENSYS2.EXE program we haveused arrays of up to 240 by 2400 and 160 by 3200 points. The details of thesecalculations are discussed in the appendix, but it will be useful to compare someaspects of the five and nine point methods here.

To illustrate the similar precision of the two methods, we show in figure 3.15 theiteration of a cylindrical array system with 80 radial points and 1600 axial pointsfor 8000 cycles. The iterations were started with g = 1.95 which was reduced to


Figure 3.15 The variation of |BV | for a cylinder lens lattice having 80radial and 1600 axial points calculated by both the five point (the uppercurve) and nine point methods. The approach of the axial potential toits final value is indicated by the © curve and was determined by anextension of the procedure mentioned in section 3.5.3 and described inthe appendix.

1.80 when |BV | fell below 10−12. Both sets of data approach the same limitingvalue, 7.0 × 10−15 with standard deviations of 6.5 × 10−16. The iteration timeson a 120 MHz Pentium computer were 0.19 and 0.28 s per cycle for the 1000 and822 cycles taken to reach the abrupt ‘step’ similar to that noted in the discussionof figure 3.11. The progress of the convergence of the axial potential is shownfor the nine point case. There is little point in continuing the convergence testbeyond z = 9D because the potential is distorted by the zero potential ‘end cap’at z = 10D.

The difference in accuracy is illustrated in figure 3.16. This compares the axialpotential found after adequate convergence of both calculations. The figure is acomparison of the fractional difference between the potentials determined by thetwo methods, both with g = 1.80.

V9(z) − V5(z)

V9(z)

where the notation should be transparent.We know that the five point method does not converge to the correct potential

and we shall see later that the nine point method does converge correctly in acertain sense, so the differences shown in this figure indicate the scale of the errorin the five point method as applied to a lens geometry. The information in thisfigure is similar to that in figure 3.12.


Figure 3.16 The fractional difference between the axial potentialsof the lens of figure 3.15 calculated by the five point and nine pointmethods.

The finite difference method is one of the easiest to program, provided that thegeometry is simple; curved surfaces need to be approximated by chords. It canbe much more accurate than practical lens systems justify, giving very good focaland aberration data for both cylinder and aperture lenses. There are commercialprograms available. The best known of these is SIMION†, which we have used toprepare figures 1.2(b), 4.3 and 5.6, but they all use the five point method which aswe have seen has limitations which may be significant.

3.6 Bessel function expansions

We saw in section 3.1 that the potential in an axisymmetric situation could beexpressed as

V (z, r) = (Akekz + Bke−kz)J0(kr) + V0. (3.1)

The general solution for a particular lens geometry will consist of a sum of suchterms and, in contrast to the treatment of section 3.1, we choose to treat positive andnegative values of kz explicitly. Figure 3.17 shows the geometry of a lens formedby two coaxial cylinders of equal diameter, D, separated by an axial distance, g,and having potentials V1 and V2. We show the walls of the cylinder to be of finitethickness as we shall later use a result from a finite difference calculation to resolvea problem inherent in this method. Three regions are indicated, in each of whichwe can write a specific expression for the potential. In regions I and III, within the

† A demonstration version of SIMION can be downloaded fromwww.srv.net/∼klack/simion.html.


Figure 3.17 Schematic diagram of a two cylinder lens showing thedimensions and identifying the regions to which different expressionsfor the potential apply. The origin of the coordinates is at the centre ofthe system.

cylinders, only one of the terms of equation (3.1) has physical meaning; the otherwould become infinite as |z| → ∞.

VI (z, r) = V1 +∞∑n=1

An exp(knz)J0(knr) (3.25a)

VII (z, r) = V1 + V2

2+

(V2 − V1

g

)z

+∞∑n=1

[Bn exp(−knz) + B ′n exp(knz)]J0(knr) (3.25b)

VIII (z, r) = V2 +∞∑n=1

Cn exp(−knz)J0(knr). (3.25c)

In order to find the coefficients, we must consider the conditions at the variousboundaries. At the inner surfaces of the cylinders the potentials are V1 and V2 so,unless An = Cn = 0, the Bessel functions must all vanish for r = D/2. Thepossible values of knD/2 are therefore the roots, j0,n, of J0 = 0 and will be foundin tables of Bessel functions [14]. This restriction on the values of kn means thatin region II the potential at r = D/2 varies linearly with z. This is not correct,but it is not a bad approximation for small values of g/D and allowances can bemade. Symmetry allows us to write

An = −Cn and Bn = −B ′n


and from the continuity of potential at the boundary between regions I and II wehave

An exp(−kng/2) = Bn[exp(kng/2) − exp(−kng/2)]

which allows us to express the potentials in all three regions in terms of the singleset of coefficients Bn.

VI (z, r) =V1 +∞∑n=1

Bn[exp(kng) − 1] exp(knz)J0(knr)

VII (z, r) =V1 + V2

2+

(V2 − V1

g

)z

+∞∑n=1

Bn[exp(−knz) − exp(knz)]J0(knr)

VIII (z, r) =V2 −∞∑n=1

Bn[exp(kng) − 1] exp(−knz)J0(knr).

The continuity of potential across the boundary between regions I and II impliesthat the radial component of the field is continuous across this boundary, but wehave not used the constraint that the axial component of the field must also becontinuous across this boundary,

∞∑n=1

Bnkn exp(kng/2)J0(knr) = V2 − V1

2gfor all r � D/2.

It is not easy to apply this constraint directly, but another approach allows all theBn

to be determined.

3.6.1 The variational principleIt is a general feature of any physical system that it will adjust itself to a condition ofminimum potential energy. We may therefore test the suitability of any expressionfor the potential by examining the energy stored in the associated field. Theenergy density of an electrostatic field can be written as 1

2ε0E2. If we write the

total energy of the field as the integral of the sums of the squares of the axial andradial differentials of the potentials we can determine the coefficients by equatingthe differentials of this integral with respect to Bn to zero. The expression for theenergy, W , contains three terms, one for each of the regions, but we need onlyevaluate one and a half of these because of the symmetry about the central plane.

W = πε0

{ −g/2∫−∞

D/2∫0

[(∂VI

∂z

)2+

(∂VI

∂r

)2]r dr dz

+0∫

−g/2

D/2∫0

[(∂VII

∂z

)2+

(∂VII

∂r

)2]r dr dz

}.


The evaluation of these integrals is considerably simplified by the orthogonalityof the Bessel functions.

Applying the condition ∂W/∂Bn = 0 gives

Bn = (V2 − V1)exp(−kng/2)

(g/2)k2nDJ1(knD/2)

where J1(knD/2) is the Bessel function of order one evaluated at the zeros of J0.The properties of the field do not depend on the scale of the system and it

is convenient to express all the dimensions in terms of the diameter, D, of thecylinders.

G = g/D Z = z/D R = 2r/D

withKn = knD/2

Values of Kn and the corresponding values of J1(Kn) to ten decimal places canbe found in tables [14] for 1 � n � 150. Writing Qn = 1/2K2

nJ1(Kn) we haveBn = (V2 − V1)Qn exp(−KnG)/G. The exponential terms involve (±2Z ± G)

so we write

a1n = exp[Kn(2Z + G)] and a2n = exp[Kn(2Z − G)]

and express the potentials as

VI (Z,R) = V1 + V2 − V1

G

∞∑n=1

QnJ0(KnR)(a1n − a2n) (3.26a)

VII (Z,R) = V1 + V2

2+ V2 − V1

GZ

+ V2 − V1

G

∞∑n=1

QnJ0(KnR)(1/a1n − a2n)

(3.26b)

VIII (Z,R) = V2 − V2 − V1

G

∞∑n=1

QnJ0(KnR)(1/a2n − 1/a1n). (3.26c)

Successive terms of the summations converge quite rapidly except close tothe boundaries between the regions, and at axial distances greater than some 1 1

2diameters only the first term makes any significant contribution. The axial potentialthen approaches that of the electrode as e−2K0Z where K0 = 2.4048. As G → 0the exponential terms approach 2KnG exp(−2KnZ), the (2KnG) cancels and theexpressions reduce to

2∞∑n=1

exp(−2Kn|Z|)KnJ1(Kn)

J0(KnR).


Figure 3.18 The potential in the gap between two cylinders atpotentials 0 and 1. The full curve shows the ‘cubic’ result of Bonjour[15] and the points show the results of nine point relaxation with alattice spacing of D/320. The straight line is a least squares fit to thesedata and its significance is discussed in the text.

3.6.2 The problem of the non-linear potential in the gapFor small gaps the departure of the true potential between the cylinders at r = D/2from a linear dependence on z is not really serious, but nonetheless it is possibleto make a correction in one of two ways. Instead of writing the second term ofequation (3.25b) as (V2 − V1/g)z a third order term can be added [15], and thevariational calculation repeated with the slightly more complicated expression

(V2 − V1)

[(1 − a

2

)z

g+ 2a

(z

g

)3].

For a zero gap lens, a = 1, and decreases quite slowly, being about 0.75for g = D/2. This is too simple to represent the true situation except, perhaps, forcylinders with extremely thin walls. Thick walled cylinders are more practicableand have the further virtues of shielding the cylindrical space from charged surfacesoutside the lens system, but within the vacuum chamber.

The true variation of the potential at r = D/2 may be found using the finitedifference method to relax the potential not only within the cylindrical surface,but outwards to a greater radius within the gap as discussed in section 3.4.2.1in the context of aperture lenses. Figure 3.18 illustrates this variation for agap, g = D/10. The cubic form is clearly a poorer representation than the simplelinear variation and the true shape requires higher order terms to describe it. It


Figure 3.19 The fractional difference between the axial potentialcalculated using the Bessel function expansion and that calculatedusing nine point relaxation. Six values of the gap scaling factor, from1.089 960 to 1.089 970 were used, the topmost curve corresponding tothe smallest value.

would be quite difficult to repeat the variational calculation with fifth and seventhorder terms but the straight line of slope 9.135 provides a guide to improving theapproximation. Instead of using the true value of the gap in the expressions forthe potential, we can multiply the this by a scaling factor slightly larger than onesuch that the axial potential agrees with that calculated by the nine point relaxationmethod to a high degree of accuracy except close to the actual gap.

We show in figure 3.19 the fractional difference between the potential calculatedusing a range of scaling factors, and that calculated by the nine point relaxationmethod for a lens represented by arrays 160×3200 for the cylinder and 160×160for the gap between the cylinders. Interpolation between these values of gap scalingfactor suggests that a value of 1.089 662 will give quite excellent agreement forz/D > 1 and very good agreement close to the gap. The greatest fractionaldifference is 3.4 × 10−6 at D/4.

In addition to providing a value for the gap scaling factor for a nominal gap ofD/10, figure 3.19 shows that the nine point method gives an accurate descriptionof the axial potential. The agreement extends over the whole volume of the lenscylinder as is shown in figure 3.20. Contours of constant fractional difference showthat, out to a radius of D/4, the agreement is better than 10−5. The disagreementfor smaller values of z/D is due to the approximation used in the Bessel expansioncalculation.

3.6.3 Calculation of the potentialA Pascal function which calculates the potential at the point (Z,R) using


Figure 3.20 The fractional difference between the potentialcalculated using the Bessel function expansion and that calculated usingnine point relaxation with a gap scaling factor of 1.089 662. The spaceis divided by the contours of zero value into region in which the valueof the fractional difference is positive or negative. The other contourscorrespond to values of ±10−6, 10−5 and 10−4.

equations (3.26) is shown on the following page. It uses a further function BessJ0(x)

which can be taken directly from an example in Numerical Recipes [7]. The valuesare appropriate for cylinder potentials of zero and unity and for a nominal gap ofD/10. The main program must supply values for K[n], the nth zero of the Besselfunction of order zero, and Q[n]. Both of these can be determined from values intables [14].

At each position the potential, V , is calculated by summing terms dV [c] untilthe absolute value of a term is less than some limiting value (taken here as 10−8)or the number of terms exceeds 150, when a rounding-off procedure is applied toestimate the sum to infinity.

The series converges rather slowly close to the boundaries between the regions.If it has not converged after 150 terms, we need to invoke the round-off procedure.By considering the continuity of the gradient at the boundary between regions Iand II, it is easy to show that

∞∑n=1

1

KnJn

≡∞∑n=1

1

k[n]j [n]= 0.5

but the sum to 150 terms is, at 0.471 15 · · ·, significantly different from this. Thesums to odd and even numbers of terms differ from the true sum by amounts whichalternate in sign and decrease progressively. A simple average introduces some


FUNCTION BesselPot(R,Z:double):double;CONSTGap=0.109362;

VARLimit,Z1,Z2,V0:double;C:integer;S:ARRAY[0..156] of double;dV:ARRAY[0..150] of double;

PROCEDURE RoundOff(C:integer);BEGIN

S[C+3]:=((S[C]*S[C-2]-SQR(S[C-1]))/(S[C]-2*S[C-1]+S[C-2]));

END;BEGIN

Limit:=0.00000001;Z1:=2*Z + Gap; Z2:=2*Z - Gap;C:=1; S[0]:=0; dV[0]:=1;WHILE ABS(dV[C-1]) > LIMIT DOBEGIN

IF Z > Gap/2 THEN BEGINdV[C]:=Q[C]*BessJ0(K[C]*R)*(exp(-K[C]*Z1)

-exp(-K[C]*Z2))/Gap;V0:=1; END

ELSE IF Z < -Gap/2 THEN BEGINdV[C]:=Q[C]*BessJ0(K[C]*R)*(exp(K[C]*Z1)

-exp(K[C]*Z2))/Gap;V0:=0; END

ELSE BEGINdV[C]:=Q[C]*BessJ0(K[C]*Z)*(exp(-K[C]*Z1)

-exp(K[C]*Z2))/Gap;V0:=0.5+Z/Gap; END;

S[C]:=S[C-1]+dV[C]; V:=V0+S[C];IF C < 150 THEN C:=C+1

ELSE BEGINRoundOff(148);RoundOff(149);RoundOff(150);RoundOff(153);V:=V0+S[156];dV[149]:=0;

END;END;BesselPot:=V;

END;


bias as a result of this progressive decrease, but a mean of three successive termsis really very good. The round-off procedure used in this program is much morepowerful than is really necessary and could probably be applied safely to estimatethe sum to infinity after only the first 30 or so terms had been calculated. It is usedto extrapolate from the sums to 146 to 150 terms by using

S[C + 3] = S[C] × S[C − 2] − S2[C − 1]

S[C] − 2S[C − 1] + S[C − 2]

with C = 148, 149 and 150 to generate pseudosums, S[151], S[152] and S[153]which are then averaged again to give a final pseudosum, S[156], which is within10−10 of 0.50 for the exact boundary case. Away from the boundaries, the seriesconverge quite quickly: 20 terms being enough by |z/D| = 0.15.

The Bessel function expansion is by far the fastest method for calculating thepotential in a two cylinder lens. On a 450 MHz Pentium III machine, the potentialcan be evaluated for a lattice of 160 radial and 3200 axial points in a quarter of aminute. The nine point finite difference method requires about half an hour!

4 The optics of simple lens systems

4.1 Aperture lenses

One might almost say that aperture lenses are the ‘traditional’ particle lens. Agreat variety of detailed shapes and proportions have been used which makes itdifficult to prepare tables of focal data for general application. In the context oflow energy beams there is no need for smoothly rounded electrodes: simple thindiscs with circular holes can be used and, provided that there is some agreementabout the spacing of the discs, it should be possible to make systems of two orthree apertures which behave in a predictable fashion. Figure 3.10 shows theimportant dimensions of such systems. For the purposes of scaling, it is usual toexpress all the dimensions in terms of the diameter,D1, of the apertures themselves.The significance of the outer diameter of the discs, D3, was discussed earlier inthe context of establishing a simple boundary condition to use in the relaxationcalculation, but it has a further, practical, significance in that the lens system doesnot exist in a vacuum, but rather in a vacuum system which has walls which willeither be of metal and therefore have some definite potential, or of an insulatingmaterial and have a rather indefinite potential. In both cases it is important thatthis should not influence the potential distribution within the lens and it is as wellto ensure that (D3 − D1)/A > 3.

A similar argument requires that the first and last apertures be mounted oncylinders to guarantee uniform potentials in the object and image spaces, andthe diameter, D2, of these cylinders will affect the potential distribution and,therefore, to some extent the focusing properties. Figure 4.1 shows the axialpotential distribution in a two aperture lens with A = D1/2 and a number ofvalues of D2/D1. Near the centre of the lens the distributions are very similar, butwith increasing distance from the centre the potential tends towards the simpleexponential decay characteristic of the dominant term in the Bessel functionexpansion, V (z, 0) ∝ exp(−k0z/

12D2). Most of the calculations were made with

the length of the D2-cylinder great enough that the potential fell smoothly, butone example is included of a cylinder of length 4D1, and therefore only 4D2/3, toillustrate the way in which the end-plate can influence the potential distribution.

61


Figure 4.1 Axial potential distributions for two aperture lenses havingA = D1/2. Values of the ratio D2/D1 are shown on the diagram. Thepotentials all have the value 0.5 at the lens centre, but fall with increasingaxial distance, z, as exp(−4.8096(z/D2)). The potentials approachthese asymptotes from above except in the case of the cylinder lens. Isthere a value of D2/D1 for which the potential falls exponentially forall z?

The different potential distributions away from the apertures themselves affectthe properties of the lens as illustrated in figure 4.2. This shows the first focallength, f1, mid-focal distance, F1, and spherical aberration coefficient, Cs0 , fora voltage ratio, V2/V1 = 10, as functions of the ratio D2/D1, and includes thecase of the cylinder lens of the same spacing, which one might consider to bea rather extreme aperture lens having D2/D1 = 1. The overall effect on thefocal properties is not large, but the lower aberration of the cylinder lens is quitenoticeable. Changes of a similar scale result if the separation of the apertures isdoubled, but whereas this is a fairly obvious effect, the effect of changes in D2 isa little more subtle. The thickness of the aperture discs also plays a part, but noton the same scale. The data presented in the LENSYS program are for D2/D1 = 3which represents a compromise between reduced dependence at large values ofthe ratio and sheer physical size of the electrodes.

4.1.1 Three-aperture lensesAs is the case with all lenses with three elements, the principal function is thedecoupling of the focal properties of the lens, in particular the behaviour betweenconjugates, from the ratio of the potentials of the image and object spaces. Datafor this sort of application are generally presented in the form of plots of theratio V2/V1 against V3/V1. These plots are usually called ‘focal loci’ or ‘zoom

The optics of simple lens systems 63

Figure 4.2 Focal lengths (©), mid-focal distances (•) and sphericalaberration coefficients (�) for two aperture lenses having A = D1/2and V2/V1 = 10.

curves’ and lenses used in this fashion are referred to as ‘zoom lenses’, but thebehaviour is more restricted than the use of this term in light optics would imply.We shall return to this topic in section 4.6.1.

4.1.1.1 The Einzel lens. Nowadays this term is understood to refer to a threeelement lens operated symmetrically, with V3 = V1 and with V2 taking anyappropriate value. Historically the Einzel, or unipotential, lens signified a systemwith V2 held at the potential of the cathode. The major advantage of this is thatthe focal properties, depending as they do on voltage ratios, are the same for allvalues of V1 = V3. The disadvantage is that the aberrations of this lens are greaterthan those of the symmetric lens with V2 > V1,3.

The potential distribution in an Einzel lens is different in character from thatshown for the two cylinder lens in figure 2.1(b) and we illustrate this in figure 4.3.Apart from the bowing out of the equipotentials which occurs for any aperture lens,the potential close to the centre forms a saddle where the potential rises (say) inthe r-direction and falls in the ±z-direction. This general form will apply to anythree aperture lens for which V2 is outside the range spanned by V1 and V3. Oneof the equipotentials crosses the axis at the centre and it is simple to show that theangle between this equipotential (the value of which will depend on the thicknessof the aperture plates) and the axis is ±54.7◦. Expanding the axial potential as

V (z, 0) = V0(z, 0) + zV ′0(z, 0) + 1

2z2V ′′

0 (z, 0) + · · ·and substituting this into equation (2.2) gives

V (z, r) = V0(z, 0) + zV ′0(z, 0) + 1

2z2V ′′

0 (z, 0) − 14 r

2V ′′0 (z, 0) + · · · .


Figure 4.3 Equipotentials in an Einzel lens. Near the centre of thelens the equipotentials are at 0.300, 0.294 and 0.288 of the potentialdifference between the outer and centre electrodes. Close to the axisthe equipotentials are hyperbolæ, and at the centre they form a pair ofstraight lines which cross the axis at angles of ±54.7◦.

V ′0(z, 0) = 0 at the saddle point and V (z, r) = V0(z, 0) along the equipotential,

so the equation of the equipotential close to the saddle point is 12z

2 = 14 r

2 fromwhich the result for the angle follows immediately.

4.2 Cylinder lenses

Cylinder lenses offer a number of advantages over aperture lenses of the sameinner diameter. For a given voltage ratio they are a little stronger and have smalleraberration coefficients, but there are two quite important mechanical advantagesas well. They are easy to make and to mount: it is easier to turn cylinders on alathe than to punch clean-edged holes in metal sheet. It is also simple to makelens systems with rather small inter-electrode gaps. This is an important factor inkeeping the potential distribution in the gap at the inner surface of the electrodesclose to that assumed in the calculation of the entire potential distribution and alsoensures good screening from any stray electric fields around the lens structure.

The potential and field distributions in a system of several coaxial cylinderscan be found quite adequately by the superposition of those for contiguous pairs,provided that the gaps between the cylinders are small (say, �> 0.2 D) and thecylinders themselves are rather longer than the gaps. In practice it is common


to use cylindrical elements no shorter than D/2 (gap centre to gap centre) withgaps of D/10. This allows lens systems having a large number of cylindersto be designed and analysed, procedures which would be much more difficultwith aperture electrodes. If the gaps were large, it would be necessary to solveeach geometry with appropriate symmetric and antisymmetric potentials on allelectrodes, as in the case of the three aperture lens.

4.3 The calculation of ray paths

There are no purely analytic solutions to this problem, only various proceduresfor the numerical integration of the path starting from a known position and slopeat some point. The choice of method depends on the extent and quality of thedata. If only the axial potential is known, then integration of the Picht equationcan be done, and there is one procedure which is particularly well suited to this.On the other hand, if the potential and its derivatives are known quite accuratelythroughout the space, a direct application of the laws of simple dynamics willgive good results. If the potential is known everywhere, but to limited accuracy,various standard procedures are available. However, care should be taken todistinguish between the precision of the integration procedure and the accuracy ofthe trajectory. Ray tracing in the correct potential will give results which showthe effects of aberrations, while integration of the Picht equation will give only theparaxial, Gaussian, behaviour though corrections can be applied and aberrationcoefficients calculated even in this case.

4.3.1 Ray tracing in a known potentialIf the potential is not known really well throughout the space, the best approach isto use some standard expression for the integration of the differential equations,but if it and its derivatives are well known an analysis by direct application of thelaws of dynamics using expansions of the coordinates and velocities about eachpoint will give good results [16].

4.3.1.1 Integration of the equation of motion. Taking the time, t , to be theindependent variable we write the coordinates as

z(t) = z0 + t(dz

dt

)0+ t2

2

(d2z

dt2

)0+ t3

6

(d3z

dt3

)0+ · · · (4.1)

with a similar expression for r(t), where z(t) and r(t) are the axial and radialpositions of a particle, having initial coordinates denoted by the ‘0’subscripts, aftera short time interval t . The components of the velocity follow from differentiationof these equations

dz(t)

dt=

(dz

dt

)0+ t

(d2z

dt2

)0+ t2

2

(d3z

dt3

)0. (4.2)


The accelerations are given by(d2z

dt2

)0

= e

m

∂V (z, r)

∂z

and the third time-derivatives of the coordinates are therefore(d3z

dt3

)0

= e

m

(∂2V

∂z2

)0

(dz

dt

)0+ e

m

( ∂2V

∂z∂r

)0

(dz

dt

)0.

Substituting these expressions into equations (4.1) and (4.2), and introducing a‘reduced time’ τ = t

√2e/m to avoid the ‘ e

m’ which occurs in nearly every term,

we construct the following recurrence relationships.

z(t) = z0 + τ( dz

dτ

)0+ τ 2

4

(∂V

∂z

)0

+ τ 3

12

[(∂2V

∂z2

)0

( dz

dτ

)0+

( ∂2V

∂z∂r

)0

( dr

dτ

)0

]dz(t)

dτ=

( dz

dτ

)0+ τ

2

(∂V

∂z

)0

+ τ 2

4

[(∂2V

∂z2

)0

( dz

dτ

)0+

( ∂2V

∂z∂r

)0

( dr

dτ

)0

]with two similar expressions for the radial coordinate and velocity.

Only expressions for the potential which can be evaluated and differentiatedanalytically at any point, such as those expressed in terms of Bessel functionsor elliptic integrals, are likely to be good enough to yield accurate values forthe second derivative required for this type of analysis. The calculations aresurprisingly simple and we illustrate this using the potential in the gap regionof a two cylinder lens developed in section 3.3.6 (equation (3.26b))

VII (Z,R) = V1 + V2

2+ V2 − V1

GZ + V2 − V1

G

∞∑n=1

QnJ0(KnR)(1/a1n − a2n).

The five differential expressions can be written as

∂V

∂Z= (V2 − V1)

G− 2(V2 − V1)

G

∞∑n=1

QnKnJ0(KnR)(1/a1n + a2n)

∂V

∂R= −2(V2 − V1)

G

∞∑n=1

QnKnJ1(KnR)(1/a1n − a2n)

∂2V

∂Z2= 4(V2 − V1)

G

∞∑n=1

QnK2nJ0(KnR)(1/a1n − a2n)

∂2V

∂R2= − 2

R

∂V

∂R− ∂2V

∂Z2

∂2V

∂Z∂R= 4(V2 − V1)

G

∞∑n=1

QnK2nJ1(KnR)(1/a1n − a2n).


Very similar expressions apply in the outer regions, the principal differences arisingfrom the different exponents.

Simple modifications of the BesselPot function of section 3.6.3 may be usedto calculate these differentials. A Pascal procedure for the calculation of the J1

Bessel function required for the R-derivatives is given in [7].The accuracy of the ray integration depends on the interval in reduced time, Bτ ,

which is related to the step length, Bl, by Bτ = Bl/V 1/2 and it is important tochoose a suitable value. Too short an interval will waste computer time, but theoptimum interval will depend on the strength of the field and so it needs to beadjusted dynamically. The accuracy of the calculations is open to a simple test:the difference in potential between the initial and final points of each step shouldbe equal to the difference in the squares of the velocities of the particle multipliedby m/2e. If the fractional difference found after a step is greater than some chosenvalue, then the step should be recalculated with Bτ halved. If the differenceexceeds some smaller criterion, Bτ may be doubled for the next step.

4.3.1.2 Runge–Kutta methods. Unless we have a good enough potential wecannot make proper use of terms beyond the first order of the Taylor expansionof equation (4.1) and need to find some other way to make allowance for thehigher order terms. The Runge–Kutta methods are based on the replacement ofthese terms by first order expansions at nearby points. Figure 4.4 illustrates thesteps required to apply these methods. The function describing some aspect ofthe trajectory is represented by a curve starting at the point A, (xn, yn), and wewish to find the y-coordinate at C, for which xn+h = xn + h. We can writethis as yn+1 = yn + k where k is h times some mean gradient. The very firstapproximation would be to write k = k1 = hy ′(xn, yn) where the gradient of thefunction is evaluated at the point A(xn, yn). This would take us to the point D1

and is plainly not a good approximation.However, common sense and the mean value theorem tell us that at some point

along the curve the tangent is parallel to the chord AC joining the initial and thetrue final points. The next stage of approximation is therefore to evaluate k at themid-point, B1, of the chord AD1 and write k = k2 = hy ′(xn + h

2 , yn + k12 ), which

will take us to the point D2. This is a significantly better approximation and theerrors are of order h3: it is therefore referred to as the second order Runge–Kuttamethod. It is worth making two further steps of this sort, because the gain inprecision outweighs the complexity of the calculation, but beyond that point oneshould really question whether the potential function is good enough to justify theeffort. The four successive values for k can be summarized as

k1 = hy ′(xn, yn)k2 = hy ′(xn + h/2, yn + k1/2)

k3 = hy ′(xn + h/2, yn + k2/2)

k4 = hy ′(xn + h, yn + k3)

at A

at B1

at B2

at D3.


D1

C3

D

D2

1B

B2

A

Figure 4.4 A schematic illustration of the steps in the fourth orderRunge–Kutta method. AC represents the real trajectory and AD1 is thetangent to the trajectory at A. The gradient of the function at B1 is usedto find D2, the second approximation to C, then the gradients at B2 andD3 are evaluated and a weighted sum of the four gradients is then agood approximation to that of the chord AC.

These expressions have a nice symmetry and we write

k = k1

6+ k2

3+ k3

3+ k4

6where the weights given to the ki have been chosen such that the third and fourthorder terms in the Taylor expansions cancel and the errors are now of order h5

giving the fourth order Runge–Kutta method.To apply this to the tracing of ray paths, we write the axial, zn, and radial, rn,

positions and the corresponding velocities, un and vn respectively, in terms of thereduced time, τ , as before (i.e. un = dz

dτ etc) and apply the recurrence relations toall four variables.

zn+1 = zn + j rn+1 = rn + k un+1 = un + l vn+1 = vn + m

where j, k, l and m are the appropriate weighted sums. The individualapproximations are

j1 = hun

j2 = h

(un + l1

2

)

j3 = h

(un + l2

2

)j4 = h(un + l3)

l1 = −hEz(zn, rn)/2

l2 = −hEz

(zn + l1

2, rn + j1

2

)/2

l3 = −hEz

(zn + l2

2, rn + j2

2

)/2

l4 = −hEz

(zn + l3, rn + j3

)/2


with a similar set for the radial motion. h is an interval of reduced time, and Ez

and Er are the axial and radial components of the electric field strength which mustbe found at all the indicated points by numerical differentiation of the potential.

4.3.2 Integration of the Picht equationThe Picht equation describes the evolution of a reduced radius defined by R(z) =r(z)V 1/4(z, 0) as a function of z. It involves the axial potential, V (z, 0) and itsfirst derivative and contains only the second derivative of R(z). There is a standardprocedure for the integration of such equations which is faster than fourth orderRunge–Kutta with the same step size and is effectively of higher order. We writethe Picht equation as

d2R(z)

dz2= −T ∗(z)R(z)

where

T ∗(z) = 3

16

(V ′(z, 0)

V (z, 0)

)2.

T ∗(z) = 316T (z)2 where T (z) was introduced in chapter 2. Some authors use T (z)

where we useT ∗(z), but it is necessary to distinguish the two symbols becauseT (z)

itself will appear in expressions for the aberration coefficients. For compactnesswe shall omit the explicit references to z in R and T ∗ in what follows.

We shall integrate the equation using equal steps, h, in z and so write for thevalues of R at adjacent steps

Rn±1 = Rn ± hR′n + h2

2R′′

n ± h3

6R′′′

n + h4

24R(4)

n ± · · ·

and rearrange this to obtain

Rn+1 − 2Rn + Rn−1

h2= R′′

n + h2

12R(4)

n + · · · .

R′′n = −(T ∗R)n and the fourth differential of Rn can be written in terms of the

second differential of (T ∗R)n as

R(4)n = d2(−T ∗R)n

dz2

= − (T ∗R)n+1 − 2(T ∗R)n + (T ∗R)n−1

h2

using an exactly similar expansion of T ∗R, but neglecting the fourth order term inthe expansion, giving

Rn+1 − 2Rn + Rn−1

h2= −(T ∗R)n

− h2

12

[(T ∗R)n+1 − 2(T ∗R)n + (T ∗R)n−1]

h2


and, rearranging this equation, we finally have

Rn+1 = 2Rn − Rn−1 − h2

12 [(T ∗R)n−1 + 10(T ∗R)n]

[1 + h2

12T∗n+1]

(4.3)

This equation is known as Numerov’s algorithm† and it is very effective withinits limitations. Two values of R(z) are required to start the process, but this is nota problem if the starting point is in a field free region. If this is not the case, someother method has to be used to start the process, and a Runge–Kutta method wouldnormally be used. The step size has to be chosen carefully as there is no easy wayto adjust it dynamically, since the particle velocity is not calculated at any stage.This means that a step size suitable for the worst part of the trajectory has to beused throughout.

4.3.2.1 The Runge–Kutta method applied to the Picht equation. As usual, theRunge–Kutta method requires parameters to be calculated in the middle of theintegration steps. If the values of T ∗ are known at intervals (h/2) the Runge–Kutta method must use a step size of h while the Numerov method can use thesmaller step giving it a 16-fold advantage in precision. The expressions for thereduced radius R and its gradient R′ are

Rn+1 = Rn − Rn

h2

6

(T ∗n + 2T ∗

n+1/2 − h2

4T ∗n T

∗n+1/2

)

+ R′n

(1 − h2

6T ∗n+1/2

)

R′n+1 = −Rn

h

6

(T ∗n + 4T ∗

n+1/2 − h2

2T ∗n T

∗n+1/2

)

+ R′n

(1 − h2

3T ∗n+1/2

)− h

6T ∗n+1Rn+1.

Notice that it is necessary to calculate Rn+1 before R′n+1.

4.4 Real and asymptotic cardinal points

Particle lenses differ from those for light in having no abrupt boundaries betweenregions of different refractive index, so there is no clear cut point at which thelens begins or ends. Though the ray paths are asymptotically straight, away fromthe lens, a particle emerging from the lens field may cross the axis before thisasymptotic region is reached. If the particle had approached the lens parallel tothe axis, this crossing point will be a focal point and the mid-focal distance will

† It is also known as the Manning–Millman and the Fox–Goodwin methods.


be well defined, but how should the focal length be defined? In chapter 1 the focallength was defined as the distance of the focal point from a principal plane whichwas itself defined by the intersection of asymptotes. This is not appropriate to thepresent case and we must define a real principal plane in terms of the incidentasymptote and a tangent to the ray path at the focal point. These real cardinalpoints are of limited use in the analysis of lens behaviour because for an object notat infinity the image will be formed beyond the focal point and therefore closer tothe asymptotic region, if not fully in it. Fortunately, the distinction is necessaryonly for lenses which are stronger than are commonly used except, perhaps, formicroscope objectives.

4.4.1 The determination of the cardinal points by ray tracingIf the path of a ray starting some three to four diameters from the lens is traced,using any of the methods outlined in section 3 the position of the focal point canbe found by noting the values of z for which the radius, r , or reduced radius, R,changes sign and interpolating to find the position of the crossing point. This willbe the true focal point, but, unless the lens is very strong indeed, it will be very closeto the asymptotic focal point. If the ray path has not crossed the axis by the end ofthe integration, the focal point can be located by extrapolating the last few pointscalculated. The corresponding principal point is found from the slope of either thetangent to the path at the focal point or of the asymptote. With a good potentialthe effects of spherical aberration can be seen by changing the off-axis distance ofthe incident ray, and the spherical aberration coefficient can be determined. If thetrace is done by integrating the Picht equation, it is not necessary to convert backto real radius as both r and R are zero at the crossing point and the asymptote isonly reached in a region of constant potential. Nonetheless, it is interesting andinstructive to make the conversion as the approach to the asymptotes has a ratherdifferent appearance.

4.5 Windows and pupils

While it is the paths of individual rays which are calculated, measurements aremade with beams of particles and these have finite lateral and angular extent.These beams must be defined by suitable apertures, and in figure 4.5 we illustratesome of the terms and parameters involved. The beam is defined by two apertures,called conventionally the window and the pupil. The window is regarded as theobject on which the lens acts to produce an image and the distinction is sometimesmade between the object window and the image window. The pupil, or angle stop,defines the range of angles which may enter the lens from any point in the window.Naturally the lens produces an image of the pupil also, and the position and sizeof the image pupil are important parameters of an optical system. In figure 4.5 thelens is represented as a thin lens with coincident principal planes purely to avoidcomplexity in the diagram. The lateral dimensions are grossly exaggerated. The


Figure 4.5 Diagram to illustrate the formation of the image of anentrance, or object, window W0, by a bundle of rays limited by anentrance pupil, P0. The semi-angular divergence of the bundle of raysis specified by the pencil angle, θ0, and the maximum inclination ofthe central ray of the bundle to the axis in the plane of the entrancepupil is the beam angle, α0. The lens forms images of the object-sidewindow and pupil, and the lower diagram shows the special case wherethe object-side pupil is at the first focal point, giving an image pupil atinfinity and zero beam angle on the image side.

upper part of the diagram shows the paths of a number of rays from the bottom ofthe object window and a point within it which pass through the object pupil at itsupper and lower edges and also through the centre of the pupil. In the image spacethe rays from each point in the object window cross at the corresponding point inthe image window and those rays which passed through particular points of theobject pupil pass through the corresponding parts of the image pupil. Two sets ofangles are indicated: α and θ . α is known as the beam angle and is the symbolused for the angle between the principal rays in the construction of figure 1.3. Itis the angle subtended at the centre of the pupil by the radius of the window and istherefore the angle at which the central ray from the edge of the window crossesthe axis in the plane of the pupil. θ is the pencil angle, describing the angularspread of a pencil of rays from a point in the window, and is the angle subtended atthe centre of the window by the radius of the pupil. Suffices distinguish the object


and image space values of these angles. If the radii of the window and pupil arewritten as rw and rp respectively and the distance between them is l then the anglescan be expressed as

α = rw

lθ = rp

l.

The lower part of figure 4.5 shows a special case. The object pupil is at the firstfocal point of the lens and is therefore imaged at infinity. The image side beamangle is now zero and the total angular divergence of the beam, which can bewritten as (|α| + |θ |), is a minimum.

4.5.1 The energy-add lensIf a real image of a window is formed at the first principal plane of a lens, theemergent rays appear to come from a virtual image of the same size at the secondprincipal plane. These rays may be focused to give a final, real image. Thesimplest practical system consists of four cylinders. The central two cylindersform the energy-add lens proper, and exploit the fact that the positions of theprincipal planes of a two cylinder lens do not change much over a fairly widerange of voltage ratios. The lens formed by the first two cylinders produces theimage at the first principal plane of this central lens, and that formed by the lasttwo cylinders gives the final image. The voltage ratios V2/V1 and V4/V3 areheld constant, but the ratio V3/V2 is changed, adding energy e(V3 − V2) to theparticle. This apparently simple method of maintaining the position of the finalimage over a range of overall voltage ratio, V4/V1, has major drawbacks for allbut the crudest applications. While the imaging of the entrance window may bequite well controlled, there is no control over the position and size of the pupilimage formed by the first lens. The action of the second and third lenses will onlymake matters worse, most probably leading to excessive beam angles at the finalimage. The presence of an intermediate image is also likely to cause increasedaberrations. The simple multi-element lenses described in the next section aremuch better behaved.

4.6 Multi-element lenses

4.6.1 Zoom lensesA lens having two elements allows a conjugate focus condition to be met in just twoways, with an accelerating or a decelerating voltage ratio. The addition of a thirdelement allows the condition to be met over a wide range of overall voltage ratiosby a suitable choice of the potential of the third, centre, electrode. The focal locusfor one such lens is shown in figure 4.6. The magnification of the lens can only becontrolled in the most general fashion by, for example, selecting the lengths of thecylinders to give particular values at a limited number (three, at the most) of pointson the locus. In order to make a true zoom lens, one in which the magnification fora given pair of conjugates can be controlled independently of the overall voltage


Figure 4.6 Focal locus for a three cylinder lens having elements oflength 3.0, 0.75 and 2.5 diameters. The variation of the magnificationand of the spherical aberration round the locus are also shown. A •identifies corresponding points on the three loci.

ratio, a fourth element is necessary. It is difficult to obtain much flexibility withaperture lenses because the electrodes have to be comparatively close togetherbut, with cylindrical electrodes, the distance between the two outer electrodes,which determine the overall voltage ratio, can be usefully large. The two centralelectrodes must have potentials which combine in some way to satisfy the focuscondition, but this can be done in many ways and one might consider the twopotentials to have some average magnitude (though not a simple arithmetic mean)and also a ‘centre of action’ which can be moved to the left or right by changingtheir ratio. Figure 4.7 illustrates this for two four cylinder lenses with symmetricconjugates separated by six diameters and with their central two elements oflength D in one case and D/2 in the other. This figure shows the magnificationswhich may be obtained for an overall voltage ratio of unity. The lens with thelonger central elements offers a range of magnifications from 0.67 to 1.5 while theother will only cover from 0.87 to 1.15. Notice that the extreme magnificationsare obtained with both centre electrodes at higher potentials than the outer onesand not, as one might have expected, with one of the centre elements at the outerpotential. It is left to the reader to investigate (using the LENSYS program) other


Figure 4.7 The potentials of the two centre elements of four cylinderlenses of total length six diameters required to give a range ofmagnifications. The upper curves relate to a lens with centre elementseach of length one diameter, and the lower curves to one with lengthsof 0.5 diameter. The potentials of the first and fourth elements are eachequal to 1.

possible pairs of values giving the same magnifications and to deduce why theyare not good combinations.

The greater range of magnification obtained with the longer central elementsraises the question of further increase. However, just extending the lengths ofthese two elements will not help a great deal. A much better approach is to placea fifth electrode in the middle of the system, and for the moment one can think ofthis merely as a ‘spacer’ keeping the two magnification-controlling electrodes agood distance apart! Before discussing this way of obtaining zoom behaviour ona par with that obtainable with quite modest cameras, it will be helpful to examinea somewhat specialized five element cylinder lens.

4.6.2 The afocal lensWe consider in this section a combination of two lenses which has an interestingand useful property. Figure 4.8 shows the optics of this combination schematicallyand illustrates the construction, using principal rays, of the image of an object tothe left of the first lens of the pair. The essential point of the lens combination isthat the second focal point of the first (left hand) lens coincides with the first focalpoint of the second (right hand) lens. This means that the first principal ray, drawnfrom the top of the object parallel to the axis, passes through this common focalpoint and emerges from the second lens again parallel to the axis. The immediate


Figure 4.8 The construction of the image of a given object by twolenses having a common focal point in the region between them.

consequence of this is that, wherever the object may be, the image will have thesame size: in other words, the magnification of the lens combination does notdepend on the position of the object. The second principal ray is drawn throughthe first focal point of the left hand lens, emerges from that lens parallel to the axisand is then deviated to emerge from the right hand lens as though from the secondfocal point of that lens.

The focal lengths and the object and image distances are indicated on the figureand we use Newton’s relation to examine their relationship. The first lens wouldproduce an intermediate image I of O a distance q from the common focal pointsuch that q = f1f2/p wherep is the distance of the object from the first focal pointof the left hand lens. This image lies beyond the first focal point of the secondlens and is imaged by that lens as I ′, a distance q ′ from the second focal point ofthis lens such that q ′ = f ′

1f′2/q. In terms of the position of the original object

q ′ = (f ′1f

′2/f1f2)p. If the two lenses are identical then we have the interesting

result that q ′ = p and the final image is the same distance from the second focalpoint of the right hand lens as the object is from the first focal point of the lefthand lens. The fact that rays entering parallel to the axis emerge parallel to theaxis means that the lens system has no focal points—it is afocal or telescopic.

Remembering that F1 is a negative quantity, the distance between the objectand the final image is 2(F2 − F1), and this must be held constant, independent ofthe overall voltage ratio, if the special property of this lens is to be maintained.The values of F1 and of F2 will change, of course, and if the difference is not tochange, each lens must consist of three electrodes. The last element of the lefthand lens and the first element of the right hand lens are at the same potential(otherwise there would be further lens action in the middle of the system) so theentire lens consists of five elements. Because the two lenses are identical


V2

V1= V4

V3and

V3

V1= V5

V3

and thereforeV3

V1=

(V5

V1

)1/2.

The magnification of the lens can be calculated from the figure and isM = f ′1/f2 =

f1/f2 for identical lenses. The angular magnification is given by

Mα = α′

α=

( h

f ′2

)/( h

f1

)= f1

f ′2

= f1

f2= M.

The magnification, angular magnification and the overall voltage ratio are relatedby the law of Helmholtz and Lagrange

MMα

(V5

V1

)1/2= 1 so M = Mα =

(V5

V1

)−1/4.

Apart from the trivial case of a lens which is electrically and mechanicallysymmetric, this is the only lens for which the magnification can be calculatedwithout any knowledge of the potential distribution; it depends only on the identityof the two component lenses.

An important application of this lens is in the production of a well definedparticle beam. The images of an object side window and pupil will have the sameaxial separation and be magnified by the same factor, so if the object side aperturesare of equal diameter, the particle beam will be bounded by a cylindrical surfacein a region of the image space though no apertures are present there, reducing thechance of unwanted collisions of the beam with the metalwork. The beam angleand the pencil angle will be equal in this situation.

4.6.3 The zoom afocal lensIf we remove the constraint on the two component lenses of figure 4.8, but maintainthe same distance between conjugates, the lens system will not, in general, remainafocal. It does then, however, have the desirable property of the zoom lens. Themagnification is controlled by the relative strengths of the first and second lensesin just the same way as is done by the two centre electrodes of a four cylinder lens.For the moment we keep the potential of the centre electrode at the geometric meanof those of the first and fifth electrodes and show in figure 4.9 (by broken curves)the values of the potentials of the second and fourth electrodes required to givea range of magnifications. The particular lens geometry consists of two lenses,each of three cylinders having lengths of 1.25, 0.5 and 1.25 diameters joined endto end to give a five element lens with a centre element 2.5 diameters long and atotal length of six diameters, the same as the two four cylinder lenses shown infigure 4.7. The very substantial increase in the range of magnifications is apparent.


Figure 4.9 The electrode potentials required to give a range ofmagnifications for a five cylinder lens of total length six diameterswith V5 = V1 = 1. Two sets of data are shown: the broken curves arefor the case of V3 = 1, and the solid curves are for V3 adjusted to givea pupil magnification equal to the window magnification.

Only four elements are necessary for a lens to have a zoom capability overa range of voltage ratios, and the present lens has, therefore, an extra elementwhich might be used to control another parameter of the lens. There is more thanone possibility, but in the present context the control of a second conjugate pairwould allow the lens to be again afocal. The effect of adjusting the potential ofthe centre electrode (with compensating adjustments of those of the second andfourth) is illustrated in figure 4.10 for this same six diameter length lens. Thepotentials have been adjusted to give a window magnification of −1 and the figureshows the magnification of a pupil located D/2 to the right of the window as afunction of the voltage ratio V3/V1 for a number of values of the overall voltageratio, V5/V1. The points marked • correspond to V3/V1 = √

V5/V1. Notice that,for both accelerating and retarding lenses, V3 has to be increased to make the pupilmagnification equal that of the window and so make the lens afocal. For an overall


Figure 4.10 The pupil magnification of a five cylinder lens as afunction of the potential of the centre element for various values ofthe overall voltage ratio, V5/V1. The lens is operated to give a windowmagnification of −1 and the object pupil is at a distance D/2 from theobject window.

voltage ratio of unity, the potentials of the three centre electrodes are shown alsoin figure 4.9 (by full curves) for this zoom afocal configuration.

5 Aberrations

So far we have considered only the focusing of paraxial rays of charged particleshomogeneous in energy. The true situation is more complex and we shouldconsider rays which are non-paraxial and particles which have a small spreadin energy. Particle lenses are subject to the same optical aberrations as photonlenses and the effects can be much worse because it is not possible to use materialsof different dispersion to reduce chromatic aberration or to grind surfaces of specialforms to reduce spherical and other aberrations.

5.1 Spherical aberration

The handling of particle beams is usually an axis-centred problem so sphericalaberration is normally much more serious than the off-axis aberrations, such ascoma, astigmatism and distortion. We shall therefore consider only rays whichcross the axis at some point. Figure 5.1 is a schematic diagram showing theemergent asymptote corresponding to a ray incident from an axial object point.For meridional rays, the relationship of the radial positions and slopes of the rays atthe first and second focal planes of the lens can be expressed, following Verster [6]as

r ′2 = − r1

f2+ m13r

′31 + m14r

′21r1

f2+ m15r

′1

( r1

f2

)2

+ m16

( r1

f2

)3+ q11r

′51 + q12r

′41r1

f2+ q13r

′31

( r1

f2

)2

+ q14r′21

( r1

f2

)3+ q15r

′1

( r1

f2

)4+ q16

( r1

f2

)5+ · · · (5.1a)

− r2

f1= r ′

1 + m23r′31 + m24r

′21r1

f2+ m25r

′1

( r1

f2

)2

+ m26

( r1

f2

)3+ q21r

′51 + q22r

′41r1

f2+ q23r

′31

( r1

f2

)2

+ q24r′21

( r1

f2

)3+ q25r

′1

( r1

f2

)4+ q26

( r1

f2

)5+ · · · (5.1b)

80

Aberrations 81

r

ir’

2

l

2

F2

q

r

1H

2H

1r

F1

o

p

r’1

Figure 5.1 Definition of the ray parameters for the description of themeridional aberration coefficients.

remembering that f1 is negative†. The coefficients, mij and qij , are dimensionlessand also negative. The terms of fifth order are not often used, but are includedhere as we shall later show how some of them may be deduced from fairly simpleray traces.

The coefficient of spherical aberration for a lens operated with magnificationM , Cs(M), is defined by

Br = −MCs(M)α3o

where αo = −r ′1. From figure 5.1 we can write r1 = −pr ′

1, and Br = r2 + qr ′2

for the distance off-axis at which the ray crosses the Gaussian image plane. Theobject and image distances can be written in terms of the focal lengths and themagnification as p = −f1/M and q = −Mf2. Substitution of these expressionsinto equation (5.1), ignoring the fifth order terms, gives the following relationshipbetween the coefficient of spherical aberration and the meridional coefficients:

Cs(M) = −m13f2 − (m14 + m23)f1

M− (m15 + m24)

(f1

M

)2/f2

− (m16 + m25)(f1

M

)3/f 2

2 − m26

(f1

M

)4/f 3

2 . (5.2)

Each term of equation (5.2) has a positive value so no cancellation is possible. Cs

has the dimensions of Br , but it is a common convention to express both in termsof the diameter of the lens, and all the values presented in this book and in theLENSYS program are strictly of Cs/D. Equation (5.2) shows that Cs(M) could beexpressed as a fourth order polynomial in the object distance, p = −f1/M , but itis more usual to write it in terms of the magnification, M ,

Cs(M) = Cs0 + Cs1/M + Cs2/M2 + Cs3/M

3 + Cs4/M4 (5.3)

where the Csi are properties of the lens itself.

† A number of the equations in this chapter differ from their counterparts in other texts asa consequence of our Cartesian sign convention.


There are certain well established relationships between some of the eightmeridional aberration coefficients

m14 = 3m23 + 1.5 m15 = m24 m25 = 3m16 + 1.5 (5.4)

and so the five spherical aberration coefficients are

Cs0 = −m13f2

Cs1 = −(m14 + m23)f1 = −(4m23 + 1.5)f1

Cs2 = −(m15 + m24)f21 /f2 = −2m24f

21 /f2

Cs3 = −(m16 + m25)f31 /f

22 = −(4m16 + 1.5)f 3

1 /f22

Cs4 = −m26f41 /f

32 .

(5.5)

If the object is at infinity, r ′1 = 0 and the magnification is therefore zero. It is

then necessary to use equation (5.1b) directly. Including the relevant fifth orderterm, q26(r1/f2)

5, we have

Br = r2 = −m26f1

( r1

f2

)3− q26f1

( r1

f2

)5.

The third order term can be written as

Br = Cs4

( r1

f1

)3. (5.6)

5.1.1 The axial displacementThe aberrated ray crosses the axis a distance Bl before the Gaussian image pointwhere

Bl = −Br

r ′2

= −MCs

α3o

αi

= − M

Mα

Csα2o

= M2Csα2o

f2

f1= M4Csα

2i

(f2

f1

)3(5.7)

and where αi = −r ′2 is the beam angle in the image space.

For parallel input M = 0 so M4Cs = Cs4 and, using αi = r1/f2,

Bl =(f2

f1

)3Cs4α

2i = f2

f1Cs4

( r1

f1

)2. (5.8)

If we wish to include the fifth order term without excessive algebra, we can write

r ′2 = −r2/Bl = −r1/(f2 + Bl)

Aberrations 83

gh

l

gz

G

mz

m

Figure 5.2 Aberrated rays in the region of the disc of minimumconfusion.

giving

r2 = r1Bl

f2 + Bl

from which we find the following working expression:

G(Bl) =(f2

f1

) Bl

f2 + Bl= −m26

( r1

f2

)2− q26

( r1

f2

)4(5.9)

which we shall use later.

5.1.2 The disc of minimum confusionIf the object is a point source of particles, the ‘image’ in the Gaussian plane willbe a round spot with a radius determined by the pencil angle and the aberrationcoefficients of the lens. Rays which have passed close to the limits of the entrancepupil will contribute to the outer parts of this spot, having crossed the axis somewhatcloser to the lens. Between the Gaussian plane and the point on the axis wherethese marginal rays are focused, there will be some rays which cross the axis goingupwards and some going downwards, with the former being more steeply inclined,and the spot formed by the envelope of all the rays will have a minimum radiussomewhere between these points. This spot is known as ‘the disc of minimumconfusion’.

Figure 5.2 shows the marginal ray and a general ray which intersect a distance hfrom the axis; the axial position of the intersection is a distance l from the Gaussianfocus, G. We can express h in two ways

h = αg(zg − l) = αm(zm − l)

Aα3g − lαg = Aα3

m − lαm


sol = A(α2

m + αmαg + α2g)

andh = −A(α2

mαg + αmα2g)

where A = Cs4

(f2/f1

)3by equation (5.8), and is negative. We find the minimum

value of h by differentiation

dh

dαg

= −A(α2m + 2αmαg) = 0 if αg = −αm

2

and this leads to the radius of the disc of minimum confusion as

hmin = Aα3m

4

and its position as

l = 3

4Aα2

m.

The radius is a quarter of that of the Gaussian plane spot and it is three quarters ofthe way from the Gaussian image to the most aberrated image point.

5.2 Chromatic aberration

Figure 5.3 illustrates the passage of two pairs of rays through a lens system ignoringthe effects of spherical aberration. One ray of each pair has the correct energy toform an image at I. The other has a slightly higher energy. We consider first thepair of rays which start from a point on the axis. The ‘correct’ ray crosses the axisat the image, but the higher energy ray crosses the axis a distance δl beyond thisand, at the image plane, is a distance δr above the axis. These distances are relatedby the angle, αi in the image space and can be referred to the corresponding angleat the object using the Helmholtz–Lagrange relation.

δr = αiδl = Mααoδl = 1

M

(Vo

Vi

) 12αoδl.

A coefficient of chromatic aberration, Cc, has historically been defined byδr = −MCcαo

δVVo

and we may, therefore, express this coefficient in terms ofeither observable by

Cc = 1

Mαo

δr

(δV/Vo)(5.10a)

= 1

M2

(Vo

Vi

) 12 δl

(δV/Vo). (5.10b)

Aberrations 85

I

r 2(z )

r (z ) δl

δr

Figure 5.3 Two pairs of rays traced to illustrate chromatic aberration.The lower ray of each pair has the correct energy to form an image at Iand the other rays, which have a slightly higher energy, cross the imageplane at points which are displaced from the true image points.

The second of these equations illustrates the non-dependence of the aberrationcoefficient on the angle αo.

This coefficient is properly referred to as the coefficient of axial chromaticaberration to distinguish it from the effect illustrated by the other pair of rays infigure 5.3 which are launched parallel to the axis from a point above the axis. The‘correct’ ray crosses the image plane and defines the magnification of the lens.The higher energy ray crosses the image plane closer to the axis, with a lowermagnification. This effect is described as chromatic aberration of magnificationand the coefficient, CD , which quantifies the effect is sometimes referred to asthe coefficient of lateral chromatic aberration. It is defined in terms of the radialchange, δrI , in the ray at the image plane by the following equations.

δrI = −MCDro

(δV

Vo

)

CD = −1

Mro

δr

(δV/Vo)(5.11)

where ro is the initial off-axis distance of the ray and, for the case of the generalray, r2(z), which crosses the axis at an angle of 45◦, is equal to f1.

This aberration is an off-axis effect and we discuss, in the next section, theoff-axis effects due to the angle dependent behaviour.


r

0rM

q

r2

r’2

2F1

HH

r1

21F

p

0r

Figure 5.4 Definition of the ray parameters for the description of theoff-axis aberrations of a lens.

5.3 Off-axis aberrations

Figure 5.4 is similar to figure 5.1, but indicates the asymptotes to a ray from anobject point, a distance r0 from the axis, which crosses the first focal plane ata radial distance r1. The path of the ray is still in a plane containing the axis;we shall not be considering skew rays. Using equations (5.1) again, we writethe position of the ray at the image plane as the sum of the geometric, Gaussian,distance, Mr0, and an aberration term which can be written as the sum of termsinvolving r3

0 , r20 r1, r0r

21 and r3

1 .

Br = Mr0 + r30

[M4m13

f2

f 31

+ M3m231

f 21

]

− r20 r1

[3M4m13

f2

f 31

+ M3(m14 + 3m23)1

f 21

+ M2m241

f1f2

]

+ r0r21

[3M4m13

f2

f 31

+ M3(2m14 + 3m23)1

f 21

+ M2(m15 + 2m24)1

f1f2+ Mm25

1

f 22

]

− r31

[M4m13

f2

f 31

+ M3(m14 + m23)1

f 21

+ M2(m15 + m24)1

f1f2

+ M(m16 + m25)1

f 22

+ m26f1

f 32

].

The terms in r30 describe the distortion of the image, because they imply a

magnification which depends on the position of the point in the object but not on thelens aperture. From equation (5.5) we see that Cs0 and Cs1 describe this aberration.The terms in r2

0 r1 describe the combined effects of astigmatism and curvature offield and those in r0r

21 describe coma. The terms in r3

1 describe spherical aberration

Aberrations 87

and transform to the form of equation (5.2) on writing r1 = − f1

Mαo. If r0 = r1,

which implies M = 0, only the term in m26, equivalent to Cs4 , remains andequation (5.6) is recovered. Notice that m26 appears only in the coefficient of r3

1 .While the terms in r2

0 r1 determine the curvature of field for a specificmagnification—and the curvature is usually dominated by the effect ofastigmatism—there is a limiting value which obtains even in the absence ofastigmatism. This minimum curvature is expressed by the Petzval integral, which isnever zero for a particle lens. The integral can only be zero if there are contributionsfrom divergent lens elements and, as we saw in chapter 2, immersion lenses, withwhich we are exclusively concerned, are always convergent. It is a property ofthe given lens and depends only on the potential distribution and not on the choiceof conjugates. It is, therefore, most readily calculated along with the basic lensparameters.

5.4 Interrelations of the spherical aberration coefficients

By considering a lens to be ‘thin’ in the sense that R does not change over thedistance for whichT is significant, it is possible to factor a term out of the aberrationintegral and hence develop simple relationships between the mij coefficients [17].It has been demonstrated [16] that these remain reasonably valid even when the‘thin’ approximation does not appear to be appropriate. The relationships areof the form m2j /m1j = σ where the value of the ratio has been determinedempirically as σ = (V2/V1)

1/4 for a two cylinder lens. Taken in conjunction withthe well established relationships between three pairs of mij (equation (5.4)) andignoring the added ‘1.5’ terms, which is a reasonable approximation except forvery strong lenses, there appears to be a pattern of interrelations between all themij coefficients,

mi6 � σmi5/3 � σ 2mi4/3 � σ 3mi3 m2j � σm1j .

With these factors, equation (5.2) becomes

Cs(M) = − m13f2

[1 + 4

σ

M

f1

f2+ 6

( σ

M

)2(f1

f2

)2

+ 4( σ

M

)3(f1

f2

)3+

( σ

M

)4(f1

f2

)4]. (5.12)

If σ is indeed (V2/V1)1/4, then σ 2 = −f2/f1 so σf1/f2 = −1/σ and, writing Cs0

for −m13f2, we can express the spherical aberration coefficient for a givenmagnification as

Cs(M) = Cs0

(1 − 1

σM

)4

. (5.13)


H(r) R

2r

2F

2r’

1H(r)

2

1F

1r

r

1r’

Figure 5.5 Diagram to illustrate the curved principal surfaces.

A careful study of more complex lenses shows that this is indeed ratherapproximate, but another result based on the same relationships does seem tohave a wider validity. We define a function, Y , by

Y = Cs0

(−σf1

f2

)4+ Cs1

(−σf1

f2

)3+ Cs2

(−σf1

f2

)2+ Cs3

(−σf1

f2

)+ Cs4

(5.14)and substitute for the Csi the appropriate terms of equation (5.12), giving

Y = Cs0

(−σf1

f2

)4(1 − 4 + 6 − 4 + 1) = 0.

If we write Y+ to represent the sum of the first, third and fifth terms ofequation (5.14) and Y− to represent the sum of the second and fourth (whichis usually negative), then the ratio Y++Y−

Y+−Y−will be a fair measure of the accuracy of

the statement Y = 0. For a wide range of lenses (and the reader is invited to testthis using LENSYS) this ratio is less than 0.01 and usually much less.

5.4.1 The principal surfacesWe noted in chapter 1 that the principal ‘planes’were not in fact flat and we are nowin a position to examine their true form. Figure 5.5 shows the principal surfaces asdefined by the intersection of the object and image asymptotes of rays incident at aradial distance, r , parallel to the axis. We can express the distance from the secondGaussian focal point to the projection of the intersection on to the axis, H2(r), as

F2 − H2(r) = −(r2 + r)/r ′2

Aberrations 89

using equations (5.1)

F2 − H2(r) = f2 + (f1m26 + f2m16)( r

f2

)2

and writing σ 2 = −f2/f1

F2 − H2(r) = f2

[1 + m13σ

2(σ − 1)( r

f2

)2].

A similar analysis gives, for the first principal surface,

F1 − H1(r) = −(r1 + r)/r ′1

= f1 + (f1m23 + f2m13)( r

f1

)2

= f1

[1 − m13σ(σ − 1)

( r

f1

)2].

Note that m13 is negative and σ > 1 for an accelerating lens, so both principalsurfaces are concave towards the high potential side of the lens.

5.5 The determination of aberration coefficients

If the potential distribution throughout the lens is known with sufficient accuracy,rays will follow their correct paths and, if they are not paraxial, will show the effectsof the lens aberrations. In principle, then, all that is needed is careful ray tracingto find the intercept of the ray at the Gaussian image plane. This raises the firstproblem: where is the Gaussian image plane? It is necessary to trace more thanone ray and extrapolate to zero αo or r1 as appropriate and it is tempting to use theaxial displacement directly. If only the axial potential is known, then calculationsmust be based on the effect of the terms omitted from the axial equation of motion.

5.5.1 Direct ray tracingFigure 5.6 shows rays traced in an accelerating and a decelerating lens.The much greater effect of spherical aberration in the retarding lens is apparent.To illustrate a procedure for the determination of the aberration coefficientsappropriate to parallel input rays we shall analyse some data obtained with the‘student’ version of the CPO-2D program.

For the purposes of the present analysis, the important parameters of thetrajectory are those indicated in figure 5.7. The position of the focal point, F2,is not known a priori and we cannot, therefore, determine r2 directly, but,frommeasurements made at the reference plane, the program provides values for theaxial position, zi , at which a ray launched parallel to the axis at a radial distancer1 crosses the axis and the slope, r ′

2, of the ray at that point. From these values


Figure 5.6 Rays traced in a two cylinder lens having voltage ratiosof (a) 1 : 10, and (b) 10 : 1. The aberrations of the retarding lens are sogreat that the outermost ray of (a) would strike the inner surface of theelectrode well before the end of the diagram.

∆l pla

nepla

ne

z r

r

refe

ren

ce

Ga

uss

ian

F

Figure 5.7 The path of an aberrated ray which crosses the Gaussianplane at a distance r2 having crossed the axis at a position zi a distanceBl before.

we can determine the focal parameters f2 and F2, and the aberration coefficientsm16, m26, q16 and q26. If we trace rays in a second lens, having the reciprocal

Aberrations 91

–0.25

0

0.25

0.50

0 0.05 0.10 0.15 0.20

r /f

r'G

(∆l)

Figure 5.8 Data plots used to determine the spherical aberrationcoefficients of the lenses shown in figure 5.6. The curves representr ′

2 = − r1f2

+m16(r1f2)3 +q16(

r1f2)5 and G(Bl) = m26(

r1f2)2 +q26(

r1f2)4.

The solid data points refer to the accelerating lens and the open datapoints to the decelerating lens.

voltage ratio, we also find f1, F1, m13, m23, q11 and q21.The first step is to find F2 by writing zi = F2 + Bl, where Bl is negative

and is given by equation (5.8). A graph of zi against r21 is almost a straight line

and a quadratic fit will give a very good value for F2. We next plot r ′2 against r1

and fit an expression in odd powers of r1. From equation (5.1a) the coefficientof the first term is −1/f2 and those of the other terms will give the aberrationcoefficients. It is a little neater to repeat this with r1/f2 as the abscissa and weshow such a plot in figure 5.8. Now that we have a value for f2 it is possible touse equation (5.9), remembering that the ratio f2/f1 = −(V2/V1)

1/2, to find theremaining coefficients. In figure 5.8 we show the results for both the accelerating(1:10) and decelerating (10 :1) lenses. The values of the coefficients are found tobe

1 : 10

m16 = −6.41

m26 = −12.1

q16 = −100

q26 = −134

10 : 1

−2.27

−1.42

−10.8

−10.6

Figure 5.9 shows values of the chromatic aberration coefficient, Cc, determinedby measuring the image plane ‘miss distance’, δrI , for a ray launched with an


1/M

Cc

Figure 5.9 The coefficient of chromatic aberration for a two cylinderlens with a voltage ratio of 1 : 10. The data can be represented byCc(M) = 0.834 − 1.005/M + 0.314/M2.

energy 2% greater than nominal in a two cylinder lens with a potential ratio of1 : 10. The coefficients of a power law fit to the data agree very well with valuescalculated directly using expressions which we develop later.

5.5.2 Perturbation of the paraxial solutionWe can modify the equation of motion to take account of terms omitted in ourearlier analysis or to allow for a spread in energy and, provided that the effects aresmall, we can write the solution to the modified equations as the simple sum of theparaxial solution and an additional term. The details of the calculation depend onthe problem and are expanded in the next two sections.

5.5.2.1 Spherical aberration. In chapter 2, the higher order terms in theexpansion of the axial potential were ignored as was the contribution made to thetotal energy of the particle by its radial velocity. These simplifications led to theparaxial equation of motion, equation (2.3), and then to the Picht equation, (2.4).In order to describe the non-paraxial behaviour, it is necessary to replace theseterms. The equation of radial motion becomes

md2r

dt2= eEr = e

[ r2V ′′ − r3

16V (4)

]and the total energy of the particle has to be written as

1

2m

[(dz

dt

)2+

(dr

dt

)2]+ e

[V − r2

2V ′′

]= 0.

Aberrations 93

The equation of motion is then

d2r

dz2+ 1

2

V ′

V

dr

dz+ r

4

V ′′

V=

[ 1

32

V (4)

V− 1

16

(V ′′

V

)2]r3

+ 1

8

(V ′′′

V− V ′′V ′

V 2

)r2r ′ − 1

4

V ′′

Vrr ′2 − 1

2

V ′

Vr ′3. (5.15)

To solve this equation, we write r = rp +ra where rp is the solution to the equationof paraxial motion and ra is a small perturbation representing the aberration dueto the use of better approximations to the real potential. ra is small enough thatonly first order terms need to be retained. Each term of the left hand side ofequation (5.15) splits into one containing rp and one containing ra . Those in rpsum to zero, because they are just the terms of the paraxial equation. The termson the right hand side involve the third power of r and so only the rp part survives,giving an equation for the perturbation, ra ,

d2ra

dz2+ 1

2

V ′

V

dradz

+ ra

4

V ′′

V=

[ 1

32

V (4)

V− 1

16

(V ′′

V

)2]r3p

+ 1

8

(V ′′′

V− V ′′V ′

V 2

)r2pr

′p − 1

4

V ′′

Vrpr

′2p − 1

2

V ′

Vr ′3p . (5.16)

Solutions of the homogeneous equation, obtained by setting the right hand sideof equation (5.16) equal to zero, are known: they are just the general solutionsof the Gaussian ray equation discussed in chapter 1. We can therefore write theperturbation as ζ(z)r1(z) + ξ(z)r2(z) where we stress the variability of ζ and ξ ,reflecting the non-zero right hand side. The problem reduces to the calculationof ξ(zi) where zi is the position of the Gaussian image. After a lot of algebra,two integrations by parts and transformation of the radial variable to the reducedradius, R = rV 1/4, the coefficient of spherical aberration can be written as

Cs(M) = 1

64V 1/20

∫ ∞

−∞

R4

V 1/2

[(3T 4 + 5T ′2 − 11

2T 2T ′) + 4T T ′R

′

R

]dz

(5.17)where V0 is the potential at the lower limit of integration†. In this equation, R(z)

corresponds to the ray which passes through the axial object point with unit slope.Every variable, bar one, in this equation will have been determined in the courseof calculating the paraxial image position. The missing variable is T ′ and it is atrivial matter to find this. We noted earlier that the sign of T is not important to thecalculation of the paraxial focus as it appears as T 2 in the Picht equation, but thepresence of T ′ here shows that it does matter for the calculation of the aberrations.

† The use of the reduced radius has transformed a term in (V/V0)1/2 into (1/V0V )1/2.

Almost always, the potentials will be expressed in terms of V0.


Figure 5.10 The integrands of the aberration integrals of equa-tion (5.17) for three aperture and Afocal8 lenses.

Though the limits of integration are marked as ±∞, in practice a much shorterinterval will suffice.

Figure 5.10 shows the integrand of equation (5.17) for the cases of a threeaperture lens, used asymmetrically, and a five cylinder lens used in the afocalmode. The range of axial position over which the integrand is significantly greaterthan zero is appreciably less than the distance between the conjugate points. For atwo element lens, we might expect the integrand to show two maxima, one whereR is large and the other where R′ is large. In the three element lens, especially withaperture electrodes, two of the four possible maxima run into each other, but thecylinder lens shows two maxima associated with each gap. The ray paths shownin this figure represent the true radius, r(z).

The ratio of the reduced and true radii has the same value at any point for alltrajectories, so we can write R(z) = R1(z) + 1

MR2(z) and expand the radial terms

Aberrations 95

Figure 5.11 The integrands of the integrals analogous to equa-tion (5.17) for the five spherical aberration coefficients for a two aper-ture lens. The curves can readily be associated with particular coeffi-cients as these are markedly different in magnitude.

asR4 = R4

1 + 4R31R2/M + 6R2

1R22/M

2 + 4R1R32/M

3 + R42/M

4

and similarly for R3R′. This allows us to separate the integral of equation (5.17)into the sum of five integrals, one for each of the Csi . Each integral is written asthe product of terms involving the T -parameters,

c1 = 1

V 1/2

[3T 4 + 5T ′2 − 11

2T 2T ′

]

c2 = 4T T ′

V 1/2

which, being functions of only the electrode geometry and potentials, are the samefor all applications of the lens, and terms specific to each coefficient

I0 = c1R41 + c2R

31R

′1

I1 = c14R31R2 + c2(3R

21R2R

′1 + R3

1R′2

I2 = c16R21R

22 + c2(3R1R

22R

′1 + 3R2

1R2R′2)

I3 = c14R1R32 + c2(R

32R

′1 + 3R1R

22R

′2)

I4 = c1R42 + c2R

32R

′2.

(5.18)

r1 and r2 are asymptotic to ±f1, which is not known a priori so the focalproperties must be calculated using arbitrary horizontal asymptotes (usually 1) andthen scaled to provide the values required for the aberration calculation. However,


to calculate the coefficient Cs(M) between conjugates, the ray R(z) can be usedwith no further scaling. Figure 5.11† shows the integrands used to calculate thefive Csi . The individual curves can easily be associated with particular coefficientsand for this lens they are, from the top, Cs2 , Cs0 , Cs4 , Cs3 , Cs1 .

5.5.2.2 The Petzval integral. The Petzval sum describes the curvature of theimage surface in the absence of astigmatism. In photon optics, where one has sharpboundaries separating regions of different refractive index, the radius of curvaturecan be expressed in terms of the radii of the boundaries and the refractive indicesand, using equations (1.3) to introduce the focal lengths of each boundary, reducesto 1/rp = ∑

(1/njfj ). The surface is a paraboloid of revolution and is concavetowards the lens elements if

∑(1/njfj ) is positive.

This will be the case for a single lens, but it is quite possible for the Petzvalsum to be zero for discrete lens combinations. Indeed there is a well knownphotographic lens, the Petzval lens, which has a slight telephoto action as it consistsof a convergent element followed by a divergent element. Notice that the detailsof the ray path make no contribution to the sum.

For charged particle optics we have the Petzval integral, a direct analogue, inwhich the square root of the axial potential has the role of the refractive index andT ′ accounts for the bending of the rays. The full expression is

P tz = 1

8

∫ ∞

−∞

(Vo

V

)1/2T ′ dz. (5.19)

Some examples of the integrand are shown in the appendix and, though the changesof sign are apparent, the integral as a whole is never negative.

5.5.2.3 Chromatic aberration. In this case, the perturbing factor can beexpressed as a constant change in the potential, and we can write the equationof motion as

d2r

dz2+ 1

2

V ′

(V + δV )

dr

dz+ r

4

V ′′

(V + δV )= 0

and, as δV is small in comparison with V , we can multiply top and bottom byV − δV to give

d2r

dz2+ 1

2

V ′

V

dr

dz+ r

4

V ′′

V= δV

[1

2

V ′

V 2

dr

dz+ 1

r

V ′′

V 2

].

We continue, as before, by writing r = rp+rc where rc is a change in the trajectory,small enough that the product rcδV can be neglected.

d2rc

dz2+ 1

2

V ′

V

drcdz

+ rc

4

V ′′

V= δV

[1

2

V ′

V 2

drpdz

+ rp

4

V ′′

V 2

].

† Figures 5.10 and 5.11 have been prepared with a version of LENSYS modified to print largerdots.

Aberrations 97

We again write the perturbation as ζ(z)r1(z) + ξ(z)r2(z) where r1 and r2 are thegeneral solutions of the ray equation. In this case we are interested in the valuesof both ζ(zi) and ξ(zi)

Using the expressions for the chromatic aberration coefficients (equations (5.10)and (5.11)) and then changing the radial variable to the reduced radius we find

Cc = 1

V1/2

0

∫ ∞

−∞

3

8

V ′2

V 5/2r2 dz

= 1

V1/2

0

∫ ∞

−∞

3

8

V ′2

V 2R2 dz

= 1

V1/2

0

∫ ∞

−∞

3

8T 2R2 dz (5.20)

and

CD = 1

V1/2

0 f1

∫ ∞

−∞

3

8

V ′2

V 5/2r2r dz

= 1

V1/2

0 f1

∫ ∞

−∞

3

8

V ′2

V 2R2R dz

= 1

V1/2

0 f1

∫ ∞

−∞

3

8T 2R2R dz. (5.21)

We have also substituted T for the ratio of the potential gradient to the potential.In these expressions R(z) is the ray which passes through the axial object

point with unit slope and we can expand it in terms of the general solutions byR(z) = R1(z)+(1/M)R2(z). This enables us to find the magnification dependenceof the chromatic aberration coefficients. As in the case of spherical aberration,these coefficients have to be calculated using arbitrarily normalized representationsof r1 and r2 and then scaled by multiplying by f 2

1 . Only in the calculation ofCc(M)

is this scaling not necessary, because R(z) is the correctly normalized reduced ray.

Cc = 1

V1/2

0

∫ ∞

−∞

3

8T 2

(R2

1 + 2

MR1R2 + 1

M2R2

2

)dz

= Cc0 + Cc1/M + Cc2/M2 (5.22)

CD = 1

V1/2

0 f1

∫ ∞

−∞

3

8T 2

(R1R2 + 1

MR2

2

)dz

= CD0 + CD1/M. (5.23)

It follows from these equations that CD0 = Cc1/2f1 and CD1 = −Cc2/f 1. Notethat both Cc1 and CD1 are negative quantities.


5.6 The aberrations of retarding lenses

5.6.1 Spherical aberrationThe equations (5.1) which describe the ray position and slope at the second focalplane in terms of the values of these parameters at the first may be invertedalgebraically to describe the effect of reversing the direction of the rays

r ′1 = − r2

f1+ m26r

′32 + m25r

′22r2

f1+ m24r

′2

( r2

f1

)2+ m23

( r2

f1

)3+ · · ·

r1

f2= −r ′

2 − m16r′32 − m15r

′22r2

f1− m14r

′2

( r2

f1

)2− m13

( r2

f1

)3+ · · ·

(5.24)

or equations (5.1) may be written in terms of the coefficients of the lens havingthe reciprocal voltage ratio

r ′2 = − r1

f ∗2

+ m∗13r

′31 + m∗

14r′21

r1

f ∗2

+ m∗15r

′1

( r1

f ∗2

)2+ m∗

16

( r1

f ∗2

)3+ · · ·

− r2

f ∗1

= r ′1 + m∗

23r′31 + m∗

24r′21

r1

f ∗2

+ m∗25r

′1

( r1

f ∗2

)2+ m∗

26

( r1

f ∗2

)3+ · · ·

(5.25)where the asterisks (∗) denote the coefficients and focal lengths of the reversedlens. A term by term comparison of equations (5.24) and (5.25) shows that eachcoefficient for the reversed lens is the same as one for the forward lens, with anobvious symmetry

m∗13 = m26 m∗

14 = m25 m∗15 = m24 m∗

16 = m23

and also f ∗1 = −f2. Notice that as we have not specified which lens is the

accelerating one, this set of equations can be read either way.Using equation (5.5) we can develop the relation between the coefficients of

spherical aberration:Cs0 = −m13f2

soC∗

s0= −m∗

13f∗2 = −m26f

∗2

substituting Cs4 = −m26(f41 /f

32 ) gives

C∗s0

= Cs4f∗2 (f

32 /f

41 )

= −Cs4(f2/f1)3 as f ∗

2 = −f1

= Cs4(Vn/V1)3/2

whereVn is the potential of the last electrode and the ratio is that for the unasterisked

direction. It is easy to show, by similar arguments, that C∗s1

= Cs3

(Vn

V1

)3/2

Aberrations 99

and C∗s2

= Cs2

(Vn

V1

)3/2. Knowing that the reverse magnification M∗ = 1/M

we can write equation (5.3) for the reversed direction as

C∗s (M

∗) =(Vn

V1

)3/2[Cs4 + Cs3/M

∗ + Cs2/M∗2 + Cs1/M

∗3 + Cs0/M∗4

]=

(Vn

V1

)3/2[Cs4 + MCs3 + M2Cs2 + M3Cs1 + M4Cs0

]= M4

(Vn

V1

)3/2[Cs4/M

4 + Cs3/M3 + Cs2/M

2 + Cs1/M + Cs0

]= M4

(Vn

V1

)3/2Cs(M). (5.26)

In the case of the afocal five cylinder lens discussed in chapter 4, the magnificationis given simply by Maf = (V5/V1)

−1/4, so the ratio of the spherical aberrationcoefficients in the forward and reverse directions is (V5/V1)

1/2.

5.6.2 Fifth order effectsIn the preceding section, we ignored the fifth order terms of equation (5.1) because,in the reverse direction, the coefficients, q∗

ij are not so simply related to those forthe forward direction as are the third order coefficients. These can be evaluated ina similar way and are:

q∗21 = q16 + 3m2

16 − m15m26

q∗22 = q15 + m15(5m16 − m25) − 2m14m26

q∗23 = q14 + 2m14(2m16 − m25) + 2m2

15 − 3m13m26 − m15m24

q∗24 = q133m13(m16 − m25) + m14(3m15 − 2m24) − m15m23

q∗25 = q12 + m13(2m15 − 3m24) + m2

14 − 2m14m23

q∗26 = q11 + m13(m14 − 3m23)

q∗11 = q26 + 3m16m26 − m25m26

q∗12 = q25 + 3m15m26 + 2m16m25 − 2m24m26 − m2

25

q∗13 = q24 + 3m14m26 + 2m15m25 + m16m24 − 3m23m26 − 3m24m25

q∗14 = q23 + 3m13m26 + 2m14m25 + m15m24 − 4m23m25 − 2m2

24

q∗15 = q22 + 2m13m25 + m14m24 − 5m23m24

q∗16 = q21 + m13m24 − 3m2

23

where the mij coefficients are those for the forward direction.

5.6.2.1 Spherical aberration of the afocal lens. While the magnification of anafocal lens does not depend on the position of the object, the coefficient of spherical


O I

d

d

F*

F*FF

dr r'

r'

r'

r

r

r

Figure 5.12 The path of a ray which leaves the axis a distance, d, tothe left of the first focal plane of an afocal lens and crosses the imageplane at a distance dr below the axis as a consequence of sphericalaberration.

aberration does so depend. In figure 5.12 we show a schematic path for a ray whichoriginates at a point, O, on the axis a distance, d, to the left of the first focal planeof the system. There are three focal planes to consider. The first crosses the axisat F1, the first focal point of the first lens of the pair which constitutes the afocalcombination. The second is at the second focal point, F2, of this lens and the firstfocal point, F∗

1, of the second lens. The third focal plane is at the second focalpoint, F∗

2, of the second lens. The afocal lens produces an image of the object at apoint at this same distance, d , to the left of this third focal plane.

Spherical aberration causes the ray to cross the image plane at a distance, dr ,from the axis where dr = −MCsr

′30 . We can use equation (5.1) to follow the path

of the ray from the point where it crosses the first focal plane at a distance r0 fromthe axis, to to the point at which it crosses the third focal plane at a distance r2 fromthe axis. The starting point is to note that r0 = −dr ′

0 and then to use equation (5.1)to express the ray parameters, r1 and r ′

1, at the central focal plane in terms of r0

and r ′0. These are third order expressions. Equation (5.1) can then be used again

to express the ray parameters r2 and r ′2 at the third focal plane in terms of r1 and r′

1and so, by substitution, in terms of r0 and r ′

0. Note that all the mij coefficients arethe same for the two parts of the lens. These expressions are of ninth order andan algebra package was used to calculate them. Terms above the third order werethen neglected and the spherical aberration written as

dr = −MCsr′30 = r2 − dr ′

2.

Substitution into this equation leads to a quite tidy expression for the aberrationof the afocal lens.

Aberrations 101

dL

–MC

s

Figure 5.13 Values of the spherical aberration, −MCs , for afocallenses operated with objects a distance dL to the right of the symmetricposition. Details of the data points and the curves are discussed in thetext.

−MCs =( d

f2

)4[Cs0 − Cs4

(f2

f1

)3]−

( d

f2

)3[Cs1 − Cs3

(f2

f1

)2]+

( d

f2

)2[Cs2

(1 − f2

f1

)]−

( d

f2

)[Cs3 − Cs1

]+ Cs4 − Cs0

(f1

f2

). (5.27)

Figure 5.13 shows values of −MCs calculated using equation (5.27) for afocallenses of total length 8 D and overall voltage ratios of 1, 4, 9, 16 and 25 andtheir reciprocals together with values obtained using the Movable Afocal routinesof LENSYS. The abscissæ are the values of dL used in these routines and are relatedto the displacement, d , of equation (5.27) by d = dL − (2.0 + F1). The brokencurve passes through the first focal points and helps to show that the minimum


z

Cs

Figure 5.14 The aberration integrals for an afocal lens having avoltage ratio, V5/V1 = 16 with the object at the symmetric position,at a distance 0.38 to the left, where −MCs has a minimum value andat 0.63 to the left, at the first focal point. The curves are displacedvertically for clarity.

aberration occurs for positions between this focal point and the symmetric position.This figure shows that, for accelerating lenses, it is better to place the object to theleft of the symmetric position. This has a further advantage in that, with a longercylinder at potential V1, it is possible to insert an angle stop, or entrance pupil,with little disturbance to the potential distribution near the first gap.

Minimum values of −MCs occur when the perturbing action is divided roughlyequally between the two component lenses. This is illustrated in figure 5.14 foran afocal lens of overall voltage ratio V5/V1 = 16 with objects at the symmetricposition, at the first focal point and at an intermediate position.

5.6.3 Chromatic aberrationIf we consider our earlier expressions to describe accelerating lenses we can obtainresults for the corresponding decelerating lenses by reversing the direction of therays. Inspection of equation (5.22) shows that Cc0 and Cc2 must be interchangedas r1 and r2 are redefined so C∗

c0becomes identified with the term in 1/M2. The

magnitudes of all three coefficients are changed by a factor (Vn/V1) becausef ∗

1 = f2 and by a further factor (Vn/V1)1/2 because of the V 1/2

0 term. We thereforehave C∗

ci= Cc(2−i)

(Vn/V1)3/2 where i = 0, 1, 2.

It is easy to show, using an analysis similar to that of equation (5.26), that

C∗c (M

∗) = M2(Vn

V1

)3/2Cc(M).

Aberrations 103

The position is a little more complicated for the coefficient of lateral chromaticaberration, as CD1 is associated with the r2

2 term in the integral for Cc and so C∗D1

must be associated with the r21 in the integral for C∗

c . The magnitude ratios arejust (Vn/V1), because f1 (f ∗

2 ) appears only once. The result, however, is just assimple,

C∗D0

= CD0(Vn/V1) C∗D1

= CD1(Vn/V1)

though this leads to a rather strange result for C∗D(M

∗)

C∗D(m

∗) = M(Vn

V1

)(CD1 − CD0/M).

5.6.4 Figures of meritThe geometric aberrations of a lens system for a given application may be reducedby the correct choice of the number and structure of the electrodes. The applicationwill usually specify the potential ratio of the image and object spaces, the separationof the object and image, the magnification to be used and the maximum acceptablepencil angle of the beam in one or other of the spaces. The desirable parameterwill be a minimum size of aberration disc. The diameter of the lens system mightbe restricted, but it is frequently a free parameter. We can write the radius of theaberration disc as

Br

D= MCs(M)r ′3

1

where we express the coefficient in units of D. The value of the diameter alsoappears in the actual length, L, of the system, in terms of lens parameters, as

L

D= Q − P

D= F2 + q − F1 − p = F2 − F1 − Mf2 + f1/M (5.28)

and so we can write

Br = LMCs(M)

F2 − F1 − Mf2 + f1/Mr ′3

1 . (5.29)

It would plainly be difficult to examine all possible lenses to find that which gavethe smallest value of Br , but, by making two reasonable approximations, it is easyto find a good criterion to point us towards a suitable lens.

The first approximation is to use equation (5.13) to express the sphericalaberration coefficient in terms of the magnification and overall voltage ratio. Thesecond is to equate the arithmetic mean of the mid-focal distances to the geometricmean of the focal lengths†. The denominator of equation (5.29) becomes

2(−f1f2)1/2 − Mf2 + f1/M = −Mf2(1 − 1/σM)2

† This result can be shown to be true for lenses weak enough that Y = 0, where Y is definedby equation (5.14).


and soBr = −LCs0(1 − 1/σM)2r ′3

1 /f2

which can be separated into

Br = −gL(1 − 1/σM)2r ′31

g = Cs0/f2 = −m13

where g contains all the lens dependent information and none of the specificationand is therefore it is a suitable figure of merit for the inter-comparison of lenses.For the type of problem considered here, where the overall voltage ratio forms partof the specification, it will almost certainly be the case that a lens of three or moreelements will have an appreciably smaller value of g than any two element lens.Once the lens geometry and inner voltage ratios have been selected, the choice oflens diameter follows from equation (5.28) where the focal properties are in unitsof D.

For the case of rays focused from infinity, for which the magnification is zero,this analysis fails because the distance between conjugates is infinite. However, theappropriate specification in this case is more likely to be in terms of the distanceof the image from the lens itself, probably specified as L = F2, the mid-focaldistance. The radius of the aberration disc is given by equation (5.6) and the figureof merit is just Cs4/f

31 = −m26(f1/f2), though, as the ratio of focal lengths is

implicit in the specification of the voltage ratio, −m26 is itself suitable.If the specification allows for an adjustable magnification, the quantity to be

optimized becomes the product MCs , which, again using equation (5.13) can bewritten as

MCs(M) = MCs0(1 − 1/σM)4.

We may differentiate this to find an optimum value of M = −3/σ , which givesfor the minimum radius spot in the Gaussian image plane

Br = −9.48(Cs0/σ)r′31 = 2.37Cs1r

′31

and suggests that m23 would be a better figure of merit with this looser constraint.

6 The LENSYS program

6.1 The programs on the disk

The disk which accompanies this book contains a number of files for use on anIBM or compatible personal computer. There are two versions of the LENSYS

program, one, LENSYS2.EXE, compiled in real mode and the other, LENSYSP.EXE,in protected mode. The difference between the two compilations is that LENSYS2

runs in the lowest 640kB memory of the computer and is therefore suitable for abasic PC, while LENSYSP will use memory above this region and will normallybe the version of choice. The program source is exactly the same for the twocompilations.The files are arranged in three subdirectories:

PROTECT REALMODE RELAXDPMI16BI.OVL LENSYS2.EXE RELAX51A.EXE

LENSYSP.EXE LENSYS2.ICO RELAX51A.ICO

LENSYSP.ICO LPOTS.DAT RELAX51P.EXE

LPOTS.DAT RELAX51P.ICO

RTM.EXE RELAXCYL.EXE

RELAXCYL.ICO

The file LPOTS.DAT, which is loaded automatically by LENSYS, contains datafrom which the potentials and potential gradients are calculated. The filesDPMI16BI.OVL and RTM.EXE are required to run any protected mode program.The RELAX subdirectory contains the auxiliary programs which are discussed insection 6.7.

The programs have been written to run under MSDOS, and will run as DOSprograms under Windows 98. The various icon (.ICO) files are provided so thatWindows shortcuts may be generated for all the programs and placed, if desired,on the desktop. LENSYS2.EXE itself requires some 400kB of memory and, if this isnot available, the program will stop with a message that there has been a memoryoverflow error. This might occur on a basic PC if several TSR programs havebeen installed, but is unlikely to occur on a modern machine. If memory is not alimitation, then LENSYSP.EXE is the better choice.

105


A graphics screen is necessary: VGA is best, but EGA is acceptable and, thoughthe program will use a CGA screen, the resolution is only just adequate, especiallyfor text messages and data tables. The Hercules monochrome graphics screen isusually very clear, but the lack of colour is not a trivial loss. The one place wherethe program behaves differently on a monochrome screen is in the menus, in whichthe selected item appears in inverse video rather than in a different colour. All thegraphics routines are specific to the programs and do not use any of the facilitiesof Windows.

6.1.1 Loading and running the programsNone of the programs should be run from the floppy disk, because any data filethey may write is put into the directory from where they are run. All the files inthe REALMODE or PROTECTED subdirectory should be copied to a dedicatedsubdirectory on the computer hard disk. LENSYS will only look in its own directoryfor the LPOTS.DAT data file and the protected mode support files

In DOS mode the program is started by typing the filename at the DOS prompt.In Windows mode it can be started using the RUN command, double clicking onthe filename in Windows Explorer or on the shortcut icon if the user has preparedone. It presents a screen with brief details of the program. If it does not find amaths coprocessor in the computer, it writes a message to the screen telling youthat it would run much faster with one, but it continues nevertheless. Once thisintroductory screen has been displayed, it reads the data file and will then make nofurther use of the program on the disk. After the data file has been loaded, you areinvited to ‘Press ENTER to continue’. This starts the program and shows a screenwhich will provide the communication channel.

6.2 The screen display

The lower three quarters of the screen is the graphics window. This will showschematic representations of the lens being studied and the rays traced duringthe analysis of the lens. The breadth of this window is equivalent to eight lensdiameters. The upper portion of the screen forms two text windows. The widerone, to the right, is called the data window and is for the display of lens data andthe results of the calculations, while that to the left is called the message windowand shows the menus, together with other prompts and messages in the courseof the program. It is usually of a different colour from the data window. Themanual input of lens parameters (potentials, lengths etc) is normally made to thedata window, but sometimes to the message window.

6.2.1 The menu systemAll choices are made via a system of menus, each offering between two and fiveoptions. The opening menu describes itself as ‘Main Menu’ and invites a choicebetween

The LENSYS program 107

Set OptionsLens DataImaging

Exit to DOS

with the top entry displayed in a different colour (inverse video on a monochromescreen) to signify that it is the ‘selected’ item. The selection may be changedby the UP- and DOWN-arrow keys and, as the motion rolls over, the quickest way toselect the bottom item is to use the UP key once. Pressing the ENTER key makes thechoice. Choosing ‘Exit to DOS’ is the normal exit route from the program. If itis chosen, the program does not request verification, but exits immediately with asimple closing message leaving the display in a 25 line text mode. The programdoes not remain in memory and has to be reloaded if required again.

6.3 The Set Options option

Choosing ‘Set Options’ takes you to an ‘Option Menu’ which offers the choice of

Show V-axialShow Cs

Show bothShow neither

Save all

This choice enables the display of the selected data in addition to the ray diagrams.After making the choice, you are returned to the main menu. The axial potentialand the integrands of the spherical aberration integrals will be written to files ifthey are selected to be shown. The ‘Save all’ option saves, in addition to these,the integrands of the chromatic aberration integrals and the Petzval integral, and,though these are not displayed, they can be examined with a suitable graphicsprogram.

6.4 The Lens Data option

Choosing ‘Lens Data’ takes you to another menu ‘Select a Lens’ which offers thechoice of

2-Aperture A=D/22-Aperture A=D

3-Aperture A=D/22-Cylinder G=D/103-Cylinder G=D/10

This section allows you to find the focal properties of a lens. The geometriesof the first four lenses are predefined, but if you choose the three cylinder lensyou are asked, in the message window, for the length of the second element, in


units of the diameter, and this should not exceed 3. If you enter too large a value,the invitation is repeated. No sign is expected, three places of decimals will beaccepted, but the length will be rounded down to the next multiple of 0.0125.The program then draws a schematic of the lens in the graphics window and asksfor the voltage ratios, V2/V1 and V3/V1 (if appropriate). These are entered in themessage window. Up to three decimal places are accepted here. All voltage valuespresented in the data window are on the basis that V1 = 1 and so Vn representsVn/V1. The chosen values of the electrode potentials are written in the graphicswindow, above the schematic electrodes. The message window will tell you thatthe program is calculating the trajectories and the aberrations of the lens† andtwo rays will be drawn in the graphics window. These are the usual principalrays, drawn from the left and from the right, parallel to the axis. The focal pointsare marked and labelled as are the principal points. The axial potential and thefive spherical aberration integrands will also be shown if they were selected fromthe Options Menu. The data window shows three columns of data. The firstcolumn displays values of the focal lengths, mid-focal distances and the positionsof the principal points, the second column values of the five spherical aberrationcoefficients, Csi and the Petzval sum, and the last column shows values of the fivechromatic aberration coefficients, Cci and CD1 .

When the calculation and display are complete, the message window shows afurther menu headed ‘Continue?’

Swap Cs/Mij coeffNew Potentials

New LensPrint/Save

Main Menu

We shall discuss ‘Print/Save’ later. The first choice, ‘Swap Cs/Mij coeff’ changesthe display in the right hand section of the data window to show the eight mij

coefficients and the four elements of the bending matrix described in chapter 1.A repeated choice restores the original display. The effect of choosing ‘NewPotentials’ is obvious. The ‘New Lens’ option leaves you in the ‘Lens Data’ partof the program by returning you to ‘Select a Lens’. If you wish to examine theproperties of a lens between conjugates it is necessary to choose ‘Main Menu’ andthen ‘Imaging’ from that menu.

6.4.1 Error conditionsThere are a number of input parameter values which the program will not rejectimmediately, but it will discover that they are not sensible and will halt with anappropriate message and wait for ENTER to be pressed. A very simple error is toinput V2 = 1 for a two element lens which gives no lens action at all. Anotherinput leading to error conditions is an excessively high voltage ratio. There are anumber of tests for these. If the emergent ray has the wrong sign of slope it has

† The speed of a modern PC is such that these messages may not be seen.


probably crossed the axis twice. If the ray crosses the axis between the outermostelectrodes, the focal length will not be a very useful parameter and it would bemisleading to display the results in the data window.

It is possible to input a zero or negative value for a potential. This would benonsense in the case of the final electrode and the program will reject the negativesign, but it may be acceptable for the centre electrode of the three aperture lensor a cylinder lens having a very short centre element. As an example, the threeaperture lens having V3 = 1 will accept values down to −0.8 for V2. A zero valuegives the condition for the classic Einzel lens.

6.5 The Imaging option

This opens with a further menu, ‘Select Lens Type’, and you have to choosebetween

Aperture lensesCylinder lensesSpecial lenses

The details of the lens specifications and the positions of the conjugate points haveto be entered in a slightly different fashion for the two lens types. These differenceswill be explained later. The program traces a ray starting from the axial objectpoint at the left of the graphics window and with a slope of 1, but the subsequentaction depends on a further choice which you be asked to make by a menu headed‘Solve the Lens’.

Adjust PotentialsFind the Image

6.5.1 Types of solution

6.5.1.1 Automatic adjustment. If you choose the automatic adjustment of apotential, the potential of one electrode will be adjusted, in a series of steps, untilthe ray crosses the axis at the image point you have specified. Each ray is deletedas the next ray is being drawn. In the case of lenses with two or three elementsthe potential of the second electrode is adjusted, but if there are more than threeelectrodes you are asked, by a message, to indicate which one. You may notchoose the first or the last electrode. After the program has made an automaticadjustment it draws two further rays. These both start from an object point abovethe axis and one has a slope of zero and the other a slope of −1. These rayswill cross at the image plane and indicate the magnification. The data windowshows the lens specification (a little differently in the various cases), the electrodepotentials, updated at each iteration step, and the magnification and the sphericaland chromatic aberration coefficients.


For any lens geometry, more than one set of electrode potentials will satisfythe conjugate focus condition. It is possible to persuade the program to pickone combination rather than another by choosing a suitable starting point for thepotential which is to be varied. For example, even a two element lens has twopossible voltage ratios, one accelerating and the other decelerating, and if V2 isinitially set to 0.1 the program should find the retarding solution. However, if thisrequires a potential of less than 0.01 it will fail, with a message ‘This is unlikely toconverge. ENTER continues’. A three element lens started with V3 = V2 = V1 = 1will converge to the solution with V2 > 1. There is a general tendency built intothe program to look for the higher potential solutions because these are usually theones with smaller aberrations.

6.5.1.2 Error conditions. If the automatic routine cannot find a value whichallows the ray to pass through the axial image point, it will issue a message to thateffect and wait for ENTER to be pressed. This will sometimes happen because it islooking in the wrong region. For example, it might be trying very small valuesand meeting the �< 0.01 constraint while a much larger value may be valid. Thiscondition can usually be caught by noting the way in which the variable potentialchanges after each iteration. This may not be possible with the speeds at whichmodern PCs operate, though there are ways, discussed in the appendix, by whichthis factor may be circumvented.

A more common cause for failure is that the overall voltage ratio is too high.If the potential of the variable electrode is observed to be varying around, or nearto, the geometric mean of the neighbouring potentials, this can be taken as a signthat that region of the lens is too strong and new fixed potentials should be tried. Itis often worthwhile to try some values close to these critical ones, using the ‘Findthe Image’ option, to assess the nature of the change that is required.

6.5.1.3 Find the image. If you choose to find the image, only a single ray isdrawn and the program will find the position of the image point appropriate to thepotentials you have chosen and will draw a vertical line through that point. Theimage may lie beyond the right hand end of the screen, but the position is noted inthe data window. The aberration coefficients and the magnification are calculatedand shown in the data window.

You are then asked if you wish to draw another ray. If so, you are asked for thestarting position and slope of the ray and it is then drawn and the ‘Another Ray?’question repeated. None of these rays are deleted until you exit by declining theinvitation. This mode allows a particular set of rays to be drawn for a lens for whichthe automatic adjustment has already provided values of the electrode potentials.It might, for example, be used to find the position of the image of a second aperture,an entrance pupil say, which can be simulated by the intersection of a ray parallelto the axis and some distance above it with the original ray.

In this mode saving the data files will only save the first, image finding ray.Other rays can only be recorded by a screen print.


At the end of the lens solution routines the message window shows the same‘Continue?’ menu as in the Lens Data case, but now the choice of ‘New Lens’returns you to the ‘Select Lens Type’ menu.

6.5.2 Aperture lensesA menu offers you the choice of the three aperture lenses already seen in the LensData option. A message asks for values of the object and image distances, P andQ, with a constraint that the total distance from object to image should not exceedeight diameters. Despite the sign convention requiring P to be negative, onlythe magnitude should be entered as the program can only handle real objects andimages and there can, therefore, be no ambiguity. The values appear in the datawindow. A schematic diagram of the lens is drawn in the graphics window, withthe object at the left hand edge of the window, and the centre of the lens a distanceP to the right. A vertical line is drawn across the axis a further distance Q beyondthe lens centre. You are then asked for the electrode potentials which are written inthe data window replacing ‘?’ prompts. Next the ‘Solve the Lens’ menu appearsand the subsequent behaviour is as we have already discussed.

6.5.3 Cylinder lensesA menu, ‘Select a Lens’, offers a choice of various cylinder lenses

2-5 CylinderAfocal6Afocal8

Varimag6Varimag8

The last four of these will be discussed later. Choosing the first option leads to yetanother menu, ‘How many cylinders’, and thence to the particular multi-cylinderlens you require.

In contrast to the aperture lenses, for which only a limited set of spacings areavailable in this program, the cylinder lenses may use a range of lengths of cylinderand it is rather simpler to specify the object and image positions in terms of theselengths. This is particularly the case for a lens with cylinder lengths of (say)2, 1, 2 and 3 diameters where the choice of reference plane might be a matter forargument. The program, therefore, expects the object to be at the left hand end ofthe first cylinder and the image at the right hand end of the last cylinder, though itwill extend the last cylinder to the right hand edge of the screen in the schematicdiagram, marking the image plane with a transverse line as before. Having chosenthe number of cylinders, you are asked for their lengths (which will be roundeddown to the next multiple of D/80) and their potentials, which may be specified tothree decimal places. The values of the lengths are taken to include a half share ofthe inter-cylinder gap ofD/10 and the total length must not exceed eight diameters.As a trivial example, a lens with L1 = 2.00, L2 = 1.00 and L3 = 3.00, for whichthe reference plane is unambiguously in the middle of the second element, would


have P = 2.50 and Q = 3.50. The ‘Solve the Lens’ menu then appears and thesubsequent behaviour is as we have already discussed. Lenses with three cylindersare widely used to satisfy the simple condition of maintaining a conjugate pair ofpoints for a range of overall voltage ratios. An example was shown in figure 4.6.

The length of the centre element affects the behaviour of both the magnificationand the spherical aberration, though not to a great degree. The addition of a fourthcylinder allows for better control of the magnification, but a fifth cylinder opensthe possibility of controlling not just the magnification, but also the sphericalaberration of the lens system.

6.5.3.1 The aberration behaviour of the five cylinder lens. The conjugate focuscondition for a lens having the potentials of the third and fifth electrodes fixed maybe achieved with each of the intermediate potentials at a high or low value comparedwith their neighbours. The four possible combinations lead to significantlydifferent coefficients of spherical aberration for all values of the magnification.This is illustrated in figure 6.1 for a five cylinder lens having V5 = V3 = V1 = 1.For all values of the magnification, the spherical aberration coefficient is leastfor V2/V1 > 1 and V4/V3 > 1.

6.5.4 The Afocal and Varimag lensesThese lenses each have five cylinder elements with lengths

for the “6”

L1 = 1.25

L2 = 0.50

L3 = 2.50

L4 = 0.50

L5 = 1.25

and the “8” lenses

1.50

1.00

3.00

1.00

1.50.

6.5.4.1 The Afocal lens. The afocal lens was discussed in chapter 4 and the onlyinput required is the value of the overall voltage ratio. The program automaticallysets V3 = V

1/25 and adjusts the other potentials maintaining V4/V3 = V2/V1. The

data window reflects the changes in the potentials and, after convergence, showsthe magnification and the spherical aberration coefficient. The program operatesonly in the regime with V2 and V4 both at the higher of their two possible values.The shorter lens has much smaller aberrations, but the end elements are rathershort. Placing a real angle stop within the electrode would therefore affect thepotential distribution to a certain extent.

The behaviour of a non-symmetric afocal lens of total length 8 D is discussedlater in section 6.5.5.1.

6.5.4.2 The Varimag lens. These are very powerful lenses which allow themagnification and the overall voltage ratio to be specified and a further parameter


Figure 6.1 The spherical aberration, expressed as the product MCs , of a five cylinderlens with V5 = V3 = V1 as a function of the magnification for values of V2 and V4 in thefollowing ranges:

Region

AB

BC

CD

DA

V2

> 1

> 1

< 1

< 1

V4

> 1

< 1

< 1

> 1.

to be adjusted. This may be either the magnification of an entrance pupil orthe spherical aberration coefficient of the lens. These are controlled, though notindependently, by the potential of the centre electrode. The program assumes thatthere is an entrance pupil at a distance D/2 from the entrance window. You areasked to supply values for V5, V3 and the target magnification. This will be anegative quantity, but your input will be read and displayed in the initial data asthe modulus, |MagT|. The program first examines the range of magnificationswhich can be achieved with these potential values, and may halt with a messageto the effect that the target magnification is outside the usable range. Otherwiseit adjusts V2 to satisfy the conjugate condition and then adjusts V4 to change themagnification. Both V2 and V4 are constrained to take the higher of their possiblevalues.

These adjustments of V2 and V4 continue alternately until the windowmagnification is close enough to the target value. At this point the data window,


which has monitored the potentials and the window magnification throughout,displays the target, window, pupil and longitudinal magnifications (MagT, MagW,MagP & MagL) and the spherical aberration coefficient of the lens. A menu invitesa choice of

Modify V3Continue

The first allows you to change V3 and the whole convergence procedure is repeatedwhile the second opens the usual ‘Continue?’ menu.

Increasing V3 from the ‘afocal’ value of V1/2

5 will both decrease the pupilmagnification and reduce the spherical aberration. If the target magnificationis close to one of the limiting values with the initial value of V3, it is possiblethat changing V3 might modify the limit sufficiently to cause the program tofail to converge. If this happens, it will be necessary either to change the targetmagnification or to accept a restriction on the value of V3.

In the case of the six diameter lens, the exit pupil can be seen, provided that thelongitudinal magnification is less than four. The reader will notice, if this lens isoperated with V5 = V3 = 1 and MagT= −1, that the pupil magnification is notquite equal to −1. This is because the potential 1.25 diameters inside a cylinderdiffers sufficiently from the potential of the electrode itself that the trajectories arenot quite straight. The effect is significant only at the 1% level and, for the eightdiameter lens, it is even smaller.

6.5.5 The special lensesThe ‘Special Lens’ menu has two choices:

Movable AfocalExcitation Lens

6.5.5.1 The movable afocal lens. This is the same as the ‘Afocal8’lens of section5.4.1, but with provision for moving the object point along the axis by an amountdL, which may be positive or negative, but should not exceed 1 D. You are askedfor a value for dL and for the overall voltage ratio, V5. The program behavesexactly as the ‘Afocal8’ program and then offers a choice

Change dLContinue

‘Continue’ takes you to the ‘New Potentials’ etc menu and that choice returns youto the lens, with the previously chosen value of dL, and allows you to set a newvalue for V5. This procedure is slightly more clumsy than for the preset geometryof ‘Afocal8’, but the two variables allow you to study the dependence of Cs on bothparameters. While the magnification of an afocal lens depends only on the overallvoltage ratio, the spherical aberration depends also on the location of the objectand image points within the lens. This behaviour was discussed in chapter 5.


All the data for this lens are written to a file MOVFOCAL.DAT. Note that thisfile may be added to if ‘Movable Afocal’ is chosen again during a single run ofLENSYS, but will be overwritten if the program is stopped and restarted.

6.5.5.2 The excitation lens. This is a five cylinder lens developed for onepurpose, to study the excitation of a gas target by a beam of electrons and thesubsequent radiative decay. The requirement is for a reasonably constant potentialin the observation region at the centre of the lens and some form of potentialbarrier to restrict secondary electrons, formed at beam-forming apertures, fromentering this region. It is geometrically and electrically symmetric with V2 andV4 operated, usually, at a lower potential than V3. An application of the lens to aspecific problem is discussed by Heddle and Kay [18].

You are asked for values of the potentials and to specify which one shouldbe varied. After solving the lens, the program draws two rays of slope ± 1from an axial object point and displays the magnification, the spherical aberrationcoefficient and the radius of the trajectory at the centre of the lens, rmid , in unitsof the lens diameter.

6.6 The Print/Save option

The lens data, the ray paths and the axial potential and spherical aberrationintegrands if selected, may be written to files. A simple pixel level dump ofthe screen can be made to a suitable printer connected to the first parallel port(LPT1) of the computer.

On selecting Print/Save from the ‘Continue?’ menu, a second menu, ‘PrinterType’ invites a choice of

Save to Files‘Laser’

‘Matrix’

The terms ‘Laser’ and ‘Matrix’ are meant to define respectively a 300 dpi laseror inkjet printer and a nine pin dot matrix printer.

6.6.1 The Save to Files optionChoosing ‘Save to Files’will cause a counter to advance by one and the data for thelens distances, potentials, aberration coefficients and magnifications to be writtento a file LENSnnn.DAT where nnn is the number of the lens. This is reset to zerowhen LENSYS is restarted and incremented when each lens for which ‘Save toFiles’ is selected. The ray paths are saved to a similar file with extension .RAY. Ifa choice has previously been made to show the axial potential or the aberrationintegrand, then these will be written to files with extensions .POT and .ABR. If the‘Save all’ choice has been made, then the integrands of the chromatic aberrationintegrals will be written to a .CHR file and, in the ‘Lens data’ mode, the integrandof the Petzval integral will be written to the .ABR file.


z/D

Figure 6.2 The ray and aberration data for a two aperture lens plottedusing the .RAY and .ABR data saved from the same run as the screendump of figure 5.11.

If the choice of ‘Find the Image’ ray tracing has been made the .DAT file showsthe image position. Even if a second ray is traced, only the original ray will besaved to the .RAY file, because further rays would overwrite the first one. It wasnot practicable to use different ray variables as there is no limit to the number ofadditional rays, for various launch positions and angles, which may be traced.

The saved .RAY, .ABR, .CHR and .POT files can be displayed and scaled by asuitable graphics program and show much smoother variation than the screencopies. Figure 6.2 shows the same data as figure 5.11 displayed using the ‘EasyPlot’program†.

All the data files are properly normalized and numerical integration of the saveddata will give the same values as listed in the .DAT file.

A complete save of the data files for a ‘lens data’ run amounts to some 125 kBand for an ‘imaging’ run to some 70 kB. On leaving the program, the exit screenwill remind you if you have saved any files. You should examine these before yourestart the program as they will be overwritten, or set to zero by fresh saves.

6.6.2 The printing optionsThe other choices from the ‘Printer Type’ menu lead to a ‘Which Region’ menu

† Cherwell Scientific Publishing Limited.


Whole ScreenData Window

Graphics WindowContinue

The ‘Continue’ option is offered in this menu, because a choice of ‘Save to Files’for a lens for which this is not possible, would otherwise force you to an unwanted‘Print’. This gives a safe exit.

The message window then clears and some information about the lens isdisplayed there. The printing is done by reading each pixel and copying it as a blackdot unless it has the colour of the background. The DOS Graphics command is notused and the quality of the printed output reflects that of the monitor, but with allthe colours lost. Figures 2.1(b), 5.10 and 5.11 were printed from a VGA monitor.The program issues two Line Feed characters after printing, but no Form Feedcharacter. A VGA display printed to a laser printer retains the correct proportionsand is of such a size that two complete screens or seven data windows may beprinted on A4 paper.

If the printer is not connected, not turned on or has run out of paper a messagewill be issued and the program will wait for ENTER to be pressed, at which point itwill return to the ‘Continue?’ menu.

This method of obtaining a printed image of the screen will only work if theprinter can interpret the codes of the PCL level III language, in the case of alaser/inkjet printer, or of the Epson FX language in the case of a dot matrix printer.Many modern laser/inkjet printers cannot do this as they rely on drivers built in tothe Windows system.

6.6.3 Printing under WindowsIf the program is run under Windows, a copy of the screen can be saved to theclipboard by pressing the PrintScreen key and can be pasted into the PAINT programusing Ctrl V. It can then be printed in colour to any Windows printer. This is notvery satisfactory, because the blue background, which is very suitable for visualdisplay, does not offer adequate contrast when printed. More sophisticated paintingprograms are available which allow the colours to be changed and also allow thescreen to be captured and cropped. A white background is the most effective forprinting, but any white areas, such as text, should first be changed to (e.g.) black.

6.7 The auxiliary programs

These are included to illustrate the manner in which the potential distributionapproaches its final value. They also allow the oscillatory behaviour, which occurswith large values of g, to be observed.

6.7.1 The RELAX51A.EXE and RELAX51P.EXE programsThese programs illustrate the solution of the two dimensional problem described


in section 3.4.1 using the five point relaxation methods. The initial condition isthat the potential along the left hand side and the top of the 51 × 51 array is set to1, while the rest of the array is set to zero except for the bottom left and top rightcorners which are set to 0.5.

They differ in the order of scanning the array of points. RELAX51A follows thesequence

for x:=1 to 49 dofor y:=1 to 49 do

while RELAX51P follows the sequence

for x:=1 to 49 dofor y:=49 downto 1 do

The effect of these sequences is to drive the potential distribution along the x = y

diagonal and perpendicular to that diagonal, respectively, which gives a moresymmetrical distribution along the diagonal at all stages.

In both cases the criterion for satisfactory completion of the iterative proceduresis more severe than that illustrated in figure 3.8. Instead of just requiring thatthe sum of the potentials on the diagonal should be smaller in magnitude thansome value, these programs require the sum of the absolute differences betweenthe individual diagonal potentials and 0.5 to be below some level. In the originalprogram, it would be quite possible to satisfy the criterion while having a systematicexcess for x(= y) < 25 and a systematic deficit for x(= y) > 25. The differentscanning modes of these two programs allow the asymmetry in the axial potentialto be seen to develop in the ‘A’ version, while a symmetry is imposed by the ‘P’scanning mode.

On running these programs you are asked for a value for the accelerationparameter to be used, and the value of the sum of the potentials of the pointson the diagonal is displayed on the screen after each cycle of iteration. This sumis also saved to a file identified by the value of the acceleration parameter.

The screen also displays an array of 51 × 51 small squares which are colouredto represent the potentials at each point. The way in which the potentials changewith successive iterations can be seen and the effect of acceleration and, especially,excessive acceleration is very clear.

6.7.2 The RELAXCYL.EXE programThis treats a similar problem in cylindrical coordinates, the two cylinder lens havinga narrow gap. The five point relaxation method is again used here and the criterionfor satisfactory completion is one which is appropriate for a demonstration, but isnot as safe as that used for accurate calculations. The change in the potential at eachpoint on the axis after each stage of iteration is determined and these values areadded algebraically to give a figure, BV , and then their moduli are added to give asecond figure, |BV |, which is never negative. This latter value decreases, thoughnot without limit, and the iteration is stopped when it becomes less than some


value, which, in this example, is set as 0.000 01. Accurate methods of assessingconvergence are described in the appendix. The method used here correspondsto the convergence of the axial potential to a few parts in 10−5, which is quiteadequate for the very coarse display of the potential used in this demonstration.

The lower part of the screen displays the iteration number, BV and |BV |.Notice that these last are equal for small values of the acceleration parameter, butBV changes sign to a greater or lesser degree as this parameter is increased.

The screen displays a representation of the electrode system with the potentialsat each point represented by coloured squares. In contrast to the square arrayproblem, potentials in adjacent regions differ by constant ratios, i.e. the coloursare chosen to represent the logarithm of the potential. The effect of too large avalue of the acceleration parameter is quite dramatic.

Appendix: Technical aspects of LENSYS

A.1 The programming language

The LENSYS2.EXE program is written mainly in Borland Pascal with some sectionsin Assembly Language. The floating point variables are of type double and requirea maths coprocessor or its emulation: this latter is included in the code and willbe used automatically if needed.

A.2 Initialization

The program first looks for a maths coprocessor and, if it does not find one, writesa message to the opening screen. It then loads the data file, stores the contentsin dynamic memory and waits for a response from the user. The type of graphicsadaptor is then examined and various scale and colour parameters are set in thelight of this. If a CGA adaptor is found, it is set to the four colour resolution of320 by 200. This does not affect the effective resolution of the display seriouslyand is preferred to the two colour 640 by 200 mode.

A.2.1 The effect of processor type on the speed of the programWhen the program was first written, some ten years ago, typical processor clockspeeds were 10–25 MHz and the type of processor was significant. The clockspeeds in modern personal computers are some 20 times as fast and even the longestcalculations, such as the Varimag routines, are completed in under a second. Thisis not an unalloyed gain, as there are occasions when it would be of interest toobserve the convergence process. There are various programs available which canslow the effective clock speed† and so allow the successive traces to be observed.

† For example, SLOWDOWN.COM, which can be downloaded frommembers.aol.com/bretjohn.

120

Technical aspects of LENSYS 121

A.3 Calculation of potentials

The file LPOTS.DAT contains values of the axial potential and potential gradient forall the lens geometries at intervals of D/80, which is the step size used by the raytracing routines, from the centre to a distance 8 D into the lens. The potentialswere calculated at closer intervals to improve both the precision and the accuracyof the data.

A.3.1 Aperture lensesThe potential distributions for the aperture lenses were calculated by the ninepoint relaxation method of equation (3.23) using, in the notation of figure 3.10,D1 = 160, D2 = 480 and D3 = 800 with a length from the centre of the lens toone end cap of 2400 and half-gaps, A/2, of 40 or 80. By using a length of 5 D2

the effect of the end cap is completely unimportant in determining the potential at8 D1 from the lens centre.

A.3.2 Cylinder lensesThe potential inside two closely spaced (D/10) cylinders is very well representedby an expansion in terms of Bessel functions and the particular method described inSection 3.6 was used in the original version of LENSYS. This required an estimateof the gap length correction. This correction is now known to a much greateraccuracy, but only as a result of comparison with the very accurate results of thenine point relaxation method [19] and so we have used these results directly in thepresent version. The diameter of the cylinder was 320, the length, from centre toend cap, 3200, and the gap half-width 16 all in units of the lattice spacing.

A.3.3 Details of the calculationsThe space within the half-lens was represented by two arrays. The main, cylinder,array had dimensions 241 × 2401 for the aperture lenses and 321 × 3201 for thecylinder lens. The gap array had dimensions 160×41 or 81 for the aperture lensesand 160 × 17 for the cylinder lens. The relaxation procedure is started, usingequation (3.23a) at the outermost point at the midplane and progresses row byrow down to the cylinder radius when the pattern changes to column by columnadvancing axially to the closed end of the electrode. The axial potential is thencalculated using equation (3.23b).

Though the lens has a nominal gap of D/10, the finite spacing of the lattice ofpoints on which the potential is calculated leads to an effective gap of 0.09964,some 0.3% less. This not a serious discrepancy.

The convergence of the iteration is assessed by an automated version of thatillustrated in figure 3.14. Values of the axial potential are recorded after somenumber, n, cycles of iteration and are compared with values found after a further δncycles. The initial value of n is usually 1000 and that of δn 100. The test iswhether |((Vn(z, 0) − Vn+δn(z, 0))/Vn+δn(z, 0))| < 10−14 as z is advanced fromthe midplane. At the point where this test fails, the values of n+ δn and (z−1)/D


Figure A.1 The convergence of the iterative process. The left hand ordinate shows valuesof |BV | for iterations started with g = 1.980 and, when |BV | < 10−12, continued withvalues of (from the top) 1.973, 1.967, 1.960, 1.950, 1.900, 1.850, 1.800. The resultfor iteration with g = 1.950 throughout is also shown. The right hand ordinate shows theapproach to convergence for the fastest choice of g : 1.980 initially and 1.973 later (+), andfor g = 1.950 in both modes. The full line interwoven with the © data is discussed in thetext.

are recorded, n is reset to n + δn and the iteration is continued until the test issatisfied out to z/D > 9.5. There is no point in testing further as the potential isaffected by the end cap and, in the present application, values greater than z/D = 8will not be used.

Figure 3.9 suggests that an acceleration parameter, g, somewhat greater than1.97 should give optimum convergence and the results from a sequence of valuesshow that g = 1.972 is, indeed, best giving convergence after 4700 cycles ofiteration, though the minimum is not very sharp.

This value was obtained with a single acceleration parameter, but we have seenearlier (figure 3.11, for example) that the initial approach to convergence may bespeeded up by using a significantly greater value until the magnitude of the changesin the potential have decreased towards their final value. We show in figure A.1the behaviour of |BV | obtained using g = 1.98 until |BV | has decreased to 10−12

and then smaller values. The different asymptotic values of |BV | are clearlyshown, covering a range of almost 1 : 10 for g = 1..800 − 1.973 which lattervalue gives convergence after 2400 cycles. The progression of this convergence isshown by the + symbols. Increasing the acceleration parameter to 1.974 increasesthe number of cycles for convergence to 4900. This would overwrite the data forg = 1.973 and is, therefore, not shown.


The figure also shows, by � symbols, the convergence for g = 1.950 afterthe initial over-acceleration. The interesting feature here is that convergence toz/D ≈ 5.5 occurs very rapidly. This is always the case if the second value of g issignificantly smaller than the first. Even if g = 1.00 there is still this very rapidconvergence to z/D ≈ 5. Notice that this is faster than the initial convergence forg = 1.973 which gives the fastest complete convergence. This step, for that is howit appears with δn = 100, is associated with the even more rapid fall in |BV | whenthe g-value changes. This very rapid convergence, assessed as it is by the valuesof the axial potential, does not mean that the potential away from the axis hasreached its final value. A rough guide to the behaviour with excessive accelerationcan be found by running the RELAXCYL program. The advancing equipotentialsare clearly not of the same shape as for more reasonable acceleration, though thefinal state does represent a complete convergence. It can be helpful to PAUSE theprogram if the computer speed is too high.

The fall in |BV | obtained with a single value, 1.950, of the accelerationparameter, and the associated approach to convergence (©) are also shown. Thefull line which weaves through these points is a copy of the � data displaced to theright by 6340 and shows that, after the initial rapid rise, the behaviour is controlledby the final value of g. It is very similar to the g = 1.973 behaviour (+), but withthe abscissa stretched by a factor of seven.

The optimum number of cycles of iteration is determined by a balance betweenthe convergence criterion and the limiting value of |BV |. This is roughly inverselyproportional to 1/(2 − g). If the convergence criterion were relaxed to, say,2 × 10−14, g = 1.974 would give convergence after 2300 cycles, a fairly trivialgain, not justified by the loss of precision.

A.3.3.1 Multi-cylinder lenses. To determine the potential distribution in a multi-cylinder lens, we write

Vi,i+1[z] = V [z + zi,i+1] and Ei,i+1[z] = E[z + zi,i+1]

where zi,i+1 represents the position of the gap between the ith and (i + 1)thcylinders. We can then represent the axial potential and its gradient in terms of theelectrode potentials as

V [z] = V1 + (V2 − V1)V12[z]

+ (V3 − V2)V23[z]

+ (V4 − V3)V34[z]

+ (V5 − V4)V45[z]

E[z] =(V2 − V1)E12[z]

+(V3 − V2)E23[z]

+(V4 − V3)E34[z]

+(V5 − V4)E45[z]

and then calculateT [z] = E[z]/V [z]

T 2[z] = 0.1875 ∗ T 2[z]


where T 2 ≡ T ∗ of chapter 4, and

F [z] = V 1/4[z]

which is needed for conversion from real to reduced radii and vice versa.

A.4 Ray tracing for focal data

The screen has a width of 640 pixels and we set this equal to 8 D. We calculatethe ray paths at intervals of D/80 but display only every fourth point. Asthe positions of the focal points are not known, we must trace the principalrays starting from points a unit distance from the axis and directed parallel tothe axis. The reduced rays are traced in the forward and reverse directionsby integrating the Picht equation using Numerov’s algorithm (equation (4.3)).This requires two initial values of the reduced radius to be known. These aretaken to be RF [0] = F [0] and RF [1] = F [1] for the forward direction, andRB[639] = F [639] and RB[638] = F [638] for the reverse direction. RF

corresponds to r2 of equation (1.5) and figure 1.5 of chapter 1, and RB to r1.Both pairs of values correspond to a real initial radius of one. Subsequent valuesof RF [z] are calculated and, if z MOD 4 = 0, a point is plotted to the screenat (z, RF [z]/F [z])with the ordinate suitably scaled to take account of the differentresolutions of the various graphics cards. IfRF [z+1]/RF [z] < 0 then the ray hascrossed the axis and the position of the first focal point is found by interpolation.The position of the corresponding principal point is calculated from the slope ofthe real ray, extrapolating back to a radius of 1. The focal length is then found bysubtraction. If the ray has not crossed the axis by the end of the calculation (i.e.ifRF [639] is still positive), which will happen for weak lenses, the focal propertiesare all found by extrapolation. Values for the reverse direction are determined inexactly the same manner. The matrix elements are also calculated.

The coefficients of spherical aberration are calculated using equationsanalogous to equation (5.17) with the integrands expressed by equation (5.18).RF [z] and RB[z] are first scaled to give R1[z] = f1 ∗ RB[z] and R2[z] =−f1 ∗ RF [z], and the integrals are evaluated using Simpson’s rule. Similarcalculations, using equations (5.19)–(5.22) give the Petzval integral and the variouscoefficients of chromatic aberration. The Petzval integrands for the three lenses forwhich the spherical aberration is illustrated in chapter 5 are shown in figure A.2.

There are certain conditions which will halt the program. If the slope of theray at the exit plane is zero, and if the ray has not crossed the axis, then there isno lens action. If the ray crosses the axis between the lens centre and the outerelectrode, it is treated as being too strong to be useful. If the slope is positive at theexit plane, then the ray must have crossed, or be about to cross, the axis a secondtime, and the lens is unlikely to be useful.


Figure A.2 The Petzval integrands for the three aperture lens (- - - -),the Afocal8 lens (——) and the two aperture lens (– – –) illustrated infigures 5.10 and 5.11.

A.4.1 Accuracy and consistencyThe lens data are presented to 3 1

2 significant figures, though the last may notalways be too reliable. The calculation of the forward and reverse characteristicsare made independently and the ratio of the focal lengths provides a test of theprecision of the ray tracing routines. This ratio should equal the square root ofthe overall voltage ratio. The determinant of the lens matrix should equal −f1/f2.Further tests may be made by examining lenses which are reversals of each other,for example 10 : 1 and 1 : 10, or 1 : 8 : 2 and 1 : 4 : 0.5 for a three elementlens. These tests can also examine the coefficients of spherical aberration, whichshould be related by equation (5.26).

A.5 Solving the imaging problem

A.5.1 Finding the imageThe two initial values of the reduced radius are taken to be R[0] = 0 andR[1] = h ∗ F [1], where h = 1/80 is the lattice spacing, corresponding to aslope of one for the real radius. The Picht equation is again integrated step by stepfor the whole length of the system, and each fourth point is plotted, again as realradius. The point where the ray crosses the axis is found in the same way as thefocal point and the magnification and the aberration coefficients are calculated asexplained later. No rays are traced beyond the limits of the screen and the imagepoint is found by extrapolation if necessary.


Further rays may be traced with initial values R[0] = r × F [0] and R[1] =(r + s) × F [1] where r and s are values of the initial radius and slope input inresponse to the ‘Another ray?’ prompt.

A.5.2 Solving the lensThe same initial values of the reduced ray are used and traced for the length ofthe system. The value of R[I ], where I represents the image plane, is noted andthe focusing voltage, Vf , is adjusted by an amount which depends on this missdistance as

Vf = Vf + k × R[I ].

After the first trace, k can only be guessed and a value of 0.1 is taken in order thatthe effect will be quite small. After this first correction, a suitable value for k canbe estimated, because R[I ] is then known for two values of Vf , and we write

k = (Vf − V†f )/(R†[I ] − R[I ])

where the † indicates the previous value.The first change is deliberately made small, because even the sign of k is not

initially known. If R[I ] is positive, the lens is too weak and must be strengthened.The correct value of Vf may have one of two values: ‘high’ or ‘low’, with the‘high’ value generally offering better aberration behaviour. However, there areoccasions when the ‘low’ value is of interest. In this case, the appropriate changein Vf is a reduction. The small initial change is a way of letting the program knowwhere the target value of Vf lies.

If the change in Vf leads to a value less than 0.01, it is replaced by a smallrandom value, greater than 0.01. The value is small to keep the search for Vf inthe ‘low’ region, and random so that the program does not immediately get stuckin a loop. The target for satisfactory imaging is that |R[I ]| � 0.001.

When the correct value of Vf has been found, the coefficients of spherical andchromatic aberration are calculated using equations (5.17) and (5.20). Two furtherrays are then drawn from a point above the axis; one is initially parallel to theaxis, and the other has a slope of −1 in real space. These two rays cross again atthe image plane, and the magnification can be calculated from the final and initialoff-axis positions. The exact starting point is chosen to be such that the magnifiedimage lies within the length of the marker drawn perpendicular to the axis at theimage plane. This requires that the program knows the magnification before theserays are drawn. The reader should have no difficulty in working out how this ispossible!

A.5.3 The afocal lensThis is handled in exactly the same way as any other lens. Though only V5 isspecified by the user, V3 = V

1/25 and the two focusing voltages are related by

V4 = V2 × V3. Only the higher values of the focusing voltages are considered.


This is achieved by setting the lower limit for V2 not at 0.01, but at√V3, a value

for which the V1−V2−V3 lens is close to minimum power†.

A.5.4 The Varimag lensConstraints are first applied to the possible values of V2 (�V

1/23 ) and V4

(�(V3 × V5)1/2). V2 is set to its minimum value, and V4 is first set to the greater

of V3 and V5, and then adjusted to focus the lens. Under these conditions, themagnification of the lens is close to the minimum value possible for the givenvalues of V3 and V5, because the V1−V2−V3 lens contributes minimally to thetotal power. To find a value for the magnification close to the maximum possiblevalue, V4 is set to its minimum value andV2 set to the greater ofV1, V3 and adjustedto focus the lens again.

If the target magnification lies outside these limits, the program halts with amessage that the maximum (or minimum) value has a certain value. This valueis given as some 5% less than the true maximum as a safeguard against laterproblems if V3 is changed. If the target magnification is acceptable, then V4 isset to a value found by interpolating between its values for the minimum andmaximum magnification, and V2 is adjusted to focus the lens. The magnificationis compared to the target value and V4 is re-adjusted. Cycles of focusing usingV2 and magnification adjustment using V4 continue until the target value is metto within 0.1%. This is quite good enough for any practical purpose, but it isnoticeable that the dependence of magnification on the ratio V4/V2 is often quiteweak.

† The reader might like to examine the focal properties of three element lenses with thecentre element at potentials between those of the outer two.

References

[1] Hawkes P W and Kasper E 1989 Principles of Electron Optics (London:Academic)

[2] Szilagyi M 1988 Electron and Ion Optics (New York: Plenum)[3] Harting E and Read F H 1976 Electrostatic Lenses (Amsterdam: Elsevier)[4] Picht J 1932 Ann. Phys., Lpz. 15 926–64[5] Weber E 1939 J. Appl. Phys. 10 663–6

Gray F 1939 Bell Syst. Tech. J. 18 1–18Bertram S 1940 Proc. IRE 28 418–20

[6] Verster J L 1963 Philips Res. Rep. 18 465–605[7] Press W H, Flannery B P, Teukolsky S P and Vetterling W T 1986 Numerical

Recipes (New York: Cambridge University Press)[8] Read F H 1999 private communication[9] Renau A, Read F H and Brunt J N 1982 J. Phys. E: Sci. Instrum. 15 347–54

van Hoof H A 1980 J. Phys. E: Sci. Instrum. 13 1081–9Martinez G and Sancho M 1982 Am. J. Phys. 51 170–4

[10] Edwards D 1983 Rev. Sci. Instrum. 54 1729–35[11] Natali S, Di Chio D and Kuyatt C E 1972 J. Res. NBS A 76 27–35[12] Durand E 1957 C. R. Acad. Sci. Paris 244 2355–8[13] Kasper E 1990 Nucl. Instrum. Methods A 298 295[14] 1958 British Association Mathematical Tables vol 6 (Cambridge: Royal

Society)[15] Bonjour P 1979 Revue Phys. Appl. 14 533–40[16] Renau A and Heddle D W O 1986 J. Phys. E: Sci. Instrum. 19 284–95[17] Hawkes P W 1987 J. Phys. E: Sci. Instrum. 20 234–5[18] Heddle D W O and Kay S M 1996 J. Phys. B: At. Mol. Opt. Phys. 29 1263–71[19] Heddle D W O 1999 J. Phys. D: Appl. Phys. 32 1447–54

128

Documents

Electrostatic Lens Systems - Dipartimento di Matematica e ...gavioli/corsi/MSFM/ref/Analizz... · 4.3 The calculation of ray paths 65 4.3.1 Ray tracing in a known potential 65 4.3.2