Improved stochastic optimization algorithms for adaptive optics

Computer Physics Communications

ELSEVIER Computer Physics Communications 99 (1997) 255-269

Improved stochastic optimization algorithms for adaptive optics *

T.E. Kalogeropoulos a, Y.G. Saridakis b, M.S. Zakynthinaki b a p • ~ hyszcs Dept., Syracuse University, Syracuse, NY 13244-1130, USA

b Applied Mathematics & Computers Lab., Technical University of Crete, Chania 73100, Greece

Received 10 November 1995; revised 1 July 1996

Abstract

Optical observations by ground-based astronomical telescopes have long been affected by the distorting effects of the earth's atmosphere. A real time system of optical elements (Adaptive Optics (AO) system) must be introduced to change the distorted wavefront into a nearly planar one. Adaptive corrections are to be achieved by a deformable or a segmented mirror, which is driven by an optimization device. Easy and cost effective implementation by analog networks and the independence of the dynamics of the system were some of the reasons which led us to consider the ALOPEX stochastic optimization algorithm. The optimal determination of the free parameters of the ALOPEX algorithm leads to efficient convergence and rapid tracking of the optimal shape of the AO mirror.

1. In t roduc t ion the disks of nearby stars, close binary systems, and the detailed structure of the cores of galaxies [9-11].

The resolving power of optical astronomical tele- Many attempts to lessen atmospheric distortion by scopes with apertures greater than about 10 cm has locating the observatory at high altitudes have been been limited by atmospheric distortion, rather than made, but were only partially successful. An alterna- diffraction, ever since the first astronomical tele- tive to postdetection processing is real time compen- scopes were built. Variations in the index of refrac- sation of the telescope optics system in order to tion, due to fluctuations of temperature and pressure cancel out the phase distortions introduced by the in the atmosphere, cause an incident plane wavefront atmosphere. This must be accomplished adaptively, to deviate from planafity. The deviations have spa- i.e. in phase with the rapidly changing atmospheric tial and temporal characteristic properties that de- conditions. pend on telescope location, weather conditions, and Adaptive optics systems work in a conceptually the wavelength used. Atmospheric seeing limits the simple manner. Light arriving from a distant star is resolving angle of large telescopes to about 1 sec of essentially a plane wave until atmospheric turbulence arc, that is, to the diffraction of a 10 cm diameter deforms the wavefront 's shape or, equivalently, in- telescope. This limit has made it impossible to image duces local phase delays across the wavefront. A

number of such adaptive optic devices have already been built and operated on large ground-based tele-

* This work was supported by the General Secreteriat of Re- scopes, delivering near diffraction-limited perfor- search & Technology of Greece under the-grant PENED - AA mance at infrared and visible wavelengths [13]. 1431. Adaptive optics not only provides the means for

0010-4655//97//$17.00 Copyright © 1997 Elsevier Science B.V. All rights reserved. PII S0010-4655(96)00101-4

256 T.E. Kalogeropoulos et al . / Computer Physics Communications 99 (1997) 255-269

increasing the angular resolution in direct imaging, vidual neurons in the visual pathway, and later ap- but also higher performance for many spectroscopic, plied successfully to experiments in neurophysiology interferometric and photometric measurements. [3,14-17].

In the present work adaptive corrections are to be The ALOPEX process operates as follows: achieved either by a deformable mirror, or by an • The procedure is iterative. In every iteration array of smaller plane mirrors, called a segmented all variables that determine the cost function mirror. Incoming light from an astronomical source are changed simultaneously by small incre- reflected off the deformable mirror leaves the mirror's ments, and the cost function is computed. surface in its original pristine state, as if it had never • The changes in the variables depend stochasti- encountered any atmospheric distortion. Control of cally on the change of the cost function and this deformable or segmented mirror may be achieved the change in that variable over the preceeding by sets of ordinarily piezo-electric devices, called two iterations. actuators, that produce motions of the order of sev- • All increments are retained from one iteration eral KHz. These actuators can be controlled by a full to the next. Since the changes are cumulative, aperture detecting system, where an image quantity the value of each variable reflects at all times is used as an objective (cost) function in an optimiz- the dependence of the cost function to changes ing algorithm [4,6-8]. in that variable over all past iterations.

The technological difficulties of achieving full • The process is guided by two f ree parameters, compensation for atmospheric turbulence are many, which are the step size of the increments and and some of the methods under development are the degree of stochasticity (that is, the ampli- quite costly. In its technological application, our tude of noise). study is to make use of a multi-channel optimization This is applicable in a variety of real-time systems, device, which would satisfy the requirements ira- since posed by the problem, and will have the following • no knowledge of the dynamics of the system is advantages: required;

• Easy implementation by analog networks. • the functional dependence of the cost function • No interferometry is required, making this on the control variables may be linear, nonlin-

method applicable to even weak sources, and ear, or unknown; at the same time greatly reducing the complex- • the number of the control variables may be ity and cost of the system, large, as they correspond to the number of the

• No knowledge of the dynamics of the system 'segments' of the segmented or deformable or of the functional dependence of the cost mirror; function on the many variables, is required. • local extrema are effectively avoided by the

For the purpose of optimization, let introduction of noise.

f = f ( x 1 , x 2 . . . . . x~; Yl, Y2 . . . . . Y,n) After a brief overview, in Section 2, of the theoretical background together with the main set of

---f( x I , x 2 . . . . . x , ) equations used in our telescope simulation, the main denote the cost function, that is, the function that results pertaining to the optimal determination of the describes the system, which depends on the group of free parameters and convergence analysis of the the n parameters x~, x 2 . . . . . x , , called the control ALOPEX algorithm are included in Section 3. The variables, and the group of the m parameters simulated experiments included in Section 4 deal Y~, Y2,---, Y,,, that are not under control, and may with the following aspects: be internal or external parameters of the system. The • Convergence time and dependence of the algo- stochastic optimization algorithm we used for the rithm on the maximization of the above cost function is called - choice of the cost function; ALOPEX (ALgor i thm O f Pattern EXtraction), and - number of variables; was originally devised in [1,2] for the purpose of - arithmetic value of the algorithm's free pa- experimentally determining receptive fields of indi- rameters.

T.E. Kalogeropoulos et al. / Computer Physics Communications 99 (1997) 255-269 257

• Comparison of the performance between the wavelength of the light). The function Q(0, ~b) de- different versions of the ALOPEX stochastic notes the irradiance from angles (0, gb) falling on the optimization algorithm, telescope objective. It is assumed that the object

• Introduction of a procedure able to adjust auto- under observation is radiating incoherently. The real matically one of the free parameters to its function 8(u, v) includes both the effects of the theoretically optimal value, disturbing atmosphere and of the correcting optical

• The equations specifying the algorithm have elements. The fact that the atmosphere modulates the been presented in a number of essentially amplitude, as well as the phase of the incoming equivalent forms. One of them is modified in wave, has been ignored. order to reduce the sensitivity observed, as far If the object beeing viewed by a ground-based as the value of the free parameters is con- telescope is of sufficiently small angular extent that cerned, light from all parts of the object incident on the same

Based on our theoretical and numerical analysis point (u, v) has passed through essentially the same we concluded that the observed fast convergence perturbing atmosphere, then we have aperture-plane properties of the ALOPEX algorithm, as well as the distortion. Such an object is said to lie within a potential for effective implementation by analog net- single isoplanatic patch. In such case we could works, make it a very attractive tool for driving AO assume that the atmospheric distortion can be written systems, in the form 8(u, v) instead of the more general form

~(u, v, +, 0). The role of the cost function in our simulations is

given to the Sharpness function S. This function can 2. Background be defined in such a way that under certain condi-

tions the value of the sharpness for an atmospheri- cally degraded image is always less than that of the

One of the basic physical quantities needed in our true (undistorted) image. In other words, any 8(u, v) analysis is that of the Irradiance I. For a definition, other than simple translation of the image (8(u, v) = consider the image plane (x, y) in which an irradi- a + bu + cv, where a, b, c are constants) will re- ance distribution I(x, y) is observed due to focused duce the value of S. light from a distant object of small angular extent. One such sharpness definition is (see [9]) The focusing is provided by a telescope objective placed at a distance F away. Let (u, v) be the coordinates of the aperture (objective) plane, and S = f d x d y I2 ( x, y ) , ( l ) consider the objective and image planes both to be perpendicular to the line that connects the center of while in the same reference the interested reader the objective and the center of the light distribution would find other equivalent definitions. All these of the distant object. The irradiance I(x, y) is then forms of the image sharpness function reach maxima given by a modified Fresnel-Kirchhoff integral over when the image distortion (excluding simple transla- the surface of the telescope objective, tion) has been removed, that is, when ~(u, v) = 0.

To simplify our investigations, we considered a

I( x, y) =- f dO dqb Q( O, dp) f du do point star and a wavefront along the x-axis. We refer to this as the one-dimensional case of 1-D. Evi- dently, in such case the expression for the irradiance

2 reduces to X e ik[~(u,v)+(ux+vy)/F+(uO+vdp)] ,

where we have neglected the angular obliquity fac- I ( x ) = f_~du e ik[~(u)+(u/F)x] 2, tors (see [5,9]). The light is taken to be monochromatic, with a wave number k = 2-rr/h (X is the where 2L is the length of the aperture.

258 T.E. Kalogeropoulos et al. / Computer Physics Communications 99 (1997) 255-269

WithoUtto getany loss of the generality, we set L = F M(~) / \ 1 the normalized version of the above \

ii expression, that is,

; I I ( x ) = du e/k(~(~)+"x) . (2) 1

In the case of perfect adjustment, that is, when 8(u) = 0, the function I ( x ) becomes

M ( x ) -=/max(x) = e ikux o

1 du e i2"~(x/x)u (3) " . . . . . z/a --!

which takes its maximum at x = 0, Fig. 1. M(x/X)versus x/X.

M(O) = f_l I du e0 = ldu = 4. (4) controlthe ruleVariable updates its value according to

In telescope system simulations, a random phase x~ k+ '~ = x~ k) + 8~ k), variation of the incident wavefront is introduced at where the telescope aperture to simulate the atmospheric distortion. The aperture of the telescope is consid- 8~k) = (8 , with probability p}k), ered to be broken into n separate movable segments, k - 8, with probability 1 - p}k), Image quality is defined by the narrowness of the point spread function of the image or, alternatively and by the Strehl Ratio SR. This is defined as the ratio 1 between the image intensity at the center and the p~k) = [ / A(/k)/] ' theoretical image intensity peak of the diffraction limited image. In the absence of any distortion, the [1 + e x p l - - ~ ) ] above ratio becomes

I(Xpeak) i(Xpeak) where A(i~)=(X~k)--X~k-l))(fk--fk_ 1) and T SR (5) is a characteristic quantity of the system, called

M( O) 4 the effective temperature of the system.

A plot of the normalized undistorted irradiance func- • ALOPEX II is another form of the algorithm, tion M ( x / k ) is shown in Fig. 1. expressed as (see [6,7])

Having introduced the necessary physical quanti- x~k+ 1)= x~k) + cA x~k)Afk + g~k) (6) ties for the simulation of the system, we now de- scribe the optimization algorithm we used for the (where A x~) = x~k)_ x~k-1), Afk = f k - problem of the image restoration. This is the fk-1)" ALOPEX stochastic iterative optimization algorithm, The terms g~ are essential ingredients in the process whose basic forms are listed below: because they provide the agitation necessary to drive

• ALOPEX I is a widely used version of the the process, as well as to overcome local extrema. algorithm [6-8] by which the cost function is The dynamics of the process depends strongly on the maximized by the following rule: Let x~ k) be amplitude and the frequency spectrum of the gr For the ith control variable in the kth iteration, example, by increasing the amplitude of the gi, and let fk -fk(x(0 k~, x~ k) . . . . . x~ ~) be the value faster convergence may be achieved, at the expense of the cost function in the kth iteration. Each of large fluctuations in the cost function.

T.E. Kalogeropoulos et all) / Computer Physics Communications 99 (1997) 255-269 259

3. Feedback optimization This will reduce the time the algorithm needs at the beginning of the process / to locate a local optimum

For a vectorization of our components, let f be a of the cost function. Then the appropriate noise will function of n variables and let x denote the n-di- be able to drive the process to a global optimum. mensional vector in En, that is, Working towards this direction, we first prove the

[Xx0] following lemma:

Xl Lemma 1. Let f ( x ) ~ ~2(Rn). If the Hessian matrix x = ~ ~ ~, n = N + 1, of f at x is negative-definite, i.e., if

z~Hz < 0 Vz ~ •", (9)

and f ( x ) = f ( x 0, x I . . . . . XN). Assuming that f ( x ) then the value is a twice differentiable function in ~ (i.e. f ~ G k A x k ~2(~n)) , its second order Taylor expansion is given ~k = - A x~ H A x k Ark (10) by

f ( x + h) = f ( x ) + GTh + ½hTHh, (7) maximizes the value of the function f at

where G is the gradient vector of f at x and H is xk+ l = xk+ l(ck) = xk + ck A xt Afk, (11)

the Hessian matrix of f at x, defined respectively as where x t = [x<0 k), x~ k> . . . . . x~k~] T.

a~f Proof By setting

a~x0 h t = Aft Ax k, (12)

a~f the relation (11) becomes

G = #x l , X t + l = X t + c t h k , (13) .

• where the constant c k is to be chosen to maximize O f f (xk+ i) on some interval of the line that passes

a~xu through x t in the direction of h t. Upon combination of Eq. (7) and (13), we obtain

a~2f a~2 W ( c t ) = _ f ( x t + l ) _ f ( x t ) = T ~ 2 T ctGt ht + $c t ht Hh t.

O2x0 "'" a~Xo XSXN (14)

H = " " The extremum of the function W(c k) is found, by

a~2f a~zf asking d W / d c t = 0, to be T • .. G t h t ",

"t~XO OXN "~i2XN ~k = h~Hh t

The idea to accelerate the tracking speed of the or, equivalently, ALOPEX algorithm was

• to allow the free parameter c in (6) to get G~" A x t different values in each iteration step, and ~ t = Ax~ H A x t A f t " (15) therefore the ALOPEX II process will take the form It is therefore evident that whenever H is negative

definite, the value of 0 k in (10) is a maximizer, since xt÷, = xt + ct ZXxt Aft + g t ; (8) Axt )

• to determine the free parameter c t of (8) in W(~t ) = > 0 . (16) such a way that, in the absence of noise, the 2Ax~ H A x k

process will converge as fast as possible to a This completes the proof. • maximizer.

260 T.E. Kalogeropoulos et al./ Computer Physics Communications 99 (1997) 255-269

In what follows we prove convergence of the 3.1. Practical optimal feedback:ALOPEXIII A LOPEX algorithm, in the absence of noise, by making convenient use of the energy norm, In its technological implementation, one of the

most attractive features of the ALOPEX algorithm is Definition 1. The energy inner product and the en- that no knowledge of the functional depedence of f ergy norm corresponding to a negative-definite ma- on the n control variables is required. This makes it trix, are defined as follows: impossible to evaluate the components of the gradi-

( x , y)/4 = -x ' rHy (17) ent vector of f , or its Hessian matrix, unless sophis- ticated and time-consuming numerical methods are

and used. Thus it still remains a challenging problem to 1[ Xi] H ~- ( X , X) 1/2/4 = (__xTHx) 1/2, find an approximation of the optimal value ~k of

(10). In an attempt to resolve this problem, we respectively, devised the following approximation process, which

Definition 2. The quantity implements the idea of second order interpolation to the cost function f ( x ) :

E ( x ) - - - i x [ ~ z = --xTHx (18) Let c 0, c I and e 2 be three trial values of the is defined as the Energy of the system, parameter c. I f xk, j, j = 0, 1, 2, denote the corre-

sponding values of the vector x k, that is, Proposition 1. Consider the ALOPEX II process (8) in the absence of noise, that is xkJ = xk + cj A f k A x k, j = 0, 1, 2,

xk+ 1 = x k + c k A x k A f k. (19) then the quadratic interpolating polynomial P~2k)(c) of the cost function f ( x k, c) is given by

Under the hypothesis that the Hessian matrix H is 2

negative-definite, if we set p~k)(c) = ~_, f ( xk , j ) "~2,j(C),

c = ~k, ( 2 0 ) s= 0

where 0 k is as defined in L e m m a 1, the process in where (19) converges in the sense that 2 ( c - ci)

It x k + , - X o p , ll/4 < llXk--XoptlI/4, (21) " ~ 2 ' j ( C ) ~--- i=0H (Cj__Ci)

where Xop t is a maximizer of the function f ( x ) . i4:j

and has an optimum value at Proof Recalling the Taylor expansion in (7) and ~g observing that the gradient vector G vanishes at c = c 3 = ~ , (26) Xop t, it is evident that in its neighbourhood there 2 , ~ holds where

f ( x ) - - f ( X o p t ) = 1 ( x - X o p t ) T n ( X -- Xopt) , ( 2 2 ) 2 1 E S(xk,j)

and for the energy E the relation S: 0

E ( X - X o p t ) = - 2 [ f ( x ) - f ( X o p t ) ] . (23) and

Combination of the above equations and Eq. (16) 2 ~'2 yields ~ = ~ ] ( - 1)J ~ '1 f ( xk ' J ) '

j = 0 E( x k - Xopt) > E( xk+ , - Xopt), (24)

with the constants ~ l and ~'2 being defined as which in turn leads to the relation 2 2

I l x k + l - X o p t l l , < I lxk--Xopt l lH, (25) ~1 = I - I ( c j - - e i ) and ~:z = Y'~c i. i=0 i=0

completing the proof. • i4~j i4=j

T.E. Kalogeropoulos et al. / Computer Physics Communications 99 (1997) 255-269 261

As will be verified in the section that follows, the Table 2 use of E i g h t c o s t f u n c t i o n s r e d u c e d to t he i r o n e - d i m e n s i o n a l e x p r e s s i o n

SI = fl_ 1 d ( x / k ) 1114 S 5 = l ( x 0 = 0.0) Ok~-'C3 S 2 = f l l d ( x / X ) 1 2 S 6 = - f L l d ( X / ~ . ) l l - M }

results in a rapidly convergent ALOPEX algorithm. S 3 = f l - I d ( x / h ) 1 3 ST=-fl-I d(x//~')ll-MI2 The computational cost for such an improvement is $4 = f l i d ( x / / ) k ) 14 So = f l 1 d ( x / / ) k ) ( l M ) 2

the increase in the number of function evaluations.

evaluations needed for the determination of the ap- 4. Simulations and results propriate value of constant ~k.

For the numerical tests, we chose = = I 1 4.1. Scope o f the tests c o O, c I ~ and c 2 = 7,

In this section, we discuss the numerical results so that obtained by applying the three ALOPEX stochastic ~ = = l ( c WO ] optimization methods to a simulated, one-dimen- c3 "~ l - , sional telescope system, w2 ]

This system is described by the number n of where

movable segments of the mirror, by the exact posi- f ( c l ) _ f ( C o ) f (c2 ) _ f ( c l ) tion of these segments and finally, by the phases ~i w 0 = , W 1 = and of the distorted wavefront. Our aim is to compare the cl c 2 - cl ALOPEX optimization methods, to test the ability of Wl - w0 them, using several cost functions from those listed w2

C2 in Table 2 and finally, to examine the behavior of the optimization algorithms in terms of the number of For the requirements of the problem, eight cost segments n. functions are reduced to their one-dimensional ex-

The ALOPEX versions under comparison are pression, as shown in Table 2. listed in Table 1. The optimal value c = 8.1 for The algorithms described have been implemented ALOPEX II is found by a "trial and error" numeri- in a computer program written in FORTRAN 77, and cal procedure, the numerical results were obtained using double-

For the case of ALOPEX III, we must keep in precision arithmetic. mind that the computational effort here is quadrupled in every iteration, and this is because of the function

I

T a b l e 1 °

A L O P E X v e r s i o n s u n d e r c o m p a r i s o n

A L O P E X I r(k+ 1) = X(k) .~_ ~(k) " ' t t t ' o .~

w h e r e ~!k) = [ ~ w i t h p r o b a b i l i t y p } k ) ,

- ~, w i th p r o b a b i l i t y 1 - p } k ) , o.n.

a n d p}k) a (k~ - i = [ 1 + e x p ( - ~ - ) ] .

(k+ i) (k) ~ A A L O P E X I I x i = xi + ~ x } k ) A f k + g~k) w h e r e c = 8 .1 . • ~ "

A L O P E X III x i(k-I- 1) = Xi(k) + C k A X? ) A f~ + g~k), w h e r e c k = c 3 as in (26) .

A L O P E X IV (k+ l) (k) g!k) xi = xi + ck Ax~ k) Afk + x/~

where c k as in (10). Fig. 2. Initial distorted image on the telescope objective.

262 T.E. Kalogeropoulos et a l . / Computer Physics Communications 99 (1997) 255-269

Table 3 Comparison between SR of the different ALOPEX versions SR

Iteration ALOPEX I ALOPEX II c = 8.1 ALOPEX III o.

1 0.5303 0.5303 0.5303

2 0.5565 0.6506 0.7361 0., 3 0.5801 0.5084 0.4024

4 0.5668 0.5577 0.5742

5 0.5871 0.5560 0.5579 0.,

6 0.5922 0.5384 0.8631 7 0.5772 0.5738 0.9649

8 0.5977 0.6209 0.9647 o.~

9 0.6032 0.6842 0.9617

10 0.6114 0.8039 0.9633 11 0.6119 0.7567 0.9620 ~'0 ' ~'0 . . . . ~o ' ' ' ~ 0 ' ~0 12 0.6191 0.6997 0.9603 iter~tioa:

13 0.6267 0.3628 0.9628 Fig. 3. Optimization performance for a typical run of 200 itera- 14 0.6189 0.6274 0.9638 tions, using ALOPEX I. 15 0.6267 0.5992 0.9607 16 0.6351 0.6316 0.9610 17 0.6515 0.6752 0.9611

18 0.6663 0.7238 0.9586 to the choice of the value of variable c, it becomes 19 0.6682 0.7073 0.9592 20 0.6627 0.7284 0.9607 apparent that the proposed ALOPEX III version of 21 0.6487 0.7359 0.9592 the algorithm performs considerably better than the 22 0.6522 0.8854 0.9591 previous versions of ALOPEX I and II. All these of 23 0.6566 0.9219 0.9613 course for an extra cost of three more function 24 0.6479 0.9426 0.9603 evaluations per iteration. 25 0.6569 0.9466 0.9600

To complete the picture drawn from Table 3, we include Figs. 3, 4 and 5 where the long time (200

For the numerical evaluation of the involved inte- iterations) behaviour of the three algorithms ALOPEX I, II and III are show, respectively. In-

grals, the 8-point Gauss-Legendre formula was uti- lized, specting Fig. 5 it becomes clear that the ALOPEX III

rapidly locates and stays in the neighborhood of the

4.2. Exploration of ALOPEX algorithms optimal value of the cost function (SR = 1).

For a comparison of the computational efficiency of the algorithms under study, all simulations started sR [ with the same distorted wavefront. For this, we introduce a constant phase distortion (see Fig. 2), o- and an initial position of the control variables independent of the algorithm used. 0.~

In this section the cost function used is S 1 from Table 2. The results are displayed in Table 3, where we include the computational performance of the ° ' f three algorithms for 25 iterations. As a satisfactory convergence criterion of the algorithms we consid . . . . ered the consistent production of results satisfying SR > 0.8. By inspection of Table 3 and Figs. 3 -5 , it becomes evident that ALOPEX III converges much iteration

faster than ALOPEX I and II. If, on top of this, we Fig. 4. Optimization performance for a typical run of 200 itera- also consider the extreme sensitivity of ALOPEX H tions, ALOPEX II algorithm.

T.E. Kalogeropoulos et a l . / Computer Physics Communications 99 (1997) 255-269 263

SR ~ SR / . . r

o . o .

0 .6 " t~ .~,-

0 . 4 ' 0 , 4 "

0 . 2 ' 0 . 2 "

iteration iteration

Fig. 5. Optimization performance for a typical run of 200 itera- Fig. 7. Optimization performance for a typical run of 200 iterations, ALOPEX III algorithm, tions, ALOPEX IV algorithm.

quickly locates and stays near the optimizer (Fig. 7) A visual perspective of the results succeded by while the values of ~k differ a lot from the corre-

ALOPEX III is included in Fig. 6. For a comparison, sponding values of ~, used in ALOPEX III (Table recall that in Fig. 1 we include the optimal undis- 4). torted image while in Fig. 2 we include the initially distorted inage. 4.3. Exploration of cost functions

For completeness purposes we also include Table 4 and Fig. 7 which demonstrate the results obtained While Table 3 summarizes the rapidity of all from the numerical evaluation for the computational ALOPEX versions, it still does not give the complete performance of the ALOPEX IV algorithm. As for picture of their ability. In order to further analyze the this case one has to evaluate the associated gradient performance of the three algorithms in connection vector and the Hessian matrix, we used function S 5 with the eight different cost functions, we include (Table 1), instead of S~, as the cost function of the two additional tests: the number of iterations and the system in order to keep the complexity at reasonable number of equivalent function evaluations needed to levels. Observe that ALOPEX IV smoothly and achieve a certain Sharpness of the image. In the case

of the ALOPEX III algorithm, the latter, more than the former, is a measure of the total computational

involved. It is defined as the number of iterations augmented by the number of test-function eval-

during each iteration. For the study of each algorithm, the problem of

r i uationseff°rt(ALOPEX II, IID or values of the random parame- the image restoration has been solved for 35 different combinations of the system's noise terms

ters (ALOPEX I). The results presented have a statis- tical meaning, as

m is the mean number of iterations required for ~ f ~ - ~ o ~ convergence, . . . . ~ mf is the mean number of function calls required for

=/;~ convergence when the ALOPEX III algorithm is Fig. 6. Restored image after 25 iterations, ALOPEX III algorithm, used and finally

264 T.E. Kalogeropoulos et a l . / Computer Physics Communications 99 (1997) 255-269

Table 4 Performance of the ALOPEX III and ALOPEX IV algorithms

Iteration ALOPEX III ALOPEX IV

SR ~ SR ck

1 0.5303 30.8101 0.5220 9.1325

2 0.7361 2.3179 0.6461 1.5002

3 0.4024 0.8849 0.6949 2.6986

4 0.5742 0.0000 0.7171 4.4891

5 0.5579 537.6612 7125 139.8572 6 0.8631 2.6446 0.7128 - 270.9617

7 0.9649 1.3229 0.7246 21.8133

8 0.9647 0.0000 0.7238 125.0609

9 0.9617 0.0000 0.7167 45.6181 10 0.9633 0.0000 0.7306 5.1740

11 0.9620 0.0000 0.7354 209.5636 12 0.9603 0.0000 0.7489 27.0588

13 0.9628 0.0000 0.7621 3.5077

14 0.9638 883.0985 0.7654 228.8462

15 0.9607 0.0000 0.7770 69.8030 16 0.9610 0.0000 0.8067 4.7094 17 0.9611 405.9387 0.8117 5.1054

18 0.9586 0.0000 0.8164 145.9551 19 0.9592 0.0000 0.8224 35.2519

20 0.9607 0.0000 0.8253 178.1935 21 0.9592 0.0000 0.8264 - 52.7298 22 0.9591 0.0000 0.8233 159.4618

23 0.9613 0.0000 0.8300 96.9611 24 0.9603 0.0000 0.83761 - 0.3082 25 0.9600 0.0000 0.8380 839.6131

o- is the standard deviation of the number of itera- The study of the behavior of the algorithms in tions, terms of the change of the cost function has taken

In the tables that follow, the index of S denotes into account the convergence criterion of the previ- the cost function used. ous section, that is, the relation SR = Imax(Xpeak) /

In each optimization cycle, we use the same M(0) > 0.8. The computational times of the follow- initial phase distortion, and the same initial positions ing results have implications with regard to this of the control variables which are also independent stopping rule. of the cost function, and the version of the algorithm Tables 5 - 7 give the mean number of iterations used. after which the iterative procedure is reset to its

Table 5 Table 6 ALOPEX I convergence time when different cost functions are ALOPEX II convergence time and constant c value when differ-

used ent cost functions are used

S m o" S c m o-

S O 78.28 30.08 S O 0.55 52.2 7.71 S l 81.54 30.28 S l 8.10 24.65 8.65 S 2 78.29 33.85 S 2 10.00 19.48 11.53 S 3 81.57 29.94 S 3 2.00 30.09 13.33 S 4 78.14 33.16 S 4 0.40 100.58 10.74 S 5 73.54 29.70 S 5 10.00 105.34 93.97 S 6 88.26 40.66 S 6 6.50 114.52 34.90 S 7 95.80 40.45 S 7 0.05 158.24 55.32


Table 7 quently, of the number of movable segments of the Conve rgence t ime o f the A L O P E X III a lgor i thm, when different aperture 'line') varies, let us first consider the fol- cost functions are used lowing two relations describing the distortion and the S mf m cr initial position of the kth variable, respectively: So 28.057 7 .5142 1.94

I L ",

SI 20 .057 5 .5142 1.51 0 < 8 k = 0 . 5 + 0 . 4 c o s / 1 2 , r r . . ~ . ~ ~ < l S 2 26 .685 7 .1714 0 .696 -- ~ N ] - ' S 3 27 ,029 7 .2571 0 .552

x 0, S 5 32.971 8 .7428 3 .184 = •

S 6 58 .914 15.2286 10,28

$7 55 .829 14.4571 8.91 T h e positions of the movable segments of the aperture are now no more determined by the 8-point

starting point, in terms of change of the cost func- Gauss-Legendre nodes, but are computed by the tion. The number of the control variables is kept formula constant and equal to 8. i

As expected, there is no essential difference be- Yi = - 1 . 0 + 2 . 0 ~ , tween the convergence rates for the various cost functions. This is due to the fact that the functional so that yi ~ [ - I, 1], and Yl = - 1, Y~v = 1, VN. The dependence of S on the control variables is of no aperture is thus separated into n = N + 1 equally importance during the optimization procedure, spaced segments, the positions of which are listed in

However, ALOPEX II, due to its extreme sensi- Table 8, for the first ten cases of 4 _ n < 13. The tivity on the value of constant e, which has to be 1 - D expression for the irradiance I(x) (formula empirically computed for every change of the cost (2)) is approximated by the sum function, does not seem to obey the above conclu- [ N 2 sion (see Table 6). Moreover, additional computa- I ( x ) = [ ~ cos k(g i - A ~ +yi x) tional effort is needed, in comparison to that of i=o ALOPEX III, in spite of the latter algorithm's extra function evaluations. N 2]( 2 )2

+ ~ s i n k ( ~ i - - A i - k Y i X ) ~ , 4.4. Convergence speed versus number of variables i= 0

In order to study the behavior of the algorithm which, as an order O(h2)(h = 2 / ( N + 1)), is a good when the number of the control variables (and conse- approximation to I (x) , as N becomes large.

Table 8

Posi t ions o f the aper ture segments , on the x-axis

N 4 5 6 7 8 9 10 11 12 13

- 1 . 0 0 0

- 1.000 - 1 .000 - 0 .846 -- 1.000 - 1.000 - 0 . 8 1 8 - 0 . 8 3 3 - - 0 . 6 9 2

- 1.000 - 1.000 - 0 .788 - 0 .800 - 0 .636 - 0 .667 -- 0 .538 - 1 .000 - 1.000 -- 7 1 4 - 0 .750 - 0 .556 - 0 .600 - 0 .455 - 0 .500 - 0 .385

- 1 .000 - 0 . 6 0 0 - - 0 . 6 6 7 - 0 . 4 2 9 - 0 . 5 0 0 - 0 . 3 3 3 - 0 . 4 0 0 - 0 . 2 7 3 - - 0 . 3 3 3 - 0 . 2 3 1

- 0 . 5 0 0 - 0 . 2 0 0 - 0 . 3 3 3 - 0 . 1 4 3 - - 0 . 2 5 0 - 0 . 1 1 1 - 0 . 2 0 0 - 0 . 0 9 1 - 0 . 1 6 7 - 0 . 0 7 7

Yi 0.000 0 .200 0 .000 0 .143 0 .000 0.111 0 .000 0.091 0 .000 0 .077

0 .500 0 .600 0 .333 0 .429 0 .250 0 .333 0 .200 0 .273 0 .167 0.231

1.000 1.000 0 .667 , 0 .714 0 .500 0 .556 0 .400 0 .455 0 .333 0 .385

1.000 1.000 0 ,750 0 .778 0 .600 0 .636 0 .500 0 .538

1,000 1.000 0 .800 0 .818 0 .667 0 .692

1.000 1.000 0 .833 0 .846

1 . 0 0 0 1 . 0 0 0

266 ~E. Ka~geropoulos et a l . / Computer Physics Communicat~ns 99 (1997) 255-269

Table 9 Dependence of ALOPEX I convergence time on the number of variables n

n m ~ n m

5 32.5161 16.2577 28 118.4194 95.2477 6 40.6774 19.0015 29 114.3548 70.1903 7 44.6452 26.7878 30 130.3548 75.3544 8 42.6129 20.0156 31 136.8387 88.0839 9 47.9677 26.6137 32 106.5806 48.7076 10 44.7419 18.7306 33 109.7419 96.9871 11 46.7097 27.7526 34 147.9677 101.8561 12 50.8064 34.2528 35 139.5483 81.7434 13 52.0645 22.9515 36 191.0645 117.3019 14 57.5161 35.9443 37 203.6774 178.3145 15 61.0645 34.0454 38 170.2258 169.0406 16 57.7097 33.6128 39 193.1290 175.2952 17 68.1290 51.5975 40 212.483 162.7984 18 73.8387 49.1417 41 266.4838 187.4427 19 65.1936 37.5151 42 161.5483 91.3327 20 73.6129 39.9667 43 241.2258 183.3557 21 93.8064 51.2838 44 235.9032 218.3257 22 66.9032 40.4542 45 237.0967 209.4099 23 73.3871 44.6208 46 256.2580 221.2480 24 87.3871 64.2443 47 272.1290 219.8911 25 91.8710 50.3714 48 273.0322 226.062] 26 123.6129" 67.9700 49 276.8064 215.7024 27 117.6452 86.3588 50 285.7742 255.3951

Table 10 Dependence of the ALOPEX III convergence time on the number of variables n

n m ~ n m ~ n m

5 9.0323 7.8842 17 29.1290 16.7923 29 34.2903 18.9688 6 14.1935 6.0664 18 34.0000 22~439 30 32.000 15.0654 7 15.3226 9.5896 19 29.2903 19.4856 35 45.7096 23.7231 8 21.7097 14.3576 20 32.0968 15.2450 40 46.4838 15.7375 9 23.5806 16.2357 21 27.8387 17.3021 45 50.7741 20.4603

10 22.5806 16.5408 22 30.4516 15.2819 50 51.6774 22.2832 11 20.1290 7.9828 23 34.8064 20.9060 55 52.3225 18.8207 12 27.2258 18.3720 24 37.5161 17.3890 65 52.6373 17.4744 13 25.6774 20.4078 25 33.1290 18.4124 75 53.3870 22.0522 14 22.2903 10.5102 26 37.2903 20.7367 85 63.0645 29.1259 15 24.6452 15.7717 27 31.4839 16.7464 95 61.5161 22.1925 16 31.0000 18.6582 28 34.4516 17.1818 105 69.6451 30.155

Table 11 ALOPEX II convergence time and change in the number of variables n

n 4 5 6 7 8

m 55.6265 52.7145 27.4387 94.9432 24.6534 11.7434 11.2856 18.1165 44.5987 8.6533


7C~ m

m

2~0 6~

200 se

1~0104 4~e

, ~ . . . . , 2tr

• 10 110 , 210 31 o , , 410 , , / ° 610 210 , , 410 . . . . 610 810 ' , j.olo

n n

Fig. 8. ALOPEX I regression analysis in terms of the number of Fig. 9. ALOPEX III logarithmic regression analysis in terms of variables n. the number of variables n.

(b) linear regression, with a correlation coefficient r = 0.9215, given by the relation

In an effort to obtain information concerning the rate at which the cost function approximations ap- m = 19.728 + 0.5n. proach their optimal value, Tables 9 - 1 1 are con- The above curves, together with the data points is structed. They present a summary of the numerical plotted in Figs. 9 and 10, respectively. results with respect to computational times and In the ALOPEX II case, the computation of the change of the number of variables n (where n = N + exact appropriate value of c for every value of n 1). They also include the mean number o f iterations was beyond reach, for the reasons previously dis- m needed for convergence and its standard~ deviation cussed. cr, for every different value of n. Table 11 lists some characteristic results, for the

For ALOPEX I, it is evident that the dependence first five values of n, and gives a sample of the of the convergence time on the number of variables incapability of the ALOPEX II process to give reli- is linear. This is in agreement with theoretical argu- able data for a satisfactory regression analysis. ments [15]. An extra study, making use o f the least- To conclude, we remark that due to the superior- squares approximation method, gave for the number ity of the proposed ALOPEX III algorithm, revealed of effective iterations, the relation

m = 14.95 + 4.43n "°

with a correlation coefficient r = 0.7789738. m

In Fig. 8 we can see the data points, together with '° the least-squares approximating line.The dependence ~0 of the ALOPEX III algorithm on the number n is much weaker, as follows from Table 10 and as one ,o might expect from the fact that the feedback is continuously adjusted for fast convergence to the optimum. The L S regression analysis led to two i possible approximations: I i ] " (a) logarithmic regression, by which the best correla-

tion coefficient ( r = 0 . 9 6 0 2 ) is obt~ned. The . . . . . . . ,o ~o curve is given by n m = - 18.846 + 17.169 In (n) Fig. 10. ALOPEX III linear regression analysis in terms of the

number of variables n.

268 T.E. Kalogeropoulos et al. / Computer Physics Communications 99 (1997) 255-269

from all the numerical tests performed in this work, (sequential and fixed feedback) Metropolis [17] algo- as well as the simplicity of its technical implementa- rithm, for imaging a white wavefront. tion, the new version of the ALOPEX method ap- The ALOPEX algorithms are driven by the inco- pears to be attractive for the solution of a vast herent dithering of the control variables in parallel variety of complicated and multi-dimensional real and the time dependence of the image sharpness time problems. (feedback). Their effectiveness and practicality de-

pend on how frequently the image sharpness has to be used, as compared to the period of the turbulent

5. Discussion and conclusions wavefront and whether the mirror can follow the wavefront time evolution. The sharpness can be mea-

An optimized feedback to ALOPEX has been sured as fast as the photon statistics would allow it, found which makes significant improvement to its while the latter is a technical question. In ALOPEX performance in its application to adaptive optics. On III three extra ditherings are required per cycle in the one hand, it maximizes the image sharpness order to determine the optimum feedback on step faster and, on the other hand, the speed of conver- sizes. Although the computational overhead is greater gence depends logarithmically on the number of than that in the fixed size ALOPEX, there is an control variables in contrast to ALOPEX II, which is overall improvement as measured by the number of linear. For a wavefront with light phase variables, feedback readings. For example, in a closed-loop the time to recover the wavefront is less by a factor adaptive optics implementation in which the lowest of two with ALOPEX III as compared to ALOPEX right Zernike [12] polynomial coefficients are the II. This is a consequence of the continuous adjust- control variables, one can foresee practical applica- merit of the feedback in an optimized way. These tions, even for the largest telescopes in the infrared improvements will extend the potential applicability region at least, as long as the sharpness is measured of ALOPEX in closed-loop control systems in gen- using a star of magnitude as small as ten. Our eral, and active or adaptive optics in particular. It is continuing studies will define sharper the limits of important to note, however, that our studies and applicability, when photon statistics and wide band results are relevant to catch-up times, namely to the realistic [18,19] wavefronts are considered. time required to adjust the flexible mirror and com- pensate a fixed and arbitrary turbulent wavefront. It does not necessary follows that the same improve- References ments will hold for the dynamic case where the wavefront is continuously changing and that [1] E. Harth and E. Tzanakou, ALOPEX: A stochastic method ALOPEX will still maintain the image sharpness by for determining visual receptive fields, Vision Research 14 continuous adjustment of the flexible mirror shape. (1974) 1475-1482. We currently investigate this in conjuction with a 1.3 [2] T. Tzanakou, R. Michalak and E. Harth, The ALOPEX

process: Visual receptive fields by response feedback, Biol. m telescope w h e r e we ultimately plan to demonstrate Cyber. 35 (1979) 161-174.

the effectiveness of our method. However, our cur- [3] E. Harth, K.P. Unnikrishnan and A.S. Pandya, The inversion rent results are directly relevant to problems of adap- of sensory processing by feedback pathways: a model of tive optics, visual cognitive functions, Science 237 (1987) 187-189.

The image sharpness has been maximized with a [4] E. Harth and T. Kalogeropoulos, Multiparameter optimization circuit, U.S. Patent No. 4 912 624, March 1990.

monochromatic wavefront and one may question [5] M. Born and E. Wolf, Principles of Optics, 3rd ed. (Per- whether the optimizing algorithm will converge to gamon Press, New York, 1965). the sharpest image produced with a " white" wave- [6] T. Kalogeropoulos, P. Saulson and E. Harth, Adaptive optics front. We are confident from extensive applications through stochastic optimization Syracuse U. Internal Report

(1993) 1-34. of ALOPEX [8] that it will converge to the sharpest [7] E. Harth, T. Kalogeropoulos and Z. Sobolewski, An elec- image that can be produced by the adjustment of the Ironic device for multi-parameter adaptive control Syracuse mirror surface. However, this has also been demon- u. Internal Report (1990) 1-27. strated [10], using the similar but much less efficient [8] E. Harth, T. Kalogeropoulos and A.S. Pandya, A universal

T.E. Kalogeropoulos et al . / Computer Physics Communications 99 (1997) 255-269 269

optimization network, Special Symposium Volume of the [14] E. Harth and A.S. Pandya, Biomathematics and Related 10th Annual International Conference of the IEEE-EMBS Computational Problems, L.M. Ricciardi, ed. (Kluwer Acad. (New Orleans, LA, Nov. 1988). Publ., Dortrecht, 1988) pp. 459-471.

[9] R. Muller and A. Buffington, Real-time correction of atmo- [15] E. Harth, T. Kalogeropoulos and A.S. Pandya, Proceedings spherically degraded telescope images through image sharp- o f the Special Symposium on Maturing Technologies and ening, J. Opt. Soc. Am. 64 (1974) 1200-1210. Emerging Horizons in Biomedcal Engineering, J.B. Mykle-

[10] A. Buffington, F.S. Crawford, R.A. Muller, A.J. Schwemin bust and G.F. Harris, eds. (1989) 97-107. and R.G. Smits, Correction of atmospheric distortion with an [16] E. Harth, Visual perception: a dynamic theory, Biol. Cyber. image-sharpening telescope, J. Opt. Soc. Am. 67 (1977) 22 (1976)169-180. 298-303. [17] N. Metropolis, A.W. Rosenbluth, N.W. Rosenbluth, A.H.

[11] D.L. Fried, Statistics of a geometric representation of wave- Teller and E. Teller, J. Chem. Phys. 21 (1953) 1087. front distortion, J. Opt. Soc. Am. 55 (1965) 1427. [18] Love et al., Appl. Opt. 34 (1995) 6058-6066.

[12] R.J. Noll, Zemike polynomials and atmospheric turbulence, [19] Lane, Glindemann and Dainty, Waves Random Media 2 J. Opt. Soc. Am. 66 (1976) 207. (1992) 209-224.

[13] L.A. Thompson, Adaptive Optics in Astronomy, Physics Today (Dec. 1994) 24-31.

Documents

Improved stochastic optimization algorithms for adaptive optics