A new trust region filter algorithm

Applied Mathematics and Computation 204 (2008) 485–489

Contents lists available at ScienceDirect

Applied Mathematics and Computation

journal homepage: www.elsevier .com/ locate/amc

A new trust region filter algorithm q

Shujun Li 1, Zhenhai Liu *

School of Mathematics and Computing Science, Changsha University of Science and Technology, Changsha 410076, PR China

a r t i c l e i n f o

Keywords:Unconstrained optimizationTrust regionAdaptive strategyFilter methodGlobal convergence

0096-3003/$ - see front matter � 2008 Elsevier Incdoi:10.1016/j.amc.2008.07.007

q This work was supported by the National Nature(Grant No. 07JJY3005).

* Corresponding author.E-mail addresses: [email protected] (S. Li), z

1 Present Address: Shaanxi Costume and Art Colleg

a b s t r a c t

In this paper, an adaptive trust region filter algorithm for unconstrained optimization ispresented. The global convergence results of the algorithm are established and the numer-ical results show that the new algorithm is efficient in practical computation.

� 2008 Elsevier Inc. All rights reserved.

1. Introduction

Consider an unconstrained optimization problem

min f ðxÞ; ð1:1Þ

where f : Rn ! R1 is twice continuously differentiable. For convenience, we introduce some notations. Throughout the paper,the notation k � k denotes the Euclidean norm on Rn. Suppose fxkg is a sequence of points generated by our algorithm, wedenote fk ¼ f ðxkÞ; gk ¼ gðxkÞ; gk;i ¼ giðxkÞ and Hk ¼ HðxkÞ, where gðxkÞ 2 Rn and HðxkÞ 2 Rn�n are the gradient and Hessian off ðxÞ evaluated at xk, respectively, Bk is a symmetric matrix which is either Hk or an approximation to Hk. The trust regionalgorithms [14] usually obtained the trail step by solving the following subproblem:

min ukðdÞ ¼ gTkdþ 1

2dTBkd;

s:t: kdk 6 Dk;

ð1:2Þ

where Dk is the trust region radius. It is well known that the trust region radius Dk is independent of gk and Bk in the tradi-tional trust region filter algorithms [2,9–12]. We do not know whether the radius Dk is suitable to the whole algorithm. Thissituation would possibly increase the number of solving subproblems and decrease the efficiency of the algorithms. Further-more, the choice of D0 also affects the efficiency of these algorithms, but we do not have any general rule on choosing D0.Therefore, Sartenaer [13] presented a strategy which can automatically determine an initial trust region radius. Zhang

et al. [15,16] presented other two strategies of determining the trust region radius by Dk ¼ cpkgkkdk

and Dk ¼ cpkgkk3

gTkbBk gk

, where

c 2 ð0;1Þ, p is a nonnegative integer, dk ¼minfkBkk;1g and cBk ¼ Bk þ iI is a positive definite matrix for some i. In this paper,we present a new adaptive trust region strategy, which is different from these in [6,15,16]. By using the information of the

. All rights reserved.

Science Foundation of China (Grant No. 10671211) and Nature Science Foundation of Hunan province

[email protected] (Z. Liu).e, Xianyang 712046, PR China.

mailto:[email protected]

mailto:[email protected]

http://www.sciencedirect.com/science/journal/00963003

http://www.elsevier.com/locate/amc

486 S. Li, Z. Liu / Applied Mathematics and Computation 204 (2008) 485–489

former iterates, it is not necessary to compute cBk to obtain Dk, which decreases the computational complexity of ouralgorithm. Therefore, our new strategy has been an improvement in a sence. Furthermore, by combining with filter tech-nique, this new strategy will avoid this situation Dk ¼ 1 when gk ¼ gk�1 in [6].

The rest of this paper is organized as follows: In Section 2, we introduce the filter technique. The new algorithm is pre-sented in Section 3. Some assumptions and the global convergence are presented in Section 4. The numerical result is sum-marized in Section 5.

2. The multidimensional filter

In this section, we introduce the mechanism of the filter, which is proposed by Flecher and Leyffer [3]. The definition ofthe filter is based on the definition of dominance. For our problem, we say that a point x1 dominates a point x2 if and only if

kg1;ik 6 kg2;ik for all i 2 f1;2; . . . ;ng:

Thus, if iterate x1 dominates iterate x2 and we focus our attention on the first-order critical point only, the latter has no realinterest to us since x1 is better than x2 for each of the components of the gradient. Therefore, all we need to do is to rememberiterates that are not dominated by other iterates by using a structure called filter [2–4,10]. A multidimensional filter F [3] is alist of of the form ðgk;1; gk;2; . . . ; gk;nÞ such that

jgk;jj < jgl;jj for at least one j 2 f1;2; � � � ;ng; gl 2 F and k 6¼ l:

By Fk we denote the filter F at kth iterate.However, we do not wish to accept a new point xþk if one of the components of ðg1ðxþk Þ; g2ðxþk Þ; . . . ; gnðxþk ÞÞ is arbitrary close

to being dominated by the other points already in the filter. In order to avoid this situation, we slightly strengthen ouracceptable conditions and say that a new trial point xþk is acceptable for the filter Fk if and only if

8gm 2 Fk 9j 2 f1;2; . . . ;ng such that kgjðxþk Þj < jgm;jj � cgkgðxmÞk; ð2:1Þ

where cg is a small positive constant. If xþk satisfies (2.1), then we add ðg1ðxþk Þ; g2ðxþk Þ; . . . ; gnðxþk ÞÞ to the filter and remove allthe points which are dominated by xþk from the filter. This operation is also called ‘‘add xþk to the filter” in the sequel.

3. Algorithm

In this section, we propose the new algorithm. At the current iterate xk, we need to solve the subproblem (1.2) to obtainthe trail step dk. If dk satisfies rk ¼ f ðxkÞ�f ðxkþdkÞ

ukð0Þ�ukðdkÞ> u, then we call dk an f-step, otherwise dk is a g-step if dk satisfies (2.1). Now

the algorithm for the solution of problem (1.1) can be presented as follow:

Algorithm 3.1.

Step 0: (Initialization) Given x0 2 Rn, B0 2 Rn�n, set the parameters 0 < u < 1, 0 < k < 1; � > 0. Initialize the filter asF0 ¼ ðg1ðx0Þ; g2ðx0Þ; . . . ; gnðx0ÞÞ. Let k :¼ 0, i :¼ 0.

Step 1: If kgkk 6 �, then stop.Step 2: Compute hk ¼minfkgmkjðg1ðxmÞ; g2ðxmÞ . . . ; gnðxmÞÞ 2 Fkg.Step 3: Let Dk ¼ ki min kdk�1k

kgk�gk�1kkgkk; kgkk; hk

n o, solve the subproblem (1.2) to obtain dk, set xþk ¼ xk þ dk.

Step 4: Compute rk, if dk is an f-step, then xkþ1 ¼ xþk ; Fkþ1 ¼ Fk and go to step 6, else go to step5.Step 5: If dk is a g-step, then xkþ1 ¼ xþk , add xþk to Fk, Fkþ1 ¼ Fk and go to step 6, else i ¼ iþ 1 and go to step 3.Step 6: Modify Bk to obtain Bkþ1. Set i ¼ 0, k :¼ kþ 1 and go to step 1.

4. Convergence analysis

In the convergence analysis, we need the following assumptions

A1 The gradient of the objective function f is Lipschitz with a constant L.A2 The iterate fxkg remain in a closed, bounded convex domain of Rn.A3 fBkg is uniformly bounded. i.e., there exists a constant M such that kBkk 6 M for all k.

The cycle between steps 1 and 6 are called outer cycle, and the cycle between steps 3 and 5 are called inner cycle. Let dik be

the solution of (1.2) with respect to Dk ¼ Dik ¼ ki min kdk�1k


n oand d�k is the solution of (1.2) when the inner

cycle at iterate xk is terminated.

S. Li, Z. Liu / Applied Mathematics and Computation 204 (2008) 485–489 487

Lemma 4.1. Suppose the assumptions A1–A3 hold. For all k, we have

�ukðdikÞP

12kgkkmin Di

k;kgkkkBkk

� �P

12kgkkmin k�k min

kgkkL

; kgkk; hk

� �;kgkk

M

� �; ð4:1Þ

where i indicates the ith inner cycle and k�k is the value of k when the inner cycle is terminated at iterate xk.

Proof 1. See Lemma 13.1.3 in [14]. h

Theorem 4.2. Suppose the assumptions A1–A3 hold. Then the Algorithm 3.1 is well defined. i.e., the algorithm does not cycle inthe inner cycle infinitely.

Proof 2. Suppose that Algorithm 3.1 cycles infinitely between step 3 and step 5 at kth iterate. By the algorithm we can get

rik 6 u and Di

k ¼ ki minkdk�1k

kgk � gk�1kkgkk; kgkk; hk

� �! 0 as i!1:

By (4.1), we have

jrik � 1j ¼ f ðxkÞ � f ðxþk Þ þukðd

ikÞÞ

ukð0Þ �ukðdikÞ

�� ¼

R 10 ½gk � gðxk þ tdi

kÞ�T di

kdt þ 12 ðd

ikÞ

T Bkdik

�ukðdikÞ

�� 6

12 ðLþMÞðDi

kÞ2

12 kgkkminfDi

k;kgkkkBkkg! 0ði!1Þ;

which implies that rik ! 1. Therefore, we obtain a contradiction with ri

k < u < 1 and the proof is completed. h

Theorem 4.3. Suppose the Assumptions A1–A3 hold and there are infinitely many trail points are added to the filter. Then

lim infk!1

kgkk ¼ 0;

Proof 3. Let K1 ¼ fkjxk is added to the filterg. By mechanism of the filter (see (2.1)), xk is acceptable for the filter, whichimplies that for each ki 2 K1 sufficient large, there exists an index j 2 f1;2 . . . ;ng such that

jgjðxkiÞj � jgjðxki�1Þj < �cgkgðxki�1Þk ki; ki�1 2 K1: ð4:2Þ

Furthermore, by the assumption A2, we know gjðxÞðj ¼ 1;2 . . . ;nÞ is a continuous function in a closed, bounded convex do-main, thus fgjðxkÞg is bounded and there exists a subsequence of fgjðxkÞg is convergent. Without loss of generality, we as-sume the sequence fgjðxkÞg is convergent. Then the left side of (4.2) tends to zero as ki !1, which implieslim infk!1kgkk ¼ 0 and the proof is completed. h

Theorem 4.4. Suppose the assumptions A1–A3 hold, only finitely trail points are added to the filter and � ¼ 0. If the Algorithm 3.1does not terminate finitely, then

lim infk!1

kgkk ¼ 0:

Proof 4. Suppose by contradiction that lim infk!1kgkk 6¼ 0, then there must exist e0 > 0 such that kgkkP e0 > 0 for any k.Furthermore, there are only finitely points added to the filter, so there exist e1 > 0 and an integer k0 > 0 such that hk > e1

and rk ¼ f ðxkÞ�f ðxkþdkÞukð0Þ�ukðdkÞ

> u for all k > k0. Let K2 ¼ fk j rk > u and k > k0g. Then we have

þ1 >X

k2K2ðf ðxkÞ � f ðxk þ d�kÞÞP

Xk2K2� uukðd

�kÞP

Xk2K2

12kgkkmin k�k min

kgkkL

; kgkk; hk

� �;kgkk

M

� �;

which implies k�k ! 0.Let cdk be the solution of subproblem (1.2) with cDk ¼ 1

k D�k ¼ 1k k�k min kdk�1k


n o, where D�k is the radius of (1.2)

when the inner cycle terminates at iterate point xk. From the mechanism of the algorithm, we get following inequality:

brk ¼f ðxkÞ � f ðxk þcdkÞukð0Þ �ukðcdkÞ

< u; k 2 K2:

By assume A1, Lemma 4.1 and limk2K2

k!1 k�k ! 0, we can prove brk ! 1 as k!1(the proof is similar to that of rik ! 1 in Theorem

4.1 and is omitted here), which contradicts brk < u < 0, this contradiction shows that lim infk!1kgkk ¼ 0. Hence, the proof iscompleted. h

5. Numerical experiments

In this section, we implemented our algorithm (ATRF), the traditional trust region (TTR) [14] algorithm and the traditionaltrust region filter (TTRF) [7] algorithm on the test problems in [5,8], which are now widely used by the optimization

Table 1Numerical results

Text problems Small scale problems Medium scale problems Large scale problems

p1 p2 p3 p4 p5 p6 p7 p8 p9

n 3 5 6 80 93 115 500 500 500ATRF nf 17 29 95 47 97 79 171 231 247

ng 16 33 91 47 103 83 173 247 249nsub 24 45 121 63 117 95 191 257 281f� 7.6562E�5 �1.0160E�6 3.0113E�2 2.7319E�2 1.0150E�6 �1.1065E�2 1.5561E�3 2.1509E�8 �6.0182E�4

TTRF nf 29 51 101 53 107 89 165 241 217ng 31 52 117 51 113 87 167 247 229nsub 45 71 146 75 120 112 198 273 273f� 7.1631E�4 �1.1401E�6 1.0215E�2 2.1456E�2 1.1487E�6 �1.9825E�2 1.5560E�3 2.1005E�8 �6.1196E�4

TTR nf 55 68 213 78 119 91 187 221 243

ng 57 70 217 83 125 93 187 220 249nsub 69 89 235 95 131 130 209 278 231f� 4.1109E�5 �2.0192E�6 2.2145E�2 2.3210E�2 1.0290E�6 �1.6512E�2 1.5590E�3 2.0250E�8 �6.5102E�4

488 S. Li, Z. Liu / Applied Mathematics and Computation 204 (2008) 485–489

researchers due to their various applications in nonlinear optimization. All test programs are written in MATLAB 6.0 and theparameters are specified as follow:

sk ¼ �gk; k ¼ 0:75; q ¼ 0:75; u ¼ 0:01; cg ¼min 0:001;1

2ffiffiffinp

� �:

fBkg is modified by BFGS formula, the stop criteria is kgkk 6 10�8. The detail results are summarized in the following Table 1.Table 1 can read as follow:

� n refers to the number of the free variables.� nf and ng denote the numbers of the function evaluation and gradient evaluation.� nsub means the number of solving the subproblem (1.2).� f� is the optimization evaluation of the objective function.

In Table 1, it is easy to see that the three algorithms are efficient in solving test problems. Comparing to the results, how-ever, our algorithm is better than the other two algorithms in terms of the amount of iterates and solving subproblem ðnsubÞ.Furthermore, we need the performance profile which was proposed by Dolan and Moré [1] to compare the CPU time of thethree algorithms on the test problems. The profile shows a summary of the CPU time of the three algorithms. In Fig. 1, theproportion of pðtÞ is defined as pðrÞ in [1] (see [1] for a more complete discussion). From [1], we see that the ‘‘best” algorithmwhich the highest curve denotes in Fig. 1.

Fig. 1 shows that our algorithm is the best one of the three algorithms, since it is significantly more efficient than theother two algorithms for the small scale problems and has a very slight advantage for large scale problems in term of

0 2 4 6 8 100

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Problem

P(t)

ATRFTTRFTTR

Fig. 1. CPU time performance profile of the three algorithms.

S. Li, Z. Liu / Applied Mathematics and Computation 204 (2008) 485–489 489

CPU time. Therefore this adaptive trust region algorithm combined with the multidimensional filter is reasonable andefficient.

Acknowledgement

The authors would like to thank the referee and editors for their helpful suggestions and excellent work.

References

[1] E.D. Dolan, J.J. Moré, Benchmarking optimization software with performance profiles, Mathematical Programming 91 (2002) 201–213.[2] R. Fletcher, S. Leyffer, N.I.M. Gould, Global convergence of a trust region SQP filter algorithm for general nonlinear, SIAM Journal on Optimization 13

(2003) 635–659.[3] R. Fletcher, S. Leyffer, Nonlinear programming without a penalty function, Mathematical Programing 91 (2002) 239–269.[4] N.I.M. Gould, S. Leyffer, Ph.L. Toint, A multidimensional filter algorithm for nonlinear programming, SIAM Journal on Optimization 15 (2005) 17–38.[5] N.I.M. Gould, D. Orban, Ph.L. Toint, A constrained and unconstrained testing environment, ACM Transactions on Mathematical Software 29 (2003) 373–

394.[6] G.D. Li, A trust region method with automatic determination of the trust region, Chinese Journal of Engineering Mathematics 5 (2006) 843–848.[7] W.H. Miao, Combination and applications of filter technique and nonmonotone technique in numerical optimization. Ph.D. Thesis, Nanjing Normal

University, 2006.[8] J.J. Moré, B.s. Grabow, K.E. Hillstrom, Test unconstrained optimization software, ACM Transactions on Mathematical Software 7 (1981) 17–24.[9] P.N. Nie, C.F. Ma, A trust region filter method for general nonlinear programming, Applied Mathematics and Computation 172 (2006) 1000–1017.

[10] P.Y. Nie, M.Y. Lai, S.J. Zhu, P.A. Zhang, A line search filter approach for the system of nonlinear equations, Computers Mathematics with Applications 55(2008) 2134–2141.

[11] P.Y. Nie, Sequential penalty quadratic programming filter methods for nonlinear programming, Nonlinear Analysis: Real World Applications 8 (1)(2007) 118–129.

[12] Y.H. Peng, Z.H. Liu, A derivative-free filter algorithm for nonlinear complementarity problem, Applied Mathematics and Computation 182 (2006) 846–853.

[13] A. Sartenaer, Automatic determination of an initial trust region in nonlinear programming, SIAM Journal on Scientific Computing 18 (1997) 1788–1803.

[14] Y. Yuan, W. Sun, Optimization Theory and Method of Optimization, Science press, Beijing, 1997.[15] X.S. Zhang, Z.W. Chen, L.Z. Liao, A self-adaptive trust region method for unconstrained optimization, Operations Research Transaction 5 (2001) 53–62.[16] X.S. Zhang, J.L. Zhang, L.Z. Liao, An adaptive trust region method and its convergence, Science in China (Series A) 45 (2002) 620–631.

Documents

A new trust region filter algorithm