36
A Computationally Fast and Approximate Method for Karush-Kuhn-Tucker Proximity Measure Kalyanmoy Deb and Mohamed Abouhawwash 1 Department of Electrical and Computer Engineering Computational Optimization and Innovation (COIN) Laboratory Michigan State University, East Lansing, MI 48824, USA Email: [email protected], [email protected] http://www.egr.msu.edu/˜kdeb COIN Report Number 2015015 Abstract Karush-Kuhn-Tucker (KKT) optimality conditions are used for checking whether a solution ob- tained by an optimization algorithm is truly an optimal solution or not by theoretical and applied optimization researchers. When a point is not the true optimum, a simple violation measure of KKT optimality conditions cannot indicate anything about it’s proximity to the optimal so- lution. Past few studies by the first author and his collaborators suggested a KKT proximity measure (KKTPM) that is able to identify relative closeness of any point from the theoretical optimum point without actually knowing the exact location of the optimum point. In this pa- per, we suggest several computationally fast methods for computing an approximate KKTPM value, so that a convergence measure for iteration-wise best solutions of an optimization algo- rithm can be quantified for terminating an optimization run. The KKTPM value can also be used to isolate less-converged solutions in a population-based optimization algorithm so as to specially modify them for an overall faster execution of the optimization run. The approximate KKTPM values are evaluated in comparison with the original exact KKTPM value on standard single-objective, multi-objective and many-objective optimization problems. In all cases, our proposed ‘estimated’ approximate method is found to achieve a strong correlation of KKTPM values with the exact values and achieve such results in two or three orders of magnitude smaller computational time. These results are extremely motivating to launch further studies in using the proposed estimated KKTPM procedure for establishing termination-based and developing other modified optimization procedures. Keywords: Multi-objective optimization, Evolutionary optimization, KKT optimality conditions, Direct method, Estimated method. 1 Also Department of Mathematics, Faculty of Science, Mansoura University, Mansoura 35516, Egypt. Email: [email protected].

A Computationally Fast and Approximate Method for Karush

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: A Computationally Fast and Approximate Method for Karush

A Computationally Fast and Approximate Method forKarush-Kuhn-Tucker Proximity Measure

Kalyanmoy Deb and Mohamed Abouhawwash1

Department of Electrical and Computer EngineeringComputational Optimization and Innovation (COIN) Laboratory

Michigan State University, East Lansing, MI 48824, USAEmail: [email protected], [email protected]

http://www.egr.msu.edu/˜kdeb

COIN Report Number 2015015

Abstract

Karush-Kuhn-Tucker (KKT) optimality conditions are used for checking whether a solution ob-tained by an optimization algorithm is truly an optimal solution or not by theoretical and appliedoptimization researchers. When a point is not the true optimum, a simple violation measureof KKT optimality conditions cannot indicate anything about it’s proximity to the optimal so-lution. Past few studies by the first author and his collaborators suggested a KKT proximitymeasure (KKTPM) that is able to identify relative closeness of any point from the theoreticaloptimum point without actually knowing the exact location of the optimum point. In this pa-per, we suggest several computationally fast methods for computing an approximate KKTPMvalue, so that a convergence measure for iteration-wise best solutions of an optimization algo-rithm can be quantified for terminating an optimization run. The KKTPM value can also beused to isolate less-converged solutions in a population-based optimization algorithm so as tospecially modify them for an overall faster execution of the optimization run. The approximateKKTPM values are evaluated in comparison with the original exact KKTPM value on standardsingle-objective, multi-objective and many-objective optimization problems. In all cases, ourproposed ‘estimated’ approximate method is found to achieve a strong correlation of KKTPMvalues with the exact values and achieve such results in two or three orders of magnitude smallercomputational time. These results are extremely motivating to launch further studies in using theproposed estimated KKTPM procedure for establishing termination-based and developing othermodified optimization procedures.

Keywords: Multi-objective optimization, Evolutionary optimization, KKT optimalityconditions, Direct method, Estimated method.

1Also Department of Mathematics, Faculty of Science, Mansoura University, Mansoura 35516, Egypt. Email:[email protected].

Page 2: A Computationally Fast and Approximate Method for Karush

1. Introduction

Karush-Kuhn-Tucker (KKT) optimality conditions [1, 2] are necessary requirement for anoptimal solution to a single or multi-objective optimization problem to satisfy [3, 4, 5]. Theseconditions require first-order derivatives of objective function(s) and constraint functions, al-though extensions of them for handling non-smooth problems using subdifferentials exist [6, 4].Although KKT conditions are guaranteed to be satisfied at optimal solutions, the mathematicaloptimization literature is mostly silent about the regularity in violation of these conditions in thevicinity of optimal solutions. This lack of literature prohibits optimization algorithmists workingparticularly with approximation based algorithms that attempt to find a near-optimal solution atbest to check or evaluate the closeness of a solution to the theoretical optimal solution using KKToptimality conditions.

However, recent studies have suggested interesting new definitions of neighboring and ap-proximate KKT points [7, 8, 9], which have been resulted in a KKT proximity measure [10, 7].The study has clearly shown that a naive measure of the extent of violation of KKT optimalityconditions cannot provide a proximity ‘measure’ to the theoretical optimum, however, the KKTproximity measure (KKTPM) that uses the approximate KKT point definition makes a ‘measure’of closeness of a solution from the KKT point. This is a remarkable achievement in its own right,as KKTPM allows a way to know the proximity of a solution from a KKT point without actuallyknowing the location of the KKT point. A recent study has extended the KKTPM procedure de-veloped for single-objective optimization problems to be applied to multi-objective optimizationproblems [11]. Since in a multi-objective optimization problem with conflicting objectives, mul-tiple Pareto-optimal solutions exist, KKTPM procedure can be applied to each obtained solutionto obtain a proximity measure of the solution from its nearest Pareto-optimal solution.

The KKTPM computation opens up several new and promising avenues for further develop-ment of evolutionary single [12] and multi-objective optimization algorithms [13, 14]: (i) KK-TPM can be used for a reliable convergence measure for terminating a run, (ii) KKTPM valuecan be used to identify poorly converged non-dominated solutions in a multi-objective optimiza-tion problem, (iii) KKTPM-based local search procedure can be invoked in an algorithm to speedup the convergence procedure, and (iv) dynamics of KKTPM variation from start to end of anoptimization run can provide useful information about multi-modality and other attractors in thesearch space. However, since the exact KKTPM computation involves a nested optimization taskin an algorithm, the computational effort is a bottleneck for the above development.

In this paper, we analyze the KKTPM computation procedure and suggest three approximatealgorithmic procedures for a fast computation of KKTPM. For certain points including the KKTpoint or optimal solution, the proposed approximate KKTPM values are identical to the exactKKTPM value and we are able to find a implementable condition when this happens. In othercases, we find two approximate methods that are guaranteed to bracket the exact KKTPM valueand another approximate method to obtain a closer to exact KKTPM value. We then demonstratethe use of approximate KKTPM methods and show the accuracy of the proposed estimated KK-TPM values compared to the exact KKTPM values and importantly report two or three orders ofmagnitude faster computation with the approximate method.

In the remainder of the paper, we first discuss the importance for computing KKTPM inSection 2. Then, in Section 3, we describe the principle of KKT proximity measure for singleand multi-objective optimization problems based on the achievement scalarizing function (ASF)concept proposed in the multiple criterion decision making (MCDM) literature. In Section 4, wepropose three fast yet approximate methods for computing the KKT proximity measure without

2

Page 3: A Computationally Fast and Approximate Method for Karush

using any explicit optimization procedure. Section 5 suggests to replace ASF with its augmentedversion so as to avoid weak Pareto-optimal solutions. Section 6 compares all three proposedapproximate methods with previously-published exact KKTPM values on standard single, multi-and many-objective optimization problems including a few engineering design problems. Fi-nally, conclusions are made in Section 7.

2. Need for Computing KKT Proximity Measure

The earlier study by the authors [11] have clearly demonstrated that the recently proposedKKT proximity measure (KKTPM) provide a quantitative measure of proximity of population-best solutions in an evolutionary algorithm to theoretical optimal solutions, which then can beused as a termination condition. Although the study did not simulate a classical point-by-pointoptimization method, the proposed KKTPM is also applicable for classical optimization meth-ods. A recent study [15] demonstrated that evolutionary multi-objective optimization (EMO)methods constitute a parallel search in finding multiple trade-off solutions in a computation-ally quicker manner compared to their point-by-point generative approaches [16, 17]. Such ex-ploratory and application studies are making EMO methods popular and useful in practice.

For solving single-objective optimization problems, the best solution at every generation (oriteration) can be recorded and its KKTPM value can be computed. As demonstrated in theoriginal study [7, 11], the KKTPM value reduces almost monotonically to zero as the iterate ap-proaches the optimal solution (or KKT point). The study has also shown the correlation betweenthe distance from the known optimal solution and the corresponding KKTPM value. Thus, athreshold on the KKTPM value can be enforced for a suitable termination of an optimization run.This will avoid any guesswork for choosing a pre-specified number of generations or functionevaluations, or stability of objective value of current best solution over the past few generations.Besides allowing an automatic termination, it also provides confidence in the final solution, as asmall value of KKTPM guarantees the solution to be close to a KKT point.

For multi- and many-objective optimization problems, the non-dominated solutions at everygeneration of an EMO can be recorded and their KKTPM values can be computed. Althoughthese solutions are non-dominated to each other, their closeness to the true Pareto-optimal frontare likely to be different. Thus, the respective KKTPM value for each non-dominated solutionat a generation is expected to be different from each other. A few statistics of these KKTPMvalues (such as smallest, median and largest) can be computed and analyzed for a termination.Although the smallest and largest KKTPM values may not be good indicators of convergenceof the entire set, the median KKTPM value can be checked against a threshold value. In sucha case, a termination will then mean that at least 50% of the trade-off non-dominated solutionshave come to close to the Pareto-optimal front with the specified threshold value.

There exist a few other reasons for pursuing to compute KKTPM, which we highlight here.

1. It is important to realize that not all non-dominated solutions are likely to be close tothe Pareto-optimal front. Figure 1 shows that with respect to a few true Pareto-optimalsolutions (with KKTPM value equal to zero for each of them) shown in filled circles, twotypes of solutions can be non-dominated. First, there can be solutions that are close tothe Pareto-optimal front, but are not truly Pareto-optimal. These points are shown in opencircles. Second, there can be solutions that are not even close to the Pareto-optimal front,but are non-dominated to the rest of the trade-off solutions. These solutions are markedin open boxes. A computation of KKTPM for each of these solutions will reveal the

3

Page 4: A Computationally Fast and Approximate Method for Karush

f1

Pareto−optimalfront

Large KKTPM

Small KKTPM

Zero KKTPM

f2

Figure 1: A set of non-dominated points may have widely different KKT proximity measure values.

properties of each of the three types of non-dominated solutions. For true Pareto-optimal(in the sense of theory) solutions, KKTPM value will be guaranteed to be zero. For nearPareto-optimal solutions, KKTPM value will be small and for far away solutions, KKTPMvalue will be large.For the first time, the above observation provides an unique information about individ-ual non-dominated population members to an EMO algorithm. So far, the only way todifferentiate non-dominated solutions in an EMO procedure is based on their neighbor-hood information [18, 19] or closeness to a reference direction [20, 21]. For the first time,KKTPM computation opens up a unique way to differentiate different non-dominated solu-tions in an EMO population based on their relative proximity to true Pareto-optimal front.Special action can then be taken to fix the non-dominated solutions that are away from thetrue Pareto-optimal front to obtain a faster overall convergence. We do not address theseadditional and special efforts in this paper, but before we are able to do them, we need afaster way to compute KKTPM value, which is the main goal of this paper.

2. In many problems, objective values near optimal or Pareto-optimal region can take moreor less similar values (making the landscape somewhat ‘flat’). In such problems, evolu-tionary optimization algorithms lack an adequate search power to converge close to thetrue optimal or Pareto-optimal solutions quickly, rather populations wander around themwith elapse of generations. For problem SRN, Figure 2 shows the NSGA-II populationafter 250 generations. While the true Pareto-optimal solutions are along the vertical lineat x1 = −2.5, NSGA-II solutions wander around the optimal line until 500 generations.Since evolutionary optimization methods do not use any gradient information, there is noreal way to salvage any second-order information to attain better convergence. Since it hasbeen established that KKTPM has a direct correlation with the ‘distance’ of a point fromits nearest optimum, it can be provide the additional information needed to distinguish arelatively close solution to an optimum than one that is away. Thereafter, the use of anappropriate local search method can be employed to make the solution converge to the op-timal solution. Again, we do not address this aspect of fine-tuning near optimal solutions

4

Page 5: A Computationally Fast and Approximate Method for Karush

−15

−10

−5

0

5

10

15

20

−20 −15 −10 −5 0 5 10 15 20x1

x2

−20

Figure 2: NSGA-II population members wander around Pareto-optimal solutions ever after 250 generations.

in this paper, but we are currently pursuing a separate study and shall report results at alater publication.

It is amply clear from the above discussions that besides using KKTPM to aid an automatic andmore confident termination of an optimization run, KKTPM can be used to expedite the searchprocess better. However, there are a few bottlenecks for computing and using KKTPM which wediscuss next.

1. The original study [11] proposed an optimization procedure for computing the exact KK-TPM value. This optimization procedure uses Lagrange multipliers (one for each con-straint or each variable bound) and two additional parameters (KKTPM itself and a slackvariable for converting a non-differentiable problem into a smooth problem) as variables.The problem also has two main inequality constraints – one quadratic and one linear toLagrange multipliers. The solution of this optimization problem can be time-consuming,particularly if KKTPM value needs to be computed for every population member in eachgeneration. In this paper, we address this issue and avoid solving the optimization problemby suggesting a computationally fast and approximate method without sacrificing much onthe accuracy of KKTPM value. With this ability, we believe that KKTPM approach pavesthe way of its use in the above-mentioned algorithmic modifications for a faster overallapproach.

2. Second, KKPTM requires gradient information of the original objective functions andconstraints at the point at which the KKTPM value needs to be computed. The originalstudy used exact gradients for most examples but demonstrated by working with numericalgradient computation (forward difference method) on a single problem that the inducederror is small and is proportional to the stepsize used for numerical gradient computation.This is encouraging and we plan to pursue this aspect in more detail at a later study. Otherinterval-based KKT optimality conditions [22] will also be explored.

3. Third, a zero KKTPM value can come from both local and global optimal solutions. How-ever, this is not a big issue, as in addition to KKTPM value, a solution’s objective value

5

Page 6: A Computationally Fast and Approximate Method for Karush

can be used to differentiate local and global solutions, but a hierarchical selection operatorwith KKTPM value followed by the objective value (or non-domination level) can be usedto steer the search towards the global optimal solution.

4. Fourth, KKTPM value indicates a measure of a solution’s proximity to the respectiveoptimal solution. In multi-objective optimization, in addition to convergence, diversitypreservation among non-dominated solutions is another equally important goal. Thus, atermination as well as other modifications suggested above must accompany a careful con-sideration between convergence and diversity of solutions. Fortunately, the convergenceissue can now be dealt with KKTPM reliably and further research is needed to combine itwith a diversity measure for achieving a fast and reliable optimization.

Setting the stage for highlighting the importance and role of KKTPM in enhancing perfor-mance and reliably terminating an optimization algorithm from a convergence issue, we nowreturn to the main goal of this paper – a fast and approximate computation of KKT proximitymeasure. We highlight the fact that if we are able to achieve a fast computation of KKTPM with-out sacrificing much on the accuracy, it will then open up a number of avenues for algorithmicimprovements, as discussed above. In the following, we first present a review of the philosophybehind computing KKTPM and then suggest the proposed fast but approximate procedure.

3. Review of KKT Proximity Measure

KKT proximity measure was first proposed elsewhere [7] for single-objective optimizationproblems. Since KKT error measure, derived from the violation of KKT equilibrium conditions,has a ‘singularity’ property at any KKT point and is unable to relate the proximity of a solution toa KKT point, the development of KKT proximity measure (KKTPM) was a seminal achievementand for the first time it demonstrated its importance in arriving at a suitable termination condition.Recently, the single-objective KKT proximity measure has been extended for multi-objectiveoptimization problems [11]. In this section, we make a brief introduction to both studies, so thatwe can then build up a fast computational method in the next section.

3.1. KKT Proximity Measure for Single-Objective Optimization

Dutta et al. [7] defined an approximate KKT solution to compute a KKT proximity measurefor any iterate xk for a single-objective optimization problem of the following type:

Minimize(x) f (x),Subject to g j(x) ≤ 0, j = 1, 2, . . . , J. (1)

After a long theoretical calculation, they suggested a procedure of computing the KKT proximitymeasure for an iterate (xk) as follows:

Minimize(εk ,u) εkSubject to ‖∇ f (xk) +

∑Jj=1 u j∇g j(xk)‖2 ≤ εk,∑J

j=1 u jg j(xk) ≥ −εk,u j ≥ 0, ∀ j,

(2)

However, that study restricted the computation of the KKT proximity measure for feasible solu-tions only. For an infeasible iterate, they simply ignored the computation of the KKT proximity

6

Page 7: A Computationally Fast and Approximate Method for Karush

measure. For the above optimization problem, the variables are (εk, u). The KKT proximitymeasure (KKTPM) was then defined as follows:

KKT Proximity Measure(xk) = ε∗k , (3)

where ε∗k is the optimal objective value of the problem stated in Equation 2. To illustrate theworking of KKTPM, we consider a two-variable constrained minimization problem:

Minimize f (x1, x2) = x21 + x2

2,s.t. g1(x1, x2) ≡ 3x1 − x2 + 1 ≤ 0,

g2(x1, x2) ≡ x21 + (x2 − 2)2 − 1 ≤ 0.

(4)

Figure 3 shows the feasible space and the optimum point (C) (x∗ = (0, 1)T . For points along line

g1∆−

f

∆−

f

∆g

∆g1

∆g2

∆g1

∆−

f

2

g2

g2=0

g1=0

A

B

C

O

D

−1.5 −1 −0.5 0 0.5 1 1.5

x2

x1

0.5

1.5

1

2

2.5

3

3.5

Figure 3: Feasible search space of the example problem.

A

C

KK

TPM

Val

ue

15

20

25

30

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2

5

0

Distance from A Along AC

10

Figure 4: Variation of KKTPM value along line AC frompoint A for the example problem.

AC, Figure 4 shows the variation of KKTPM. It is clear that as points get closer to the optimalpoint, KKTPM value reduces monotonically and finally becomes zero at the optimal point.

3.2. Exact KKT Proximity Measure for Multi-Objective Optimization

The theory for the development of the exact KKTPM for multi-objective optimization prob-lems was presented in an earlier study by the authors [11]. Here, we summarize the main resultsfor completeness.

For a n-variable, M-objective optimization problem with J inequality constraints:

Minimize(x) { f1(x), f2(x), . . . , fM(x)},Subject to g j(x) ≤ 0, j = 1, 2, . . . , J, (5)

the Karush-Kuhn-Tucker (KKT) optimality conditions for Equation 5 are given as follows [6,

7

Page 8: A Computationally Fast and Approximate Method for Karush

17]:

M∑m=1

um∇ fm(xk) +J∑

j=1

u j∇g j(xk) = 0, (6)

g j(xk) ≤ 0, j = 1, 2, . . . , J, (7)u jg j(xk) = 0, j = 1, 2, . . . , J, (8)u j ≥ 0, j = 1, 2, . . . , J, (9)um ≥ 0, m = 1, 2, . . . ,M, and u � 0. (10)

Multipliers um are non-negative, but at least one of them must be non-zero. The parameter u j iscalled the Lagrange multiplier for the j-th inequality constraint and they are also non-negative.Any solution xk that satisfies all the above conditions is called a KKT point [4]. Variable boundsof the type x(L)

i ≤ xi ≤ x(U)i can be splitted into two inequality constraints: gJ+2i−1(x) = x(L)

i − xi ≤0 and gJ+2i(x) = xi − x(U)

i ≤ 0. Thus, in the presence of all n pairs of specified variable bounds,there are a total of J + 2n inequality constraints for the above problem.

For a given iterate (solution), xk, the original study [11] formulated an achievement scalar-ization function (ASF) optimization problem [23]:

Minimize(x) ASF(x, z,w) = maxMm=1

(fm(x)−zm

wm

),

Subject to g j(x) ≤ 0, j = 1, 2, . . . , J.(11)

The reference point z ∈ RM was considered as an utopian point and the weight vector w ∈ RM iscomputed for xk as follows:

wi =fi(xk) − zi√∑M

m=1( fm(xk) − zm)2. (12)

Thereafter, the KKT proximity measure was computed by applying the KKTPM computationprocedure developed for single-objective optimization problems stated in equation 2 to the ASFformulation given above. Since the ASF formulation makes the objective function non-differentiable,a smooth transformation of the ASF problem was first made by introducing a slack variables xn+1

and reformulating the original problem as follows:

Minimize F(x, xn+1) = xn+1,

Subject to(

fi(x)−zi

wki

)− xn+1 ≤ 0, i = 1, 2, . . . ,M,

g j(x) ≤ 0, j = 1, 2, . . . , J.

(13)

Now, the KKTPM optimization problem for the above smooth single-objective problem for y =(x; xn+1) can be written as follows:

Minimize(εk ,xn+1,u) εk +∑J

j=1

(uM+ jg j(xk)

)2,

Subject to ‖∇F(y) +∑M+J

j=1 u j∇G j(y)‖2 ≤ εk,∑M+Jj=1 u jG j(y) ≥ −εk,(

f j(x)−z j

wkj

)− xn+1 ≤ 0, j = 1, . . . ,M,

u j ≥ 0, j = 1, 2, . . . , (M + J).

(14)

8

Page 9: A Computationally Fast and Approximate Method for Karush

The added term in the objective function allows a penalty associated with the violation of com-plementary slackness condition [11]. The constraints G j(y) are given below:

G j(y) =

f j(x) − z j

wkj

− xn+1 ≤ 0, j = 1, . . . ,M, (15)

GM+ j(y) = g j(x) ≤ 0, j = 1, 2, . . . , J. (16)

The optimal objective value (ε∗k ) to the above problem corresponds to the exact KKT prox-imity measure. It is observed that for feasible solutions ε∗k ≤ 1, hence the exact KKTPM wasdefined as follows:

Exact KKT Proximity Measure(xk) =

ε∗k , if xk is feasible,

1 +∑J

j=1

⟨g j(xk)

⟩2, otherwise.

(17)

4. Proposed Fast and Approximate Computation of KKTPM

The above exact KKTPM computation procedure requires an optimization problem (equa-tion 14) to be solved for every iterate xk. Since the variables for this optimization task are(εk, xn+1, u) – (M + J + 2) of them, the computational time to compute KKTPM value for everynon-dominated solution at every generation of an EMO can be large. To make a fast computationof an approximation of the KKTPM value, we suggest a few fast but approximate methods here.

4.1. Direct KKTPM Computation Method

There are two main constraints to the optimization task in equation 14. The first constraintrequires the computation of the gradient of F and G functions and can be written as:

εk ≥∥∥∥∥∥∥∥∥

M∑j=1

u j

wkj

∇ f j(xk) +J∑

j=1

uM+ j∇g j(xk)

∥∥∥∥∥∥∥∥2

+

1 −M∑j=1

u j

2

. (18)

The second constraint in equation 14 can be written as follows:

εk ≥ −M∑j=1

u j

f j(xk) − z j

wkj

− xn+1

− J∑j=1

uM+ jg j(xk). (19)

Since xn+1 is a slack variable, the third set of constraints will get satisfied by setting

xn+1 =M

maxj=1

f j(xk) − z j

wkj

. (20)

For any feasible solution xk, f j(xk) is always greater than the respective j-th component of anyutopian point z. Hence, xn+1 is also non-negative. Thus, the constraint set of the problem givenin equation 14 reduces to the first and second constraints and the non-negativity constraint ofthe Lagrange multipliers u = (uM, uJ), where uM and uJ are M-dimensional and J-dimensionalvectors of Lagrange multipliers.

9

Page 10: A Computationally Fast and Approximate Method for Karush

Let us consider the first constraint alone. It is clear that it is a quadratic function in terms ofLagrange multipliers u and can be written as follows:

εk ≥n∑

j=1

M+J∑i=1

uiai j

2

+

1 − M∑i=1

ui

2

, (21)

where ai j can be written in terms of derivatives of f j and g j functions:

ai j =

1

wki

∂ fi∂x j

∣∣∣xk , for i = 1, 2, . . . ,M,∂g(i−M)

∂x j

∣∣∣xk , for i = M + 1,M + 2, . . . ,M + J,

for j = 1, 2, . . . , n.

Writing the above constraint in terms of vectors uM and uJ, we have

εk ≥(AT

MuM + ATJ uJ

)T (AT

MuM + ATJ uJ

)+ (1 − 1T

MuM)2, (22)

where 1M is a M × 1-dimensional matrix of ones, AM is (M × n)-dimensional matrix and AJ is(J × n)-dimensional matrix, extracted from the overall ((M + J) × n)-dimensional matrix

A = [ai j] =[

AM

AJ

].

In our first approximation (we refer here as the ‘Direct’) method, we assume that the secondconstraint given in equation 19 is inactive and the first constraint is active at the optimal solu-tion to the KKTPM optimization problem given in equation 14. Thus, the problem becomes asfollows:

Min.(εk ,uM ,uJ ) εk + uTJGGT uJ ,

Subject to εk ≥(AT

MuM + ATJ uJ

)T (AT

MuM + ATJ uJ

)+ (1 − 1T

M×1uM)2,

uM ≥ 0, uJ ≥ 0,(23)

where G is the J-dimensional constraint vector (g1(xk), g2(xK), . . . , gJ(xk)T . Since εK is an inde-pendent variable, the above optimization problem can be simplified as follows:

Min.(uM ,uJ )

(AT

MuM + ATJ uJ

)T (AT

MuM + ATJ uJ

)+(1 − 1T

M×1uM)2+uTJGGT uJ ,

Subject to uM ≥ 0, uJ ≥ 0,(24)

The above objective function is a quadratic function in terms of uM and uJ . By differentiatingthe objective function with respect to uM and uJ, we have

AMATMuM + AMAT

J uJ −(1 − 1T

M×1uM

)1M×1 = 0M×1, (25)

AJATMuM + AJAT

J uJ + GGT uJ = 0J×1. (26)

Rearranging the above two vector equations, we have[AMAT

M + 1M×M AMATJ

AJ ATM AJ AT

J + GGT

] (uM

uJ

)=

(1M×1

0J×1

). (27)

10

Page 11: A Computationally Fast and Approximate Method for Karush

Interestingly, the above equations are a set of linear equations, which can be solved to obtain(uM , uJ)T vector. However, it is not necessary that each component of this vector be non-negative.To satisfy the non-negativity of constraints, if any uM or uJ is calculated to be negative, weset them to zero, and eliminate that equation from above set of linear equation and re-solvethe remaining equations. We continue this process until the non-negativity of uM and uJ areestablished. Let us say that the final vector is (uD

M , uDJ )T .

By transposing and post-multiplying equation 25 by uDM and doing the same for equation 26

by uDJ , we obtain

(ATMuD

M + ATJ uD

J )T ATMuD

M =(1 − 1T

M×1uDM

)1M×1uD

M, (28)

(ATMuD

M + ATJ uD

J )T ATJ uD

J = −(uD

J

)T GGT uDJ . (29)

Substituting these left-hand terms for εk term and manipulating, we have the direct approximatedKKTPM value arising only from the first (quadratic) constraint:

εDk =(1 − 1T

M×1uDM

)1M×1uD

M −(uD

J

)T GGT uDJ + (1 − 1T

M×1uDM)2,

= 1 − 1TM×1uD

M −(GT uD

J

)2. (30)

Since uDm ≥ 0, uD

j ≥ 0, and GGT is a matrix with positive elements, it can be concluded thatεDk ≤ 1 for any feasible iterate xk. This result also supports our earlier definition of the exactKKTPM value for infeasible solutions, as described in equation 17.

The above-computed εDk value corresponds to the exact ε∗k value, if at the above (uDM , u

DJ )T

vector, the second constraint is also satisfied. Using equations 12 and 13, we obtain the followingcondition for this to be true:

xn+1 ≥√√√ M∑

k=1

( fk(xk) − zk)2. (31)

Since xn+1 is an independent variable, the minimization of objective function at equation 14 willforce xn+1 to take the smallest positive value. Hence, the equation will be satisfied with equalitysign. This will simplify equation 19 as follows:

εk ≥ −GT uJ. (32)

Therefore, εDk = ε∗k happens only when the following condition is true at (uD

M , uDJ )T vector:

1 − 1TM×1uD

M −(GT uD

J

)2 ≥ −GT uJ , (33)

or, 1TM×1uD

M −(GT uD

J

) (1 − GT uD

J

)≤ 1. (34)

Here is the summary of our approach so far. After finding the feasible solution of the linearsystem of equation 27 with non-negativity restriction of uD-vector, it can be used to check theabove condition. If the condition is satisfied, εDk is identical to the exact KKTPM value (ε∗k ).If this happens, we are then able to avoid a complete optimization run to compute ε∗k and havemanaged to find a computationally faster method of computing it. This scenario is illustrated witha sketch in Figure 5 with a single Lagrange multiplier u j. The first constraint is quadratic to u j

(equation 18) and the second constraint is a linear function of u j (equation 32). In a generic case11

Page 12: A Computationally Fast and Approximate Method for Karush

BA

opt

D

constraintFirst

constraintε

ε

=εSecond

regionFeasible

u

Figure 5: First constraint governs at iterate xk which is nota KKT point (A) and a KKT point (B).

Adj

opt

constraint

constraint

D

First

P

ε

εε

ε

ε

P

D

A

Second

regionFeasible

O

u

Figure 6: Second constraint governs at iterate xk which isnot a KKT point.

satisfying condition 34, the second constraint becomes redundant in the KKTPM optimizationprocess having εk ≥ 0 and the use of the first constraint is adequate to find the optimal ε∗k . Also,due to the solution of a set of linear equations, the computation of ε∗k becomes faster.

An interesting scenario to consider is the case when the iterate xk is a KKT point (or anoptimal point). In this case, the second constraint passes through the minimal point of the firstconstraint function and condition 34 becomes

1TM×1uD

M −(GT uD

J

) (1 − GT uD

J

)= 1. (35)

At the KKT or an optimal point, the complementary slackness condition is satisfied, makingGT uD

J = 0. With this condition, equations 26 and 25 yields 1TM×1uD

M = 1. These two conditionssatisfy the above identity, thereby meaning that at the optimal or a KKT point, the above identity(equation 35) is always true.

The above brings out an interesting property of a KKT or an optimal point using our deriva-tion in this paper – the respective Lagrange multipliers for a KKT or an optimal point can bescaled so that the sum of Lagrange multipliers um for objective functions is one. Although orig-inal KKT conditions for an optimal or KKT point xk may not make

∑Mk=1 uk equal to one, our

formulation for KKT proximity measure forces this condition to be true. This can be explainedas follows. Let us consider the KKT optimality conditions given in equations 6 to 10 again. Bydividing these equations by a non-negative quantity

∑Mk=1 uk and writing KKT conditions for the

normalized Lagrange multipliers uk = uk/∑M

k=1 uk and u j = u j/∑M

k=1 uk, we have the following

12

Page 13: A Computationally Fast and Approximate Method for Karush

modified KKT conditions:M∑

m=1

um∇ fm(xk) +J∑

j=1

u j∇g j(xk) = 0, (36)

g j(xk) ≤ 0, j = 1, 2, . . . , J, (37)u jg j(xk) = 0, j = 1, 2, . . . , J, (38)u j ≥ 0, j = 1, 2, . . . , J, (39)um ≥ 0, m = 1, 2, . . . ,M, and u � 0. (40)

Since a KKT or an optimal point xk must also satisfy the above conditions, the sum of modifiedLagrange multipliers um can be written as follows:

M∑m=1

um =

M∑m=1

um∑Mk=1 uk

= 1. (41)

Let us now get back to a more generic scenario in which the condition 34 is not satisfiedfor an iterate xk. In this scenario, ε∗k � ε

Dk and in fact, ε∗k > ε

Dk . The scenario is illustrated

in Figure 6. The second constraint (line) now dictates the location of the optimal εk, which ismarked as the point ‘O’ in the figure. The feasible region due to both constraints are shownshaded. The minimum εk occurs at the point ‘O’. The only way to find this point exactly is tosolve the optimization problem given in equation 14, which can be computationally demanding.Note that for a single u j, the intersection of the quadratic and linear constraint functions occursat only two points, as shown in the figure, and choosing the intersection point with smaller εkvalue will ensure finding ε∗k . However, for more than one u j elements, the intersection betweenthe quadratic and linear constraint functions produces an ellipsoidal intersecting surface havinginfinite possibilities. The use of an minimization method to choose the minimum εk value of allintersection points will produce ε∗k ; hence the need for using an optimization procedure.

Note that this optimum ε∗k occurs at the intersection of a quadratic function and a linear func-tion. Thus, a quadratic programming (QP) method cannot be used, rather a sequential quadraticprogramming (SQP) method must be used. This is why in the original study [11], we have usedMatlab’s fmincon routine, in which SQP is an option. There is no other computationally simplerway to avoid the optimization run to compute ε∗k for this scenario. However, we can find approx-imate values of ε∗k by using computationally faster approaches, which we discuss in the next fewsubsections.

4.2. Adjusted KKTPM Computation MethodFrom the direct solution uD and corresponding εDk (point ‘D’ in the Figure 6), we compute

an adjusted point ‘A’ (marked in the figure), by simply computing the εk value from the secondconstraint boundary at u = uD, as follows:

εAdjk = −GT uD

J . (42)

From the geometry, it is easy to realize that this adjusted KKTPM value, εAdjk , will always over-

estimate the optimal ε∗k . Moreover, since εDk is the minimum of the quadratic constraint, thefollowing relationship is always true:

εDk ≤ ε∗k ≤ εAdjk . (43)

13

Page 14: A Computationally Fast and Approximate Method for Karush

4.3. Projected KKTPM Computation Method

Next, we consider another approximation method using uDj . This time, we make a projection

from the direct solution (point ‘D’) (uD, εDk ) on the second constraint boundary. We then obtainthe projected point ‘P’ on the second constraint. Let us consider that the projected point has acoordinate (uP, εPk ). The linear (second) constraint boundary can be written as εPk + GT uP

J = 0.The perpendicular distance, dPD, from point ‘D’ to the linear constraint boundary is given as

dPD =εDk + GT uD

J√1 + GTG

.

The unit vector along line PD is (−1,GT )T . Writing �PD in two different ways and equating them,we have (

εPk − εDkuP

J − uDJ

)= dPD

(−1G

)

=εDk + GT uD

J

1 + GTG(−1G

).

From the first element of the above vector equation, we estimate the projected KKTPM value, asfollows:

εPk = εDk −εDk + GT uD

J

1 + GTG ,

=GT

(εDk G − uD

j

)1 + GTG . (44)

While εPk is always smaller than εAdjk , εPk can be larger or smaller than ε∗k depending on the shape

and location of the two constraints. The projected KKTPM value can either be smaller or largerthan the exact KKTPM value, but is always guaranteed to be within εDk and εAdj

k values.

4.4. Estimated KKTPM Computation Method

Due to uncertainty in the range of the bracketing εk values (εDk and εAdjk ) for the optimal ε∗k

and due to the fact that εPk always lie within these two bounds, we suggest taking an average ofthe three approximated εk values as our proposed estimated KKTPM value:

εest =13

(εDk + ε

Pk + ε

Adjk

). (45)

Due to above properties, this measure can be regarded as a more reliable approximate to the exactKKTPM value than the other three approximate KKTPM values.

It is intuitive to realize from Figure 5 that for a KKT point or the optimal iterate xk, thefollowing condition is always true:

εDk = εAdjk = εPk = ε

estk = ε

∗k = 0. (46)

14

Page 15: A Computationally Fast and Approximate Method for Karush

5. KKT Proximity Measure with Augmented ASF

The ASF procedure calculated from an ideal or an utopian point may result in a weak Pareto-optimal solution [13]. To avoid, an augmented version of ASF was proposed [17]:

Minimize(x) ASF(x, z,w) = maxMi=1

(fi(x)−zi

wi

)+ ρ

∑Mi=1

(fi(x)−zi

wi

),

Subject to g j(x) ≤ 0, j = 1, 2, . . . , J.(47)

Here, the parameter ρ takes a small positive value (∼ 10−4). The approximate KKT proximitymetric computation procedures (εDk from equation 30, εAdj

k from equation 42, εPk from equation 44,and εest

k from equation 45) can be easily extended using the above augmented ASF formulation.For brevity, we do not show the calculations here. For all simulations of this paper, we have usedthe augmented ASF function so that weak Pareto-optimal solution will have a non-zero KKTPMvalue.

6. Results

In this section, we consider single objective, multi-objective and many-objective test prob-lems to demonstrate the working of the proposed approximate KKT proximity measures for so-lutions obtained using different evolutionary optimization algorithms. Single-objective problemsare solved using the elite-preserving real-coded genetic algorithm [24] with pc = 0.9 and ηc = 0and polynomial mutation operator [13] with pm = 1/n (where n is the number of variables) andηm = 20. A C-programming code is available from website http://www.egr.msu.edu/~kdeb/codes.shtml.For bi-objective problems, we use NSGA-II [18] and for three or many-objective problems weuse the recently proposed NSGA-III procedure [21, 25]. For problems solved using NSGA-IIand NSGA-III, we use the simulated binary crossover (SBX) operator [24] with pc = 0.9 andηc = 30, and polynomial mutation operator [13] with pm = 1/n (where n is the number ofvariables) and ηm = 20.

6.1. Single-Objective Optimization ProblemsWe start with results on a simple problem and then present results on standard G-series of

constrained test-problems [26].

6.1.1. A Proof-of-Principle Example ProblemFirst, we consider a single-variable, single-constraint problem to make different approximate

methods proposed in this study clear:

Minimize f (x) = x2,Subject to g(x) = 0.5 − x ≤ 0. (48)

The optimal solution is x∗ = 0.5 and by theory, ε∗k = 0 at this solution. Noting ∇ f (x) = 2x and∇g(x) = −1, the first constraint (18) at any iterate x(k) becomes:

εk ≥ ‖u1∇ f (x(k)) + u2∇g(x(k))‖2 + (1 − u1)2,

or, εk ≥ (2u1x(k) − u2)2 + (1 − u1)2.

The second constraint at iterate xk becomes

εk ≥ −u2(0.5 − x(k)). (49)15

Page 16: A Computationally Fast and Approximate Method for Karush

Let us now take two x(k) values and investigate the difference between exact and approximatedKKTPM values. For x(k) = 0.5 (the optimal solution to the above optimization problem), twoconstraint surfaces are given as follows:

εk ≥ (u1 − u2)2 + (1 − u1)2, (50)εk ≥ 0, (51)

Figure 7 shows these two constraint surfaces. The z-axis indicates the εk. It is clear that the first

2ndconstr.

constr.1st

x=0.5

1

0 0.5

1 1.5

2

u2u1 0 0.6

0.4 0.6 0.8 1

0.4 0.2

0

Eps

0.2 0.8

Figure 7: Two constraint surfaces are shown for x(k) = 0.5(optimal solution).

Dεconstr.2nd

constraint1st

εAdjεopt

0

2

0 0.5

1 1.5

2Eps

0 0.5 1 1.5 2

2.5 3u1u2

1 0.5

1.5

Figure 8: Two constraint surfaces are shown for x(k) = 1.0(an non-optimal feasible solution).

constraint value is never smaller than the second constraint value at any point (u1, u2)T . Hence,we have a scenario similar to that discussed in Figure 5. Interestingly, the two constraint surfacesmeet at (um, u j) = (1, 1)T and the corresponding ε∗k = 0.

First, we use the direct method and formulate the optimization problem given in equation 24using the first constraint, as follows:

Min.(u1 ,u2) (u1 − u2)2 + (1 − u1)2,Subject to u1 ≥ 0, u2 ≥ 0. (52)

The solution to the above problem is uD1 = uD

2 = 1 with εDk = 0. Thus, the direct method makes anaccurate estimation of the optimal εk∗ . It is also interesting to note that since G = (0.5 − 0.5) = 0for this solution, equations 42 and 44 estimate εAdj

k = εPk = 0 and εestk = 0, thereby supporting our

conclusion about exact and approximated KKTPM values (equation 46) for optimal solutions.Next, we consider a feasible but non-optimal solution x(k) = 1. At this point, the solution of

the KKT optimality conditions yields in the optimal KKTPM value as ε∗k = 0.3441. Let us nowestimate different approximate KKTPM values. First, we compute εDk using the direct method.For this purpose, we formulate the following quadratic problem:

Min.(u1 ,u2) εk = (2u1(1) − u2)2 + (1 − u1)2 + u22(0.5 − 1)2,

Subject to u1 ≥ 0, u2 ≥ 0. (53)

The solution to this problem is uD1 = 0.556, uD

2 = 0.889 and corresponding εDk = 0.2469, whichis smaller than the exact KKTPM value (ε∗k = 0.3441). Figure 8 marks the quadratic constraint

16

Page 17: A Computationally Fast and Approximate Method for Karush

function and corresponding εDk value. Next, we observe that the second constraint function is asfollows:

εk ≥ 0.5u2. (54)

Figure 8 shows that εDk surface lies below this linear constraint surface at uDj . Putting uD

m and uDj

values found above to condition 34, we find that left side value is 1.197, which is greater than 1,thereby violating the condition. Since condition 34 is violated, εDk � ε

∗k .

Using equations 42, 44 and 45, we then obtain the following estimations of KKTPM:

εAdjk = 0.4444, εPk = 0.4049, εest

k = 0.3654.

Note that exact KKTPM (0.3441) is bounded within direct KKTPM (εDk = 0.2469) and adjustedKKTPM (εAdj

k = 0.4444). Moreover, due to the use of εPk in the averaging process, the estimatedKKTPM = 0.3654 is closer to the exact KKTPM value of 0.3441 than the two bracketing KK-TPM values. This example clearly illustrates the relationship of each approximated KKTPMvalue with the exact KKTPM value.

6.1.2. G-Series Constrained ProblemsNow, we are ready to present results of our proposed approximation methods comparing

with exact KKTPM values on standard single-objective constrained optimization problems – g-functions [26]. We apply an elitist RGA [24] and record the population-best solution x(k) at everygeneration k. Thereafter, we compute exact and approximate KKTPM values and compare.

Figure 9 is for g01 problem from [26] having 35 constraints and 13 variables. We run anelite-preserving real-coded genetic algorithm (RGA) [27, 28] for 200 generations and with 100population size. The KKT proximity measure of the population-best solution is calculated usingoptimal (shown in solid line), direct, projection, adjusted, estimated (shown with upside downtriangles) methods, discussed above. The proposed estimated values are close to the exact KK-TPM value. It can also be observed that the ε∗k is always bounded by two approximated KKTPMvalues: εDk (pink line) and εAdj

k (green line). For this problem, RGA is able to find the true op-timum (1, 1, 1, 1, 1, 1, 1, 1, 1, 3, 3, 3, 1)T with f ∗ = 0 at generation 94. The exact KKTPM valueε∗k is found to be exactly zero. The inset figure shows this fact in a semi-log plot better. At thisgeneration, all approximated KKTPM values are also zero.

The population-best solution for the above run is infeasible until generation 10, then in gen-eration 11, the first feasible solution is found. Notice how all KKTPM measures including ap-proximated ones have identical and large value (more then one) until generation 10 and thereafterwhen feasible solutions are found, the approximated KKTPM values are different from the exactKKTPM value. But the estimated KKTPM matches well with the exact KKTPM value. Impor-tantly, the relative change in exact KKTPM values over generations is in agreement with changesin the estimated KKTPM values. We compute the correlation coefficient between ε∗k and εest

l andobserve the value is 0.9897, meaning that there is very strong positive correlation in the changesin two sets of KKTPM values.

Next, we solve the g02 problem using the eltist RGA with 100 population size and 200generations. There are 20 variables and the optimal function value is f ∗ = −0.80362. RGAimproves its best solution slowly towards the end of the run and after 200 generations findsits best solution having f = −0.754629, indicating the RGA is unable to find the true optimalsolution for this problem. Figure 10 shows the variation of exact and approximate KKTPMvalues of the population-best solution with generations. To show that the true optimal solution

17

Page 18: A Computationally Fast and Approximate Method for Karush

0 20 40 60 80 100 120 140 160 180 2000

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

Generation Number

KK

T P

roxi

mity

Mea

sure

OptimalDirectProjectedAdjustedEstimated

0 20 40 60 80 10010

−3

10−2

10−1

100

101

Generation Number

KK

T P

roxi

mity

Mea

sure

Figure 9: Generation-wise KKT proximity measure forthe population-best RGA solution for Problem g01.

0 50 100 150 2000

0.05

0.1

0.15

0.2

0.25

Generation Number

KK

T P

roxi

mity

Mea

sure

OptimalDirectProjectedAdjustedEstimated

Figure 10: Generation-wise KKT proximity measure forthe population-best RGA solution for Problem g02.

has a zero KKTPM value, we replace 200-th generation RGA solution with the true optimalsolution. The figure shows that all the KKTPM values drops to zero at 200-th generation. In thisproblem, the projected KKTPM approximation is closer to the exact KKTPM values. However,the correlation coefficient between ε∗k and εest

k is found to be 0.9760, meaning a strong positivecorrelation in their changing pattern with generations.

Next we consider g04 problem, which has five variables and six constraints besides 10 vari-able bounds. Figure 11 shows that the elitist RGA with 2,000 population size run for 500 gen-erations is able to quickly reduce the population-best function value close to known optimum( f ∗ = −30665.539) but get stuck to a near-optimal solution having f = −30665.506 at the 71-stgeneration. The exact KKTPM value for this solution is 0.0271. However, when this optimalsolution is computed for its KKTPM value, it is found to be exactly zero. The estimated KKTPM(εest

k ) value at the final RGA solution is 0.0108. All approximated KKTPM values are found tobe close to the exact KKTPM values. It is also interesting to note that the optimal KKTPM valueis always bounded by εDk and εAdj

k values. The correlation coefficient between ε∗k and εAdjk is found

to be 0.9976.The problem g06 contains two variables and six constraints, including four variable bounds

which are treated as constraints. The elitist RGA is applied with 1,000 population membersand is run for 2,000 generations. The reason for these large numbers is that for this problemwe wanted to achieve a large accuracy. The optimal solution for this problem has a functionvalue of f ∗ = −6961.813876. Figure 13 shows the variation of different KKTPM values withgeneration. The elitist RGA finds a solution with f = −6961.014 in 674 generations (having anexact KKTPM value of 0.395626) and a solution with f = −6961.813655 at 1,898-th generationhaving an exact KKTPM value of 0.000059. The corresponding estimated KKTPM values atthese two solutions are 0.246461 and 0.000073, respectively. The correlation coefficient betweenε∗k and εAdj

k is found to be 0.9914.Problem g07 is a 10-variable, 28-constraint problem. The elitist RGA is run with 1,000

population members and for 1,000 generations. The optimal solution of this problem has afunction value f ∗ = 24.3062. Figure 13 shows the variation of different KKTPM values withgeneration. Elitist RGA converges to a near-optimal solution having f = 24.3612. This solution

18

Page 19: A Computationally Fast and Approximate Method for Karush

0 100 200 300 400 5000

0.2

0.4

0.6

0.8

1

Generation Number

KK

T P

roxi

mity

Mea

sure

OptimalDirectProjectedAdjustedEstimated

Figure 11: Generation-wise KKT proximity measure forthe population-best RGA solution for Problem g04.

0 500 1000 1500 20000

0.2

0.4

0.6

0.8

1

Generation Number

KK

T P

roxi

mity

Mea

sure

OptimalDirectProjectedAdjustedEstimated

Figure 12: Generation-wise KKT proximity measure forthe population-best RGA solution for Problem g06.

has an exact KKTPM value of 0.0661. The corresponding estimated KKTPM value is 0.0857.The correlation coefficient between ε∗k and εAdj

k is found to be 0.9931.The problem g08 has two variables and 10 constrains including variable bounds. The elitist

RGA is applied to this problem using 20 population members for 100 generations. Figure 14shows the variation of different KKTPM values with generation. The elitist RGA finds the exactoptimum (having f ∗ = −0.095825 at 48 generations. All KKTPM values at this generation arezero. The correlation coefficient between ε∗k and εAdj

k values from all generations of a RGA run isfound to be 0.9985, as also evident from the closeness of these two KKTPM values in the figure.

0 100 200 300 400 5000

0.2

0.4

0.6

0.8

1

Generation Number

KK

T P

roxi

mity

Mea

sure

OptimalDirectProjectedAdjustedEstimated

Figure 13: Generation-wise KKT proximity measure forthe population-best RGA solution for Problem g07.

0 20 40 60 80 1000

0.2

0.4

0.6

0.8

1

Generation Number

KK

T P

roxi

mity

Mea

sure

OptimalDirectProjectedAdjustedEstimated

Figure 14: Generation-wise KKT proximity measure forthe population-best RGA solution for Problem g08.

The next problem g09 is a seven-variable, 18-constraint problem. The elitist RGA is appliedusing 1,000 population members and is run for 500 generations. The optimal function value isf ∗ = 680.630057. As seen from Figure 15, the elitist RGA converges gradually near this optimal

19

Page 20: A Computationally Fast and Approximate Method for Karush

solution with generation, but cannot find the true optimal solution until 499 generations. It gets toa solution having f = 680.633284, which has an exact KKTPM value of 0.090192, whereas theoptimal solution has a KKTPM value of zero. Figure 15 adds the known optimal solution at 500-th generation to show this fact. Interestingly, for this problem, all approximated KKTPM valuesare almost identical to each other and to the exact KKTPM value. This can also be explainedfrom the fact that the correlation coefficient of ε∗k and εAdj

k is found to be one.

0 100 200 300 400 5000

0.2

0.4

0.6

0.8

1

Generation Number

KK

T P

roxi

mity

Mea

sure

OptimalDirectProjectedAdjustedEstimated

Figure 15: Generation-wise KKT proximity measure forthe population-best RGA solution for Problem g09.

0 100 200 300 400 5000

0.2

0.4

0.6

0.8

1

1.2

Generation Number

KK

T P

roxi

mity

Mea

sure

OptimalDirectProjectedAdjustedEstimated

Figure 16: Generation-wise KKT proximity measure forthe population-best RGA solution for Problem g10.

The next problem g10 is a challenging eight-variable, 22-constraint problem. The optimalfunction value for this problem is f ∗ = 7049.248. There is a scaling issue of constraint functionsfor this problem, which makes it difficult to solve this problem to optimality. As shown inFigure 16, the elitist RGA with 200 population members fails to converge near the optimumsolution. The KKTPM values are also away from zero even after 500 generations. The best RGAsolution has a function value f = 7848.562 (11.3% worse from the optimal function value). Theexact KKTPM value at this point is 0.499358, as shown in the figure, whereas when the optimalsolution is deliberately added at the 500-th generation for KKTPM computation, it is calculatedto be zero. The corresponding estimated KKTPM value at the best RGA solution is 0.499358,which is very close to exact KKTPM value. For this problem as well, all approximated KKTPMvalues are very close to each other. It is not surprising that the correlation coefficient between ε∗kand εAdj

k is found to be one.Problem g18 is a nine-variable, 31-constraint problem. The optimal function value is f ∗ =

−0.866025. The elitist RGA is run with 100 population members, Figure 17 shows that g18needs 30 generations to find a feasible solution. At 31-st generation, the best RGA solution hasa function value f = −0.366352 with an exact KKTPM value of 0.072097. At generation 199,the population-best near-optimal solution with f = −0.861109 has a KKTPM value of 0.003453.The corresponding estimated KKTPM values are 0.064172 and 0.017034, respectively. Thecorrelation coefficient between ε∗k and εAdj

k is found to be 0.9255.The final problem G24 is a two-variable, six constraint problem. The optimal solution has a

function value f ∗ = −5.508013. The elitist RGA is applied with 20 population size and is runfor 100 generations. Figure 18 shows that KKT-proximity measure reduces with generations andreaches to a near-optimal solution quickly. The best RGA solution after 100 generations has a

20

Page 21: A Computationally Fast and Approximate Method for Karush

0 50 100 150 2000

0.2

0.4

0.6

0.8

1

Generation Number

KK

T P

roxi

mity

Mea

sure

OptimalDirectProjectedAdjustedEstimated

Figure 17: Generation-wise KKT proximity measure forthe population-best RGA solution for Problem g18.

0 20 40 60 80 1000

0.1

0.2

0.3

0.4

0.5

0.6

Generation Number

KK

T P

roxi

mity

Mea

sure

OptimalDirectProjectedAdjustedEstimated

Figure 18: Generation-wise KKT proximity measure forthe population-best RGA solution for Problem g24.

function value of f = −5.500590 with an exact KKTPM value of 0.006491. The correspondingestimated KKTPM value is 0.002637. Exact and estimated ε∗k and εAdj

k values follow a similarpattern and the correlation coefficient between them for all generation-wise best solutions isfound to be 0.9925, meaning a strong positive correlation between them.

6.1.3. Computational TimeThe extensive results on the above constrained problems indicate that the correlation between

the exact KKTPM and the proposed estimated KKTPM is quite strong and close to one. Thismeans that the pattern of variation of these two KKTPM quantities are similar to each other,indicating that the estimated KKTPM can be used to indicate the level of convergence of thepopulation-best solution close to the theoretical optimal solution, without knowing the true opti-mal solution.

As mentioned earlier, the purpose of computing the approximated KKTPM, instead of theexact KKTPM, is the computational advantage which we demonstrated in Table 1. The com-putational time needed to solve each of the G-series constrained problems with the exact andapproximated KKTPM values computed at every generation are presented in seconds of CPUtime. It is clear from the table that the optimization based KKTPM computation takes two orthree orders of magnitude more time than the approximated methods. The results in above fig-ures and computational time shown in this table together indicates that the proposed estimatedKKTPM procedure allows to compute a near-exact KKTPM value within a fraction of computa-tional effort. The final column indicates the computational advantage of the proposed estimatedKKTPM compared to the exact KKTPM computational time. As the table shows, the compu-tational advantage varies from about 31 to 781. This is encouraging for the use of the KKTPMfor modifying evolutionary or classical optimization methods. While we do not address the pos-sible modifications in this paper, in the next subsection we present results on multi-objectiveoptimization problems.

6.2. Two-Objective ZDT ProblemsFirst, we consider commonly used two-objective ZDT problems [29]. ZDT1 has 30 vari-

ables and the efficient solutions occur for xi = 0 for i = 2, 3, . . . , 30. NSGA-II is applied with21

Page 22: A Computationally Fast and Approximate Method for Karush

Table 1: Computational time (in seconds) for a complete optimization run using optimization, direct, projected, adjustedand estimated methods on single-objective test problems.

ProblemOpt. Approximation Methods

Method Direct Projected Adjusted EstimatedKKTPM KKTPM KKTPM KKTPM KKTPM Gain

g01 96.88 0.93 0.95 0.89 1.06 91.74g02 69.83 0.80 0.79 0.83 0.83 83.73g04 227.09 0.79 0.80 0.82 0.81 280.36g06 497.94 2.38 2.38 2.38 2.39 208.52g07 491.18 2.21 2.18 2.18 2.23 220.06g08 5.44 0.17 0.17 0.17 0.17 31.24g09 203.45 0.85 0.85 0.89 0.93 218.76g10 877.15 1.09 1.09 1.09 1.12 781.77g18 38.41 0.43 0.47 0.50 0.51 75.31g24 9.32 0.16 0.16 0.16 0.17 55.46

a population of size 40 and for 200 generations. We then compute exact and approximate KKTproximity measures with AASF formulation to each non-dominated solution at every generation.Thereafter, we record the median KKTPM value of all non-dominated solutions at each gener-ation and show their variation with generation number in Figure 19. This metric indicates thelevel of convergence of at least 50% of the non-dominated solutions in a generation. First, it is

0 50 100 150 20010

−5

10−4

10−3

10−2

10−1

100

101

Generation Number

KK

T P

roxi

mity

Mea

sure

OptimalDirectProjectedAdjustedEstimated

Figure 19: Generation-wise median KKT proximity mea-sures is computed for non-dominated solutions (exceptx1 = 0 solutions) of NSGA-II populations for ProblemZDT1.

0 50 100 150 20010

−5

10−4

10−3

10−2

10−1

100

101

Generation Number

KK

T P

roxi

mity

Mea

sure

OptimalDirectProjectedAdjustedEstimated

Figure 20: Generation-wise median KKT proximity mea-sures is computed for non-dominated solutions of NSGA-II populations for Problem ZDT2.

interesting to observe how all KKT proximity measures reduce with generation, thereby meaningthat NSGA-II’s non-dominated solutions move towards the true Pareto-optimal front. The directKKTPM value (εDk ) is much smaller than the exact KKTPM value (ε∗k ). This indicates that thesecond constraint (equation 19) is critical and the condition given in equation 34 is not satisfiedfor most points. A comparison of the optimal KKTPM with approximated KKTPM indicates a

22

Page 23: A Computationally Fast and Approximate Method for Karush

similar change in their values. To show that the convergence is fast, we plot the KKTPM valuesin log scale. Since the drop is almost linear, they indicate an exponentially fast convergence.The correlation coefficient between ε∗k and proposed estimated εest

k is found to be 0.9608, mean-ing a strong positive correlation. Like in the single-objective case, the exact KKTPM is alwaysbracketed by εDk and εAdj

k .Next, we consider the 30-variable ZDT2 problem. ZDT2 has a non-convex efficient front.

The NSGA-II with a population of size 40 is run for a maximum of 200 generations to recordgeneration-wise non-dominated solutions. Figure 20 shows the variation of all KKT proximitymeasures with generation number. Our proposed estimated KKTPM is mostly smaller than therespective exact KKTPM value, but their changing pattern is similar, which is indicated by ahigh correlation coefficient of 0.9763 between these two KKTPM values. A fast convergenceto near-Pareto-optimal front is evident from this plot. Since the median KKTPM values of allnon-dominated solutions are plotted, some non-dominated solutions with much smaller KKTPMvalues are also expected to be exist in the population.

Figure 21 shows the median KKTPM value variation with generation for the 30-variableZDT3 problem. NSGA-II is run with 40 population members for 200 generations. Again, a

0 50 100 150 20010

−5

10−4

10−3

10−2

10−1

100

101

Generation Number

KK

T P

roxi

mity

Mea

sure

OptimalDirectProjectedAdjustedEstimated

Figure 21: Generation-wise median KKT proximity mea-sures is computed for non-dominated solutions of NSGA-II populations for Problem ZDT3.

0 50 100 150 200 250 30010

−3

10−2

10−1

100

Generation Number

KK

T P

roxi

mity

Mea

sure

OptimalDirectProjectedAdjustedEstimated

Figure 22: Generation-wise median KKT proximity mea-sures is computed for non-dominated solutions (exceptx1 = 0 solutions) of NSGA-II populations for ProblemZDT4.

faster convergence near the true Pareto-optimal front is observed for this problem. The exactKKTPM and estimated KKTPM values are close to each other and have a similar pattern ofvariation (with a correlation coefficient of 0.9834).

The 30-variable ZDT4 problem is multi modal and has 99 local efficient fronts and is moredifficult to solve than ZDT1, ZDT2 and ZDT3 [29]. Using identical NSGA-II parameter valuesand with a population size of 48, Figure 22 shows how generation-wise variation of the KK-TPM takes place for the non-dominated population members of a typical NSGA-II run. Dueto difficulties in this problem for convergence, the drop in KKTPM value is not exponential inthis problem. However, the pattern of variation of different approximated KKTPM quantities issimilar to that of the exact KKTPM. The correlation coefficient between ε∗k and εest

l is 0.9954,indicating a strong positive correlation.

Finally, the median KKT proximity measure values for NSGA-II non-dominated members23

Page 24: A Computationally Fast and Approximate Method for Karush

(with a population size 40) for ZDT6 problem are shown in Figure 23. The convergence is

0 50 100 150 20010

−5

10−4

10−3

10−2

10−1

100

101

Generation Number

KK

T P

roxi

mity

Mea

sure

OptimalDirectProjectedAdjustedEstimated

Figure 23: Generation-wise median KKT proximity measures is computed for non-dominated solutions of NSGA-IIpopulations for Problem ZDT6.

fast and the adjusted KKTPM seems to match well with the exact KKTPM value, however thecorrelation coefficient between ε∗k and proposed εest

l is 0.9241, which indicates a definite positivecorrelation.

To summarize, the proposed estimated KKTPM value is found to follow a similar patternof variation with generation number as the exact KKTPM value varies on ZDT problems. In-terestingly, such correlated KKTPM values come with a small computational time, as shown inTable 2, thereby making our proposed estimated KKTPM procedure worthy of use in an opti-mization algorithm. It can observed that the estimated KKTPM computation is more than twoorders of magnitude faster compared to the exact KKTPM computation. We discuss this aspectin more detail later in Section 6.7.

6.3. Three-objective DTLZ Problems

DTLZ1 problem has a number of locally efficient fronts on which some points can get stuck;hence it is relatively difficult to solve this problem to global optimality. Recently proposedNSGA-III procedure [21] is applied to DTLZ1 problems with 92 population members for 1,000generations. Figure 24 shows the variation of median KKT proximity measure values (withAASF formulation) versus the generation counter. Due to the difficulties offered by this problemwe notice from Figure 24 that the median KKTPM error stays close to one until around thefirst 300 generations. Then it starts to reduce slowly in steps. Even after 1,000 generations,the median KKTPM value is 0.0269. The presence of multiple local efficient fronts hampers asteady and global convergence in this problem. However, Figure 25 shows that the NSGA-IIInon-dominated points reach the true Pareto-optimal front ( f1+ f2+ f3 = 0.5) at generation 1,000.Also the 92 points are well distributed on the entire efficient front. The correlation coefficientbetween ε∗k and εest

k is found to be 0.9997, which indicates a strong positive correlation betweenthe two measures.

24

Page 25: A Computationally Fast and Approximate Method for Karush

0 200 400 600 800 100010

−2

10−1

100

Generation Number

KK

T P

roxi

mity

Mea

sure

OptimalDirectProjectedAdjustedEstimated

Figure 24: Generation-wise median KKT proximity mea-sures is computed for non-dominated solutions of NSGA-III populations for three-objective DTLZ1.

NSGA-III points

0

0.2 0.3

0.4 0.5

0

0.1

0.2 0.3

0.4

0.5

0 0.1

0.2 0.3

0.4 0.5

f1f2

f3

0.1

Figure 25: NSGA-III non-dominated population membersat generation 1,000 for the three-objective DTLZ1 prob-lem.

Next, we apply NSGA-III with a population size of 92 to three-objective DTLZ2 problem,which has a concave efficient front and relatively easier to solve than DTLZ1. NSGA-III wasrun for a maximum of 400 generations. Figure 26 shows the median values of different KKTproximity measures. Figure 27 shows that NSGA-III points at the end of a run are very close

0 50 100 150 200 250 300 350 40010

−4

10−3

10−2

10−1

100

Generation Number

KK

T P

roxi

mity

Mea

sure

OptimalDirectProjectedAdjustedEstimated

Figure 26: Generation-wise median KKT proximity mea-sures is computed for non-dominated solutions of NSGA-III populations for three-objective DTLZ2.

pointsNSGA−III

0.4 1f2

f1 0.4 0.2 0

0.6 0.8

1

f3

0.8 1

0.6 0.4 0.2

0 0 0.2 0.6 0.8

Figure 27: NSGA-III non-dominated population membersat generation 400 for the three-objective DTLZ2 problem.

to the true Pareto-optimal front (shown with a wire-mesh quadratic surface), thereby confirmingvisually the convergence and diversity of NSGA-III points. It is clear from a comparison of thisplot with that for DTLZ1 is that DTLZ2 is a relatively easier problem to solve. The convergencebehavior is more smooth and steady. Different KKTPM measures have different range of values,but their variation patterns are similar, The correlation coefficient between exact ε∗k and proposedεest

k is found to be 0.9901. In both these problems, the second constraint for KKTPM computationis critical and hence the direct KKTPM values differ from the exact KKTPM values.

25

Page 26: A Computationally Fast and Approximate Method for Karush

DTLZ5 problem has a degenerate efficient front. Although the problem has three objectives,the efficient front is two-dimensional. Figure 28 shows the median KKT proximity measurevalues versus generation number obtained using NSGA-III with a population size of 92. Asimilar dynamic variation of different KKTPM values is observed. The correlation coefficientbetween exact ε∗k and estimated εest

k is found to be 0.9900. As always, the exact KKTPM valuesare bracketed by εDk and εAdj

k .

0 100 200 300 400 50010

−6

10−5

10−4

10−3

10−2

10−1

100

Generation Number

KK

T P

roxi

mity

Mea

sure

OptimalDirectProjectedAdjustedEstimated

Figure 28: Generation-wise median KKT proximity measures is computed for non-dominated solutions of NSGA-IIIpopulations for three-objective DTLZ5.

6.4. Many-Objective DTLZ Problems

As the number of objectives increase, EMO algorithms have reportedly found DTLZ prob-lems to be difficult to optimize [21], as also evident from Figures 29 and 30 for five-objectiveand 10-objective DTLZ1 problems. For five and 10-objective versions of DTLZ1 and DTLZ2,we have used 212 ad 276 population members, respectively, to obtain a reasonable number ofdiverse points [21]. The increase in population size allows NSGA-III to work relatively betterthan on the three-objective DTLZ1 problem with 92 population members. NSGA-III does seemto perform in a consistent manner and KKTPM values reduce with generations. The estimatedKKTPM is correlated with the exact KKTPM (but computationally expensive) values with a cor-relation coefficient of 0.9991 and 0.9994 on five and 10-objective DTLZ1 problems, respectively.Figures 31, 32 and 33 show the parallel coordinate plot (PCP) for NSGA-III’s non-dominatedpopulation members at generations 30, 100 and 2,000, respectively, on 10-objective DTLZ1problem. For this problem, the values for all 10 objective functions for Pareto-optimal solutionsare expected to lie in [0.0, 0.5]. The figures show that at 10-th generation, non-dominated so-lutions do not come close to this range. However, with generations they progress towards thisrange. Although at 100-th generation, they come close the distribution of objective values isnon-uniform. However, at generation 2,000, the convergence and distribution are the best. Thisfact is reflected through the KKTPM values, as the estimated median KKTPM value at respectivegenerations are found to be 0.9738, 0.0975, and 0.0002, respectively. It is amply clear that theaccuracy of convergence and corresponding estimated KKTPM values are correlated.

26

Page 27: A Computationally Fast and Approximate Method for Karush

0 200 400 600 800 100010

−5

10−4

10−3

10−2

10−1

100

Generation Number

KK

T P

roxi

mity

Mea

sure

OptimalDirectProjectedAdjustedEstimated

Figure 29: Generation-wise median KKT proximity mea-sures is computed for non-dominated solutions of NSGA-III populations for five-objective DTLZ1.

0 500 1000 1500 200010

−4

10−3

10−2

10−1

100

Generation Number

KK

T P

roxi

mity

Mea

sure

OptimalDirectProjectedAdjustedEstimated

Figure 30: Generation-wise median KKT proximity mea-sures is computed for non-dominated solutions of NSGA-III populations for 10-objective DTLZ1.

Since the DTLZ2 problem is relatively easier to solve to Pareto-optimality, median KKTproximity measures values for five and 10-objective DTLZ2 problems are found to be steadilyapproaching to zero, as evident from Figures 34 and 35, respectively. Identical parameter settingto those used for DTLZ1 problems are used here. The similarity in the dynamics of KKTPMvariation of exact ε∗k and estimated εest

k is evident with a correlation coefficient of 0.9963 and0.9532, respectively, for five and 10-objective DTLZ2 problems.

Table 2 presents the computational time for NSGA-III runs with exact KKTPM and each ofthe approximated KKTPM. Since a large positive correlation among KKTPM values betweenoptimal and estimated versions are found, the use of a faster but approximated computation ofKKTPM (such as the proposed estimated KKTPM) are promising for further development andmodifications to existing EMO algorithms.

6.5. Constrained Test ProblemsThe above multi and many-objective optimization problems involved variable bounds as con-

straints. In this section, we consider further problems with non-linear inequality constraints.First, we consider the problem TNK, which has two inequality constraints, two variables andtwo objectives, as shown below:

Min. f1(x) = x1,Min. f2(x) = x2,

s.t. g1(x) ≡ x21 + x2

2 − 1 − 0.1 cos(16 arctan x1

x2

)≥ 0,

g2(x) ≡ (x1 − 0.5)2 + (x2 − 0.5)2 ≤ 0.5,0 ≤ x1 ≤ π, 0 ≤ x2 ≤ π.

(55)

NSGA-II is run with 40 population members. Figure 36 shows the variation of median KKTPMvalue with generations. Interestingly, the variation patterns of all approximated KKTPM andexact KKTPM are very similar. The correlation coefficient between exact ε∗k and estimated εest

kfor the all generations of a NSGA-II run is 0.9142, indicating a strong correlation between thetwo KKTPM entities.

27

Page 28: A Computationally Fast and Approximate Method for Karush

Gen=30

20

40

60

80

100

1 2 3 0

5 6 7 8 9 10Objective Number

Obj

ectiv

e V

alue

4

Figure 31: PCP plot of NSGA-III’s 276 non-dominatedmembers at generation 30. Median value of the esti-mated KKTPM is 0.9738.

Gen=100

0.2

0.4

0.6

0.8

1

1 2 3

0

5 6 7 8 9 10Objective Number

Obj

ectiv

e V

alue

4

Figure 32: PCP plot of NSGA-III’s 276 non-dominatedmembers at generation 100. Median value of the esti-mated KKTPM is 0.0975.

Gen = 2000

0.1

0.2

0.3

0.4

0.5

0.6

1 2 3

0

5 6 7 8 9 10

Obj

ectiv

e V

alue

Objective Number 4

Figure 33: PCP plot of NSGA-III’s 276 non-dominatedmembers at generation 2,000. Median value of the esti-mated KKTPM is 0.0002.

28

Page 29: A Computationally Fast and Approximate Method for Karush

0 50 100 150 200 250 300 350 40010

−5

10−4

10−3

10−2

10−1

100

Generation Number

KK

T P

roxi

mity

Mea

sure

OptimalDirectProjectedAdjustedEstimated

Figure 34: Generation-wise median KKT proximity mea-sures is computed for non-dominated solutions of NSGA-III populations for five-objective DTLZ2.

0 200 400 600 800 100010

−5

10−4

10−3

10−2

10−1

100

Generation NumberK

KT

Pro

xim

ity M

easu

re

OptimalDirectProjectedAdjustedEstimated

Figure 35: Generation-wise median KKT proximity mea-sures is computed for non-dominated solutions of NSGA-III populations for 10-objective DTLZ2.

0 50 100 150 20010

−4

10−3

10−2

10−1

100

Generation Number

KK

T P

roxi

mity

Mea

sure

OptimalDirectProjectedAdjustedEstimated

Figure 36: Generation-wise median KKT proximity mea-sures is computed for non-dominated solutions of NSGA-II populations for problem TNK.

0 100 200 300 400 50010

−3

10−2

10−1

100

Generation Number

KK

T P

roxi

mity

Mea

sure

OptimalDirectProjectedAdjustedEstimated

Figure 37: Generation-wise median KKT proximity mea-sures is computed for non-dominated solutions of NSGA-III populations for problem BNH.

29

Page 30: A Computationally Fast and Approximate Method for Karush

Next, we consider the problem BNH which has two variables and two constraints:

Minimize f1(x) = 4x21 + 4x2

2,Minimize f2(x) = (x1 − 5)2 + (x2 − 5)2,subject to g1(x) ≡ (x1 − 5)2 + x2

2 ≤ 25,g2(x) ≡ (x1 − 8)2 + (x2 + 3)2 ≥ 7.7,0 ≤ x1 ≤ 5, 0 ≤ x2 ≤ 3.

(56)

Due to the known difficulty in solving this problem, NSGA-III is run with 200 population mem-bers. Figure 37 shows the variation of median KKT proximity measures with generations forproblem BNH. A much smoother convergence pattern is observed for this problem.

Next, we consider the SRN problem, stated below:

Minimize f1(x) = 2 + (x1 − 2)2 + (x2 − 1)2,Minimize f2(x) = 9x1 − (x2 − 1)2,subject to g1(x) ≡ x2

1 + x22 ≤ 225,

g2(x) ≡ x1 − 3x2 + 10 ≤ 0,−20 ≤ x1 ≤ 20, −20 ≤ x2 ≤ 20.

(57)

The Pareto-optimal points have the following properties: x1 = −2.5 and x2 ∈ [2.50, 14.79]. Fig-ure 38 plots the median KKT proximity measures for generation-wise NSGA-III non-dominatedsolutions. NSGA-III points come very close to the true Pareto-optimal front quickly and then

0 100 200 300 400 500

10−2

10−1

100

Generation Number

KK

T P

roxi

mity

Mea

sure

OptimalDirectProjectedAdjustedEstimated

Figure 38: Generation-wise median KKT proximity mea-sures is computed for non-dominated solutions of NSGA-III populations for problem SRN.

0 200 400 600 800 100010

−5

10−4

10−3

10−2

10−1

100

Generation Number

KK

T P

roxi

mity

Mea

sure

OptimalDirectProjectedAdjustedEstimated

Figure 39: Generation-wise median KKT proximity mea-sures is computed for non-dominated solutions of NSGA-II populations for problem OSY.

move slowly closer to the front. We also notice that the projected and the direct KKTPM valuesare identical which means that the linear constraint (second constraint) is nearly vertical. Thecorrelation coefficient between ε∗k and εest

k for the all generations of a NSGA-III run is 0.9997.

30

Page 31: A Computationally Fast and Approximate Method for Karush

Next, we consider problem OSY, stated below:

Min. f1(x) = −[25(x1 − 2)2 + (x2 − 2)2 + (x3 − 1)2+

(x4 − 4)2 + (x5 − 1)2],

Min. f2(x) = x21 + x2

2 + x23 + x2

4 + x25 + x2

6,s.t. g1(x) ≡ x1 + x2 − 2 ≤ 0,

g2(x) ≡ x1 + x2 − 6 ≤ 0,g3(x) ≡ −x1 + x2 − 2 ≤ 0,g4(x) ≡ x1 − 3x2 − 2 ≤ 0,g5(x) ≡ (x3 − 3)2 + x4 − 4 ≤ 0,g6(x) ≡ −(x5 − 3)2 − x6 + 4 ≤ 0,0 ≤ x1, x2, x6 ≤ 10, 1 ≤ x3, x5 ≤ 5, 0 ≤ x4 ≤ 6.

(58)

Pareto-optimal solutions for this problem were identified in another study [13]. NSGA-II is runwith 200 population size. Figure 39 shows median different KKT proximity measures. A steadyconvergence is observed. The variation pattern in all KKTPM values is similar. The correlationcoefficient between ε∗k and εest

k for the all generations of a NSGA-II run is 0.9975.

6.6. Engineering Design Problems

Next, we consider three of the commonly-used real-world multi-objective optimization prob-lems [21, 25].

6.6.1. Welded Beam Design ProblemThe welded beam design problem has two objectives and four non-linear constraints [13].

This problem is solved using NSGA-III with 60 population members, initially created at random.Figure 40 shows the variation of median for optimal, direct, projection, adjusted and estimatedKKT proximity measure values with generation. After an initial fast convergence pattern until

0 100 200 300 400 50010

−3

10−2

10−1

100

Generation Number

KK

T P

roxi

mity

Mea

sure

OptimalDirectProjectedAdjustedEstimated

Figure 40: Generation-wise median KKT proximity mea-sures is computed for non-dominated solutions of NSGA-III populations for the welded-beam design problem.

0 200 400 600 800 100010

−3

10−2

10−1

100

Generation Number

KK

T P

roxi

mity

Mea

sure

OptimalDirectProjectedAdjustedEstimated

Figure 41: Generation-wise median KKT proximity mea-sures is computed for non-dominated solutions of NSGA-III populations for the car side impact problem.

about 40 generations, NSGA-III slows its convergence speed towards the optimum, despite an

31

Page 32: A Computationally Fast and Approximate Method for Karush

overall downward trend. Such a dynamic performance plot can be useful for modifying an EMOalgorithm interactively when a change in rate of convergence occurs. The estimated and exactKKTPM values have similar pattern of change with a correlation coefficient of 0.9981.

6.6.2. Car Side Impact ProblemNext, we consider a three-objective car cab design problem having seven variables, 10 con-

straints, and 14 variable bounds. NSGA-III [25] is applied with 92 initial population members.Figure 41 shows the variation of KKT proximity measure with generations. Although the KK-TPM values reduce with generations, there are more local undulations of the median KKTPMvalues in these real-world problems than that we have observed in test problems. The objectiveand constraint functions, in these problems, often make the Pareto-optimal front and its neigh-borhood non-smooth with periods of infeasible holes and feasible islands. A slight change in anon-dominated solution may make it infeasible or a better solution. Such incidences make themedian KKTPM value to fluctuate as the algorithm proceeds from one generation to another.Interestingly, the approximated KKTPM values are expected to differ from their exact (optimal)values, the correlation coefficient is still found to be 0.9929 for this problem. This is remarkableand motivates us to use the approximated values instead of the exact values in the interest ofcomputational time.

6.6.3. WATER ProblemFinally, we consider the five-objective WATER problem having three variables and seven

constraints. NSGA-III [25] is applied with 68 initial population members. Figure 42 shows thevariation of median KKT proximity measures with generations. An interesting and unknownproperty of this problem becomes clear. Since there are three variables and five objective func-tions, the efficient front is at most three-dimensional. Most random solutions lie very close to thePareto-optimal front in this problem. Due to more regularity in the search space in this problem,the KKTPM variations are smooth. A similar convergence pattern is evident here as well – afteran initial fast convergence, the algorithm slows down at the close vicinity of the Pareto-optimalsolutions. The optimal and estimated KKTPM methods have a correlation coefficient of 0.9969.

6.7. Computational TimeOur earlier study [11] for KKT proximity measure using an optimization method takes more

time but in this study our new estimated KKT-proximity measure is found to be much faster andwe now present computational time for solving all the above multi and many-objective problemsin Table 2. For example, in ZDT1 problem, while the NSGA-II run with exact KKTPM com-putation takes 62.63 min, NSGA-II with estimated KKTPM takes only 1.16 min. Previously, inFigure 19 it was shown that the variation pattern of these two KKTPM values are very similarwith a correlation coefficient of 0.9608 (high positive correlation). The low computational timerequirement by estimated KKTPM indicates that it can very well be used, instead of exact KK-TPM, to have a good trade-off on time and without sacrificing on the accuracy of the relativeKKTPM values. Similar conclusions can be drawn for other problems as well. As shown in thefinal column of the table, the time advantage in computing εest

k than ε∗k is about 47 times to 478times, which is a remarkable achievement.

6.8. Accuracy of Proposed Estimated KKTPMAlthough the proposed estimated εk computation is about two orders of magnitude faster,

next, we investigate the accuracy of the estimated εk with respect to the optimal εk. Figure 4332

Page 33: A Computationally Fast and Approximate Method for Karush

0 5 10 15 20 250

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

Generation Number

KK

T P

roxi

mity

Mea

sure

OptimalDirectProjectedAdjustedEstimated

Figure 42: Generation-wise median KKT proximity measures is computed for non-dominated solutions of NSGA-IIIpopulations for the WATER problem.

Table 2: Computational time (in minutes) for a complete optimization run using optimization, direct, projected, adjustedand estimated methods on multi- and many-objective problems.

ProblemOpt. Approximation Methods

Method Direct Projected Adjusted EstimatedKKTPM KKTPM KKTPM KKTPM KKTPM Gain

ZDT1 62.63 1.12 1.15 1.15 1.16 54.11ZDT2 53.77 1.03 1.02 1.03 1.06 50.87ZDT3 56.33 1.13 1.16 1.16 1.19 47.49ZDT4 126.37 1.51 1.52 1.60 1.58 79.80ZDT6 30.62 0.20 0.20 0.20 0.21 145.72

DTLZ1-3 214.22 1.15 1.21 1.27 1.21 177.40DTLZ1-5 790.81 4.13 4.33 4.52 4.77 165.62

DTLZ1-10 4981.90 30.39 30.70 30.95 30.60 162.82DTLZ2-3 135.24 0.91 0.76 0.81 0.83 162.86DTLZ2-5 1047.58 3.45 3.49 3.45 3.47 301.74

DTLZ2-10 10870.59 22.73 22.55 23.07 22.70 478.88DTLZ5-3 199.55 1.34 1.42 1.35 1.28 155.34

BNH 203.60 0.99 0.99 1.00 1.10 184.59SRN 245.58 0.96 1.04 1.05 1.09 225.57TNK 11.87 0.07 0.07 0.08 0.08 153.69OSY 1261.30 7.18 7.60 7.59 7.61 165.66

Welded 71.85 0.42 0.43 0.46 0.50 142.44Car 1074.20 2.59 2.57 2.69 2.87 374.73

Water 213.65 2.14 2.16 2.17 2.15 99.51

33

Page 34: A Computationally Fast and Approximate Method for Karush

plots εestk and ε∗k values for a median run of NSGA-II on problem ZDT1. A high correlation (with

R = 0.9608) between these two measures is observed. Table 3 presents the correlation coefficient

R=0.9608

0.0001

0.01

0.1

1

0.0001 0.001 0.01 0.1 1Optimal eps

Est

imat

ed e

ps

0.001

Figure 43: Correlation coefficient between εestk and ε∗k values for ZDT1.

between median values of εestk and ε∗l for all the problems considered in this study. It can be seen

Table 3: Correlation coefficient between estimated εk and optimal εk for single, multi, and many-objective optimizationproblems.

g01 g02 g04 g06 g07 g08 g090.9897 0.976 0.998 1.000 1.000 1.000 1.000

g10 g18 g24 ZDT1 ZDT2 ZDT3 ZDT41.000 1.000 0.993 0.961 0.976 0.983 0.995ZDT6 BNH TNK SRN OSY CAR WATER0.924 0.998 0.914 1.000 0.998 0.993 0.997

WELD DTLZ1-3 DTLZ1-5 DTLZ1-10 DTLZ2-3 DTLZ2-5 DTLZ2-100.998 1.000 0.999 0.999 0.990 0.996 0.953

DTLZ50.990

that in all cases the correlation coefficient is very close to one, indicating the accuracy in thetrend of estimate εk values.

7. Conclusions

This paper has proposed a computationally fast method for calculating the KKT proxim-ity measure introduced in two previous studies [7, 11]. The original computational procedure

34

Page 35: A Computationally Fast and Approximate Method for Karush

involved an optimization process which was found to be time consuming for the metric to beused frequently within an optimization run. This paper has suggested three different approxima-tion procedures or KKTPM computation that avoids solving the original optimization associatedwith the process. The analysis has revealed that when a condition is met, the approximationprocedures produce the exact (and optimal) KKTPM values. In other cases, the exact KKTPMvalue is found to be bracketed by two approximation estimates. Based on these observations, anestimated KKTPM (εest

k ) has been proposed.The closeness of εest

k with the exact KKTPM value on generation-wise best or non-dominatedsolutions of a real-coded genetic algorithm (for single objective problems), NSGA-II (for two ob-jective problems) and NSGA-III (for three or more objective problems) on g-series of constrainedtest problems and on standard single and multi-objective test problems has been demonstrated.Further, they are also computed for a few engineering design problems. A detailed computa-tional time comparison has revealed the efficacy of the proposed estimated KKTPM procedure.Moreover, by presenting the correlation coefficient of exact KKTPM and estimated KKTPM onall problems studied here, it has been observed that the values are very close to one, therebydemonstrating a strong positive correlation between the two quantities.

All the results presented in this paper clearly indicate that the proposed estimated KKTPMcomputation provides similar relative values as the exact optimization based procedure woulddo, but the estimated procedure is achieved with a fraction (between 1/31 to 1/781 times) ofcomputational effort needed by the exact procedure. With this achievement, we believe that theconcept of KKTPM is now ready to be integrated with an evolutionary or any other optimizationprocedure to (i) estimate the convergence characteristics of the population-best solutions fromthe true optimal solutions without knowing their true values, (ii) identify population membersthat are relatively deficient in convergence to their respective optimal solutions, so that a dif-ferential treatment of them can be invoked to make the overall optimization task quicker, and(iii) constitute a suitable termination condition ensuring a proper convergence and integratingwith other performance measures ensuring other termination aspects, such as desired spread andglobal optimality, etc. Results on these follow-up studies will be performed next.

Acknowledgments

Authors acknowledge the efforts of Mr. Haitham Saeda in providing us with results of NSGA-III. The second author’s support from the Ministry of Higher Education in Egypt is highly ap-preciated.

References

[1] H. W. Kuhn, A. W. Tucker, Nonlinear programming, in: J. Neyman (Ed.), Proceedings of Second Berkeley Sym-posium on Mathematical Statistics and Probability, University of California Press, 1951.

[2] S. I. Birbil, J. B. G. Frenk, G. J. Still, An elementary proof of the Fritz-John and KarushKuhnTucker conditions innonlinear programming, European Journal of Operational Research 180 (1) (2007) 479–484.

[3] G. V. Reklaitis, A. Ravindran, K. M. Ragsdell, Engineering Optimization Methods and Applications, New York :Wiley, 1983.

[4] R. T. Rockafellar, Convex Analysis, Princeton University Press, 1996.[5] D. Bertsekas, Convex Analysis and Optimization, Athena Scientific, 2003.[6] C. R. Bector, S. Chandra, J. Dutta, Principles of Optimization Theory, New Delhi: Narosa, 2005.[7] J. Dutta, K. Deb, R. Tulshyan, R. Arora, Approximate KKT points and a proximity measure for termination, Journal

of Global Optimization 56 (4) (2013) 1463–1499.

35

Page 36: A Computationally Fast and Approximate Method for Karush

[8] G. Haeser, M. L. Schuverdt, Approximate KKT conditions for variational inequality problems, Optimization On-line.URL http://www.optimization-online.org/DB HTML/2009/10/2415.html

[9] R. Andreani, J. M. Martinez, B. F. Svaiter, A new sequential optimality condition for constrained optimization andalgorithmic consequences, SIAM. J. OPT 20 (2010) 3533–3554.

[10] R. Tulshyan, R. Arora, K. Deb, J. Dutta, Investigating EA solutions for approximate KKT conditions for smoothproblems, in: Proceedings of Genetic and Evolutionary Algorithms Conference (GECCO-2010), ACM Press, 2010,pp. 689–696.

[11] K. Deb, M. Abouhawwash, J. Dutta, A KKT proximity measure for evolutionary multi-objective and many-objective optimization, in: Proceedings of Eighth Conference on Evolutionary Multi-Criterion Optimization(EMO-2015), Part II, Heidelberg: Springer, 2015, pp. 18–33.

[12] D. E. Goldberg, Genetic Algorithms for Search, Optimization, and Machine Learning, Reading, MA: Addison-Wesley, 1989.

[13] K. Deb, Multi-objective optimization using evolutionary algorithms, Wiley, Chichester, UK, 2001.[14] C. A. C. Coello, D. A. VanVeldhuizen, G. Lamont, Evolutionary Algorithms for Solving Multi-Objective Problems,

Boston, MA: Kluwer, 2002.[15] P. Shukla, K. Deb, On finding multiple Pareto-optimal solutions using classical and evolutionary generating meth-

ods, European Journal of Operational Research (EJOR) 181 (3) (2007) 1630–1652.[16] V. Chankong, Y. Y. Haimes, Multiobjective Decision Making Theory and Methodology, New York: North-Holland,

1983.[17] K. Miettinen, Nonlinear Multiobjective Optimization, Kluwer, Boston, 1999.[18] K. Deb, S. Agrawal, A. Pratap, T. Meyarivan, A fast and elitist multi-objective genetic algorithm: NSGA-II, IEEE

Transactions on Evolutionary Computation 6 (2) (2002) 182–197.[19] E. Zitzler, M. Laumanns, L. Thiele, SPEA2: Improving the strength Pareto evolutionary algorithm for multiob-

jective optimization, in: K. C. G. et al. (Ed.), Evolutionary Methods for Design Optimization and Control withApplications to Industrial Problems, International Center for Numerical Methods in Engineering (CIMNE), 2001,pp. 95–100.

[20] Q. Zhang, H. Li, MOEA/D: A multiobjective evolutionary algorithm based on decomposition, Evolutionary Com-putation, IEEE Transactions on 11 (6) (2007) 712–731.

[21] K. Deb, H. Jain, An evolutionary many-objective optimization algorithm using reference-point based non-dominated sorting approach, Part I: Solving problems with box constraints, IEEE Transactions on EvolutionaryComputation 18 (4) (2014) 577–601.

[22] H.-C. Wu, The Karush-Kuhn-Tucker optimality conditions in an optimization problem with interval-valued objec-tive function, European Journal of Operational Research 176 (1) (2007) 46–59.

[23] A. P. Wierzbicki, The use of reference objectives in multiobjective optimization, in: G. Fandel, T. Gal (Eds.),Multiple Criteria Decision Making Theory and Applications, Berlin: Springer-Verlag, 1980, pp. 468–486.

[24] K. Deb, R. B. Agrawal, Simulated binary crossover for continuous search space, Complex Systems 9 (2) (1995)115–148.

[25] H. Jain, K. Deb, An evolutionary many-objective optimization algorithm using reference-point based non-dominated sorting approach, Part II: Handling constraints and extending to an adaptive approach, IEEE Trans-actions on Evolutionary Computation 18 (4) (2014) 602–622.

[26] J. J. Liang, T. P. Runarsson, E. Mezura-Montes, M. Clerc, P. N. Suganthan, C. A. C. Coello, K. Deb, Special sessionon constrained real-parameter optimization (http://www.ntu.edu.sg/home/epnsugan/) (2006).

[27] K. Deb, M. Goyal, A robust optimization procedure for mechanical component design based on genetic adaptivesearch, Transactions of the ASME: Journal of Mechanical Design 120 (2) (1998) 162–164.

[28] H. Seada, K. Deb, U-NSGA-III: A unified evolutionary optimization pro- cedure for single, multiple, and manyobjectives proof-of-principle results, in: Proceedings of Eighth Conference on Evolutionary Multi-Criterion Opti-mization (EMO-2015), Part II, Heidelberg: Springer, 2015, pp. 34–49.

[29] E. Zitzler, K. Deb, L. Thiele, Comparison of multiobjective evolutionary algorithms: Empirical results, Evolution-ary Computation Journal 8 (2) (2000) 125–148.

36