[2008][CON ] A comparative landscape analysis of fitness functions for search-based testing

Embed Size (px)

Citation preview

  • 8/3/2019 [2008][CON ] A comparative landscape analysis of fitness functions for search-based testing

    1/8

    A comparative landscape analysis of fitness

    functions for search-based testing

    Raluca Lefticaru, Florentin Ipate

    Department of Computer Science and Mathematics

    University of Pitesti

    Str. Targu din Vale 1, 110040 Pitesti, Romania

    [email protected], [email protected]

    AbstractLandscape analysis of fitness functions is animportant topic. This paper makes an attempt to characterizethe search problems associated with the fitness functions usedin search-based testing, employing the following measures:diameter, autocorrelation and fitness distance correlation.In a previous work, a general form of objective functionsfor structural search-based software testing was tailored forstate-based testing. A comparison is performed in this paper

    between the general fitness functions and some problem-specificfitness functions, taking into account their performance withdifferent search methods.

    Keywords: search-based testing, finite state machines, metaheuristic search techniques, fitness functions, landscape analysis.

    I. INTRODUCTION

    Software testing is a very expensive, tedious and time

    consuming task, which was estimated to require around 50%

    of the total cost of software development. Therefore, different

    techniques have been employed to automate test generation,

    among them random testing [1], symbolic execution [2],

    domain reduction [3] and constraint-based testing [4]. An

    approach with a great potential for the automatic generationof test data is to model the testing task as a search problem,

    which can be solved by applying search techniques like genetic

    algorithms, simulated annealing, particle swarm optimization

    or tabu search.

    Search-based testing is characterized by the usage of search

    techniques for test generation. Whatever the test aim is, it

    is first transformed into an optimization problem with re-

    spect to some fitness (cost or objective) function. The search

    space, in which the test data that fulfils a given test aim is

    searched for, is the input domain of the test object (program,

    function). The search spaces obtained are usually complex,

    discontinuous, and non-linear, due to the non-linearity of

    software (if-statements, loops etc.) [5]. Therefore the use

    of neighborhood search methods, such as hill climbing, is

    not recommended; instead, metaheuristic search methods are

    employed [5]. Many papers have been published, which

    present applications of metaheuristic algorithms for test data

    generation. Consequently, a comprehensive survey [6] on the

    search-based test data generation followed them.

    Most studies have concentrated on the application of such

    techniques in structural (program-based or white-box) testing

    [7][9]. In structural testing, the program is represented as a

    directed graph, in which each node corresponds to a state-

    ment or a sequence of statements and each edge (branch) to

    a transfer of control between two statements. Search-based

    techniques are then used to generate test data to cover the

    desired graph elements (nodes, branches or paths).

    Functional search-based testing has been less investigated.

    Some papers have concentrated on generating test data from

    Z specifications [10], [11]. This idea of using the functionalspecification of a program was extended to conformance

    testing [11], an objective function of the form pre-condition

    post-condition, that measures the closeness of the testdata to uncovering a fault in the implementation was em-

    ployed. The application of evolutionary functional testing to

    some industrial software (an automated parking system) is

    presented in [12], [13]. Other approaches tackle the problem of

    generating realistic signals and propose an innovative solution,

    by building the overall signal from a series of simple signal

    types, for example sine, spline and linear curves [14].

    Evolutionary methods for testing state-based programs have

    been studied, mainly from a structural testing point of view,

    regarding the flags (loop-assigned flags), enumerations andcounters used to manage the internal state of the objects [15],

    [16].

    In this paper we discuss the use of search based-methods

    for testing from state-based specifications. A general fitness

    function was tailored in [17] to find input parameters for

    a given path in a state machine, which satisfy some given

    constraints for each transition, and preliminary experiments

    showed that its usage with genetic algorithms and particle

    swarm optimization is promising [17][19].

    The use of this general fitness function, based on state

    machines, is further investigated in this paper. Furthermore, the

    investigation is also extended to other search techniques: simu-

    lated annealing and particle swarm optimization. Experimentalresults show that this general fitness function may produce

    results comparable to those produced by fitness functions

    designed especially for a particular situation, which assume

    a much richer knowledge of the problem in hand. Metrics that

    help the assessment of the fitness functions are also presented.

    The paper is structured as follows. Section II provides a

    brief overview of metaheuristic search techniques and section

    III some measures used for the landscape characterization of

    the search dynamics. Section IV presents the fitness functions

    10th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing

    978-0-7695-3523-4/08 $25.00 2008 IEEE

    DOI 10.1109/SYNASC.2008.69

    201

  • 8/3/2019 [2008][CON ] A comparative landscape analysis of fitness functions for search-based testing

    2/8

    for search-based testing from state machines, some experi-

    mental results are resumed in section V and the proposed

    approach is compared with previous work in section VI.

    Finally, conclusions and further work and are drawn in section

    VII.

    I I . METAHEURISTIC SEARCH TECHNIQUES

    Simulated Annealing (SA) is a local search technique,

    proposed by Kirkpatrick et al. in 1983 [20]. In the original

    algorithm, the optimization starts with an initial candidate

    solution; a random generated neighbour replaces the current

    solution whether it has a better objective value, or it has a

    worse objective value, with a probability that depends on the

    difference between the corresponding function values anda global parameter t (temperature). The dependency on thecurrent temperature t is such that the current solution changesalmost randomly when t is large, but increasingly downhill

    as t goes to zero. The probability of acceptance of a worsecandidate is higher at the beginning of the algorithm, when the

    temperature t is high and it is gradually decreased accordingto a cooling schedule that controls the parameter t.

    Genetic Algorithms (GA) are global search heuristics, that

    use techniques inspired from biology, such as selection, re-

    combination (crossover) and mutation, applied on a population

    of potential solutions, called chromosomes (or individuals).

    GA are a particular class of evolutionary algorithms; they are

    closely related to Evolutionary Strategies (ES), the main dif-

    ference being that for GA recombination is the main operator

    that drives the search, whereas ES principally use mutation.

    High level descriptions of these algorithms can be found in

    [6], [20], [21].Particle Swarm Optimization (PSO) is a relatively recent

    search technique introduced by Eberhart and Kennedy in 1995,

    inspired by social metaphors of behavior and swarm theory.

    The system is initialized with a population of random solutions

    (particles) and searches for optimum by updating generations.

    Each particle maintains its current position, velocity and best

    position explored so far. It flies through the problem space by

    following the current optimum particles, with a velocity, dy-

    namically adjusted according to its own previous best position

    and also to the previous best position of its companions [22].

    Unlike in GA, in PSO the selection and crossover operators

    are not employed: all the particles are kept as members of

    the population. In the basic PSO algorithm, the particles aremanipulated according to the following equations [22]: vid =wvid+c1rand()(pidxid)+ c2Rand()(pgdxid) andxid = xid + vid, where: Xi = (xi1, xi2, . . . , xiD) representsthe ith particle, Vi = (vi1, vi2, . . . , viD) the rate of change(velocity) for the ith particle and Pi = (pi1, pi2, . . . , piD)its best previous position. The index of the best particle

    among all the population is g, c1, c2 > 0 are the accelerationcoefficients, w the initial inertia weight and rand(), Rand()random functions in the range [0, 1].

    I I I . MEASURES TO CHARACTERIZE SEARCH PROBLEMS

    This section presents some measures used for characterizing

    the search problems and the metaheuristics applied to solve

    them. Let S be a solution space, f a cost (objective or fitness)function assigning a real number to each point in S and N aneighbourhood operator connecting each point in S to a non-

    empty set of neighbours.The diameter D of a search landscape is the maximal

    distance between two points in S, in terms of successiveapplications of N. Intuitively, small diameters should be morefavorable than large ones, because any point can potentially

    be reached quicker [21]. For a successful SA process, the

    diameter D should be significantly lower than M AX ITER,the maximum number of iterations per run. This criterion is

    intended to ensure that an optimal solution has a chance to be

    reached before the end of the run, whatever the starting point

    of the search is [21].

    The autocorrelation d of a series measures the variationof fitness for points that are at distance d. The most important

    value is 1 and it is estimated by means of a random walk,whose steps consist in moving to a new point chosen randomly

    among the neighbors of the current point (for example, using

    N, the neighbourhood operator of a SA process, as in Fig. 1).

    Select a starting point x1f1 = f(x1)FOR i = 1 to n-1 DO

    Randomly select xi+1 N(xi)fi+1 = f(xi+1)

    END FOR

    Fig. 1. Random walk algorithm

    An estimation of 1 is:

    1 =

    n1i=1 (fi f)(fi+1 f)n

    i=1(fi f)2

    where n is the number of samples, fi the fitness values andf their mean [23]. If |1| is close to one, the landscape issmooth. If |1| is close to zero, the neighboring cost values areunrelated and neighborhood search techniques are not expected

    to be effective.

    Another measure, proposed for GA, which indicates the

    problem hardness by characterizing the joint variation of dis-

    tances and objective values is the fitness distance correlation

    (FDC). Given a set F = {f1, f2, . . . , f n} of n individualfitness values and a corresponding set D = {d1, d2, . . . , dn} ofthe n distances to the nearest global optimum, the correlationcoefficient r is:

    r =cFD

    sF sD

    where

    cFD =1

    n

    n

    i=1

    (fi f)(di d)

    202

  • 8/3/2019 [2008][CON ] A comparative landscape analysis of fitness functions for search-based testing

    3/8

    is the covariance of F and D, and sF, sD , f and d arethe standard deviations and means of F and D respectively[24]. The FDC definition involves the notion of distance to

    the nearest global optimum and consequently this restricts its

    applicability to problems with known optima.

    The ideal fitness function will have r = 1 when minimizing

    (the fitness decreases as distance to global minima decreases)and r = 1 when maximizing (the fitness increases as thedistance to a global maxima decreases). In the rest of the paper

    we are interested in minimizing the fitness function. As shown

    in [24], GA problems can be classified, when minimizing,

    according to the value of the FDC coefficient in: misleading

    (r 0.15), in which the fitness tends to increase as theglobal minimum approaches, difficult (0.15 < r < 0.15),in which there is virtually no correlation between fitness and

    distance and straight-forward (r 0.15), in which fitnessdecreases as global minimum approaches. The second class

    corresponds to problems for which the difficulty cannot be

    estimated because r doesnt bring any information and in thiscase the fitness-distance scatterplot may give information on

    problem difficulty.

    IV. FITNESS FUNCTION FOR STATE-BASED TESTING

    Extended forms of finite state machines (e.g. statecharts

    [25], X-machines [26]) have received extensive study in recent

    years. They are formally described by finite state machines

    (FSMs) which have the transitions between the states labelled

    by function (method) names. One of the great benefits of

    FSMs are the associated testing techniques, e.g. the W-method

    proposed by Chow [27]. These methods were generalized and

    used for testing extended finite state machines [26], [28], [29].

    However, the transformation of a function sequence, obtained

    with a certain testing method for extended state machines,

    into methods call sequences is hindered by the difficulty ofchoosing the input values to be provided to the methods.

    In this paper we investigate the applicability of metaheuris-

    tic search techniques to the following type of fitness function,

    which is tailored in [17] and further analysed in [18], [19].

    The purpose of this type of fitness function is to find input

    data for state machine constrained paths, as described in the

    following.

    Given a particular path in a state machine, p =

    S0f1[g1]

    // S1f2[g2]

    // S2f3[g3]

    // . . . S m1fm[gm]

    // Sm ,where fi represent methods, Si states and gi some constraints(for example guards in the state machine diagram), an

    individual (possible solution) is a list of input values,

    x = (x1, x2, . . . , xn), corresponding to all parameters ofthe methods f1, f2, . . . , f m (in the order they appear). Ifthe sequence of method calls, having the parameter values

    x = (x1, x2, . . . , xn), determines the transitions between thestates specified by the path p and validates the predicate giof each transition, then x is a solution for p [17], [18].

    The transition constraints g1, g2, . . . , gm are not necessarythe guards from the state machine diagram; they can be

    predicates obtained from stronger variants of state-based cri-

    teria, such as full predicate [28], in which all individual

    clauses in a decision are exercised, or disjunct coverage [29],

    in which guard conditions are transformed into disjunctive

    normal form and a separate transition is defined for each

    disjunct. Furthermore, these predicates can be arbitrary chosen,

    for example to realize a partitioning into equivalence classes

    of the parameters provided to the methods f1, f2, . . . , f m.

    The fitness value of one individual will be computed asfitness = approach level + normalized branch level [17],[18], as described in Fig. 2.

    The first component of the fitness function, namely the

    approach (approximation) level, has a similar metric in evo-

    lutionary structural test data generation. It is calculated by

    subtracting one from the number of critical branches lying

    between the node from which the individual diverged away

    from the target node, and the target itself. In the structural

    approach, a critical branch is a program branch which leads

    to a miss of the current structural target for which test data

    is sought [16]. For state-based testing we consider all the

    transitions to be critical and for the transitions without a guard,

    the true predicate as pre-condition [17], [18].

    An individual has missed the target path if at some point

    a pre-condition is not satisfied (this could cause a transition

    to another state). The approach level will be 0 for theindividuals which follow the target path; for those which

    diverge from the path, it will be calculated as described in

    Fig. 2.

    S0

    Sm-1

    S1

    Sm

    f1 [g1]

    TARGET MISSED

    approach level = m-1branch level = obj(g1)

    TARGET MISSED

    approach level = 0branch level = obj(gm)

    f2 [g2]

    fm [gm]

    TARGET MISSED

    approach level = m-2branch level = obj(g2)

    S2

    [not g1]

    [not g2]

    [not gm]

    Fig. 2. Calculating the fitness function

    A fitness function employing just the approach level has

    plateaux (for each value 0, 1, . . . , m 1) and it does notoffer enough guidance to the search. Consequently a second

    component of the fitness function will compute, for the place

    where the actual path diverges from the required one, how

    close was the pre-condition predicate to being true. This

    second metric is called branch level and can be derived from

    the guard predicates using the transformations from Table I [6],

    203

  • 8/3/2019 [2008][CON ] A comparative landscape analysis of fitness functions for search-based testing

    4/8

    [10], [11], where K is a positive constant. The branch level isthen mapped onto [0, 1) or normalized. Thus the fitness value

    is: fitness = approach level + normalized branch level,for short al + nbl.

    Predicate Objective function obj

    a = b if abs(a b) = 0 then 0 else abs(a b) + Ka

    =b if abs

    (a b

    )= 0

    then 0 else Ka < b if a b < 0 then 0 else (a b) + Ka b if a b 0 then 0 else (a b) + KBoolea n if TRUE then 0 else Ka b obj(a) + obj(b)a b min(obj(a), obj(b))

    TABLE ITRACEYS OBJECTIVE FUNCTIONS

    V. EMPIRICAL EVALUATION

    A. Motivating Example

    A measurement approach is experimented in [21] on various

    landscapes of search problems used for test generation, in

    order to perform a better tuning of SA. One of the examples,Cal 1 (calendar problem), is an artificial problem, inspired

    from software testing [11] and has exactly one optimum

    to be found, arbitrarily taken as December 31, 2000. The

    solution space is the set of dates from January 1, 1900 up

    to December 31, 3000 and each date is encoded by a triplet

    (day,month, year), for short (d,m,y).Waeselynck et al. considered 24 landscapes for Cal 1

    problem, corresponding to the combinations of 4 cost functions

    and 6 neighborhood operators, described below [21].

    cost1 =0 if (d,m,y) = (31, 12, 2000), else cost1 israndomly chosen in [1, 1000]

    cost2 =|31d|+|12 m|+|2000y|+Kd+Km+Ky

    Kd = 0 if d = 31, else Kd = 10;Km = 0 if m = 12, else Km = 10;Ky = 0 if y = 2000, else Ky = 100

    cost3 =|31 d| + 31 |12 m| + 365 |2000 y|cost4 =Exact number of days between date (d,m,y)

    and date (31, 12, 2000)

    N1 Any date obtained by letting one, two or threetriplet parameters (d,m,y) vary 1, modulo themaximal parameter value

    N2 Same as N1, without modulo (e.g., a Decemberdate is no more connected to a January one)

    N3 Any date no more than 15 days apart from thecurrent date, modulo the solution space boundaries,

    that is, (1, 1, 1,900) and (31, 12, 3,000) areconnected

    N4 Same as N3, without moduloN5 Any date no more than 400 days apart from the

    current date, modulo the solution space boundaries

    N6 Same as N5, without modulo

    The measures retained by the authors are: diameter of the

    search landscape and autocorrelation, for which the experi-

    mental values obtained are provided. Also, a new measure,

    the Generation Rate of Better Solutions (GRBS), was intro-

    duced to monitor convergence of the SA search process and

    implement a stopping criteria.

    B. Extending the Motivating Example

    However several cost functions can be built for Cal1

    problem and we consider other 6 fitness functions:

    cost ydm, cost ymd, cost mdy, cost myd, cost dmy andcost dym. All of them have the form approach level +normalized branch level (al + nbl), discussed previously.Fig. 3 shows how cost dmy is computed, checking first thecondition day = 31, then the constraint month = 12 andfinally year = 2000. The other 5 functions are calculatedsimilarly, the only difference is in the order in which the

    constraints are checked, suggested by their name. To calculate

    branch level we employed the formulas from Table I, where

    the constant considered was K = 1 and the normalizationfunction was norm(d) = 1 1.001d, d > 0 (just for bettervisualizing the scatterplots in Fig. 4 we replaced this with

    norm(d) = 1 1.01d, d > 0).

    S0 DMY

    DM

    M

    Y DY

    MYins_day(d)

    [d=31]

    ins_year(y)

    [y=2000]

    ins_month(m)[m=12]

    ins_day

    (d)

    [d=31

    ]ins

    _day(d

    )

    [d=31]

    ins_month(m)[m=12]

    ins_day(d)[d=31]

    ins_year(y)[y=2000]

    ins_yea

    r(y)

    [y=200

    0]

    D

    ins_m

    onth(

    m)

    [m=1

    2]ins

    _month(

    m)

    [m=1

    2]

    ins_year(y)

    [y=2000]

    (a) State machine diagram

    S0

    DM

    D

    DMY

    TARGET MISSED

    approach level = 2branch level = abs(d-31)

    ins_day(d)

    [d == 31]

    [d 31]

    ins_month(m)

    [m == 12]

    ins_year(y)

    [y = 2000]

    TARGET MISSED

    approach level = 1branch level = abs(m-12)

    TARGET MISSED

    approach level = 0branch level = abs(y-2000)

    [m 12]

    [y 2000]

    (b) Corresponding path: S0 D DM DM Y

    Fig. 3. Computing the fitness for cost dmy

    Having the previous 4 cost functions [21], of which cost3and cost4 are the most adequate for the Cal 1 search problem,we compared the other 6 fitness functions with them (note

    that cost3 is a good metric, counting the approximate numberof days to the solution of the problem and cost4 returns theexact number of days). The latter 6 cost functions, of the

    204

  • 8/3/2019 [2008][CON ] A comparative landscape analysis of fitness functions for search-based testing

    5/8

    form al + nbl have a general form and they incorporate noadditional information, like the number of days to the target

    date (31,12,2000) or other properties, which could give a better

    guidance and transform them in perfect metrics for Cal1.

    The diameter criterion was used in [21] to ensure that

    an optimal solution has a chance to be reached before the

    end of the run of a SA process, whatever the starting pointof the search. The diameter values for each neighbourhood

    operator N1, . . . , N 6 are presented in Table II and showthat for M AX ITER = 2000, N3 and N4 do not pass

    0 1 2 3 4

    x 105

    0

    200

    400

    600

    800

    1000

    Distance

    Fitness

    cost1

    0 1 2 3 4

    x 105

    0

    500

    1000

    1500

    Distance

    Fitness

    cost2

    0 1 2 3 4

    x 105

    0

    1

    2

    3

    4x 10

    5

    Distance

    Fitness

    cost3

    0 1 2 3 4

    x 105

    0

    1

    2

    3

    4x 10

    5

    Distance

    Fitness

    cost4

    0 1 2 3 4

    x 105

    1

    1.5

    2

    2.5

    3

    Distance

    Fitness

    cost_ydm

    0 1 2 3 4

    x 105

    0

    0.5

    1

    1.5

    2

    2.5

    3

    Distance

    Fitness

    cost_ymd

    0 1 2 3 4

    x 105

    0

    0.5

    1

    1.5

    2

    2.5

    Distance

    Fitness

    cost_mdy

    0 1 2 3 4

    x 105

    1

    1.5

    2

    2.5

    Distance

    Fitness

    cost_myd

    0 1 2 3 4

    x 105

    0

    0.5

    1

    1.5

    2

    2.5

    Distance

    Fitness

    cost_dmy

    0 1 2 3 4

    x 105

    1

    1.5

    2

    2.5

    Distance

    Fitness

    cost_dym

    Fig. 4. Fitness distance scatterplots

    this criterion (the diameter should be significantly lower than

    M AX ITER, the maximum number of iterations per run).In the Tables II and III, bold values indicate that a criterion is

    passed, i.e. for the autocorrelation that |1| is close to 1 andfor FDC that r 0.15.

    Another measure employed for comparison was the au-

    tocorrelation of each fitness function, using random walkswith the neighbourhood operators N1, . . . , N 6. The re-sults, given in Table II, which were calculated after

    100 random walks, each one of 1000 samples, show

    that cost3,cost4, cost ymd,cost ydm pass the criterion forN1, . . . , N 6; cost2 passes the criterion just for N1, N2; theremaining combinations do not satisfy it.

    The last measure used was the FDC, computed for 1000

    random samples; the results were averaged over 100 runs. To

    measure the distance between a point and the global optimum

    we used 2 metrics: d1. the exact number of days between thetwo dates; d2. the euclidian distance between 2 real pointsfrom R3, as the solutions were coded as triplets (d,m,y).The mean and standard deviations of the FDC coefficients

    obtained for both distances are presented in Table III. These

    results show that for cost2, cost3, cost4 the fitness valuesvary according to the distance to optimum and they are ideal

    for a search. Among the other 6 functions only cost ydm andcost ymd have a high FDC coefficient (they also have a highautocorrelation) and we would expect them to have a better

    behaviour in guiding the search, than cost dmy for example(the FDC criterion would classify it as difficult).

    The scatterplots from Fig. 4 correspond to the fitness func-

    tions presented and the distance d1 (exact number of days). Itis obvious from Fig. 4 that cost1 (the random function) is theworst; cost2, cost3 and cost4 are straightforward; cost ymdand cost ydm are, in a certain way, guiding the search tothe optimum. However, the last 4 functions are somehow

    leveled and not so appropriate for guiding the search. The

    same fitness values from the intervals [1, 1 + ], [2, 2 + ],where 0 < , < 1, might correspond to points in the searchspace that are very close or very far away from the optimum.

    For example, x1 = (31, 11, 2000) is just 1 month away fromthe target date (31, 12, 2000); x2 = (31, 11, 3000) is morethan 999 years away from the target and cost mdy(x1) =cost mdy(x2) = 2 + , 0 < < 1.

    N1 N2 N3 N4 N5 N6

    Diameter 550 1100 6700 13400 250 500

    cost1 0.00 0.00 -0.01 0.00 0.00 0.00cost2 0. 88 0. 90 0.21 0.20 0.34 0.32

    cost3 0.99 0.99 0.90 0.90 0.98 0.98cost4 0.99 0.99 0.99 0.99 0.99 0.99cost ydm 0.95 0.97 0.91 0.88 0.98 0.97cost ymd 0.96 0.97 0.90 0.91 0.95 0.97cost mdy 0.38 0.38 0.67 0.70 0.03 0.03cost myd 0.39 0.40 0.71 0.71 0.04 0.04cost dmy 0.19 0.24 0.01 0.01 0.00 0.00cost dym 0.22 0.25 0.02 0.02 0.00 0.00

    TABLE IIDIAMETER AND AUTOCORRELATION (MEAN)

    205

  • 8/3/2019 [2008][CON ] A comparative landscape analysis of fitness functions for search-based testing

    6/8

    Mean1 (Std1) Mean2 (Std2)

    cos t1 0.000 (0.032) -0.004 (0.031)cost2 0.998 (

  • 8/3/2019 [2008][CON ] A comparative landscape analysis of fitness functions for search-based testing

    7/8

    the other hand, the same study provides some examples where

    the simple hill climbing approach significantly outperforms

    evolutionary testing.

    Arcuri et al. formally analyse in [34] the expected runtime

    of three different search algorithms on the problem of test

    data generation for an instance of the triangle classification

    program.Intuitively, the relief of the landscape should have a

    strong impact on the dynamics of exploration strategies [21]

    and some papers tackle this issue. The properties of fitness

    functions employed in search-based testing, more precisely the

    landscape characterizations, are analyzed in [6], [12], [21].

    McMinn presents different types of fitness functions, with

    their associated landscapes, and they are characterized to be

    deceptive, or to have plateaux (which do not offer enough

    guidance to the search) [6].

    An evaluation of different fitness functions for the evolution-

    ary testing of an autonomous parking system, by performing

    a number of experiments to analyze the fitness landscapes for

    the fitness functions, is performed in [12].

    Waeselynck et al. investigated a measurement approach to

    find an adequate setting of simulated annealing parameters,

    applied to test generation [21].

    VI I . CONCLUSIONS AND FUTURE WOR K

    This paper investigates the usage of general fitness func-

    tions for specification based-testing and performs an empirical

    measurement approach (employing the diameter, the autocor-

    relation and the FDC as characterizing measures). It also

    realizes an evaluation on various examples, which show that

    this general fitness function may produce results comparable

    to those produced by fitness functions designed especially for

    a particular situation.

    Metaheuristic are general search techniques and have to betuned for each cost function considered; consequently finding

    adequate parameter settings that will give performance to

    the search is a difficult problem. The experiments performed

    suggest some tuning choices for SA, GA and PSO.

    Future work concerns analyzing other variants of fitness

    function for state-based testing, experimentation on a larger

    benchmark of real world objects and extending the approach

    to the case in which the method parameters have complex

    types.

    REFERENCES

    [1] R. Hamlet, Random testing, Encyclopedia of Software Engineering.Wiley, 1994, pp. 970978.

    [2] J. King, Symbolic execution and program testing, Communications ofthe ACM, vol. 19, no. 7, pp. 385394, 1976.[3] A. J. Offutt, Z. Jin, and J. Pan, The dynamic domain reduction

    procedure for test data generation, Software - Practice and Experience,vol. 22, no. 2, pp. 167193, 1999.

    [4] R. A. DeMillo and A. J. Offutt, Constraint-based automatic test datageneration, IEEE Trans. Softw. Eng., vol. 17, no. 9, pp. 900909, 1991.

    [5] A. Baresel, H. Sthamer, and M. Schmidt, Fitness function designto improve evolutionary structural testing, Proc. of the Genetic andEvolutionary Computation Conference (GECCO02), 2002, pp. 13291336.

    [6] P. McMinn, Search-based software test data generation: a survey.Softw. Test., Verif. Reliab., vol. 14, no. 2, pp. 105156, 2004.

    [7] R. P. Pargas, M. J. Harrold, and R. Peck, Test-data generation usinggenetic algorithms. Softw. Test., Verif. Reliab., vol. 9, no. 4, pp. 263282, 1999.

    [8] P. Tonella, Evolutionary testing of classes, Proc. of the 2004 ACMSIGSOFT International Symposium on Software Testing and Analysis(ISSTA04), 2004, pp. 119128.

    [9] P. McMinn, M. Harman, D. Binkley, and P. Tonella, The species perpath approach to search-based test data generation, Proc. of the 2006

    International Symposium on Software Testing and Analysis (ISSTA06),2006, pp. 1324.[10] N. Tracey, J. Clark, and K. Mander, Automated program flaw finding

    using simulated annealing, Proc. of the 1998 ACM SIGSOFT Interna-tional symposium on Software testing and analysis (ISSTA98), ACMPress, 1998, pp. 7381.

    [11] N. J. Tracey, A search-based automated test-data generation frameworkfor safety-critical software, Ph.D. dissertation, University of York, 2000.

    [12] J. Wegener and O. Buhler, Evaluation of different fitness functionsfor the evolutionary testing of an autonomous parking system, Proc. ofthe Genetic and Evolutionary Computation Conference (GECCO04)(2),2004, pp. 14001412.

    [13] O. Buhler and J. Wegener, Evolutionary functional testing, Computersand Operations Research, 2007.

    [14] A. Baresel, H. Pohlheim, and S. Sadeghipour, Structural and functionalsequence test of dynamic and state-based software with evolutionaryalgorithms. Proc. of the Genetic and Evolutionary Computation Con-ference (GECCO03), 2003, pp. 24282441.

    [15] P. McMinn and M. Holcombe, The state problem for evolutionarytesting. Proc. of the Genetic and Evolutionary Computation Conference(GECCO03), 2003, pp. 24882498.

    [16] P. McMinn and M. Holcombe, Evolutionary testing of state-basedprograms. Proc. Genetic and Evolutionary Computation Conference(GECCO05), 2005, pp. 10131020.

    [17] R. Lefticaru and F. Ipate, Automatic state-based test generation usinggenetic algorithms, Proc. of the Ninth International Symposium onSymbolic and Numeric Algorithms For Scientific Computing (SYNASC07). IEEE Computer Society, 2007, pp. 188195.

    [18] R. Lefticaru and F. Ipate, Functional search-based testing from statemachines, Proc. of the 2008 International Conference on SoftwareTesting, Verification, and Validation. IEEE Computer Society, 2008,pp. 525528.

    [19] R. Lefticaru and F. Ipate, Search-based testing using state-based fit-ness, in Proc. of the 2008 IEEE International Conference on SoftwareTesting Verification and Validation Workshop . IEEE Computer Society,2008, p. 210.

    [20] S. Kirkpatrick, C. D. Gelatt, and M. P. Vecchi, Optimization bysimulated annealing, Science, vol. 220, no. 4598, pp. 671680, 1983.

    [21] H. Waeselynck, P. Thevenod-Fosse, and O. Abdellatif-Kaddour, Sim-ulated annealing applied to test generation: landscape characterizationand stopping criteria, Empirical Software Engineering, vol. 12, no. 1,pp. 3563, 2007.

    [22] Y. Shi and R. Eberhart, Empirical study of particle swarm opti-mization, Proc. of the 1999 Congress on Evolutionary Computation(CEC99), 1999, pp. 19451950.

    [23] W. Hordijk, A measure of landscapes, Evolutionary Computation,vol. 4, no. 4, pp. 335360, 1996.

    [24] T. Jones and S. Forrest, Fitness distance correlation as a measure ofproblem difficulty for genetic algorithms, Proc. of the 6th InternationalConference on Genetic Algorithms, 1995, pp. 184192.

    [25] D. Drusinsky, Modeling and Verification Using UML Statecharts: AWorking Guide to Reactive System Design, Runtime Monitoring andExecution-based Model Checking. Newnes, 2006.

    [26] F. Ipate, Testing against a non-controllable stream X-machine usingstate counting. Theoretical Comput. Sci., vol. 353, no. 1-3, pp. 291316, 2006.

    [27] T. S. Chow, Testing software design modeled by finite-state machines.IEEE Trans. Softw. Eng., vol. 4, no. 3, pp. 178187, 1978.

    [28] A. J. Offutt, S. Liu, A. Abdurazik, and P. Ammann, Generating testdata from state-based specifications. Softw. Test., Verif. Reliab., vol. 13,no. 1, pp. 2553, 2003.

    [29] L. C. Briand, M. D. Penta, and Y. Labiche, Assessing and improvingstate-based class testing: A series of experiments. IEEE Trans. Softw.Eng., vol. 30, no. 11, pp. 770793, 2004.

    [30] Genetic Algorithm and Direct Search Toolbox 2. 2,http://www.mathworks.com/products/gads/.

    207

  • 8/3/2019 [2008][CON ] A comparative landscape analysis of fitness functions for search-based testing

    8/8

    [31] PSOtoolbox, http://psotoolbox.sourceforge.net/.[32] B. Birge, PS Ot - a particle swarm optimization toolbox for use with

    Matlab, Proc. IEEE Swarm Intelligence Symposium, 2003, pp. 182 186.

    [33] M. Harman and P. McMinn, A theoretical & empirical analysis ofevolutionary testing and hill climbing for structural test data generation,Proc. of the 2007 International Symposium on Software Testing andAnalysis (ISSTA07), 2007, pp. 7383.

    [34] A. Arcuri, P. K. Lehre, and X. Yao, Theoretical runtime analyses of

    search algorithms on the test data generation for the triangle classi-fication problem, Proc. of the 2008 IEEE International Conferenceon Software Testing Verification and Validation Workshop. IEEEComputer Society, 2008, pp. 161169.

    208