26
Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References Refere “Ants, Mutants and Beyond” Combining formal and stochastic techniques to improve software [email protected] douglas [email protected] [email protected] October 12, 2015

Ants, Mutants and Beyond - BCS SGAIGenetic Algorithms/Genetic Programming - I ... Obtain a multi-objective trade-o between Non-Functional Properties5 (NFPs). Optimize/improve functional

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Ants, Mutants and Beyond - BCS SGAIGenetic Algorithms/Genetic Programming - I ... Obtain a multi-objective trade-o between Non-Functional Properties5 (NFPs). Optimize/improve functional

Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References

“Ants, Mutants and Beyond”Combining formal and stochastic techniques

to improve software

[email protected] [email protected]

[email protected]

October 12, 2015

Page 2: Ants, Mutants and Beyond - BCS SGAIGenetic Algorithms/Genetic Programming - I ... Obtain a multi-objective trade-o between Non-Functional Properties5 (NFPs). Optimize/improve functional

Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References

Overview

1 Nature-inspired Computing.

2 Search Based Software Engineering (SBSE).

3 Genetic Improvement (GI).

4 Automatic Improvement (AIP).

5 Industry perspective on SBSE — Douglas Carson (Keysight).

6 Hylas and Antbox — Zoltan A. Kocsis (U. Stirling).

Page 3: Ants, Mutants and Beyond - BCS SGAIGenetic Algorithms/Genetic Programming - I ... Obtain a multi-objective trade-o between Non-Functional Properties5 (NFPs). Optimize/improve functional

Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References

3/26

Naturally-inspired Computing

Some classes of problem are well-understood and haveefficient solution methods (e.g. the simplex algorithm forproblems with linear objectives/constraints).

However, in many cases we only have an operationaldescription of the problem — calculating derivatives to drivea ‘conventional’ optimization method such as Newton’smethod may not be easy (or even possible).

In such cases, it is common to turn to a variety of‘nature-inspired’ metaheuristic search methods: GeneticAlgorithms, Ant-Colony Optimization etc.

Page 4: Ants, Mutants and Beyond - BCS SGAIGenetic Algorithms/Genetic Programming - I ... Obtain a multi-objective trade-o between Non-Functional Properties5 (NFPs). Optimize/improve functional

Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References

4/26

General problem-solving with Metaheuristics

Metaheuristics are stochastic (a form of ‘generate-and-test’) andrequire only a few problem-specific ingredients:

Solution representation: e.g. list of cities for the TSP.

Objective function: measures the quality of a solution (e.g.tour length).

Search operators: some means of mutating or (re)combiningsolutions to produce a new solution (e.g. swapping two cities).

Page 5: Ants, Mutants and Beyond - BCS SGAIGenetic Algorithms/Genetic Programming - I ... Obtain a multi-objective trade-o between Non-Functional Properties5 (NFPs). Optimize/improve functional

Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References

5/26

Genetic Algorithms/Genetic Programming - I

Idea: generate/breed soutions and apply ‘survival of the fittest’

John H. Holland (1929-2015) John Koza

Page 6: Ants, Mutants and Beyond - BCS SGAIGenetic Algorithms/Genetic Programming - I ... Obtain a multi-objective trade-o between Non-Functional Properties5 (NFPs). Optimize/improve functional

Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References

6/26

Genetic Algorithms/Genetic Programming - II

Page 7: Ants, Mutants and Beyond - BCS SGAIGenetic Algorithms/Genetic Programming - I ... Obtain a multi-objective trade-o between Non-Functional Properties5 (NFPs). Optimize/improve functional

Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References

7/26

Search Based Software Engineering

There is a recent trend to tackle problems in softwareengineering using metaheuristics:

Requirements Selection:

Objective function: cost, value, . . .Representation: bitvector of requirements.

Test case prioritization

Objective function: rate of coverage, time, faults, . . .Representation: permutation of the test suite.

Program Synthesis

Objective function: correctness, speed, power, memory . . .Representation: proof trees; expression trees; source code.

Page 8: Ants, Mutants and Beyond - BCS SGAIGenetic Algorithms/Genetic Programming - I ... Obtain a multi-objective trade-o between Non-Functional Properties5 (NFPs). Optimize/improve functional

Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References

8/26

Pragmatics of Program Synthesis

Scalability remains an issue for program synthesis:

We don’t yet know how to generate sizeable algorithms fromscratch.

Generative approaches such as GP still work best at thescale of expressions . . .

. . . but human ingenuity already provides a vast repertoireof (abstract) algorithms and (concrete) programs . . .

Page 9: Ants, Mutants and Beyond - BCS SGAIGenetic Algorithms/Genetic Programming - I ... Obtain a multi-objective trade-o between Non-Functional Properties5 (NFPs). Optimize/improve functional

Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References

9/26

Templar - Improving algorithms

The ‘Template Method’ Design Pattern1 divides analgorithm into a fixed skeleton and some variants.

The fixed parts orchestrate the behaviour of the variants.

Example: Quicksort performance depends on the pivotfunction, so we can treat it as a variant:

DoubleArray q s o r t ( Doub leArray a r r ) {double p i v o t = pivotFn ( a r r ) ;// Imp l ementa t i on o f p i vo tFn can be v a r i e d

g e n e r a t i v e l yr e t u r n q s o r t ( a r r . f i l t e r ( < p i v o t ) )

++ a r r . f i l t e r ( == p i v o t )++ q s o r t ( a r r . f i l t e r ( > p i v o t ) ) ;

}

Expressing algorithms as templates allows us to learn goodimplementations for the variant parts.

1[Gamma, Helm, et al. 1995].

Page 10: Ants, Mutants and Beyond - BCS SGAIGenetic Algorithms/Genetic Programming - I ... Obtain a multi-objective trade-o between Non-Functional Properties5 (NFPs). Optimize/improve functional

Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References

10/26

‘Template Method Hyper-heuristics’3

Templar2 is a JavaTM framework designed to make thegeneration of customized algorithms as simple as possible.Ingredients:

A list of variation points describing the parts of thealgorithm to be automatically generated.

An algorithm template expressing the algorithm skeleton.The template produces a customized version of the algorithmfrom automatically-generated implementations of the variationpoints.

An objective function to evaluate the customized algorithm.

An algorithm factory that searches the space of variationpoints to produce an optimized version of the algorithm.

2[Swan and Burles 2015].3[Woodward and Swan 2014].

Page 11: Ants, Mutants and Beyond - BCS SGAIGenetic Algorithms/Genetic Programming - I ... Obtain a multi-objective trade-o between Non-Functional Properties5 (NFPs). Optimize/improve functional

Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References

11/26

Hyper-quicksort: Optimizing for energy consumption

8 16 32 64 128 256 512 1024 20480.0625

0.25

1

4

16

64

256

input array size (log2 scale)

Jou

les

(log

2sc

ale)

MidSedgewickRandomHyper-quicksort

Page 12: Ants, Mutants and Beyond - BCS SGAIGenetic Algorithms/Genetic Programming - I ... Obtain a multi-objective trade-o between Non-Functional Properties5 (NFPs). Optimize/improve functional

Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References

12/26

Hyper-Quicksort - Results

Array size Middle index Sedgewick Random Index Hyper-quicksort

8 0.191 0.163 0.446 0.09416 0.296 0.345 0.410 0.17332 0.651 0.757 0.967 0.41064 1.366 1.145 1.708 0.976

128 3.505 4.034 5.221 2.341256 8.175 7.646 9.269 6.387512 19.777 21.391 27.685 15.268

1024 62.961 42.508 41.245 33.0122048 198.438 132.663 111.894 70.234

Energy consumption (Joules) against input array size

Page 13: Ants, Mutants and Beyond - BCS SGAIGenetic Algorithms/Genetic Programming - I ... Obtain a multi-objective trade-o between Non-Functional Properties5 (NFPs). Optimize/improve functional

Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References

13/26

Genetic Improvement (GI)

Addresses scalability by applying (a variant of) GP to pre-existingprograms. It can be used to:

Fix bugs4 (maintanance dominates software lifecycle cost).

Obtain a multi-objective trade-off between Non-FunctionalProperties5 (NFPs).

Optimize/improve functional properties6.

4[Le Goues, Forrest, et al. 2013].5[Harman, Langdon, et al. 2012].6[Burles, Swan, et al. 2015].

Page 14: Ants, Mutants and Beyond - BCS SGAIGenetic Algorithms/Genetic Programming - I ... Obtain a multi-objective trade-o between Non-Functional Properties5 (NFPs). Optimize/improve functional

Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References

14/26

Gen-O-Fix - Improving programs

A Scala framework for self-improving software systems7, i.e. it canimprove a system as it runs.

Performs both GIP and GP, rather than ‘plastic surgery’.

Tight integration of compiler and improvement mechanism(via reflection) is more efficient and less brittle than existingapproaches8.

Callbacks to newly-generated functionality can also beinjected into legacy Java code.

Uses an actor-based approach for executing program variants.

7[Swan, Epitropakis, et al. 2014].8[Langdon and Harman 2013]; [Le Goues, Nguyen, et al. 2012].

Page 15: Ants, Mutants and Beyond - BCS SGAIGenetic Algorithms/Genetic Programming - I ... Obtain a multi-objective trade-o between Non-Functional Properties5 (NFPs). Optimize/improve functional

Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References

15/26

Why Scala?

Supports expressive webservice frameworks (‘Hello, Web’ in 6lines of code).

Increasingly popular for concurrency support (Twitter corerewritten in Scala).

Extensively used in industry:

Page 16: Ants, Mutants and Beyond - BCS SGAIGenetic Algorithms/Genetic Programming - I ... Obtain a multi-objective trade-o between Non-Functional Properties5 (NFPs). Optimize/improve functional

Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References

16/26

Gen-O-Fix System Diagram

Page 17: Ants, Mutants and Beyond - BCS SGAIGenetic Algorithms/Genetic Programming - I ... Obtain a multi-objective trade-o between Non-Functional Properties5 (NFPs). Optimize/improve functional

Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References

17/26

Gen-O-Fix example: Hotswapping webserver code

A stock-price predictor for shares9 in David Bowie10.

Achieved via univariate symbolic regression . . .

of a function extracted from web-application source code.

9*Not actual shares.10*Not the actual David Bowie.

Page 18: Ants, Mutants and Beyond - BCS SGAIGenetic Algorithms/Genetic Programming - I ... Obtain a multi-objective trade-o between Non-Functional Properties5 (NFPs). Optimize/improve functional

Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References

18/26

Automatic Improvement

Page 19: Ants, Mutants and Beyond - BCS SGAIGenetic Algorithms/Genetic Programming - I ... Obtain a multi-objective trade-o between Non-Functional Properties5 (NFPs). Optimize/improve functional

Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References

19/26

AIP versus Formal Methods

Automatic Improvement Programming (AIP) provides (some of)the benefits of formal methods without the difficulty of writingformal specs.

Formal Methods:

Require highly specialized developers.Hard to write large programs.

AIP:

Doesn’t need formal specs or tools with steep learning curve.Has no scalability issues since we search for transformablepatterns in the source code.

Page 20: Ants, Mutants and Beyond - BCS SGAIGenetic Algorithms/Genetic Programming - I ... Obtain a multi-objective trade-o between Non-Functional Properties5 (NFPs). Optimize/improve functional

Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References

20/26

AIP versus GI

Metaheuristic approaches to program improvement (e.g. GI):

Rely heavily on random perturbation /recombination.Can degrade program structure/correctness/explanatorypower.

AIP:

Can use semantics-preserving transformations.Can come with an asymptotic guarantee of superiority.

Page 21: Ants, Mutants and Beyond - BCS SGAIGenetic Algorithms/Genetic Programming - I ... Obtain a multi-objective trade-o between Non-Functional Properties5 (NFPs). Optimize/improve functional

Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References

21/26

Automatic Improvement Programming - 1

Well-known that implementing hashCode in Java is (oftenfatally) error-prone.

Our analysis revealed 487 incorrect implementations inApache Hadoop11.

We repaired Hadoop by correcting hashCode implementations(semantics-preserving), whilst simultaneously improving onthe efficiency of the incorrect version (generative).

11[Kocsis, Neumann, et al. 2014].

Page 22: Ants, Mutants and Beyond - BCS SGAIGenetic Algorithms/Genetic Programming - I ... Obtain a multi-objective trade-o between Non-Functional Properties5 (NFPs). Optimize/improve functional

Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References

22/26

Automatic Improvement Programming - 2

PolyFunic uses Category Theory to replace stochasticGen-O-Fix mutation operators with catamorphisms:

This guarantees that the mutation is semantics-preserving.

Trial study obtained an asymptotic improvement inefficiency12 (O(n) to O(1)).

Performance comparison of naıve and AIP-optimized code

12[Kocsis and Swan 2014].

Page 23: Ants, Mutants and Beyond - BCS SGAIGenetic Algorithms/Genetic Programming - I ... Obtain a multi-objective trade-o between Non-Functional Properties5 (NFPs). Optimize/improve functional

Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References

23/26

Real-World Impact

£80K research income from:

Dataductus: Funded the application of search combinators tohybrid search.

BT: Funding research studentship in adaptive scheduling.

Keysight: funded development of Hylas (one of only 30 suchgrants worldwide).

Page 24: Ants, Mutants and Beyond - BCS SGAIGenetic Algorithms/Genetic Programming - I ... Obtain a multi-objective trade-o between Non-Functional Properties5 (NFPs). Optimize/improve functional

Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References

24/26

References I

[1] Nathan Burles, Jerry Swan, Edward Bowles, Alexander E. I. Brownlee,Zoltan A. Kocsis, and Nadarajen Veerapen. “Object-Oriented GeneticImprovement for Improved Energy Consumption in Google Guava”. In:Search-Based Software Engineering - 7th International Symposium, SSBSE2015, Bergamo, Italy, September 5-7, 2015, Proceedings. 2015, pp. 255–261.doi: 10.1007/978-3-319-22183-0_20.

[2] Jerry Swan and Nathan Burles. “Templar - A Framework for Template-MethodHyper-Heuristics”. In: Genetic Programming, LNCS 9025. Ed. byPenousal Machado et al. 2015. isbn: 978-3-319-16500-4.

[3] Z. A. Kocsis, G. Neumann, J. Swan, M. G. Epitropakis, A. E. I. Brownlee,S. O. Haraldsson, and E. Bowles. “Repairing and optimizing Hadoop hashCodeimplementations”. In: Symposium on Search-Based Software Engineering,Brazil, August 26 - 29. 2014.

[4] Zoltan A. Kocsis and Jerry Swan. “Asymptotic Genetic ImprovementProgramming with Type Functors and Catamorphisms”. In: Parallel ProblemSolving from Nature - PPSN XIV - 13th International Conference, Ljubljana,Slovenia, September 13-17, 2014, Proceedings. Ed. by Jurij Silc. Lecture Notesin Computer Science. Springer, 2014.

Page 25: Ants, Mutants and Beyond - BCS SGAIGenetic Algorithms/Genetic Programming - I ... Obtain a multi-objective trade-o between Non-Functional Properties5 (NFPs). Optimize/improve functional

Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References

25/26

References II

[5] Jerry Swan, Michael G. Epitropakis, and John R. Woodward. Gen-O-Fix: Anembeddable framework for Dynamic Adaptive Genetic ImprovementProgramming. Tech. rep. CSM-195. Stirling FK9 4LA, Scotland: ComputingScience and Mathematics, University of Stirling, 2014, pp. 1–12.

[6] John R. Woodward and Jerry Swan. “Template Method Hyper-heuristics”. In:Proceedings of the 2014 Conference Companion on Genetic and EvolutionaryComputation Companion. GECCO Comp ’14. Vancouver, BC, Canada: ACM,2014, pp. 1437–1438. isbn: 978-1-4503-2881-4. doi:10.1145/2598394.2609843. url:http://doi.acm.org/10.1145/2598394.2609843.

[7] William B. Langdon and Mark Harman. “Optimising Existing Software withGenetic Programming”. In: IEEE Transactions on Evolutionary Computation(2013). Accepted. issn: 1089-778X. doi: doi:10.1109/TEVC.2013.2281544.

[8] Claire Le Goues, Stephanie Forrest, and Westley Weimer. “Current Challengesin Automatic Software Repair”. In: Software Quality Jornal 21 (3 2013),pp. 421–443.

Page 26: Ants, Mutants and Beyond - BCS SGAIGenetic Algorithms/Genetic Programming - I ... Obtain a multi-objective trade-o between Non-Functional Properties5 (NFPs). Optimize/improve functional

Nature-inspired Computing Search Based Software Engineering Genetic Improvement Automatic Improvement References References

26/26

References III

[9] Mark Harman, William B. Langdon, Yue Jia, David Robert White,Andrea Arcuri, and John A. Clark. “The GISMOE challenge: constructing thepareto program surface using genetic programming to find better programs.”In: ASE. Ed. by Michael Goedicke, Tim Menzies, and Motoshi Saeki. ACM,2012, pp. 1–14. isbn: 978-1-4503-1204-2.

[10] Claire Le Goues, ThanhVu Nguyen, Stephanie Forrest, and Westley Weimer.“GenProg: A Generic Method for Automatic Software Repair”. In: IEEETransactions on Software Engineering 38 (2012), pp. 54–72. issn: 0098-5589.doi: http://doi.ieeecomputersociety.org/10.1109/TSE.2011.104.

[11] Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides. Designpatterns: elements of reusable object-oriented software. Boston, MA, USA:Addison-Wesley Longman Publishing Co., Inc., 1995. isbn: 0-201-63361-2.