13
Automated refactoring to the Strategy design pattern Aikaterini Christopoulou a , E.A. Giakoumakis a,, Vassilis E. Zafeiris a , Vasiliki Soukara b a Department of Informatics, Athens University of Economics and Business, 76 Patission Str., Athens 104 34, Greece b Information Society S.A., 2-4 Ilioupoleos Str., Ymittos 172 37, Greece article info Article history: Received 11 October 2011 Received in revised form 30 April 2012 Accepted 30 May 2012 Available online 7 June 2012 Keywords: Refactoring Design patterns Strategy pattern Polymorphism Total replacement of conditional logic abstract Context: The automated identification of code fragments characterized by common design flaws (or ‘‘code smells’’) that can be handled through refactoring, fosters refactoring activities, especially in large code bases where multiple developers are engaged without a detailed view on the whole system. Auto- mated refactoring to design patterns enables significant contributions to design quality even from devel- opers with little experience on the use of the required patterns. Objective: This work targets the automated identification of refactoring opportunities to the Strategy design pattern and the elimination through polymorphism of respective ‘‘code smells’’ that are related to extensive use of complex conditional statements. Method: An algorithm is introduced for the automated identification of refactoring opportunities to the Strategy design pattern. Suggested refactorings comprise conditional statements that are characterized by analogies to the Strategy design pattern, in terms of the purpose and selection mode of strategies. Moreover, this work specifies the procedure for refactoring to Strategy the identified conditional state- ments. For special cases of these statements, a technique is proposed for total replacement of conditional logic with method calls of appropriate concrete Strategy instances. The identification algorithm and the refactoring procedure are implemented and integrated in the JDeodorant Eclipse plug-in. The method is evaluated on a set of Java projects, in terms of quality of the suggested refactorings and run-time effi- ciency. The relevance of the identified refactoring opportunities is verified by expert software engineers. Results: The identification algorithm recalled, from the projects used during evaluation, many of the refactoring candidates that were identified by the expert software engineers. Its execution time on pro- jects of varying size confirmed the run-time efficiency of this method. Conclusion: The proposed method for automated refactoring to Strategy contributes to simplification of conditional statements. Moreover, it enhances system extensibility through the Strategy design pattern. Ó 2012 Elsevier B.V. All rights reserved. 1. Introduction Refactoring is the process of applying changes to the internal structure of software without changing its observable behaviour [1]. It focuses, according to Fowler [1], on making the source code more understandable, while lowering the cost of introducing mod- ifications to it. These objectives denote the relevance of refactoring with system design improvement and, thus, its applicability to both system implementation and maintenance stages of the soft- ware life cycle. During system implementation, refactoring enables continuous design, i.e., alternation of coding and design modifica- tion, a practice that is adopted in agile development methodolo- gies, such as eXtreme Programming (XP), that do not emphasize up-front design [2]. On the other hand, refactoring prevents soft- ware quality degradation that emanates from software mainte- nance tasks. The support for automated execution of multiple, though simple, types of refactorings by widely-used IDEs (e.g., Eclipse, Netbeans, MS Visual Studio), is indicative of the refactoring activities’ adoption by current developers. The refactoring process comprises a set of activities such as iden- tification of the candidate code fragments and the required refactor- ing(s), evaluation of behaviour preservation guarantees, application of the refactoring and assessment of its effects on software quality [3,4]. However, the refactoring automation provided by IDEs and specialized tools is, primarily, related to the application of code transformations, in a way that ensures code correctness and behav- iour preservation while providing the capability of reverting them. However, the developer’s role is still significant, since he/she is re- quired to specify to the tool the target code fragments and the type of refactoring to apply, a task that highly depends on the skills and experience of the developer. Fowler [1] tried to systematically put refactoring into practice by specifying common design flaws (or ‘‘code smells’’) in object oriented code, along with sequences of 0950-5849/$ - see front matter Ó 2012 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.infsof.2012.05.004 Corresponding author. Tel.: +30 2108203183; fax: +30 2108203275. E-mail addresses: [email protected] (A. Christopoulou), [email protected] (E.A. Giakoumakis), bzafi[email protected] (V.E. Zafeiris), vsoukara@ ktpae.gr (V. Soukara). Information and Software Technology 54 (2012) 1202–1214 Contents lists available at SciVerse ScienceDirect Information and Software Technology journal homepage: www.elsevier.com/locate/infsof

Automated refactoring to the Strategy design pattern · Automated refactoring to the Strategy design pattern ... refactoring procedure are implemented and integrated in the JDeodorant

  • Upload
    others

  • View
    15

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Automated refactoring to the Strategy design pattern · Automated refactoring to the Strategy design pattern ... refactoring procedure are implemented and integrated in the JDeodorant

Information and Software Technology 54 (2012) 1202–1214

Contents lists available at SciVerse ScienceDirect

Information and Software Technology

journal homepage: www.elsevier .com/locate / infsof

Automated refactoring to the Strategy design pattern

Aikaterini Christopoulou a, E.A. Giakoumakis a,⇑, Vassilis E. Zafeiris a, Vasiliki Soukara b

a Department of Informatics, Athens University of Economics and Business, 76 Patission Str., Athens 104 34, Greeceb Information Society S.A., 2-4 Ilioupoleos Str., Ymittos 172 37, Greece

a r t i c l e i n f o

Article history:Received 11 October 2011Received in revised form 30 April 2012Accepted 30 May 2012Available online 7 June 2012

Keywords:RefactoringDesign patternsStrategy patternPolymorphismTotal replacement of conditional logic

0950-5849/$ - see front matter � 2012 Elsevier B.V. Ahttp://dx.doi.org/10.1016/j.infsof.2012.05.004

⇑ Corresponding author. Tel.: +30 2108203183; faxE-mail addresses: katerinaxristopoulou@yaho

[email protected] (E.A. Giakoumakis), [email protected] (V. Soukara).

a b s t r a c t

Context: The automated identification of code fragments characterized by common design flaws (or‘‘code smells’’) that can be handled through refactoring, fosters refactoring activities, especially in largecode bases where multiple developers are engaged without a detailed view on the whole system. Auto-mated refactoring to design patterns enables significant contributions to design quality even from devel-opers with little experience on the use of the required patterns.Objective: This work targets the automated identification of refactoring opportunities to the Strategydesign pattern and the elimination through polymorphism of respective ‘‘code smells’’ that are relatedto extensive use of complex conditional statements.Method: An algorithm is introduced for the automated identification of refactoring opportunities to theStrategy design pattern. Suggested refactorings comprise conditional statements that are characterizedby analogies to the Strategy design pattern, in terms of the purpose and selection mode of strategies.Moreover, this work specifies the procedure for refactoring to Strategy the identified conditional state-ments. For special cases of these statements, a technique is proposed for total replacement of conditionallogic with method calls of appropriate concrete Strategy instances. The identification algorithm and therefactoring procedure are implemented and integrated in the JDeodorant Eclipse plug-in. The method isevaluated on a set of Java projects, in terms of quality of the suggested refactorings and run-time effi-ciency. The relevance of the identified refactoring opportunities is verified by expert software engineers.Results: The identification algorithm recalled, from the projects used during evaluation, many of therefactoring candidates that were identified by the expert software engineers. Its execution time on pro-jects of varying size confirmed the run-time efficiency of this method.Conclusion: The proposed method for automated refactoring to Strategy contributes to simplification ofconditional statements. Moreover, it enhances system extensibility through the Strategy design pattern.

� 2012 Elsevier B.V. All rights reserved.

1. Introduction

Refactoring is the process of applying changes to the internalstructure of software without changing its observable behaviour[1]. It focuses, according to Fowler [1], on making the source codemore understandable, while lowering the cost of introducing mod-ifications to it. These objectives denote the relevance of refactoringwith system design improvement and, thus, its applicability toboth system implementation and maintenance stages of the soft-ware life cycle. During system implementation, refactoring enablescontinuous design, i.e., alternation of coding and design modifica-tion, a practice that is adopted in agile development methodolo-gies, such as eXtreme Programming (XP), that do not emphasizeup-front design [2]. On the other hand, refactoring prevents soft-

ll rights reserved.

: +30 2108203275.o.gr (A. Christopoulou),

(V.E. Zafeiris), vsoukara@

ware quality degradation that emanates from software mainte-nance tasks. The support for automated execution of multiple,though simple, types of refactorings by widely-used IDEs (e.g.,Eclipse, Netbeans, MS Visual Studio), is indicative of the refactoringactivities’ adoption by current developers.

The refactoring process comprises a set of activities such as iden-tification of the candidate code fragments and the required refactor-ing(s), evaluation of behaviour preservation guarantees, applicationof the refactoring and assessment of its effects on software quality[3,4]. However, the refactoring automation provided by IDEs andspecialized tools is, primarily, related to the application of codetransformations, in a way that ensures code correctness and behav-iour preservation while providing the capability of reverting them.However, the developer’s role is still significant, since he/she is re-quired to specify to the tool the target code fragments and the typeof refactoring to apply, a task that highly depends on the skills andexperience of the developer. Fowler [1] tried to systematically putrefactoring into practice by specifying common design flaws (or‘‘code smells’’) in object oriented code, along with sequences of

Page 2: Automated refactoring to the Strategy design pattern · Automated refactoring to the Strategy design pattern ... refactoring procedure are implemented and integrated in the JDeodorant

A. Christopoulou et al. / Information and Software Technology 54 (2012) 1202–1214 1203

primitive refactoring operations for their remedy. Kerievsky [5] fo-cused on the improvement of system design quality through refac-toring to patterns. His work specifies ‘‘bad smells’’ and refactoringprocedures for their elimination through the use of appropriate de-sign patterns. Despite the detailed presentation of ‘‘bad smells’’ inboth works, their identification is often based on code semanticsthat makes the automation of this task rather challenging.

The automated identification of refactoring opportunities, i.e.,code fragments characterized by ‘‘code smells’’ that can be handledthrough refactoring, fosters refactoring activities, especially in largecode bases where multiple developers are engaged without a de-tailed view on the whole system. Automation in the introductionof design patterns enables significant contributions to design qual-ity even from developers with little or no practical experience onthe use of the required patterns. This work focuses on the auto-mated identification of refactoring opportunities to the Strategy de-sign pattern [6] and the elimination, through polymorphism, ofrespective ‘‘code smells’’ related to extensive use of complex condi-tional statements.

We propose an algorithm for the identification of conditionalstatements that emulate, in their use, the Strategy design pattern.These statements are characterized by certain analogies to theStrategy pattern, in terms of the purpose and selection mode ofstrategies [6]: (a) conditional branches include mutually exclusive,non-trivial computations, corresponding to alternative algorithmimplementations represented by strategies, (b) branch selectionis not controlled by the logic of the class declaring the conditionalstatement but, instead, by its clients, in proportion to Strategyselection in the Strategy pattern. Moreover, this work specifiesthe procedure for refactoring to Strategy the identified conditionalstatements. For special cases of these statements, we propose atechnique for total replacement of conditional logic with methodcalls of appropriate concrete Strategy instances. The identificationalgorithm, as well as the refactoring procedure have been imple-mented and integrated in the JDeodorant Eclipse plug-in [7]. Wehave evaluated our approach, on a set of Java projects, in termsof the quality of identified refactoring opportunities and its execu-tion time scalability. The evaluation resulted in the identification ofa number of refactoring opportunities on each project, whose rele-vance was verified by two expert software engineers. Moreover,the execution time of the identification algorithm on projects ofvarying size confirms the execution efficiency of this method.

The rest of the paper is organized as follows: Section 2 presentsrelevant literature in the area of automated refactoring to patterns.Section 3 specifies the proposed algorithm for identification ofrefactoring opportunities to Strategy and the respective procedurefor applying suggested refactorings. The evaluation of our methodis included in Section 4 and the paper is concluded in Section 5with open issues and possible extensions for future research.

2. Related work

This section provides an overview of methods for automatedrefactoring to design patterns. The reader may refer to the workof [4] for an extensive review in the research area of softwarerefactoring. Among the earlier approaches for introducing automa-tion in the application of design patterns, for source code mainte-nance and evolution, is the method proposed by Tokuda and Batory[8]. The method is based on the specification of primitive object-oriented transformations that can be automated with appropriatetool support. Moreover, it prescribes the introduction of a designpattern as a composition of a sequence of parameterized object-oriented transformations. The authors demonstrate their methodthrough the introduction of the Abstract Factory design pattern ina simple case study.

Refactoring to design patterns is also treated as a series of mini-transformations in the methodology proposed by Cinneide andNixon [9]. The methodology introduces a systematic way for en-abling refactoring towards a specific design pattern that involves(a) selection of the target design pattern, (b) definition of the pre-cursor, i.e., structural properties, expressing the intent of the pat-tern, of the source code that the pattern will be applied, (c)decomposition of the pattern in a series of mini-patterns, i.e, fre-quently occurring design motifs that are lower level constructsthan design patterns, (d) definition of an appropriate mini-trans-formation for each mini-pattern, (e) specification of the refactoringas a composition of mini-transformations. A mini-transformationcomprises pre-conditions, post-conditions, transformation stepsand an argument over how the mini-transformation supportsbehaviour preservation after its application. The authors presentthe application of the methodology for the Factory Method patternthat is also implemented as a Java tool. The methodology is mostlyfocused on structure-rich, rather than behavioural patterns, andhas been applied to the Abstract Factory, Builder and Singletonpatterns.

An approach for automated identification of refactoring oppor-tunities to the Abstract Factory design pattern in a Java code basehas been proposed by Jeon et al. [10]. The authors propose a meth-od for inferencing refactoring opportunities, through employmentof logic programming, and a refactoring Strategy for transformingthe source code towards the target structure. Inference is basedon extraction from Java code of the system design and its represen-tation as a set of Prolog-like predicates that are then converted toProlog facts. Moreover, design pattern inference rules are defined,for each target pattern for refactoring, that are then transformedto Prolog rules. Each inference rule has input components that cor-respond to the roles of the design pattern. The identification ofrefactoring opportunities takes place through issuing of Prologqueries. Jebelean et al. [11] also used logical programming forthe detection of certain ‘‘bad smells’’ specified by Kerievsky [5].The approach involves transformation of the Java project’s AbstractSyntax Tree (AST) into Prolog facts through the JTransformer en-gine [12]. Problematic code fragments are identified through defi-nition of appropriate Prolog rules. The work specifies a respectiverule for detecting refactoring opportunities to the Composite designpattern.

As concerning behavioural patterns, Juillerat and Hirsbrunnerspecify an algorithm for automated refactoring to the TemplateMethod design pattern [13,14]. The approach applies to a pair ofmethods of two different classes that share a common abstractclass ancestor. The authors employ and extend existing methodsfor clone detection in order to identify the common and differentstatements of the compared methods. The differences are ex-tracted to a new method for each class in a way that (a) extractmethod preconditions are satisfied for behaviour preservationand (b) the new methods have a common signature in order tobe polymorphically invoked from the template method. The com-mon parts of the initial methods are moved as a single templatemethod to the common superclass of the refactored classes. More-over, an additional abstract method is created in the superclassthat has the same signature with the extracted methods. The find-ings of this work have been implemented as an Eclipse plug-in.

The approaches, described so far for refactoring to patterns, fo-cus mainly on structural patterns (e.g. Abstract Factory, Builder,Composite) with the exception of the work of Juillerat and Hirsb-runner [13,14] that handle refactoring to the Template Methodbehavioural pattern. Our method is not directly comparable tothese works, since refactoring to the Strategy design pattern isnot in their scope. From a methodological perspective, relevantwork can be categorized to (a) generic methods [8–11] (e.g.mini-transformations, logic inference) that can be applied with

Page 3: Automated refactoring to the Strategy design pattern · Automated refactoring to the Strategy design pattern ... refactoring procedure are implemented and integrated in the JDeodorant

Fig. 1. State and Strategy design patterns.

1204 A. Christopoulou et al. / Information and Software Technology 54 (2012) 1202–1214

appropriate parameterization to various patterns and (b) methodsspecialized for a specific pattern [13–15]. The proposed method(along with the work of [15] that will be described thereafter) be-longs to the second category since it builds on the special struc-tural and behavioural properties of the Strategy pattern.

Tsantalis and Chatzigeorgiou [15] also focus on behavioural pat-terns, through identification of refactoring opportunities towardsthe State/Strategy design patterns. Moreover, the authors proposea behaviour preserving transformation for applying the suggestedrefactorings. The identification of the candidate code fragmentsfor refactoring is based on the syntactic analysis of the system’ssource code, construction and traversal of the corresponding AST.It focuses on two cases of refactoring for simplification of condi-tional expressions, as described by Fowler [1]: (a) replace typecode with State/Strategy and (b) replace conditional logic withpolymorphism. In the first case, a set of criteria is applied on a sub-set of the variables, defined inside the target class, that participatein conditional expressions. In the second case, code fragments thatperform Runtime Type Identification (RTTI) are discovered in orderto be replaced with invocation of polymorphic methods. The meth-od has been implemented and integrated in the JDeodorant Eclipseplug-in [7], an open source project in which Tsantalis and Chatzi-georgiou participate as core members of the development team.

The method proposed in this work, for automated refactoring tothe Strategy design pattern, is differentiated from the approach of[15] (or JDeodorant for brevity) in a series of aspects. First of all,the two methods analyze different parts of a conditional block in or-der to characterize it as a refactoring opportunity. While JDeodor-ant focuses on the conditional expressions of a conditionalstatement, our approach takes also into account properties of theirbranches (e.g., average number of statement per branch). The scopeof applied checks for the evaluation of a given conditional state-ment, i.e., the code fragments that need to be analyzed in order todetermine its suitability for refactoring to Strategy, is broader inthis work as compared to JDeodorant. The latter targets its checksto the method that includes the candidate statement (intra-proce-dural analysis), while our approach extends its scope to methodsthat directly or indirectly invoke that method (inter-proceduralanalysis) and either belong to the declaring class or client classes(intra- and inter-class analysis). The analysis of these methods takesplace by following the system’s call graph in the inverse direction,starting from the method enclosing the candidate statement. JDe-odorant employs syntactic analysis for the identification of the can-didate code fragments for refactoring. Moreover, our approachemploys control and data flow analysis that is based on the ProgramDependence Graph (PDG) of each analyzed method. As concerningthe application of refactoring to the suggested conditional state-ments, the proposed method performs (where applicable) totalreplacement of conditional statements with polymorphic invoca-tion of methods on appropriate concrete Strategy implementations.

3. Automated refactoring to Strategy

3.1. State vs. Strategy design pattern

The State and Strategy design patterns, both members of thebehavioural patterns family, are identical, as regarding their struc-ture. Specifically, both patterns employ composition and polymor-phism in order to circumvent the use of complex conditional logic.Fig. 1a and b highlight the structural similarity of the two patterns.

The Strategy pattern enables the use and interchange of differ-ent algorithms or implementations for a certain policy. It pre-scribes the definition of an abstract type (abstract class orinterface), representing the policy contract (Strategy), and a seriesof different concrete subtypes (Concrete Strategies) that corre-

spond to alternative implementations of the policy [6]. On theother hand, the State design pattern allows an object to alter itsbehaviour as a result of changes to its state. The object (instanceof the Context class) appears to its clients as changing its class atruntime. The State pattern’s realization is based on the introduc-tion of an abstract type (usually abstract class) that represents anAbstract State and defines methods corresponding to state-depen-dent operations of the Context class. Each discrete Context state ismapped to a concrete subtype of Abstract State (Concrete State)that provides the state-specific implementation of Abstract State’smethods and controls transitions to appropriate target states.

The similarity of these patterns, in terms of their structure andsuitability for tackling complex conditional logic, often leads totheir indistinguishable treatment by software designers. For in-stance, Fowler [1] proposes a series of code transformations forrefactoring conditional logic to State/Strategy (termed as replacetype code with State/Strategy) without emphasizing on a clear dis-tinction between the two patterns. On the other hand, Kerievsky[5] brings out in his work the different purpose of these patternsby fully separating refactorings to the State pattern (replacestate-altering conditionals with State) from refactorings to Strat-egy (replace conditional logic with Strategy).

A key difference between State and Strategy lies in their scopeof use. While State is most appropriate for an object that changesits behaviour as it passes through different states, Strategy allowstransparent interchange of alternative algorithms (strategies) fora certain aspect of an object’s behaviour. Note that although strat-egies are interchangeable, this does not hold for state implementa-tions that are only relevant to a certain part of the Context object’slife-cycle. Algorithm selection in Strategy usually takes place dur-ing the initialization of a Context instance and does not changeduring its life-cycle. State transitions, though, occur throughoutthe life-cycle and independently for each Context instance [16].

Further differentiation among the two patterns is based on thedegree of client involvement during their use, as quoted by Gammaet al. [6]: ‘‘Clients must be aware of different Strategies. The pat-tern has a potential drawback in that a client must understandhow Strategies differ before it can select the appropriate one’’. Inother words, Strategy selection is delegated to the client or to a fac-tory. As regarding the State pattern, the client interacts with theContext object without being aware of its available states or

Page 4: Automated refactoring to the Strategy design pattern · Automated refactoring to the Strategy design pattern ... refactoring procedure are implemented and integrated in the JDeodorant

A. Christopoulou et al. / Information and Software Technology 54 (2012) 1202–1214 1205

directly affecting state selection. However, as State transitions arecontrolled by state objects, a Concrete State subtype may be awareof the presence of other states, unlike alternative strategies in theStrategy pattern that are fully decoupled from each other.

As concerning the coupling of the types that introduce polymor-phism (states or strategies) with the Context class, the Statepattern is characterized by the highest coupling. Specifically, con-crete state subtypes are aware of all aspects of Context’s behaviourand usually have access to its State transition operations. On theother hand, strategies usually do not require access to the Contextinterface, while they are relevant to a single behaviour of the Con-text class.

3.2. Automated refactoring of conditional logic in JDeodorant

The work of [15] focuses on the identification of refactoringopportunities that introduce the State/Strategy pattern. The imple-mentation of their approach is part of the JDeodorant Eclipse plug-in [7] that also supports full automation in the application of theidentified refactorings. Identification of refactoring opportunitiesconcerns two cases of refactoring for simplification of conditionalexpressions, as described by Fowler [1]:

� replace type code with State/Strategy,� replace conditional logic with polymorphism.

As concerning the first case, Tsantalis and Chatzigeorgiou [15] pro-pose a method for identification of conditional statements (referredto as state-checking code fragments) that are appropriate for apply-ing refactoring to the State/Strategy pattern. The identification ofstate-checking code fragments is based on the evaluation of a setof criteria on certain types of variables (State Variables1or typecodes according to Fowler [1]) that participate in condition expres-sions of their branches. The concept is that a state-checking codefragment contains, in all branch conditions, equality comparisonsof a certain State Variable (SV) against different constants. Each con-stant corresponds to a different state of the Context class. In the sec-ond case, a method is proposed for detection of code fragments thatperform Runtime Type Identification (RTTI) and their consecutivereplacement with invocations of polymorphic methods. An RTTIcode fragment is a conditional statement that includes in all branchconditions instanceof or equivalent equality comparisons con-taining the same Superclass Type Variable (STV), i.e., a variablewhose class is subclassed by other classes of the system. For in-stance, let v be a variable of type C and Ci,i = 1,2, . . . be subclassesof C, then the presence of (v instanceof Ci) expressions in allbranch conditions of a conditional statement results to its identifica-tion as RTTI code fragment.

The approach of [15], for identification of ‘‘replace type codewith State/Strategy’’ refactoring opportunities, keeps pace with[1], as regarding the common handling of both State and Strategypatterns. Their method, that will be henceforth referred to as JDe-odorant, employs syntactic analysis in order to figure out proper-ties of the candidate conditional statements that match thecommon structure of the State/Strategy design patterns. As theidentification procedure does not focus on the intent or behav-ioural aspects that differentiate these patterns, the suggestedrefactoring candidates may correspond to opportunities for intro-ducing either (a) the State pattern, (b) the Strategy pattern, or (c)plain polymorphism. In practice, the refactoring opportunitiesidentified by JDeodorant correspond mainly to the State designpattern. This is justified by the nature of the identification criteria

1 State Variables comprise (a) non-static fields, (b) parameters or (c) local variablesdeclared before the state-checking code fragment. They correspond to primitive typesor enumerations.

and the orientation of the method towards the State pattern. More-over, since the identification procedure focuses on the presence ofnamed constants in branch conditions, the method misses refac-toring candidates that do not use named constants for explicit rep-resentation of alternative states/strategies.

3.3. A method for automated refactoring to Strategy

We propose a method for automated refactoring to Strategy thatsupports identification of refactoring opportunities and transfor-mation of the source code to comply with the Strategy design pat-tern. Our method complements JDeodorant, that focuses mainly onthe State pattern, with capabilities for automated refactoring to theStrategy design pattern. The use of novel refactoring identificationcriteria, that also take into account behavioural properties of theStrategy design pattern during evaluation of the candidate condi-tional statements, differentiates our method from the approach of[15]. Specifically, the focus is on verifying that (a) branch selectionin a conditional statement is not controlled by the Context class and(b) branch bodies correspond to non-trivial computations. Theseproperties reflect the interchange of alternative strategies in theStrategy design pattern and their identification in conditional state-ments improves substantially the recall of refactoring opportunitiesto Strategy, as compared to JDeodorant. As concerning the proposedsource code transformation for introducing the Strategy pattern, itsupports, under certain conditions, the total replacement of theconditional statement with a polymorphic method call.

Our method searches for refactoring opportunities among theconditional statements declared in a given Context class and eval-uates them on the basis of properties of their branch bodies andconditions. On the other hand, JDeodorant’s analysis of conditionalstatements is based exclusively on properties of branch conditions.Moreover, the scope of applied checks for the evaluation of a givenconditional statement, i.e., the code fragments that need to be ana-lyzed in order to determine its suitability for refactoring to Strat-egy, is broader in this work as compared to JDeodorant. Thelatter targets its checks to the method that includes the candidateconditional statement (intra-procedural analysis), while our ap-proach extends its scope to methods that directly or indirectly in-voke that method (inter-procedural analysis) and either belong tothe Context class or client classes (intra- and inter-class analysis).The processing of conditional statements in this method is basedon syntactic analysis, as well as control and data flow analysis rel-evant to variables in branch conditions and their values. The StateVariables of the JDeodorant method are referred to, in this work, asStrategy Selection Variables (SSVs), since they represent a different,though overlapping, set of variables (fields and parameters of non-primitive type are also included, local variables are excluded) andserve a different purpose.

As mentioned above, the identification of refactoring opportuni-ties is based on the role of the Context class and its clients inbranch selection in the candidate conditional statements. This alsocontributes to the separation of Strategy from State refactoringopportunities. Specifically, if branch selection in a conditionalstatement is determined by clients of the Context class (classdeclaring the statement), then the code fragment is considered ascandidate for refactoring to Strategy. This is in accord with selec-tion of Concrete Strategies in the Strategy design pattern, as alsohighlighted in the comparison of the two patterns of Section 3.1.The proposed method evaluates branch selection in a conditionalstatement by identifying the classes that define (directly or indi-rectly) the values of SSVs in branch conditions.

3.3.1. Algorithm for identification of refactoring opportunitiesThis section introduces an algorithm for identification of refac-

toring opportunities towards the Strategy design pattern. The algo-

Page 5: Automated refactoring to the Strategy design pattern · Automated refactoring to the Strategy design pattern ... refactoring procedure are implemented and integrated in the JDeodorant

Fig. 2. Illustration of SSB, SSF and SSP instances in a code fragment.

1206 A. Christopoulou et al. / Information and Software Technology 54 (2012) 1202–1214

rithm operates each time on a single class of the system undermaintenance, that will be, henceforth, referred to as Context

class. This characterization denotes the role of the class in theStrategy design pattern, in case that a refactoring is successfullyidentified and applied.

The code fragments that are processed by the algorithm, insearch of refactoring opportunities, are conditional statements(i.e., switch, if/else, if/else if/. . ./else statements) de-clared in Context methods. However, only a subset of conditionalstatements, providing hints for Strategy selection, are thoroughlyevaluated. These statements are referred to as Strategy SelectionBlocks (SSBs) and are determined by the decision function is_SSB.The function receives a conditional statement as parameter and re-turns true in case that the statement is an SSB. The rules that com-prise the decision logic of is_SSB and must apply for a conditionalstatement s in order to be considered as SSB, are:

(a) s must have at least two branches,(b) s must not be part of a static method or other static code

block,(c) s must not be part of a method that overrides a method

declared in a non-system class, e.g. methods of the languageruntime libraries like equals, toString, hashCode forJava,

(d) the average number of statements per branch (excludingsetter and object instantiation statements) for s must be atleast two, except for cases that branches contain invocationsof Context methods.

The identification of an SSB as refactoring opportunity is contin-gent on characteristics of the variables that participate in the con-dition expressions of its branches. A determining factor is thepresence of certain variables of the Context class that potentiallycontribute to Strategy selection and will be referred to as StrategySelection Variables. Let SSV be the set of these variables for the Con-text class that comprises: (a) member variables (or fields) (Strat-egy Selection Fields set, SSF) that their value assignment is notcontrolled by the Context, and (b) method parameters (StrategySelection Parameters set, SSP) that their value is defined, eitherdirectly or indirectly, by Context class clients. It holds thatSSV = SSF [ SSP.

The elements of SSF are member variables that are not assigneda value inside the Context class. An exception to this rule aremember variables that are assigned a value inside the body of set-ter methods that are either invoked by: (a) clients of the Context

class, (b) Context class methods that provide as setter parameterduring invocation method parameters whose values originate fromoutside of the Context class (as determined by procedureget_parameter_trace that will be explained thereafter).

Fig. 2 presents a code fragment from class SerializationU-tility that supports object serialization/deserialization acrossdifferent formats. The conditional statement declared in serial-

izeObject method is a valid SSB, since it satisfies all rules ofthe is_SSB function. Member variable serializationMode isan SSF, as its value is not defined inside SerializationUtility.Note that serializationMode is characterized as an SSF despiteits value assignment by setter method setSerializationMode.The reason is that the method is never invoked by its declaringclass and, thus, its parameter value is always provided by third-parties. The same holds for parameter obj of serializeObjectmethod that represents an SSP.

The identification of refactoring opportunities for a given classis performed by Algorithm 1. The algorithm generates, during itsexecution, a set of refactoring opportunities R, represented by pairs(s,V) where s is an SSB and V a set of decision variables of s that per-form branch selection and their value is not controlled by Con-

text. For brevity reasons, additional notation will be introducedprior to algorithm specification. Let sm denote a conditional state-ment declared in method m of the Context class. The set DV(sm)for sm includes variables that participate in condition expressionsof its branches, while DVðsmÞ is a subset of DV(sm) with variablesthat participate in condition expressions of all branches of sm.Moreover, let LVm, Pm represent the sets of local variables andparameters of m respectively. Unlike SSVs, the presence of localvariables in condition expressions of the branches of an SSB resultsto its rejection from refactoring opportunities. The reason is thatlocal variables are always assigned a value inside the declaringmethod and, thus, their value is controlled by the Context.

The algorithm suggests a given SSB sm as refactoring candidatein case that the condition expressions of all its branches containat least one variable v that is either: (a) a Strategy Selection Field,or (b) a Strategy Selection Parameter of method m, i.e., a parameterthat (i) is not assigned a value (is not defined) in any executionpath inside method m that leads to sm (function is_defined(m,s,v)) and (ii) is always bound to values provided by Context

clients during method m invocations (procedure get_parame-ter_trace (m,v)). The presence in DV(sm) of any variable that isassigned a value by Context results to the rejection of sm fromrefactoring candidates. With reference to the code fragment ofFig. 2, Algorithm 1 returns a single refactoring opportunityR = {(s, serializationMode)}, where s is the delineated condi-tional statement. Note that although parameter obj is an SSP, itis not returned by our algorithm since it does not participate inbranch conditions of all branches of s.

Page 6: Automated refactoring to the Strategy design pattern · Automated refactoring to the Strategy design pattern ... refactoring procedure are implemented and integrated in the JDeodorant

Software Technology 54 (2012) 1202–1214 1207

Algorithm 1. Refactoring to Strategy Identification Algorithm

A. Christopoulou et al. / Information and

The boolean function is_

defined (m,s,v) determineswhether a variable v is assigned a value in at least one execution path between the start of method m and a statement s that is in-cluded in the body of m. The function employs the ProgramDependence Graph (PDG) of method m and returns true if thePDG node corresponding to statement s has at least one incomingdata dependence edge for variable v. The functionality of is_de-fined relies on the base PDG specification, initially proposed byFerrante et al. [17], enhanced with various additions (e.g., supportof break, continue, try/catch statements). A comprehensivepresentation of that PDG model is included in the work of [18]that employs the PDG for automation of the extract methodrefactoring.

Algorithm 2. Procedure get_parameter_trace

2 Set of direct or indirect invocations of method m where parameter p is controlledby client classes.

The get_parameter_trace (m,p) procedure is described byAlgorithm 2. It represents a recursive procedure that (a) receivesas input a method m of the Context class and a respective param-

eter p of that method and (b) returns a list of direct or indirect invo-cations of m, by methods of the Context class or its clients, alongwith the respective values supplied for parameter p. Its focus is todetermine whether the value(s) bound to method parameters orig-inate exclusively from clients of the Context class. In the positivecase, the function returns a set T of tuples (m0,s,e) where m0 is amethod that calls m (directly or indirectly), s the calling statementand e the expression that is supplied as value for parameter p. In anycase that the method m is called with a value for parameter p deter-mined by Context, the procedure returns the empty set.

Fig. 3 presents the recursive invocation of the get_parame-ter_trace procedure on methods of the RandomGenerator class.Its execution confirms that the values passed for parameter distr,during invocations of the generateRandomVariable method, arealways indirectly provided by clients of the RandomGenerator

class (in this case by VideoSource). The dashed arrows in Fig. 3illustrate how the procedure ‘‘traces’’ the sources of distr values.Based on the successfull outcome of get_parameter_trace, ourrefactoring identification algorithm suggests the delineatedswitch statement as a candidate for refactoring to Strategy withparameter distr having the role of the SSV.

The procedure operates iteratively on each method mp that in-vokes m (line 2 of Algorithm 2). Let Cm be the set of all methodsthat call directly a given method m. The role of m, in the exampleof Fig. 3, is enacted by generateRandomVariable, while its call-ers Cm can be derived from the respective call graph. For each state-ment s of mp 2 Cm that invokes method m, the actual parameter abound to parameter p is retrieved through function actual

(p,m,s) (line 4). The function returns a variable name, in case thatthe actual parameter is a single variable expression. Otherwise, itsreturn value is null. If mp is a Context method, the tuple (mp,s,a)is added to the set T2 only in case that the actual parameter a corre-sponds to a Strategy Selection Field of Context (line 9), i.e. its valueis not determined by that class. In our case study example, the callersof generateRandomVariable are also methods of RandomGener-ator, but provide as value for parameter distr one of their param-eters. In such cases, get_parameter_trace is executed recursivelyfor each caller mp, in order to determine whether their respectiveparameter is, in its turn, controlled by Context clients (line 10).Fig. 3 shows the recursive invocations initiated by get_parame-ter_trace and the methods that they apply to.

In any case that a caller mp does not belong to Context class,the tuple (mp,s,a) is added without further processing to the setT (lines 5, 6). This condition holds at the execution of get_param-eter_trace for methodsgenRv (char, float, float) andgen-Rv (char, float, float, float) respectively. Note that the classof a method m is denoted by the predicate class (m). Thus, start-ing from a method m, get_parameter_trace traverses inverselythe call graph that is formed by Context methods and their callersfrom client classes.

If actual parameter a is a local variable, constant or parameterwith non-primitive type the procedure returns the empty set. Notethat if any recursive call to get_parameter_trace returns theempty set, i.e., the parameter under check is controlled by Con-

text, then the procedure also returns the empty set, regardlessof the existence of any elements in T.

The finiteness of Algorithm 2 is not affected by the presence ofcircles in the call graph of Context, i.e., existence of recursivemethods or circular invocations among Context methods. These

Page 7: Automated refactoring to the Strategy design pattern · Automated refactoring to the Strategy design pattern ... refactoring procedure are implemented and integrated in the JDeodorant

Fig. 3. Invocation of get_parameter_trace in a code fragment.

1208 A. Christopoulou et al. / Information and Software Technology 54 (2012) 1202–1214

cases can be handled by marking the methods (call graph nodes)that have been used as parameters in get_parameter_traceinvocations. The marking takes place upon get_parame-ter_trace entry by storing a reference to method m into a liststructure. Each caller mp is looked up in that list prior to the invo-cation of get_parameter_trace in line 10 of the algorithm. Incase of successful look up, the procedure is not invoked again withmp but, instead, its execution continues with the next iteration ofthe inner for-loop. These details are suppressed from Algorithm 2for reasons of clarity to its presentation.

3.3.2. Applying the refactoringBase Refactoring Procedure The suggestions for refactoring to

Strategy, derived by the algorithm of Section 3.3.1, can be applied

to the source code through a sequence of elementary refactoringoperations. In certain cases, the code transformation involves thecomplete elimination of the conditional statement. This sectionspecifies the procedure that transforms an SSB, declared insidemethod methodA of a class named Context, to a code block thatemploys polymorphism based on the Strategy design pattern. Letb be the SSB and SSV its respective set of Strategy selection vari-ables. The refactoring to Strategy procedure involves the followingsteps:

1. Create the Abstract Strategy class AbstractXStrategy. Let Xrepresent the name of the Context’s behaviour with alterna-tive implementations that the pattern will support.

2. Add an abstract method, also named methodA, to AbstractX-

Strategy representing the behaviour with alternativeimplementations.

3. Add formal parameters to methodA that include:� the Context class, if the body of at least one branch of b

includes statements that reference fields or methods of theContext class or its super-class,

� one parameter for each parameter or local variable that isdefined outside b and is read in statements of at least onebranch of b.

4. For each branch i of b create a ConcreteXStrategyi class thatinherits from AbstractXStrategy and� override methodA in each subclass with the body of the

respective branch of the SSB,� add the pair li = (�i,ci) to a list L, where �i represents the

branch condition expression and ci the branch’s correspond-ing concrete Strategy class. The role of the list L will beexplained thereafter.

5. Add to Context a private member variable instXStrategy oftype AbstractXStrategy that will hold the currently activeconcrete Strategy instance.

6. If the set SSV contains just a single Context.methodA parame-ter of primitive or enumerated type then� find the set T of invocations of methodA with the help of

get_parameter_trace procedure,� if the variable expression part ej of each tuple tj 2 T is a con-

stant then Total Replacement of Conditional Logic in code frag-ment b is performed and the refactoring procedureterminates. This step will be analyzed in detail in the nextsubsection.

7. Replace the body of each conditional branch i of b with the fol-lowing statements:� assignment of a new ConcreteXStrategyi instance to

instXStrategy,� invocation of the method instXStrategy.methodA.

The application of this procedure to a refactoring opportunityidentified in the open source project Pamvotis [19] will be de-scribed for illustrative purposes. Pamvotis simulates a network ofwireless nodes that generate data traffic and coordinate their pack-et transmissions with the IEEE 802.11 protocol. The traffic, gener-ated by each wireless node, has the form of a stream of packets.The packet length and inter-arrival time for each packet streamare random variables that follow probability distributions specifiedin the simulator’s configuration parameters.

Packet streams in Pamvotis are generated by instances of FTP-Source, HTTPSource, VideoSource and GenericSource classesthat inherit from abstract class Source. With regard to the firstthree classes, they implement widely accepted traffic models forFTP, HTTP and Video applications respectively. All Source sub-classes depend on the RandomGenerator class that has theresponsibility of generating random numbers according to variousprobability distributions (e.g., exponential, pareto, uniform). Fig. 4

Page 8: Automated refactoring to the Strategy design pattern · Automated refactoring to the Strategy design pattern ... refactoring procedure are implemented and integrated in the JDeodorant

Fig. 4. Class diagram of packet sources in Pamvotis simulator.

A. Christopoulou et al. / Information and Software Technology 54 (2012) 1202–1214 1209

presents a class diagram with the relationships of the classes thatare relevant to our illustrative case study.

The refactoring opportunity identified by the proposed algo-rithm concerns the switch statement included in the generate-RandomVariable method of RandomGenerator. The method’ssource code, before refactoring to Strategy, is depicted in the leftpart of Fig. 5. The role of the SSV in the switch statement is repre-sented by the parameter distr that determines the random num-ber generation algorithm during generateRandomVariable

invocation. The domain of distr spans a set of character constantsthat correspond to different probability distributions. Each con-stant activates a different branch of the switch statement, imple-menting an appropriate pseudo-random generation algorithm forthe probability distribution.

The left part of Fig. 5 presents relevant code fragments of Ran-domGenerator andVideoSource classes before refactoring. Theseclasses play the roles of Context and Client in the Strategy designpattern that derives from refactoring and is depicted in the rightpart of the figure. The code that is produced by each step of therefactoring procedure is annotated with the respective step number.Specifically, AbstractRandomGen, that has the role of the abstractStrategy class, results from steps 1, 2, 3. During step 4, the body ofeach branch of the SSB is mapped to the implementation of a con-crete Strategy class, e.g., the branch that corresponds to constant’u’ is refactored to Strategy classUniformRandomGen that inheritsfrom AbstractRandomGen (step 4). After refactoring, the body ofeach branch is replaced by a call to an instance of an appropriatesubclass of AbstractRandomGen (step 7). Note that step 6 is not in-cluded in Fig. 5 since it involves the elimination of theswitch state-ment that is not applicable in the SSB under examination.

Total Replacement of Conditional Logic The proposed refactoringmethod supports, under certain conditions, the Total Replacementof Conditional Logic (TRCL) in an SSB. TRCL support is restricted toSSBs that include in all branch conditions a single Strategy Selec-tion Variable of primitive or enumerated type, represented by aspecific Context.methodA parameter p. TRCL applies in cases thatthe concrete Strategy implementation, that is used during each (di-rect or indirect) invocation of Context.methodA by client classes,can be inferred at compile time. This is determined by ensuringthat parameter p is bound to constant values during client callsof Context.methodA. TRCL results to the transformation of an

SSB to a simple non-conditional code fragment. This refactoringemploys the list L, populated in step 4 of the Base Refactoring Pro-cedure, as well as the tuple set T generated on step 6. On the basisof the previously introduced notation, the additional refactoringoperations required during TRCL are:

(a) For each element tj = (mj,sj,ej) 2 T, where mj represents themethod of a client class that controls the value of parameterp, sj is the statement where the value is supplied through aninvocation of a Context method and ej is a constant expres-sion representing the actual value of the parameter, do thefollowing:

� Find the tuple li 2 L whose expression �i evaluates to

true once parameter p is replaced by ej. Recall thatexactly one element of L is selected, since conditionalexpressions of the branches of an SSB are mutuallyexclusive.

� Let methodCj be the Context method that is invoked instatement sj. Introduce in that statement an additionalparameter to the methodCj invocation given by theexpression new ConcreteXStrategyi().

(b) Introduce an additional formal parameter to Con-

text.methodA with name x and type AbstractXStrate-

gy. Assume that x is placed first in the list of methodA

formal parameters.(c) Replace the SSB b inside Context.methodA with the follow-

ing two statements:

� assignment of x to member variable instXStrategy,� invocation of the instXStrategy.methodA. The latter

may be invoked in a return statement in case it has anon-void return type.

(d) Let ssv be the Context.methodA parameter that has the roleof a Strategy selection variable. If ssv is not referenced insidemethodA eliminate it from the list of formal parameters.Moreover, eliminate the respective actual parameters inmethodA calls.

(e) For each Context method methodCz that invokes methodAdo the following:

� apply step (b) to that method,� introduce the new parameter x as a first argument in

methodA invocations,

Page 9: Automated refactoring to the Strategy design pattern · Automated refactoring to the Strategy design pattern ... refactoring procedure are implemented and integrated in the JDeodorant

Fig. 5. Refactoring to Strategy in Pamvotis.

1210 A. Christopoulou et al. / Information and Software Technology 54 (2012) 1202–1214

� apply step (d) to methodCz for the parameter that wassupplied as a value for the ssv parameter in methodA

calls,� apply step (e) recursively for each Context method that

invokes methodCz.

The application of TRCL to the SSB of Fig. 5 is presented in Fig. 6.Note that our identification algorithm does not suggest TRCL forthe given conditional statement, as the values supplied by classGenericSource (Fig. 4) for the SSV distr are not constants.The refactoring included in Fig. 6 is valid under the assumptionthat class GenericSource is removed from project Pamvotis.

The annotations in Fig. 6 correspond to the steps of TRCL proce-dure that introduce or eliminate the respective code fragments.Specifically, step (a) applies to invocations of method RandomGen-

erator.genRv by client classes VideoSource and HTTPSource

(the latter is not included in figure and will no longer be referredto). It, initially, determines which AbstractRandomGen concreteimplementation, included in list L, corresponds to each characterliteral that is bound to parameter distr on RandomGenera-

tor.genRv method invocations. In the case of VideoSource class,the aforementioned concrete implementation is ParetoRandom-

Gen. Step (a) completes with the introduction of a ParetoRandom-Gen object instantiation expression as the first parameter in

Page 10: Automated refactoring to the Strategy design pattern · Automated refactoring to the Strategy design pattern ... refactoring procedure are implemented and integrated in the JDeodorant

Fig. 6. Total replacement of conditional logic in Pamvotis.

A. Christopoulou et al. / Information and Software Technology 54 (2012) 1202–1214 1211

RandomGenerator.genRv invocations. Step (b) adds a formalparameter of type AbstractRandomGen to method RandomGen-

erator.generateRandomVariable. The latter declares the refac-tored SSB (switch statement depicted in Fig. 5) that is replaced by apolymorphic invocation of AbstractRandomGen.generateRan-domVariable in step (c). Thus, RandomGenerator.generate-RandomVariable is adapted in order to use the concrete Strategyimplementations provided by client classes. Step (d) removesdistr from the formal parameters’ list of generateRandomVari-able, as it is no longer referenced in the method body. Moreover,respective actual parameters are eliminated from all invocationsof that method. Finally, step (e) changes the signature of gener-ateRandomVariable callers, declared in the same class, throughthe introduction of a formal parameter of type AbstractRandom-Gen. The parameter is also introduced as actual parameter inrespective generateRandomVariable invocations. Step (e) com-pletes with the elimination of any occurrence of parameter distr,since it is no longer referenced in RandomGenerator class.

3.3.3. PreconditionsThe source code transformation of Section 3.3.2 comes with a

set of preconditions that are evaluated on the candidate condi-tional statements for refactoring. Applying the transformation toa refactoring candidate that violates any of these preconditions re-sults to erroneous source code that either has compilation errors ordoes not preserve the external behaviour of the system. Thepreconditions that are associated with the Base Refactoring Proce-dure of Section 3.3.2 are the following:

� Branch bodies should not contain assignment of two or moredifferent local variables. This precondition is required whenthe refactoring is applied to Java source code where methodparameters are always passed by-value. Thus, it is not possibleto return multiple values as method results. In other object ori-ented languages, such as C++ where pass by-reference is sup-ported, this precondition can be revoked. On the other hand, ifbranch bodies include assignments to a single local variable,its value can be returned by the concrete implementations ofthe polymorphic method as a return value.� Branch bodies should not contain unstructured control flow

statements (break or continue) in case that the refactoring can-didate is nested inside iteration statements. If this preconditionis violated the refactored code will contain compilation errors.� Branch bodies should not contain any super method invocations

since the move of the branch body outside the Context classwill lead to compilation errors.� The names of the newly introduced classes that belong to the

Strategy hierarchy should not conflict with names of existingclasses of the system.

These preconditions are also required in the JDeodorant methodduring refactoring to State/Strategy [15]. As regarding TotalReplacement of Conditional Logic, an additional precondition is re-quired that prevents the application of the transformation in casesthat the method containing the refactoring candidate overrides asuper-class method or implements an interface operation. Thisprecondition prevents compilation errors that result from non-conformance of the Context method signature with the signatureof the overridden methods.

3.3.4. Implementation detailsThe proposed method has been implemented as part of the JDe-

odorant plug-in for Eclipse [7]. The syntactic analysis of Java sourcefiles, performed during execution of the refactoring identificationalgorithm, is realized through (a) AST parsing capabilities providedby the Eclipse Java Development Tools (JDTs) Core infrastructureand (b) utility classes included in the JDeodorant project. More-over, the identification of refactoring candidates is based on con-trol and data flow analysis performed on Program DependenceGraph (PDG) representations of class methods that are constructedthrough relevant infrastructure of JDeodorant [18]. Finally, thetransformations required for refactoring to Strategy have beenimplemented with functionality provided by JDT and the EclipseLanguage Toolkit (LTK).

4. Experimental evaluation

The proposed method for automated identification of refactor-ing opportunities towards the Strategy design pattern has been ap-plied on a set of software projects for evaluation purposes. Theobjectives of the experimental evaluation are: (a) quality assess-ment of the refactoring to Strategy suggestions, on the basis ofjudgments made by two software engineers, (b) estimation of thescalability of the refactoring identification algorithm and its imple-mentation, in terms of execution efficiency.

The selection of the Java code projects, serving as input for theevaluation of our method, is based on the following requirements:(i) the projects should have different sizes and serve various appli-cation domains in order to enable, to a certain extent, the general-ization of the derived outcomes, (ii) projects should be familiar tothe evaluators, ensuring, thus, a high reliability on their judgments,(iii) projects unfamiliar to the evaluators should be of small tomedium size in order to facilitate the review and understandingof their functionality. Table 1 summarizes the projects used for

Page 11: Automated refactoring to the Strategy design pattern · Automated refactoring to the Strategy design pattern ... refactoring procedure are implemented and integrated in the JDeodorant

Table 1Description of the software projects used in the evaluation

Project name Opensource?

Description

Pamvotis 1.1 [19] Yes IEEE 802.11 network simulatorMSSDesigner [20,21] No Graphical data modelling toolNutch 1.1 [22] Yes Web search systemViolet 0.21.2 [23] Yes Graphical UML editorFlowScheduler [24] No Research prototype

1212 A. Christopoulou et al. / Information and Software Technology 54 (2012) 1202–1214

the evaluation of this work. Quality assessment is based on the firstthree projects, while the scalability of our approach is evaluated onall projects.

Table 2 provides information on the size and structure of theaforementioned projects such as numbers of Source Lines of Code(SLOC), classes and methods. As concerning the last column of thistable, it provides the number of candidate SSBs that need to be eval-uated by the proposed identification algorithm for each project.

IceHockeyManager 0.4 [25] Yes Ice hockey manager gameApache Ant 1.8.2 [26] Yes Build management toolJade 4.1 [27] Yes Application development framework

4.1. Quality assessment of refactoring identification algorithm

The evaluation of the refactoring identification algorithm isbased on the manual detection of refactoring candidates, in pro-jects FlowScheduler, IceHockeyManager and MSSDesigner (Table2), by two expert software engineers that will be henceforth re-ferred to as evaluators. Both evaluators are familiar with the Strat-egy design pattern and its appropriateness for dealing withconditional complexity. For each Java project, the evaluators indi-vidually inspect all ifnswitch conditional statements with atleast two branches (i.e., statements that satisfy the minimumrequirements for being considered as refactoring candidates) insearch of refactoring opportunities. Let S be this set of conditionalstatements for a given Java project. The approval of a conditionalstatement as refactoring candidate is contingent on the evaluator’sjudgement. The refactoring identification algorithm is executed onthe Java projects and quality assessment, from the viewpoint ofeach evaluator, is performed on its results. Specifically, the auto-matically identified refactoring opportunities are comparedagainst the refactorings suggested by evaluators and a distinctset of quality metrics is calculated for each one of them.

Depending on the characterization of the elements of S as refac-toring opportunities by the evaluator and the identification algo-rithm, the set S can be divided into the following disjointsubsets: (a) True Positive elements (STP), i.e., correctly identifiedrefactoring opportunities by the algorithm, with respect to theevaluator’s judgement, (b) False Positive elements (SFP), i.e., incor-rectly identified refactoring opportunities by the algorithm, (c)True Negative elements (STN), i.e., correctly rejected refactoringsuggestions by the algorithm, (d) False Negative elements (SFN),i.e., valid refactoring opportunities, according to the evaluator, thatare falsely rejected by the algorithm.

The quality assessment of the identification algorithm is basedon three metrics, namely Precision, Recall and Accuracy. For a givenevaluator and Java project, these metrics are calculated on the car-dinalities TP, FP, TN, FN of the respective subsets STP, SFP, STN SFN:

Precision ¼ TPTP þ FP

Recall ¼ TPTP þ FN

Accuracy ¼ TP þ TNTP þ FP þ TN þ FN

Table 3 includes the values for precision, recall and accuracy that re-sult from the evaluation of algorithm’s suggested refactorings onprojects FlowScheduler, IceHockeyManager and MSSDesigner. Thequality metrics are presented separately for each individual evalu-ator. Their different values are due to the non-identical sets of refac-toring candidates identified by the two evaluators, denoting adegree of subjectivity in the assessment of whether branch bodiesof a conditional statement represent algorithms that are significantenough to be represented as methods of Concrete Strategy classes.On the basis of the number of identified refactorings (TP + FN), thesecond evaluator is more conservative in his suggestions, probablydue to his higher familiarity with the Java projects. Nevertheless,the quality assessment results show that our refactoring identifica-

tion algorithm provides, with satisfactory confidence, meaningfulrefactoring suggestions to both evaluators.

Evaluation results relevant to the recall metric show that thealgorithm identifies on average half of the refactoring candidatessuggested by evaluators. Note that recall is inversely proportionalto the number of False Negatives (FNs), i.e., to ‘‘valid’’ refactoringsthat were not identified by the algorithm. A study of FNs encoun-tered in Java projects lead to the following categorization of thereasons for being rejected by our method:

1. presence of local variables in branch conditions. These localvariables take their values from parameters or fields ‘‘con-trolled’’ by Context clients.

2. presence of attributes in branch conditions that are initial-ized inside Context class. These attributes are assigned adefault value during their declaration or take their valuefrom parameters ‘‘controlled’’ by Context clients.

3. presence of attributes and local variables in branch condi-tions that their value is ‘‘controlled’’ by Context class.

The majority of FNs fall into the first two cases and, thus, couldbe possibly eliminated with appropriate handling of local variablesand fields (with assignments inside Context) participating inbranch conditions. Specifically, more extensive data flow analysisis required, embracing these types of variables, in order to con-clude whether their values originate from Context clients and,thus, take into account the respective conditional statements inthe refactoring identification process. In certain cases, local vari-ables or Context fields are defined with results of method invoca-tions that retrieve system properties (e.g., System.getProperty()) or information from configuration files. Although the sourceof their assigned values is external to the system, its tracing is arather challenging task, since it requires semantic analysis of thesystem’s code.

As concerning the precision metric, it ranges around 50% (33.33–64.29%) for all projects and, thus, almost half of suggested refactor-ings are meaningful to the evaluators. Precision improves as thenumber of False Positives (FPs) declines. Analysis of the conditionalstatements marked as FPs during evaluation revealed that they are,generally, characterized by branch bodies with trivial logic, such asinitialization of user interface components, object instantiationand parameterization or output formatting. However, our algo-rithm suggests these conditional statements as refactoring candi-dates due to their relatively high average number of statementsper branch (P2). The elimination of these FPs requires semanticanalysis of branch bodies and will be handled in future versionsof this work.

Since refactoring to patterns aims at improving code quality, theevaluation of our method for refactoring to Strategy involves, also,a study of its impact on software quality metrics. The focus is onassessing the improvement to cyclomatic complexity M that re-sults from introducing the Strategy pattern to the automaticallyidentified refactoring opportunities. Moreover, we estimate the

Page 12: Automated refactoring to the Strategy design pattern · Automated refactoring to the Strategy design pattern ... refactoring procedure are implemented and integrated in the JDeodorant

Table 2Source code attributes of the software projects.

Project name SLOC Classes Methods If/switch statements with P2 branches

Pamvotis 1.1 5285 33 299 74MSSDesigner 10,671 37 513 250Nutch 1.1 16,810 206 1050 200Violet 0.21.2 19,965 224 1430 107FlowScheduler 23,452 213 1640 268IceHockeyManager 0.4 27,113 310 2320 280Apache Ant 1.8.2 103,148 1258 9430 1648Jade 4.1 106,036 1736 8582 1665

Table 3Evaluation of refactoring identification algorithm.

Project TP FP TN FN Precision (%) Recall (%) Accuracy (%)

Evaluator 1FlowScheduler 6 8 245 9 42.86 40.00 93.66IceHockeyManager 18 12 242 8 60.00 69.23 92.86MSSDesigner 9 7 219 15 56.25 37.50 91.20

Evaluator 2FlowScheduler 9 5 245 9 64.29 50.00 94.78IceHockeyManager 10 20 247 3 33.33 76.92 91.79MSSDesigner 8 8 232 2 50.00 80.00 96.00

Table 5Execution time of the refactoring identification algorithm.

Project name Execution time in ms

AST creation Identification time Total

Pamvotis 3376 1453 4829MSSDesigner 11,454 4968 16,422Nutch 18,500 5281 23,781Violet 0.21.2 9671 3922 13,593FlowScheduler 6544 3437 9981IceHockeyManager 0.4 15,360 12,312 27,672Apache Ant 1.8.2 39,641 74,234 113,875Jade 4.1 66,187 72,688 138,875

A. Christopoulou et al. / Information and Software Technology 54 (2012) 1202–1214 1213

reduction of method size MLOC (Method Lines of Code), in terms oflines of code, for methods declaring the respective conditionalstatements. We adopt the McCabe’s Cyclomatic Complexity metric[28] as a quantitative indicator of conditional complexity and cal-culate its value prior and after introducing the Strategy pattern, foreach method that includes a refactoring candidate.

Table 4 summarizes these code quality measurements for thethree projects that participate in quality assessment. Column Mp in-cludes the average value of cyclomatic complexity, prior to refactor-ing, over all methods that declare a refactoring candidate identifiedby our algorithm. Column Ma includes the aforementioned metricafter refactoring. The average size in lines of code for methodsdeclaring a refactoring candidate, prior and after refactoring, is pre-sented in columns MLOCp and MLOCa respectively. The standarddeviation corresponding to each average value is provided insideparentheses.

The code quality measurements show a considerable improve-ment to McCabe Conditional Complexity (15–30%) and MLOC(20–30%) metrics. Thus, our method succeeds to reduce code sizeand complexity of several classes on each code base through theintroduction of the Strategy design pattern. Note, however, thatthis process does not eliminate conditional logic (although TRCL,when applicable, achieves a partial elimination). Instead, it man-ages to distribute it across the Context and Concrete Strategy clas-ses resulting in more maintainable and understandable code units.

4.2. Scalability of the approach

The merits of automated identification of refactoring opportuni-ties escalate with the size of the code project under maintenance. A

Table 4Impact of refactoring on software quality metrics.

Project name Metric

Mp Ma MLOCp MLOCa

MSSDesigner 9.31 7.31 51.75 40.31(±5.31) (±3.32) (±31.78) (±24.87)

FlowScheduler 7.64 4.71 35.71 23.50(±3.03) (±2.52) (±16.56) (±13.34)

IceHockeyManager 4.24 3.14 20.03 13.86(±1.83) (±1.51) (±9.80) (±8.87)

basic reason is that large projects, in terms of Source Lines of Code(SLOC), number of classes and methods etc., are more difficult toreview and maintain, as they are usually implemented by multipledevelopers. Thus, an essential part of the evaluation of our refac-toring identification method relates to its scalability in terms ofexecution time. The projects that constitute the input data are in-cluded in Table 2 and have various size attributes in terms of SLOC,number of classes, methods and candidate SSBs.

Table 5 presents the execution time of the identification algo-rithm when applied to the projects of Table 2, as well as its decom-position into various time components that correspond to differentprocessing tasks of the algorithm. The measurements have beencollected after running the algorithm in a workstation equippedwith a 2.2 GHz dual core processor and 4 GB of RAM. The columnAST Creation represents the time required for parsing the sourcefiles and constructing the AST for each project, while IdentificationTime corresponds to the actual algorithm execution time. The totalexecution time for each project is presented in the last column ofthe table.

Execution time results show a positive correlation between theAST Creation time and SLOC, number of compilation units for eachcode base (Table 2). On the other hand, the Identification Time gen-erally increases with the number of if/switch statements thatare processed by the algorithm in each project. The results justifythe scalability of the approach, since total execution time is below30 s for small to medium size projects and does not exceed 2.5 minfor large projects. Note that the actual processing time of the iden-tification algorithm is even lower (<13 s for medium size projects,

Page 13: Automated refactoring to the Strategy design pattern · Automated refactoring to the Strategy design pattern ... refactoring procedure are implemented and integrated in the JDeodorant

1214 A. Christopoulou et al. / Information and Software Technology 54 (2012) 1202–1214

<75 s for large ones), since the file parsing and AST creation tasksrepresent a significant part of the total execution time.

5. Conclusions and future work

We have proposed a method for simplification of conditional lo-gic in object oriented source code, through the automated intro-duction of polymorphism with the use of the Strategy designpattern. The method introduces automation to (a) the identifica-tion of refactoring opportunities towards the Strategy patternand (b) the application of the refactoring through a source codetransformation. The identification of refactoring opportunities isbased on an algorithm that operates on a given class (candidateContext class) and evaluates properties of its conditional state-ments through syntactic, control and data flow analysis. Suggestedrefactoring opportunities comprise conditional statements that arecharacterized by certain analogies to the Strategy pattern, in termsof the purpose and selection mode of strategies: (i) conditionalbranches include mutually exclusive, non-trivial computations,corresponding to alternative algorithm implementations repre-sented by strategies, (ii) branch selection is not controlled by Con-text class logic but, instead, by its clients, in proportion toStrategy selection in the Strategy pattern. The presence of theseanalogies in a conditional statement is detected by ensuring thatthe values of variables that participate in branch condition expres-sions are controlled by Context clients. The types of variables thatare processed are method parameters and Context fields. On theother hand, the presence of local variables in branch conditionexpressions results to rejection of the conditional statement fromrefactoring opportunities, since local variables are always initial-ized and calculated inside the Context class. As concerning theapplication of refactoring to Strategy, we provide its specificationas a series of primitive refactoring steps. A basic feature of the pro-posed transformation involves the Total Replacement of ConditionalLogic, under certain conditions, that eliminates a conditional state-ment and replaces it with a polymorphic method invocation on areference of the abstract Strategy class.

We have implemented and integrated the proposed method inthe JDeodorant Eclipse plug-in for refactoring automation supportin Java projects. On the basis of this implementation, we have eval-uated the proposed identification algorithm in terms of quality ofsuggested refactorings and runtime efficiency. The evaluation in-volved execution of the identification algorithm on a set of Javacode projects and rating of the refactoring suggestions, in termsof their relevance, by two expert software engineers. Qualityassessment results support the effectiveness of our algorithm,since almost half of suggested refactorings were meaningful tothe evaluators. Moreover, runtime efficiency of the algorithm israther satisfactory, as the algorithm processing time on the bench-mark Java projects did not exceed 30 s and 2.5 min for code basesof medium and large size respectively.

Our future work will focus on improvements to the refactoringidentification method. Quality assessment of this work revealedthat discovery of refactoring opportunities to Strategy is a challeng-ing task, even for a software engineer, due to the semantics of theStrategy design pattern that are difficult to be mapped to sourcecode attributes. Towards this direction, we plan to create an exten-sive data-set of refactoring candidates from various code bases andapply statistical and machine learning methods on attributes oftheir respective code fragments. Our goal is to automatically inferadditional rules for identification of refactorings to Strategy. As partof our future work, we also intend to broaden the scope of applica-bility of Total Replacement of Conditional Logic for further decreasingthe conditional complexity of refactored code fragments.

Acknowledgments

The authors thank Dr. N. Diamantidis and Dr. D. Vassis for theircontribution as evaluators in the quality assessment of the pro-posed method. The authors would also like to thank Dr. Tsantalis,Prof. Chatzigeorgiou, as well as the rest of the JDeodorant develop-ment team members for providing us access to the project’s sourcecode. Finally, we would like to thank the anonymous reviewers fortheir useful comments that improved the quality of this work.

References

[1] M. Fowler, Refactoring: Improving the Design of Existing Code, Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA, 1999.

[2] J. Shore, S. Warden, The Art of Agile Development, first ed., O’Reilly, 2007.[3] S. Demeyer, Object-oriented reengineering, in: T. Mens, S. Demeyer (Eds.),

Software Evolution, Springer, 2008, pp. 91–104.[4] T. Mens, T. Tourwe, A survey of software refactoring, IEEE Transactions on

Software Engineering 30 (2004) 126–139.[5] J. Kerievsky, Refactoring to Patterns, Pearson Higher Education, 2004.[6] E. Gamma, R. Helm, R. Johnson, J. Vlissides, Design Patterns: Elements of

Reusable Object-Oriented Software, Addison-Wesley Longman Publishing Co.,Inc., Boston, MA, USA, 1995.

[7] A. Chatzigeorgiou, N. Tsantalis, T. Chaikalis, M. Fokaefs, Jdeodorant EclipsePlug-in, 2011. <http://java.uom.gr/jdeodorant/>.

[8] L. Tokuda, D. Batory, Automated software evolution via design patterntransformations, in: International Symposium on Applied CorporateComputing.

[9] M. Cinneide, P. Nixon, A methodology for the automated introduction of designpatterns, in: Proceedings of the IEEE International Conference on SoftwareMaintenance, 1999 (ICSM ’99), pp. 463–472.

[10] S.-U. Jeon, J.-S. Lee, D.-H. Bae, An automated refactoring approach to designpattern-based program transformations in java programs, in: Proc. of NinthAsia-Pacific Software Engineering Conference (APSEC’02), IEEE ComputerSociety, 2002, pp. 337–345.

[11] C. Jebelean, C.-B. Chirila, V. Cretu, A logic based approach to locate compositerefactoring opportunities in object-oriented code, in: IEEE InternationalConference on Automation Quality and Testing Robotics (AQTR’10), vol. 3,pp. 1–6.

[12] ROOTS research group, JTransformer query and transformation engine for Javasource code, Computer Science Department III, University of Bonn, 2011.

[13] N. Juillerat, B. Hirsbrunner, Toward an implementation of the ‘‘form templatemethod’’ refactoring, in: Seventh IEEE International Working Conference onSource Code Analysis and Manipulation (SCAM’07), pp. 81–90.

[14] N. Juillerat, Models and Algorithms for Refactoring Statements, Ph.D. Thesis,Faculty of Science of the University of Fribourg, Switzerland, 2009.

[15] N. Tsantalis, A. Chatzigeorgiou, Identification of refactoring opportunitiesintroducing polymorphism, Journal of Systems and Software 83 (2010) 391–404.

[16] E. Freeman, E. Freeman, B. Bates, K. Sierra, Head First Design Patterns, O’ Reilly& Associates, Inc., 2004.

[17] J. Ferrante, K.J. Ottenstein, J.D. Warren, The program dependence graph and itsuse in optimization, ACM Transactions on Programming Languages andSystems 9 (1987) 319–349.

[18] N. Tsantalis, A. Chatzigeorgiou, Identification of extract method refactoringopportunities for the decomposition of methods, Journal of Systems andSoftware 84 (2011) 1757–1782.

[19] D.E. Vassis, V.E. Zafeiris, IEEE 802.11 WLAN Simulator Pamvotis v.1.1, 2012.<http://www.pamvotis.org>.

[20] V. Zafeiris, C. Doulkeridis, Y. Stavrakas, M. Gergatsoulis, An infrastructure formanipulating multidimensional semistructured data, in: Proceedings of theHellenic Data Management Symposium (HDMS’02).

[21] Y. Stavrakas, M. Gergatsoulis, C. Doulkeridis, V. Zafeiris, Representing andquerying histories of semistructured databases using multidimensional OEM,Elsevier Information Systems 29 (2004) 461–482.

[22] Apache Software Foundation, Nutch Web-search Software v.1.1, 2012. <http://nutch.apache.org/..

[23] C.S. Horstmann, A. de Pellegrin, Violet Open Source UML Editor v.0.21.2, 2012.<http://violet.sourceforge.net/>.

[24] V.E. Zafeiris, E. Giakoumakis, Optimized traffic flow assignment in multi-homed, multi-radio mobile hosts, Computer Networks 55 (2011) 1114–1131.

[25] IceHockeyManager, Ice Hockey Manager Game v.0.4, 2012. <http://ihm.sourceforge.net>.

[26] Apache Software Foundation, Ant Tool for Software Build Automation, 2012.<http://ant.apache.org>.

[27] Telecom Italia Lab, JADE (Java Agent DEvelopment Framework) Framework forDevelopment of Agent-based Applications, 2012. <http://jade.tilab.com>.

[28] T. McCabe, A complexity measure, IEEE Transactions on Software EngineeringSE-2 (1976) 308–320.