64
Threats to the Validity of Refactoring Studies Rainer Schaden Freie Universität Berlin June 30, 2016

Rainer Schaden June 30, 2016 · 2016-07-13 · What is Refactoring? É Opdyke - PhD thesis (1992) É Fowler - Refactoring: Improving the Design of Existing Code (1999) Definition

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Rainer Schaden June 30, 2016 · 2016-07-13 · What is Refactoring? É Opdyke - PhD thesis (1992) É Fowler - Refactoring: Improving the Design of Existing Code (1999) Definition

Threats to the Validity of Refactoring Studies

Rainer SchadenFreie Universität Berlin

June 30, 2016

Page 2: Rainer Schaden June 30, 2016 · 2016-07-13 · What is Refactoring? É Opdyke - PhD thesis (1992) É Fowler - Refactoring: Improving the Design of Existing Code (1999) Definition

Outline

Introduction

How to find Refactorings?

Repository Mining Tools

Planned Approach

,

FU Berlin, Threats to the Validity of Refactoring Studies, June 30, 2016 2

Page 3: Rainer Schaden June 30, 2016 · 2016-07-13 · What is Refactoring? É Opdyke - PhD thesis (1992) É Fowler - Refactoring: Improving the Design of Existing Code (1999) Definition

Outline

Introduction

How to find Refactorings?

Repository Mining Tools

Planned Approach

,

FU Berlin, Threats to the Validity of Refactoring Studies, June 30, 2016 3

Page 4: Rainer Schaden June 30, 2016 · 2016-07-13 · What is Refactoring? É Opdyke - PhD thesis (1992) É Fowler - Refactoring: Improving the Design of Existing Code (1999) Definition

What is Refactoring?

É Opdyke - PhD thesis (1992)

É Fowler - Refactoring: Improving the Design of Existing Code (1999)

Definition“A change made to the internal structure of software to make it easier tounderstand and cheaper to modify without changing its observablebehavior”

— Martin Fowler

,

FU Berlin, Threats to the Validity of Refactoring Studies, June 30, 2016 4

Page 5: Rainer Schaden June 30, 2016 · 2016-07-13 · What is Refactoring? É Opdyke - PhD thesis (1992) É Fowler - Refactoring: Improving the Design of Existing Code (1999) Definition

What is Refactoring?

É Opdyke - PhD thesis (1992)É Fowler - Refactoring: Improving the Design of Existing Code (1999)

Definition“A change made to the internal structure of software to make it easier tounderstand and cheaper to modify without changing its observablebehavior”

— Martin Fowler

,

FU Berlin, Threats to the Validity of Refactoring Studies, June 30, 2016 4

Page 6: Rainer Schaden June 30, 2016 · 2016-07-13 · What is Refactoring? É Opdyke - PhD thesis (1992) É Fowler - Refactoring: Improving the Design of Existing Code (1999) Definition

What is Refactoring?

É Opdyke - PhD thesis (1992)É Fowler - Refactoring: Improving the Design of Existing Code (1999)

Definition“A change made to the internal structure of software to make it easier tounderstand and cheaper to modify without changing its observablebehavior”

— Martin Fowler

,

FU Berlin, Threats to the Validity of Refactoring Studies, June 30, 2016 4

Page 7: Rainer Schaden June 30, 2016 · 2016-07-13 · What is Refactoring? É Opdyke - PhD thesis (1992) É Fowler - Refactoring: Improving the Design of Existing Code (1999) Definition

Why Refactoring?

É Prevents software aging

É Improves:É DesignÉ UnderstandabilityÉ MaintainabilityÉ Developer’s productivity

,

FU Berlin, Threats to the Validity of Refactoring Studies, June 30, 2016 5

Page 8: Rainer Schaden June 30, 2016 · 2016-07-13 · What is Refactoring? É Opdyke - PhD thesis (1992) É Fowler - Refactoring: Improving the Design of Existing Code (1999) Definition

Why Refactoring?

É Prevents software agingÉ Improves:

É DesignÉ UnderstandabilityÉ MaintainabilityÉ Developer’s productivity

,

FU Berlin, Threats to the Validity of Refactoring Studies, June 30, 2016 5

Page 9: Rainer Schaden June 30, 2016 · 2016-07-13 · What is Refactoring? É Opdyke - PhD thesis (1992) É Fowler - Refactoring: Improving the Design of Existing Code (1999) Definition

Refactoring Tactics

É Floss Refactoring:É FrequentÉ Mixed with other program changes

É Root-Canal Refactoring:É InfrequentÉ LengthyÉ Hardly any other program changes

,

FU Berlin, Threats to the Validity of Refactoring Studies, June 30, 2016 6

Page 10: Rainer Schaden June 30, 2016 · 2016-07-13 · What is Refactoring? É Opdyke - PhD thesis (1992) É Fowler - Refactoring: Improving the Design of Existing Code (1999) Definition

Refactoring Tactics

É Floss Refactoring:É FrequentÉ Mixed with other program changes

É Root-Canal Refactoring:É InfrequentÉ LengthyÉ Hardly any other program changes

,

FU Berlin, Threats to the Validity of Refactoring Studies, June 30, 2016 6

Page 11: Rainer Schaden June 30, 2016 · 2016-07-13 · What is Refactoring? É Opdyke - PhD thesis (1992) É Fowler - Refactoring: Improving the Design of Existing Code (1999) Definition

Goal of this Thesis

É Examine Refactoring studies with regards to:

É CredibilityÉ RelevanceÉ Validity

,

FU Berlin, Threats to the Validity of Refactoring Studies, June 30, 2016 7

Page 12: Rainer Schaden June 30, 2016 · 2016-07-13 · What is Refactoring? É Opdyke - PhD thesis (1992) É Fowler - Refactoring: Improving the Design of Existing Code (1999) Definition

Goal of this Thesis

É Examine Refactoring studies with regards to:É Credibility

É RelevanceÉ Validity

,

FU Berlin, Threats to the Validity of Refactoring Studies, June 30, 2016 7

Page 13: Rainer Schaden June 30, 2016 · 2016-07-13 · What is Refactoring? É Opdyke - PhD thesis (1992) É Fowler - Refactoring: Improving the Design of Existing Code (1999) Definition

Goal of this Thesis

É Examine Refactoring studies with regards to:É CredibilityÉ Relevance

É Validity

,

FU Berlin, Threats to the Validity of Refactoring Studies, June 30, 2016 7

Page 14: Rainer Schaden June 30, 2016 · 2016-07-13 · What is Refactoring? É Opdyke - PhD thesis (1992) É Fowler - Refactoring: Improving the Design of Existing Code (1999) Definition

Goal of this Thesis

É Examine Refactoring studies with regards to:É CredibilityÉ RelevanceÉ Validity

,

FU Berlin, Threats to the Validity of Refactoring Studies, June 30, 2016 7

Page 15: Rainer Schaden June 30, 2016 · 2016-07-13 · What is Refactoring? É Opdyke - PhD thesis (1992) É Fowler - Refactoring: Improving the Design of Existing Code (1999) Definition

Outline

Introduction

How to find Refactorings?

Repository Mining Tools

Planned Approach

,

FU Berlin, Threats to the Validity of Refactoring Studies, June 30, 2016 8

Page 16: Rainer Schaden June 30, 2016 · 2016-07-13 · What is Refactoring? É Opdyke - PhD thesis (1992) É Fowler - Refactoring: Improving the Design of Existing Code (1999) Definition

How to find Refactorings?

É Goal: empirical refactoring data

É Four Methods: (Murphy-Hill)É Mining the Commit LogÉ Analyzing Code HistoriesÉ Observing ProgrammersÉ Logging Refactoring Tool Use

,

FU Berlin, Threats to the Validity of Refactoring Studies, June 30, 2016 9

Page 17: Rainer Schaden June 30, 2016 · 2016-07-13 · What is Refactoring? É Opdyke - PhD thesis (1992) É Fowler - Refactoring: Improving the Design of Existing Code (1999) Definition

How to find Refactorings?

É Goal: empirical refactoring dataÉ Four Methods: (Murphy-Hill)

É Mining the Commit LogÉ Analyzing Code HistoriesÉ Observing ProgrammersÉ Logging Refactoring Tool Use

,

FU Berlin, Threats to the Validity of Refactoring Studies, June 30, 2016 9

Page 18: Rainer Schaden June 30, 2016 · 2016-07-13 · What is Refactoring? É Opdyke - PhD thesis (1992) É Fowler - Refactoring: Improving the Design of Existing Code (1999) Definition

How to find Refactorings?

É Goal: empirical refactoring dataÉ Four Methods: (Murphy-Hill)

É Mining the Commit LogÉ Analyzing Code HistoriesÉ Observing ProgrammersÉ Logging Refactoring Tool Use

,

FU Berlin, Threats to the Validity of Refactoring Studies, June 30, 2016 9

Page 19: Rainer Schaden June 30, 2016 · 2016-07-13 · What is Refactoring? É Opdyke - PhD thesis (1992) É Fowler - Refactoring: Improving the Design of Existing Code (1999) Definition

Mining the Commit Log

É Search for "refactor", "rename", etc. in commit messages

É Problems:É Bad commit messagesÉ Floss refactoringÉ Tendency towards certain, large refactorings

É Assumptions:É Developer recalls refactoringÉ And describes it accurately

,

FU Berlin, Threats to the Validity of Refactoring Studies, June 30, 2016 10

Page 20: Rainer Schaden June 30, 2016 · 2016-07-13 · What is Refactoring? É Opdyke - PhD thesis (1992) É Fowler - Refactoring: Improving the Design of Existing Code (1999) Definition

Mining the Commit Log

É Search for "refactor", "rename", etc. in commit messagesÉ Problems:

É Bad commit messagesÉ Floss refactoringÉ Tendency towards certain, large refactorings

É Assumptions:É Developer recalls refactoringÉ And describes it accurately

,

FU Berlin, Threats to the Validity of Refactoring Studies, June 30, 2016 10

Page 21: Rainer Schaden June 30, 2016 · 2016-07-13 · What is Refactoring? É Opdyke - PhD thesis (1992) É Fowler - Refactoring: Improving the Design of Existing Code (1999) Definition

Mining the Commit Log

É Search for "refactor", "rename", etc. in commit messagesÉ Problems:

É Bad commit messagesÉ Floss refactoringÉ Tendency towards certain, large refactorings

É Assumptions:É Developer recalls refactoringÉ And describes it accurately

,

FU Berlin, Threats to the Validity of Refactoring Studies, June 30, 2016 10

Page 22: Rainer Schaden June 30, 2016 · 2016-07-13 · What is Refactoring? É Opdyke - PhD thesis (1992) É Fowler - Refactoring: Improving the Design of Existing Code (1999) Definition

Analyzing Code Histories

É Analyzing versions of source code

É Problems:É Manual inspection is slow and error-proneÉ Comparing non-consecutive versionsÉ Tools only recognize limited number of refactoringsÉ Tools use heuristics

É Assumptions:É Adequate granularity in code historyÉ Appropriate window of observationÉ Heuristics are parameterized correctly

,

FU Berlin, Threats to the Validity of Refactoring Studies, June 30, 2016 11

Page 23: Rainer Schaden June 30, 2016 · 2016-07-13 · What is Refactoring? É Opdyke - PhD thesis (1992) É Fowler - Refactoring: Improving the Design of Existing Code (1999) Definition

Analyzing Code Histories

É Analyzing versions of source codeÉ Problems:

É Manual inspection is slow and error-proneÉ Comparing non-consecutive versionsÉ Tools only recognize limited number of refactoringsÉ Tools use heuristics

É Assumptions:É Adequate granularity in code historyÉ Appropriate window of observationÉ Heuristics are parameterized correctly

,

FU Berlin, Threats to the Validity of Refactoring Studies, June 30, 2016 11

Page 24: Rainer Schaden June 30, 2016 · 2016-07-13 · What is Refactoring? É Opdyke - PhD thesis (1992) É Fowler - Refactoring: Improving the Design of Existing Code (1999) Definition

Analyzing Code Histories

É Analyzing versions of source codeÉ Problems:

É Manual inspection is slow and error-proneÉ Comparing non-consecutive versionsÉ Tools only recognize limited number of refactoringsÉ Tools use heuristics

É Assumptions:É Adequate granularity in code historyÉ Appropriate window of observationÉ Heuristics are parameterized correctly

,

FU Berlin, Threats to the Validity of Refactoring Studies, June 30, 2016 11

Page 25: Rainer Schaden June 30, 2016 · 2016-07-13 · What is Refactoring? É Opdyke - PhD thesis (1992) É Fowler - Refactoring: Improving the Design of Existing Code (1999) Definition

Observing Programmers

É Direct observation: controlled experiment

É Indirect observation: surveyÉ Problems:

É Controlled experiments are expensiveÉ Bad external validity of controlled experimentsÉ Indirect observation relies on memory of developers

É Assumptions:É Developers recall refactorings accuratelyÉ Frequent refactoringsÉ External validity

,

FU Berlin, Threats to the Validity of Refactoring Studies, June 30, 2016 12

Page 26: Rainer Schaden June 30, 2016 · 2016-07-13 · What is Refactoring? É Opdyke - PhD thesis (1992) É Fowler - Refactoring: Improving the Design of Existing Code (1999) Definition

Observing Programmers

É Direct observation: controlled experimentÉ Indirect observation: survey

É Problems:É Controlled experiments are expensiveÉ Bad external validity of controlled experimentsÉ Indirect observation relies on memory of developers

É Assumptions:É Developers recall refactorings accuratelyÉ Frequent refactoringsÉ External validity

,

FU Berlin, Threats to the Validity of Refactoring Studies, June 30, 2016 12

Page 27: Rainer Schaden June 30, 2016 · 2016-07-13 · What is Refactoring? É Opdyke - PhD thesis (1992) É Fowler - Refactoring: Improving the Design of Existing Code (1999) Definition

Observing Programmers

É Direct observation: controlled experimentÉ Indirect observation: surveyÉ Problems:

É Controlled experiments are expensiveÉ Bad external validity of controlled experimentsÉ Indirect observation relies on memory of developers

É Assumptions:É Developers recall refactorings accuratelyÉ Frequent refactoringsÉ External validity

,

FU Berlin, Threats to the Validity of Refactoring Studies, June 30, 2016 12

Page 28: Rainer Schaden June 30, 2016 · 2016-07-13 · What is Refactoring? É Opdyke - PhD thesis (1992) É Fowler - Refactoring: Improving the Design of Existing Code (1999) Definition

Observing Programmers

É Direct observation: controlled experimentÉ Indirect observation: surveyÉ Problems:

É Controlled experiments are expensiveÉ Bad external validity of controlled experimentsÉ Indirect observation relies on memory of developers

É Assumptions:É Developers recall refactorings accuratelyÉ Frequent refactoringsÉ External validity

,

FU Berlin, Threats to the Validity of Refactoring Studies, June 30, 2016 12

Page 29: Rainer Schaden June 30, 2016 · 2016-07-13 · What is Refactoring? É Opdyke - PhD thesis (1992) É Fowler - Refactoring: Improving the Design of Existing Code (1999) Definition

Logging Refactoring Tool Use

É Use log files of automated refactoring tools

É Problems:É Requires extensive log filesÉ Can’t track manual refactorings

É Assumptions:É Developers use tools for refactoring

,

FU Berlin, Threats to the Validity of Refactoring Studies, June 30, 2016 13

Page 30: Rainer Schaden June 30, 2016 · 2016-07-13 · What is Refactoring? É Opdyke - PhD thesis (1992) É Fowler - Refactoring: Improving the Design of Existing Code (1999) Definition

Logging Refactoring Tool Use

É Use log files of automated refactoring toolsÉ Problems:

É Requires extensive log filesÉ Can’t track manual refactorings

É Assumptions:É Developers use tools for refactoring

,

FU Berlin, Threats to the Validity of Refactoring Studies, June 30, 2016 13

Page 31: Rainer Schaden June 30, 2016 · 2016-07-13 · What is Refactoring? É Opdyke - PhD thesis (1992) É Fowler - Refactoring: Improving the Design of Existing Code (1999) Definition

Logging Refactoring Tool Use

É Use log files of automated refactoring toolsÉ Problems:

É Requires extensive log filesÉ Can’t track manual refactorings

É Assumptions:É Developers use tools for refactoring

,

FU Berlin, Threats to the Validity of Refactoring Studies, June 30, 2016 13

Page 32: Rainer Schaden June 30, 2016 · 2016-07-13 · What is Refactoring? É Opdyke - PhD thesis (1992) É Fowler - Refactoring: Improving the Design of Existing Code (1999) Definition

Strength and Weaknesses

É Implicit/ExplicitÉ Accuracy and PrecisionÉ ContextÉ Fidelity

,

FU Berlin, Threats to the Validity of Refactoring Studies, June 30, 2016 14

Page 33: Rainer Schaden June 30, 2016 · 2016-07-13 · What is Refactoring? É Opdyke - PhD thesis (1992) É Fowler - Refactoring: Improving the Design of Existing Code (1999) Definition

Outline

Introduction

How to find Refactorings?

Repository Mining Tools

Planned Approach

,

FU Berlin, Threats to the Validity of Refactoring Studies, June 30, 2016 15

Page 34: Rainer Schaden June 30, 2016 · 2016-07-13 · What is Refactoring? É Opdyke - PhD thesis (1992) É Fowler - Refactoring: Improving the Design of Existing Code (1999) Definition

Ref-Finder

É Prete et al. (2010)É Analyzing versions of source codeÉ Supports 63 refactoringsÉ Logic rules

,

FU Berlin, Threats to the Validity of Refactoring Studies, June 30, 2016 16

Page 35: Rainer Schaden June 30, 2016 · 2016-07-13 · What is Refactoring? É Opdyke - PhD thesis (1992) É Fowler - Refactoring: Improving the Design of Existing Code (1999) Definition

Ref-Finder Description

É Input: Two versions of a program

É Basic factsÉ I.e. method(methodFullName, methodShortName, typeFullName)

É Deleted_ und added_ factsÉ Similarbody factsÉ Refactorings are rulesÉ Topological sort

,

FU Berlin, Threats to the Validity of Refactoring Studies, June 30, 2016 17

Page 36: Rainer Schaden June 30, 2016 · 2016-07-13 · What is Refactoring? É Opdyke - PhD thesis (1992) É Fowler - Refactoring: Improving the Design of Existing Code (1999) Definition

Ref-Finder Description

É Input: Two versions of a programÉ Basic facts

É I.e. method(methodFullName, methodShortName, typeFullName)

É Deleted_ und added_ factsÉ Similarbody factsÉ Refactorings are rulesÉ Topological sort

,

FU Berlin, Threats to the Validity of Refactoring Studies, June 30, 2016 17

Page 37: Rainer Schaden June 30, 2016 · 2016-07-13 · What is Refactoring? É Opdyke - PhD thesis (1992) É Fowler - Refactoring: Improving the Design of Existing Code (1999) Definition

Ref-Finder Description

É Input: Two versions of a programÉ Basic facts

É I.e. method(methodFullName, methodShortName, typeFullName)

É Deleted_ und added_ facts

É Similarbody factsÉ Refactorings are rulesÉ Topological sort

,

FU Berlin, Threats to the Validity of Refactoring Studies, June 30, 2016 17

Page 38: Rainer Schaden June 30, 2016 · 2016-07-13 · What is Refactoring? É Opdyke - PhD thesis (1992) É Fowler - Refactoring: Improving the Design of Existing Code (1999) Definition

Ref-Finder Description

É Input: Two versions of a programÉ Basic facts

É I.e. method(methodFullName, methodShortName, typeFullName)

É Deleted_ und added_ factsÉ Similarbody facts

É Refactorings are rulesÉ Topological sort

,

FU Berlin, Threats to the Validity of Refactoring Studies, June 30, 2016 17

Page 39: Rainer Schaden June 30, 2016 · 2016-07-13 · What is Refactoring? É Opdyke - PhD thesis (1992) É Fowler - Refactoring: Improving the Design of Existing Code (1999) Definition

Ref-Finder Description

É Input: Two versions of a programÉ Basic facts

É I.e. method(methodFullName, methodShortName, typeFullName)

É Deleted_ und added_ factsÉ Similarbody factsÉ Refactorings are rules

É Topological sort

,

FU Berlin, Threats to the Validity of Refactoring Studies, June 30, 2016 17

Page 40: Rainer Schaden June 30, 2016 · 2016-07-13 · What is Refactoring? É Opdyke - PhD thesis (1992) É Fowler - Refactoring: Improving the Design of Existing Code (1999) Definition

Ref-Finder Description

É Input: Two versions of a programÉ Basic facts

É I.e. method(methodFullName, methodShortName, typeFullName)

É Deleted_ und added_ factsÉ Similarbody factsÉ Refactorings are rulesÉ Topological sort

,

FU Berlin, Threats to the Validity of Refactoring Studies, June 30, 2016 17

Page 41: Rainer Schaden June 30, 2016 · 2016-07-13 · What is Refactoring? É Opdyke - PhD thesis (1992) É Fowler - Refactoring: Improving the Design of Existing Code (1999) Definition

Ref-Finder Example Rule

Example

É deleted_method(mFullName, mShortName, t1FullName) ∧

É added_method(newmFullName, mShortName, t2FullName) ∧É similar_body(newmFullName, newmBody, mFullName, mBody) ∧É NOT(equals(t1FullName, t2FullName))É → move_method(mShortName, t1FullName, t2FullName)

,

FU Berlin, Threats to the Validity of Refactoring Studies, June 30, 2016 18

Page 42: Rainer Schaden June 30, 2016 · 2016-07-13 · What is Refactoring? É Opdyke - PhD thesis (1992) É Fowler - Refactoring: Improving the Design of Existing Code (1999) Definition

Ref-Finder Example Rule

Example

É deleted_method(mFullName, mShortName, t1FullName) ∧É added_method(newmFullName, mShortName, t2FullName) ∧

É similar_body(newmFullName, newmBody, mFullName, mBody) ∧É NOT(equals(t1FullName, t2FullName))É → move_method(mShortName, t1FullName, t2FullName)

,

FU Berlin, Threats to the Validity of Refactoring Studies, June 30, 2016 18

Page 43: Rainer Schaden June 30, 2016 · 2016-07-13 · What is Refactoring? É Opdyke - PhD thesis (1992) É Fowler - Refactoring: Improving the Design of Existing Code (1999) Definition

Ref-Finder Example Rule

Example

É deleted_method(mFullName, mShortName, t1FullName) ∧É added_method(newmFullName, mShortName, t2FullName) ∧É similar_body(newmFullName, newmBody, mFullName, mBody) ∧

É NOT(equals(t1FullName, t2FullName))É → move_method(mShortName, t1FullName, t2FullName)

,

FU Berlin, Threats to the Validity of Refactoring Studies, June 30, 2016 18

Page 44: Rainer Schaden June 30, 2016 · 2016-07-13 · What is Refactoring? É Opdyke - PhD thesis (1992) É Fowler - Refactoring: Improving the Design of Existing Code (1999) Definition

Ref-Finder Example Rule

Example

É deleted_method(mFullName, mShortName, t1FullName) ∧É added_method(newmFullName, mShortName, t2FullName) ∧É similar_body(newmFullName, newmBody, mFullName, mBody) ∧É NOT(equals(t1FullName, t2FullName))

É → move_method(mShortName, t1FullName, t2FullName)

,

FU Berlin, Threats to the Validity of Refactoring Studies, June 30, 2016 18

Page 45: Rainer Schaden June 30, 2016 · 2016-07-13 · What is Refactoring? É Opdyke - PhD thesis (1992) É Fowler - Refactoring: Improving the Design of Existing Code (1999) Definition

Ref-Finder Example Rule

Example

É deleted_method(mFullName, mShortName, t1FullName) ∧É added_method(newmFullName, mShortName, t2FullName) ∧É similar_body(newmFullName, newmBody, mFullName, mBody) ∧É NOT(equals(t1FullName, t2FullName))É → move_method(mShortName, t1FullName, t2FullName)

,

FU Berlin, Threats to the Validity of Refactoring Studies, June 30, 2016 18

Page 46: Rainer Schaden June 30, 2016 · 2016-07-13 · What is Refactoring? É Opdyke - PhD thesis (1992) É Fowler - Refactoring: Improving the Design of Existing Code (1999) Definition

Ref-Finder Problems: Missing Refactorings

É Nine of 72 refactorings are not detectedÉ Including “Substitute Algorithm”

,

FU Berlin, Threats to the Validity of Refactoring Studies, June 30, 2016 19

Page 47: Rainer Schaden June 30, 2016 · 2016-07-13 · What is Refactoring? É Opdyke - PhD thesis (1992) É Fowler - Refactoring: Improving the Design of Existing Code (1999) Definition

Ref-Finder Problems: Wrong precision

É Precision =|E ∩ R||R|

É Recall = |E ∩ R||E|

Figure : Ref-Finder precision(Source: Prete et al.)

,

FU Berlin, Threats to the Validity of Refactoring Studies, June 30, 2016 20

Page 48: Rainer Schaden June 30, 2016 · 2016-07-13 · What is Refactoring? É Opdyke - PhD thesis (1992) É Fowler - Refactoring: Improving the Design of Existing Code (1999) Definition

Ref-Finder Problems: Wrong precision

É Precision =|E ∩ R||R|

É Recall = |E ∩ R||E|

Figure : Ref-Finder precision(Source: Prete et al.)

,

FU Berlin, Threats to the Validity of Refactoring Studies, June 30, 2016 20

Page 49: Rainer Schaden June 30, 2016 · 2016-07-13 · What is Refactoring? É Opdyke - PhD thesis (1992) É Fowler - Refactoring: Improving the Design of Existing Code (1999) Definition

Ref-Finder Problems: Wrong precision

É Precision =|E ∩ R||R|

É Recall = |E ∩ R||E|

Figure : Ref-Finder precision(Source: Prete et al.)

,

FU Berlin, Threats to the Validity of Refactoring Studies, June 30, 2016 20

Page 50: Rainer Schaden June 30, 2016 · 2016-07-13 · What is Refactoring? É Opdyke - PhD thesis (1992) É Fowler - Refactoring: Improving the Design of Existing Code (1999) Definition

Ref-Finder Problems: Wrong precision

É Precision =|E ∩ R||R|

É Recall = |E ∩ R||E|

Figure : Ref-Finder precision(Source: Prete et al.)

,

FU Berlin, Threats to the Validity of Refactoring Studies, June 30, 2016 21

Page 51: Rainer Schaden June 30, 2016 · 2016-07-13 · What is Refactoring? É Opdyke - PhD thesis (1992) É Fowler - Refactoring: Improving the Design of Existing Code (1999) Definition

Ref-Finder Problems: Questionable Recall

Recall:“Since it is hard to find known refactorings, we ran REF-FINDER usingsimilarity threshold σ = 0.65 and manually inspected randomly chosenrefactorings until we found 10 correct refactorings. We then measured arecall against this data set at a more reasonable threshold, σ = 0.85.”

— Prete et al.

,

FU Berlin, Threats to the Validity of Refactoring Studies, June 30, 2016 22

Page 52: Rainer Schaden June 30, 2016 · 2016-07-13 · What is Refactoring? É Opdyke - PhD thesis (1992) É Fowler - Refactoring: Improving the Design of Existing Code (1999) Definition

UMLDiff

É Xing and Stroulia (2005)É Structural-differencing algorithmÉ Name-similarity and Structure-similarity heuristicsÉ Supports 33 Refactorings

,

FU Berlin, Threats to the Validity of Refactoring Studies, June 30, 2016 23

Page 53: Rainer Schaden June 30, 2016 · 2016-07-13 · What is Refactoring? É Opdyke - PhD thesis (1992) É Fowler - Refactoring: Improving the Design of Existing Code (1999) Definition

RefactoringCrawler

É Dig et al. (2006)É Syntactic analysis using Shingles encodingÉ Semantic analysisÉ Supports seven Refactorings

,

FU Berlin, Threats to the Validity of Refactoring Studies, June 30, 2016 24

Page 54: Rainer Schaden June 30, 2016 · 2016-07-13 · What is Refactoring? É Opdyke - PhD thesis (1992) É Fowler - Refactoring: Improving the Design of Existing Code (1999) Definition

Outline

Introduction

How to find Refactorings?

Repository Mining Tools

Planned Approach

,

FU Berlin, Threats to the Validity of Refactoring Studies, June 30, 2016 25

Page 55: Rainer Schaden June 30, 2016 · 2016-07-13 · What is Refactoring? É Opdyke - PhD thesis (1992) É Fowler - Refactoring: Improving the Design of Existing Code (1999) Definition

Planned Approach

É Examine precision and recall of Ref-Finder

É Analyze results of studies using Ref-FinderÉ Examine precision and recall of other tools and methodsÉ Analyze results of other studies

,

FU Berlin, Threats to the Validity of Refactoring Studies, June 30, 2016 26

Page 56: Rainer Schaden June 30, 2016 · 2016-07-13 · What is Refactoring? É Opdyke - PhD thesis (1992) É Fowler - Refactoring: Improving the Design of Existing Code (1999) Definition

Planned Approach

É Examine precision and recall of Ref-FinderÉ Analyze results of studies using Ref-Finder

É Examine precision and recall of other tools and methodsÉ Analyze results of other studies

,

FU Berlin, Threats to the Validity of Refactoring Studies, June 30, 2016 26

Page 57: Rainer Schaden June 30, 2016 · 2016-07-13 · What is Refactoring? É Opdyke - PhD thesis (1992) É Fowler - Refactoring: Improving the Design of Existing Code (1999) Definition

Planned Approach

É Examine precision and recall of Ref-FinderÉ Analyze results of studies using Ref-FinderÉ Examine precision and recall of other tools and methods

É Analyze results of other studies

,

FU Berlin, Threats to the Validity of Refactoring Studies, June 30, 2016 26

Page 58: Rainer Schaden June 30, 2016 · 2016-07-13 · What is Refactoring? É Opdyke - PhD thesis (1992) É Fowler - Refactoring: Improving the Design of Existing Code (1999) Definition

Planned Approach

É Examine precision and recall of Ref-FinderÉ Analyze results of studies using Ref-FinderÉ Examine precision and recall of other tools and methodsÉ Analyze results of other studies

,

FU Berlin, Threats to the Validity of Refactoring Studies, June 30, 2016 26

Page 59: Rainer Schaden June 30, 2016 · 2016-07-13 · What is Refactoring? É Opdyke - PhD thesis (1992) É Fowler - Refactoring: Improving the Design of Existing Code (1999) Definition

Further possibilities: Time pressure andreleases

Recommendation“The other time you should avoid refactoring is when you are close to adeadline. At that point the productivity gain from refactoring would onlyappear after the deadline and thus be too late.”

— Martin Fowler

É Studies equate time pressure with major releasesÉ Questionable for Open-source software

,

FU Berlin, Threats to the Validity of Refactoring Studies, June 30, 2016 27

Page 60: Rainer Schaden June 30, 2016 · 2016-07-13 · What is Refactoring? É Opdyke - PhD thesis (1992) É Fowler - Refactoring: Improving the Design of Existing Code (1999) Definition

Further possibilities: Time pressure andreleases

Recommendation“The other time you should avoid refactoring is when you are close to adeadline. At that point the productivity gain from refactoring would onlyappear after the deadline and thus be too late.”

— Martin Fowler

É Studies equate time pressure with major releasesÉ Questionable for Open-source software

,

FU Berlin, Threats to the Validity of Refactoring Studies, June 30, 2016 27

Page 61: Rainer Schaden June 30, 2016 · 2016-07-13 · What is Refactoring? É Opdyke - PhD thesis (1992) É Fowler - Refactoring: Improving the Design of Existing Code (1999) Definition

Further possibilities: Bug-inducing vs.Bug-fix-inducing changes

É Are refactorings really responsible for bugs?É Or do they help finding bugs?

,

FU Berlin, Threats to the Validity of Refactoring Studies, June 30, 2016 28

Page 62: Rainer Schaden June 30, 2016 · 2016-07-13 · What is Refactoring? É Opdyke - PhD thesis (1992) É Fowler - Refactoring: Improving the Design of Existing Code (1999) Definition

Summary

É There are four basic methods to gather empirical data aboutrefactoring

É Tools to detect refactorings are necessary but flawedÉ Goal: How valid are the results of refactoring studies?

,

FU Berlin, Threats to the Validity of Refactoring Studies, June 30, 2016 29

Page 63: Rainer Schaden June 30, 2016 · 2016-07-13 · What is Refactoring? É Opdyke - PhD thesis (1992) É Fowler - Refactoring: Improving the Design of Existing Code (1999) Definition

For Further Reading I

Martin FowlerRefactoring: Improving the Design of Existing Code.Addison-Wesley Longman Publishing Co., Inc., 1999.

Murphy-Hill, Black, Dig, ParninGathering Refactoring Data: A Comparison of Four MethodsProceedings of the 2nd Workshop on Refactoring Tools, 7:1–7:5, 2008.

Prete, Rachatasumrit, Sudan, KimTemplate-based Reconstruction of Complex RefactoringsProceedings of the 2010 IEEE International Conference on SoftwareMaintenance, 1–10, 2010.

,

FU Berlin, Threats to the Validity of Refactoring Studies, June 30, 2016 30

Page 64: Rainer Schaden June 30, 2016 · 2016-07-13 · What is Refactoring? É Opdyke - PhD thesis (1992) É Fowler - Refactoring: Improving the Design of Existing Code (1999) Definition

Thank You!

,

FU Berlin, Threats to the Validity of Refactoring Studies, June 30, 2016 31