Upload
tashya-hudson
View
37
Download
0
Embed Size (px)
DESCRIPTION
Things Structural Clones Tell that Simple Clones Don’t. Hamid Abdul Basit. Software Clones. Simple clones the same or similar code fragments Structural clones higher-level , larger similarities Similarity of code and similarity of structure. Simple clones. Structural Clones. - PowerPoint PPT Presentation
Citation preview
1
Things Structural Clones Tell that Simple Clones Don’t
Hamid Abdul Basit
2
Software Clones
• Simple clones
– the same or similar code fragments
• Structural clones
– higher-level, larger similarities
– Similarity of code and similarity of structure
3
Simple clones
i nt f 2( ) {i nt d; Whi l e ( i sdi gi t ( d) ) { i f ( p == buf ) p=gr ow_buf ( p) ; ski p() ; d = get c( fi nput ) ; }}
char f3(char) {char c; Whi l e ( i sdi gi t ( c) ) { i f ( p == buf ) p=gr ow_buf ( p) ; i f (c == ‘ - ’ ) return; c = get c( fi nput ) ; }}
i nt f1(byte) {i nt c; Whi l e ( i sal pha( c) ) { i f ( p == buf ) p=gr ow_buf ( p) ; c = get c( fi nput ) ; }}
4
Structural Clones
5
CreateTask.BLValidateTask()
Task.DBAddTask()Task Table
accesses
Project.DBAddProject()Project Table
executes
CreateProject.BLValidateProject()
accesses
executes
visualizes visualizes
User Interface
Business Logic
Database
CreateTask CreateProject
CreateTask.UICreateTask()
CreateProject.UICreateProject()
Collaborative structural clone
Structural Clones
6
When are structural clones useful?• showing a bigger picture of similarity situation -
the forest from the trees
• Finding refactoring opportunities
• Architecture recovery, program understanding and maintenance
– Structural clones often represent application domain or design concepts
• Re-engineering for reuse– The bigger the clones the better for reuse
• Some benefits for plagiarism detection
7
A structure is a graph
A
D
C B
x y
z
w
• Entities {A,B,C,D}• Relationships {w,x,y,z}
Entities
• Physically defined
– Code fragments, files , web pages, directories
• Semantically defined
– Methods, classes, packages
• Conceptually defined
– Components, sub-systems
8
Relationships
• Physical co-location– Same file, same directory
• Runtime– Message passing– Hyperlink between web pages
• Design level– Inheritance– Association– Composition
9
10
e8b
e6b
x
w
z
x
z
e7bx
y
z
e8d
y
e6d
y
e7d
y
e8c
e6c
e7c
e8a
S8
e6a
S6
e7a
S7
e4bx
y
z
e5bx
y
z
e4d
y
e5d
y
e4ce5c
e4a
S4e5a
S5
Structural Clones
• Higher level similarities are composed of
lower level similarities
• Can be recovered by finding repeating
configurations of lower level similarities
11
Observation
Detecting Structural Clones
12
Simple clones in files clone patterns
1
9876
5432
1312 14
10
11 15
1,4,8,10,11,12
1,4,7,8,10,11,12
2,5,9,13,15
Detecting Structural Clones
13
File Analysis
1
8
11 10
12
4
7 13 5
9 15
2
Detecting Structural Clones
14
Directory Analysis
1
8 1110 12
4 7
13
59
15
2
Detecting Structural Clones
15
File Level Structural Clone Across Directories
1 104 8 7
151392 5
16
a1
b1
c1
a2
b2
c2
F1F2
Simple Clone Structure (SCS)Across Files
17
F1 F2
File Clone Class (FCC)
18
File Clone Structure (FCS) Across / Within Directories
F1
X1 Z2
F2
X2
Y1 Y2
D1 D2
Z1
19
STRUCTURAL CLONES DETECTED BY CLONE MINERSimple Clone Structures (SCS) Across MethodsSimple Clone Structures (SCS) Across FilesMethod Clone Classes (MCC)Method Clone Structures (MCS) Across FilesFile Clone Classes (FCC)File Clone Structures (FCS) Across DirectoriesFile Clone Structures (FCS) Within Directories
20
In earlier work
• We hypothesized the benefits of structural clones
– Re-engineering for reuse, architecture recovery, …
• Defined structural clones• Implemented Clone Miner – structural clone
detector• Did initial empirical evaluation
21
In work presented here• How frequent are the different types of
structural clones?• Are structural clones more meaningful for
program understanding and design recovery than simple clones?
• What is the value added by structural clone detection in identifying refactoring opportunities?
22
Case Study Systems
Apache Ivy http://ant.apache.org/ivy/index.html
Apache Ant http://ant.apache.org/
Columba http://sourceforge.net/projects/columba/
Dnsjava http://www.dnsjava.org/
Javax-Swing http://www.oracle.com/technetwork/java/javase/downloads/index.html
JFreeChart http://www.jfree.org/jfreechart/
ANTLR http://www.antlr.org/
DrJava http://www.drjava.org/
FreeCol http://www.freecol.org/
JEdit http://www.jedit.org/
JHotDraw http://www.jhotdraw.org/
23
Apache -Ivt Apache-Ant Columba DnsJava Javax-Swing JFreechart Antlr
Tokens 240711 711914 569676 137019 960906 505448 243461
Files 386 1174 1549 182 620 585 258
Lines Of Code 54235 224642 157000 30084 289160 198351 61414
Methods 4156 12849 8334 2024 15251 7842 3618
Antlr DrJava FreeCol JEdit JHotdraw AVERAGE TOTAL
Tokens 243461 615739 518795 536173 164929 473161 4964060
Files 258 605 488 531 484 624 6476
Lines Of Code 61414 140571 128854 157079 63364 136796 1450519
Methods 3618 7831 5892 6873 4859 7230 75373
Systems’ Overview
24
AVERAGE TOTAL
SCC 1275 13231
SCC Instances 4125 42924
Average Length of SCC Instance 48
Methods Containing SCC 1532 16198
% Methods Containing SCC 22Files Containing SCC 332 3476
% Files Containing SCC 53Directories Containing SCC 51 504
% Directories Containing SCC 83
Simple Clone Classes (SCC)
25
Apache -Ivt Apache-Ant Columba DnsJava Javax-Swing JFreechart Antlr DrJava FreeCol JEdit JHotdraw0
5
10
15
20
25
30
35
40
% Methods Containing SCC
26
How frequent are the different types of structural
clones?
1
27
Structural Clones are Frequent
• Simple clones tend to occur in groups– 56% of simple clones are within structural clones
• There are less structural clones than simple clones
28
Simple Clone Classes (SCC) and Simple Clone Structures (SCS)
29
AVERAGE TOTALSCS 727 7257SCS Instances 2567 25798Average Instances In SCS 3Average Token Count (ATC) in SCS 87Average % Cover(APC) in SCS 9SCC Covered By SCS 703 7204% SCC Covered By SCS 54Methods Containing SCS 1009 10665% Methods Containing SCS 14Files Containing SCS 298 3122% Files Containing SCS 48Directories Containing SCS 48 467% Directories Containing SCS 80
Simple Clone Structures (SCS)Across Files
30
Apache -Ivt Apache-Ant Columba DnsJava Javax-Swing JFreechart Antlr DrJava FreeCol JEdit JHotdraw0
10
20
30
40
50
60
70
80
% SCC Covered By SCS Across Files
31
AVERAGE TOTALSCS 306 3229SCS Instances 1158 12362Average Instances In SCS 3SCC Covered By SCS 694 7278% SCC Covered By SCS 55Methods Containing SCS 1054 11093% Methods Containing SCS 15Files Containing SCS 212 2216% Files Containing SCS 36Directories Containing SCS 41 413% Directories Containing SCS 72
Simple Clone Structures (SCS)Within Files
32
Apache -Ivt Apache-Ant Columba DnsJava Javax-Swing JFreechart Antlr DrJava FreeCol JEdit JHotdraw0
10
20
30
40
50
60
70
% Files Containing SCS Within Files
33
AVERAGE TOTALSCS 1067 10996SCS Instances 3788 39398Average Instances In SCS 3Average Token Count (ATC) in SCS 55Average %Cover(APC) in SCS 53SCC Covered By SCS 1081 11168% SCC Covered By SCS 86Methods Containing SCS 3300 35666% Methods Containing SCS 21Files Containing SCS 319 3344% Files Containing SCS 52Directories Containing SCS 50 492% Directories Containing SCS 82
Simple Clone Structures (SCS) Across Methods
34
Apache -Ivt Apache-Ant Columba DnsJava Javax-Swing JFreechart Antlr DrJava FreeCol JEdit JHotdraw0
10
20
30
40
50
60
70
80
90
100
% SCC Covered By SCS Across Methods
35
Method Clone Classes (MCC) and Method Clone Structures (MCS)
36
AVERAGE TOTALMCC 165 1760MCC Instances 475 5068Average Instances In MCC 2SCC Covered By MCC 296 3139% SCC Covered By MCC 23Methods Covered By MCC 475 5068% Methods Covered By MCC 7Files Containing MCC 196 2066% Files Containing MCC 31Directories Containing MCC 39 385% Directories Containing MCC 66
Method Clone Classes (MCC)
37
AVERAGE TOTALMCS 22 230MCS Instances 58 612Average Instances In MCS 2MCC Covered By MCS 45 484% MCC Covered By MCS 4Methods Covered By MCS 162 1740% Methods Covered By MCS 2Files Containing MCS 36 384% Files Containing MCS 6Directories Containing MCS 10 106% Directories Containing MCS 23
Method Clone Structures (MCS) Across Files
38
File Clone Classes (FCC) and File Clone Structures (FCS)
39
AVERAGE TOTALFCC 10 112FCC Instances 29 314Average Instances In FCC 3SCC Covered By FCC 69 755% SCC Covered By FCC 5Files Containing FCC 28 307% Files Containing FCC 4Directories Containing FCC 10 110% Directories Containing FCC 23
File Clone Classes (FCC)
40
AVERAGE TOTALFCS 4 40FCS Instances 10 98Average Instances In FCS 2
FCC Covered By FCS 4 38% FCC Covered By FCS 40Files Containing FCS 12 116% Files Containing FCS 2Directories Containing FCS 7 64% Directories Containing FCS 16
File Clone Structures (FCS)Across Directories
41
AVERAGE TOTALFCS 8 86FCS Instances 25 267Average Instances In FCS 2FCC Covered By FCS 8 83% FCC Covered By FCS 66Files Covered By FCS 20 218% Files Covered By FCS 3
Directories Containing FCS 6 64% Directories Containing FCS 11
File Clone Structures (FCS)Within Directories
42
Are structural clones more meaningful for program
understanding and design recovery than simple
clones?
2
43
Improved Program Understanding and Design Recovery
• Analysis is more qualitative than quantitative– anecdotal evidences of interesting examples of various
types of structural clones
• Larger program parts recovered as clones from a system are expected to be more meaningful than smaller ones
• High level structural clones like FCC and FCS appear to be a very useful tool for design recovery because of their size, highlighting the design level similarities between various parts of the system
44
FCC Examples from Apache-Ant
TC PC
FTP.java 4586 52%FTPTaskMirrorImpl.java 4576 63%
TarFileSetTest.java 451 90%ZipFileSetTest.java 451 90%
45
TC PC
ColonialAIPlayer.java 5130 76%
StandardAIPlayer.java 5084 52%
ReportCargoPanel.java 682 74%ReportNavalPanel.java 686 69%
FCC examples from FreeCol
46
FCC examples from DrJava
File Names TC PC
BackSlashTest.java 2050 84%
SingleQuoteTest.java 2050 88%
47
FCC examples from JFreeChart
File Names TC PC
CombinedDomainCategoryPlot.java 1008 50%CombinedDomainXYPlot.java 1005 52%
CombinedDomainXYPlot.java 1182 61%CombinedRangeCategoryPlot.java 1185 71%CombinedRangeXYPlot.java 1182 63%
48
FCC Example from Javax-SwingFile Names TC PC
MultiButtonUI.java 644 89 %MultiColorChooserUI.java 644 89 %MultiComboBoxUI.java 795 87 %MultiDesktopIconUI.java 644 89 %MultiDesktopPaneUI.java 644 89 %MultiFileChooserUI.java 833 75 %MultiInternalFrameUI.java 644 89 %MultiLabelUI.java 644 89 %MultiListUI.java 708 73 %MultiMenuBarUI.java 644 89 %MultiMenuItemUI.java 644 88 %MultiOptionPaneUI.java 737 87 %MultiPanelUI.java 644 89 %MultiPopupMenuUI.java 702 79 %MultiProgressBarUI.java 644 89 %
49
FCC Example from Javax-SwingFile Names TC PC
MultiScrollBarUI.java 644 89 %MultiScrollPaneUI.java 644 89 %MultiSeparatorUI.java 644 89 %MultiSliderUI.java 644 89 %MultiSpinnerUI.java 644 89 %MultiSplitPaneUI.java 862 80 %MultiTabbedPaneUI.java 708 74 %MultiTableHeaderUI.java 644 89 %MultiTableUI.java 644 89 %MultiTextUI.java 798 53 %MultiToolBarUI.java 644 89 %MultiToolTipUI.java 644 89 %MultiTreeUI.java 859 61 %MultiViewportUI.java 644 89 %MultiRootPaneUI.java 644 89 %
50
Directory TC PC File Namerenderer/category/ 786 51 % GradientBarPainter.javarenderer/xy/ 786 51 % GradientXYBarPainter.java
renderer/category/ 53 77 % BarPainter.javarenderer/xy/ 53 76 % XYBarPainter.java
FCS Example from JFreeChart
51
Clones77%,
53 tokens
Clones51%,
786 tokens
Clones77%,
53 tokens
Clones51%,
786 tokens
GradientBarPainter.java
BarPainter.java
GradientXYBarPainter.java
XYBarPainter.java
Implements Implements
FCS Example from JFreeChart
52
FCS Within Directory in Columba
File Names TC PC
DownAction.java 247 65 %UpAction.java 247 63 %
NextMessageAction.java 199 62 %PreviousMessageAction.java 199 62 %
NextUnreadMessageAction.java 229 71%PreviousUnreadMessageAction.java 229 71 %
53
What is the value added by structural clone detection in
identifying refactoring opportunities?
3
54
Better Refactoring Help
• Analysis of structural clones is helpful in locating places where high-level duplication is present that can be restructured or refactored.
• We can use the Form Template Method refactoring to unify similar methods that follow the same high-level algorithm but have implementation variations, with the Template Method design pattern
• Simple clones appear as candidates for the Extract Method refactoring, however, MCS Across Files could also indicate possible applications of Extract Super Class refactoring
55
Template method design pattern
56
traverse(){
backtrack=false;
cur.i++;cur.next=n;
checkSent(true);
}
init();
cur = findNext();jumptToNext();
serialize(cur)
send(cur);
updatePos(-1);
processGraph(){
setEdge(cur);
cur.next=cur.i;
checkSent(true);
}
init();
cur = findNext();jumptToNext();
serialize(cur)
send(cur);
updatePos(1);
A structural clone suitable for Form Template Method Refactoring
(a toy example)
57
Target for Form Template Method refactoring from Javax-Swing
protected void layoutVScrollbar(JScrollBar sb) { Dimension sbSize = sb.getSize(); Insets sbInsets = sb.getInsets(); int itemW = sbSize.width – (sbInsets.left+bInsets.right); int itemX = sbInsets.left; boolean squareButtons = DefaultLookup.getBoolean( scrollbar, this, "ScrollBar.squareButtons", false); int decrButtonH = squareButtons ? itemW : decrButton.getPreferredSize().height; int decrButtonY = sbInsets.top; int incrButtonH = squareButtons ? itemW : incrButton.getPreferredSize().height; int incrButtonY = sbSize.height - (sbInsets.bottom +
incrButtonH); int sbInsetsH = sbInsets.top + sbInsets.bottom; int sbButtonsH = decrButtonH + incrButtonH; float trackH = sbSize.height – (sbInsetsH + sbButtonsH); float min = sb.getMinimum(); float extent = sb.getVisibleAmount(); float range = sb.getMaximum() - min; float value = getValue(sb); int thumbH = (range <= 0)? getMaximumThumbSize().height : (int)(trackH * (extent / range)); thumbH = Math.max(thumbH, getMinimumThumbSize().height); thumbH = Math.min(thumbH,getMaximumThumbSize().height); int thumbY = incrButtonY - thumbH; if (value < (sb.getMaximum() – sb.getVisibleAmount())) { float thumbRange = trackH - thumbH; thumbY = (int)(0.5f + (thumbRange *
((value – min) / (range - extent)))); thumbY += decrButtonY + decrButtonH; } int sbAvailButtonH = (sbSize.height -
sbInsetsH); if (sbAvailButtonH < sbButtonsH) { incrButtonH = decrButtonH =
sbAvailButtonH / 2; incrButtonY = sbSize.height -
(sbInsets.bottom +
incrButtonH); } decrButton.setBounds(itemX, decrButtonY,
itemW, decrButtonH);
incrButton.setBounds(itemX, incrButtonY, itemW,
incrButtonH); int itrackY = decrButtonY + decrButtonH; int itrackH = incrButtonY - itrackY; trackRect.setBounds(itemX, itrackY, itemW,
itrackH); if(thumbH >= (int)trackH) { setThumbBounds(0, 0, 0, 0); } else { if ((thumbY + thumbH) > incrButtonY) { thumbY = incrButtonY - thumbH; } if (thumbY < (decrButtonY +
decrButtonH)) { thumbY = decrButtonY + decrButtonH +
1; } setThumbBounds(itemX, thumbY, itemW,
thumbH); }}
protected void layoutHScrollbar(JScrollBar sb) { Dimension sbSize = sb.getSize(); Insets sbInsets = sb.getInsets(); int itemH = sbSize.height – (sbInsets.top + sbInsets.bottom); int itemY = sbInsets.top; boolean ltr = sb.getComponentOrientation().isLeftToRight(); boolean squareButtons = DefaultLookup.getBoolean( scrollbar, this, "ScrollBar.squareButtons", false); int leftButtonW = squareButtons ? itemH : decrButton.getPreferredSize().width; int rightButtonW = squareButtons ? itemH : incrButton.getPreferredSize().width; if (!ltr) { int temp = leftButtonW; leftButtonW = rightButtonW; rightButtonW = temp; } int leftButtonX = sbInsets.left; int rightButtonX = sbSize.width - (sbInsets.right +
rightButtonW); int sbInsetsW = sbInsets.left + sbInsets.right; int sbButtonsW = leftButtonW + rightButtonW; float trackW = sbSize.width – (sbInsetsW + sbButtonsW); float min = sb.getMinimum(); float max = sb.getMaximum(); float extent = sb.getVisibleAmount(); float range = max - min; float value = getValue(sb); int thumbW = (range <= 0)? getMaximumThumbSize().width : (int)(trackW * (extent / range)); thumbW = Math.max(thumbW, getMinimumThumbSize().width); thumbW = Math.min(thumbW, getMaximumThumbSize().width); int thumbX = ltr ? rightButtonX - thumbW : leftButtonX +
leftButtonW; if (value < (max - sb.getVisibleAmount())) { float thumbRange = trackW - thumbW; if( ltr ) { thumbX = (int)(0.5f + (thumbRange *
((value - min) / (range - extent)))); } else { thumbX = (int)(0.5f + (thumbRange * (
(max - extent - value) / (range - extent)))); } thumbX += leftButtonX + leftButtonW; } int sbAvailButtonW = (sbSize.width - sbInsetsW); if (sbAvailButtonW < sbButtonsW) { rightButtonW = leftButtonW = sbAvailButtonW / 2; rightButtonX = sbSize.width - (sbInsets.right + rightButtonW); } (ltr ? decrButton : incrButton).setBounds
(leftButtonX, itemY, leftButtonW, itemH); (ltr ? incrButton : decrButton).setBounds(rightButtonX,
itemY, rightButtonW, itemH); int itrackX = leftButtonX + leftButtonW; int itrackW = rightButtonX - itrackX; trackRect.setBounds(itrackX, itemY, itrackW, itemH); if (thumbW >= (int)trackW) { setThumbBounds(0, 0, 0, 0); } else { if (thumbX + thumbW > rightButtonX) { thumbX = rightButtonX - thumbW; } if (thumbX < leftButtonX + leftButtonW) { thumbX = leftButtonX + leftButtonW + 1; } setThumbBounds(thumbX, itemY, thumbW, itemH); }}
58
Candidates for Form Template Method from JFreeChart
Classes MethodsGroupedStackedBarRenderer StackedBarRenderer drawItem()XYAreaRenderer XYAreaRenderer2 drawItem()LineAndShapeRenderer ScatterRenderer getLeagendItem()DataAxis PeriodAxis valueToJava2D()ComparableObjectSeries XYSeries add()CategoryPlot XYPlot readObject()BarRenderer LevelRenderer drawItem()DataAxis NumberAxis refreshTicksHorizontal()XYBubbleRenderer XYLineAndShapeRenderer getLeagendItem()MiddlePinNeedle PinNeedle drawNeedle()ComparableObjectSeries XYSeries hashCode()AreaRenderer BoxAndWhiskerRenderer getLegendItem()XYAreaRenderer XYStepAreaRenderer drawItem()CompassPlot PiePlot setSeriesNeedle()ColumnArrangement FlowArrangement arrangeNF()LogAxis NumberAxis selectVerticalAutoTickUnit()PaintMap StrokeMap equals()XYLineAndShapeRenderer XYShapeRenderer drawSecondaryPass()Minute Second parseMinute()
59
CombinedDomainCategoryPlot.java CombinedDomainXYPlot.java
CombinedDomainCategoryPlotCombinedDomainCategoryPlotgetGapsetGapaddaddremovegetSubplotsfindSubplotzoomRangeAxeszoomRangeAxeszoomRangeAxescalculateAxisSpacedrawsetFixedRangeAxisSpaceForSubplotssetOrientationgetDataRangegetLegendItemsgetCategoriesgetCategoriesForAxishandleClickplotChangedequalsclone
CombinedDomainXYPlotCombinedDomainXYPlotgetPlotTypesetOrientationgetDataRangegetGapsetGapaddaddremovegetSubplotscalculateAxisSpacedrawgetLegendItemszoomRangeAxeszoomRangeAxeszoomRangeAxesfindSubplotsetRenderersetFixedRangeAxisSpacesetFixedRangeAxisSpaceForSubplotshandleClickplotChangedequalsclone
Extract Superclass candidate from JFreeChart
60
Extract Super Class and Structural Clones
• The two files contain a number of similarly named methods, but only through structural clone analysis we could find those methods that are also significantly similar in contents
61
Another Candidate for Extract Superclass from JFreeChart
FlowArrangement ColumnArrangementFlowArrangementFlowArrangementaddarrangearrangeFNarrangeFRarrangeFFarrangeRRarrangeRFarrangeRNarrangeNNarrangeNFclearequals
ColumnArrangementColumnArrangementaddarrange
arrangeFFarrangeRR arrangeRF
arrangeNNarrangeNFclear equals
62
More Candidates for Extract Superclass from JFreeChart
BarRenderer LevelRendererLogAxis NumberAxisAbstractCategoryItemRenderer AbstractXYItemRendererMilliSecond SecondPaintMap StrokeMapCombinedDomainXYPlot CombinedRangeXYPlotMinute SecondCombinedRangeCategoryPlot CombinedDomainCategoryPlot
CombinedRangeCategoryPlot CombinedRangeXYPlot CombinedDomainCategoryPlot
DefaultIntervalXYDataset DefaultXYDataset DefaultXYZDataset
63
Conclusions
• Structural clones often represent important design
concepts
• Structural clone detection becomes an aid in design
recovery
• Structural clones show a bigger picture of code
duplication and guides to the correct refactoring
technique applicable in the given situation
64
Future Work
• Classification of structural clones• Detection of other types of structural clones• Better integration of structural clones with
architecture recovery and re-modularization techniques
• Better visualization of structural clones• Management of structural clones with meta-
programming• Re-engineering for reuse
65
Thank you